#raft #server #cluster #state-machine #framework #top #built

raftpico

A simple Raft framework built on top of the raftbare crate

3 unstable releases

0.1.1 Dec 8, 2024
0.1.0 Dec 3, 2024
0.0.1 Oct 28, 2024

#262 in Algorithms

Download history 146/week @ 2024-10-28 3/week @ 2024-11-18 246/week @ 2024-12-02 71/week @ 2024-12-09

320 downloads per month

MIT license

85KB
2K SLoC

raftpico

raftpico Documentation Actions Status License

A simple Raft framework for Rust built on top of the raftbare crate.

Features

  • JSON-RPC API: The framework exposes a JSON-RPC API for various operations including:
    • CreateCluster: Initialize a new Raft cluster.
    • AddServer: Add a new server to the cluster.
    • RemoveServer: Remove a server from the cluster.
    • Apply: Submit a command, perform a consistent query, or execute a local query on the state machine.
    • TakeSnapshot: Trigger a snapshot of the current state.
    • GetServerState: Retrieve the current state of an individual server.
  • Custom State Machines: Provides a Machine trait that users can implement to define their own state machines that will be replicated across the Raft cluster.
  • Simple Codebase: Designed to be easily understandable and modifiable, enabling users to add their own features with ease.
  • JSON Serialization: Utilizes JSON Lines as the serialization format for persistent storage and communication between servers, offering simplicity and human-readability.

Example: Key-Value Store

The code snippets provided below are from examples/kvs.rs and demonstrate the implementation of a key-value store (KVS) using raftpico:

#[derive(Debug, Default, Serialize, Deserialize)]
struct KvsMachine {
    entries: HashMap<String, serde_json::Value>,
}

impl Machine for KvsMachine {
    type Input = KvsInput;

    fn apply(&mut self, ctx: &mut ApplyContext, input: Self::Input) {
        match input {
            KvsInput::Put(key, value) => {
                let value = self.entries.insert(key, value);
                ctx.output(&value);
            }
            KvsInput::Get(key) => {
                let value = self.entries.get(&key);
                ctx.output(&value);
            }
            KvsInput::Delete(key) => {
                let value = self.entries.remove(&key);
                ctx.output(&value);
            }
            KvsInput::List => {
                ctx.output(&self.entries.keys().collect::<Vec<_>>());
            }
        }
    }
}

#[derive(Debug, Clone, Serialize, Deserialize)]
enum KvsInput {
    Put(String, serde_json::Value),
    Get(String),
    Delete(String),
    List,
}

First, launch two KVS servers to form the cluster:

/// Start a KVS server in one terminal.
$ cargo run --release --example kvs 127.0.0.1:4000

/// Start another KVS server in a different terminal.
$ cargo run --release --example kvs 127.0.0.1:4001

To set up a Raft cluster, run the following commands:

$ cargo install jlot

// Initialize a Raft cluster with a server.
$ jlot req CreateCluster | jlot call :4000
{"jsonrpc":"2.0","id":0,"result":{"members":["127.0.0.1:4000"]}}

// Add an additional server to the cluster.
$ jlot req AddServer '{"addr":"127.0.0.1:4001"}' | jlot call :4000
{"jsonrpc":"2.0","id":0,"result":{"members":["127.0.0.1:4000","127.0.0.1:4001"]}}

To perform operations on the replicated state machine, use an Apply request as demonstrated below:

// Insert an entry into the KVS.
$ jlot req Apply '{"input":{"Put":["foo", 123]}, "kind":"Command"}' | jlot call :4000
{"jsonrpc":"2.0","id":0,"result":null}

// Retrieve the value of the previously inserted entry from another server.
$ jlot req Apply '{"input":{"Get":"foo"}, "kind":"Query"}' | jlot call :4001
{"jsonrpc":"2.0","id":0,"result":123}

Limitations

  • JSON Serialization:
    • JSON is evidently not the most efficient format for storage and internal communication.
      • This is particularly true if the replicated state machine (or its input/output) includes BLOBs.
    • Despite this drawback, raftpico intentionally uses JSON due to its emphasis on simplicity and readability.
      • Besides, the performance of raftpico appears adequate for many use cases (see the Benchmark section).
  • JSON-RPC (over TCP) API:
    • When the Raft server runs in the same process as the client, this can introduce unnecessary overhead.
    • [NOTE] Adding methods such as Server::apply() would be straightforward.
  • Single-Threaded Server:
    • Currently, Server does not utilize multi-threading.
    • Consequently, intensive tasks like handling large snapshots can block the server's running thread.
  • In-Memory Log Entries:
    • To maintain simplicity and minimize storage reads, raftpico also keeps log entries in memory.
    • This means memory usage increases with each proposed command until the next snapshot is taken.

In summary, raftpico is not suitable for the following purposes:

  • Processing large BLOBs that are inefficient to serialize using JSON
  • Managing large state machines where taking snapshots is prohibitively costly

Benchmark

This is a casual benchmark of the above example KVS server.

Environment:

$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 20.04.6 LTS
Release:        20.04
Codename:       focal

$ cat /proc/cpuinfo | grep 'model name' | head -1
model name      : AMD EPYC 7763 64-Core Processor

$ cat /proc/cpuinfo | grep 'model name' | wc -l
8

Create a KVS cluster with three nodes:

$ parallel -u ::: 'cargo run --release --example kvs 127.0.0.1:4000 kvs-4000.jsonl' \
                  'cargo run --release --example kvs 127.0.0.1:4001 kvs-4001.jsonl' \
                  'cargo run --release --example kvs 127.0.0.1:4002 kvs-4002.jsonl'

$ echo $(jlot req CreateCluster) \
       $(jlot req AddServer '{"addr":"127.0.0.1:4001"}') \
       $(jlot req AddServer '{"addr":"127.0.0.1:4002"}') | jlot call :4000

Put command benchmark:

// Generate 100,000 random requests.
$ cargo install rjg
$ rjg --count 100000 \
      --var key='{"$str":["$alpha", "$alpha", "$alpha"]}' \
      --var params='{"kind":"Command", "input":{"Put":["$key", "$u32"]}}' \
      '{"jsonrpc":"2.0", "id":"$i", "method":"Apply", "params":"$params"}' > commands.jsonl

// Execute the requests with a concurrency of 1000.
$ cat commands.jsonl | jlot call :4000 :4001 :4002 -a -c 1000 | jlot stats | jq .
{
  "rpc_calls": 100000,
  "duration": 1.874101709,
  "max_concurrency": 1000,
  "rps": 53358.896969022506,
  "latency": {
    "min": 0.003427635,
    "p25": 0.015225686,
    "p50": 0.018969559,
    "p75": 0.02225708,
    "max": 0.037078306,
    "avg": 0.018675618
  }
}

Get query (consistent query) benchmark:

// Generate 100,000 random requests.
$ rjg --count 100000 \
      --var key='{"$str":["$alpha", "$alpha", "$alpha"]}' \
      --var params='{"kind":"Query", "input":{"Get":"$key"}}' \
      '{"jsonrpc":"2.0", "id":"$i", "method":"Apply", "params":"$params"}' > queries.jsonl

// Execute the requests with a concurrency of 1000.
$ cat queries.jsonl | jlot call :4000 :4001 :4002 -a -c 1000 | jlot stats | jq .
{
  "rpc_calls": 100000,
  "duration": 1.362893895,
  "max_concurrency": 1000,
  "rps": 73373.28339855833,
  "latency": {
    "min": 0.002436916,
    "p25": 0.01140849,
    "p50": 0.014045573,
    "p75": 0.01576976,
    "max": 0.024304805,
    "avg": 0.013583492
  }
}

Get local query benchmark:

// Generate 100,000 random requests.
$ rjg --count 100000 \
      --var key='{"$str":["$alpha", "$alpha", "$alpha"]}' \
      --var params='{"kind":"LocalQuery", "input":{"Get":"$key"}}' \
      '{"jsonrpc":"2.0", "id":"$i", "method":"Apply", "params":"$params"}' > local-queries.jsonl

// Execute the requests with a concurrency of 1000.
$ cat local-queries.jsonl | jlot call :4000 :4001 :4002 -a -c 1000 | jlot stats | jq .
{
  "rpc_calls": 100000,
  "duration": 0.410423888,
  "max_concurrency": 993,
  "rps": 243650.53527293712,
  "latency": {
    "min": 0.000639122,
    "p25": 0.003991307,
    "p50": 0.004037483,
    "p75": 0.00421849,
    "max": 0.004560348,
    "avg": 0.004078114
  }
}

Dependencies

~1.4–9MB
~89K SLoC