9 releases

new 0.1.8 Jun 17, 2024
0.1.7 Jun 7, 2024
0.1.2 May 28, 2024

#217 in Database interfaces

Download history 201/week @ 2024-05-22 314/week @ 2024-05-29 471/week @ 2024-06-05

986 downloads per month

MIT license

145KB
3K SLoC

RLDB A Rust implementation of the Amazon Dynamo Paper

Codecov

Introduction

RLDB (Rusty Learning Dynamo Database) is an educational project that provides a Rust implementation of the Amazon dynamo paper. This project aims to help developers and students understand the principles behind distributed key value data stores.

Features

Feature Description Status Resources
InMemory Storage Engine A simple in-memory storage engine $${\textsf{\color{green}Implemented}}$$ Designing Data-Intensive applications - chapter 3
LSMTree An LSMTree backed storage engine $${\textsf{\color{yellow}TODO}}$$ Designing Data-Intensive applications - chapter 3
Log Structured HashTable Similar to the bitcask storage engine $${\textsf{\color{yellow}TODO}}$$ Bitcask intro paper
TCP server A tokio backed TCP server for incoming requests $${\textsf{\color{green}Implemented}}$$ tokio
PUT/GET/DEL client APIs TCP APIs for PUT GET and DELET $${\textsf{\color{greenyellow}WIP}}$$ N/A
PartitioningScheme via Consistent Hashing A functional consistent-hashing implementation $${\textsf{\color{green}Implemented}}$$ Designing Data-Intensive applications - chapter 6, Consistent Hashing by David Karger
Leaderless replication of partitions Replicating partition data using the leaderless replication approach $${\textsf{\color{green}Implemented}}$$ Designing Data-Intensive applications - chapter 5
Quorum Quorum based reads and writes for tunnable consistenty guarantees $${\textsf{\color{green}Implemented}}$$ Designing Data-Intensive applications - chapter 5
Node discovery and failure detection A gossip based mechanism to discover cluster nodes and detect failures $${\textsf{\color{green}Implemented}}$$ Dynamo Paper
re-sharding/rebalancing Moving data between nodes after cluster state changes $${\textsf{\color{yellow}TODO}}$$ Designing Data-Intensive applications - chapter 6
Data versioning Versioning and conflict detection / resolution (via VersionVectors) $${\textsf{\color{greenyellow}WIP}}$$ Vector clock wiki, Lamport clock paper (not that easy to parse)
Reconciliation via Read repair GETs can trigger repair in case of missing replicas $${\textsf{\color{yellow}TODO}}$$ Dynamo Paper
Active anti-entropy Use merkle trees to detect missing replicas and trigger reconciliation $${\textsf{\color{yellow}TODO}}$$ Dynamo Paper

Running the server

  1. Start nodes using config files in different terminals
cargo run --bin rldb-server -- --config-path conf/node_1.json
cargo run --bin rldb-server -- --config-path conf/node_2.json
cargo run --bin rldb-server -- --config-path conf/node_3.json
  1. Include the new nodes to the cluster

In this example, we assume node on port 3001 to be the initial cluster node and we add the other nodes to it.

cargo run --bin rldb-client join-cluster -p 3002 --known-cluster-node 127.0.0.1:3001
cargo run --bin rldb-client join-cluster -p 3003 --known-cluster-node 127.0.0.1:3001

PUT

cargo run --bin rldb-client put -p 3001 -k foo -v bar

GET

cargo run --bin rldb-client get -p 3001 -k foo

Documentation

See rldb docs

License

This project is licensed under the MIT license. See License for details

Acknowledgments

This project was inspired by the original Dynamo paper but also by many other authors and resources like:

and many others. When modules in this project are based on specific resources, they will be included as part of the module documentation

Dependencies

~6–17MB
~209K SLoC