6 releases
0.1.9 | Jun 14, 2020 |
---|---|
0.1.8 | Jun 13, 2020 |
#28 in #algorithms
21 downloads per month
34KB
777 lines
lsm_engine
A rust implementation of a key-value store that uses Log Structured Merge Trees and leverages a Write-Ahead log (WAL) for data recovery.
Install
[dependencies]
lsm_engine = "*"
Docs
lib.rs
:
A rust implementation of a key-value store using Log Structured Merge Trees
Example Usage
use lsm_engine::{LSMEngine, LSMBuilder} ;
use std::fs::File;
fn main() -> Result<(), Box< dyn std::error::Error>> {
let mut lsm = LSMBuilder::new().
segment_size(2000). // each sst file will have up to 2000 entries
inmemory_capacity(100). //store only 100 entries in memory
sparse_offset(20). //store one out of every 20 entries written into segments in memory
wal_path("my_write_ahead_log.txt"). //path
build();
let mut default_lsm = LSMBuilder::new().build(); //an lsm engine with default parameters
let dataset = vec![("k1", "v1"), ("k2", "v2"), ("k1", "v_1_1")];
for (k, v) in dataset.iter() {
lsm.write(String::from(*k), String::from(*v));
}
assert_eq!(lsm.read("k1")?, Some("v_1_1".to_owned()));
let mut wal = File::open("my_write_ahead_log.txt")?;
default_lsm.recover_from(wal)?;
for (k, v) in dataset {
assert!(default_lsm.contains(k)?);
}
//cleanup
std::fs::remove_file("my_write_ahead_log.txt")?;
Ok(())
}
Design
lsm_engine
is an embedded key-value store that uses LSM-trees and leverages a Write-Ahead log (WAL) file for
data recovery.
The basic architecture is illustrated below:
Write
When a write comes in, the following happens:
- The entry is written into the WAL file (unless an explicit request is made not to)
- If the size of the internal is at full capacity, the contents are dumped into a segment file, with compaction performed in the end.
- The entry is then inserted into the now-empty memtable.
Read
When a request for a read is made, the following happens:
- It first checks its internal memtable for the value corresponding to the requested key. If it exists, it returns the value
- Otherwise, it looks up the offset of the closest key with its sparse memory index. This is a balanced tree that maintains
the position of 1 out of every
sparse_offset
entries in memeory. - It then linearly scans forward from that offset, looking for the desired key-value entry.
Delete
This is just a special case of write, with value being a special tombstone string. For more details with visual illustrations, check out this blog post
Dependencies
~4–13MB
~170K SLoC