#lsm-tree #key-value-store #algorithm #key-value-engine

lsm_engine

A rust implementation of a key-value store using LSM trees

6 releases

0.1.9 Jun 14, 2020
0.1.8 Jun 13, 2020

#2707 in Database interfaces

MIT/Apache

34KB
777 lines

lsm_engine

A rust implementation of a key-value store that uses Log Structured Merge Trees and leverages a Write-Ahead log (WAL) for data recovery.

Docs.rs Docs.rs

Install

[dependencies]
lsm_engine = "*"

Docs

https://docs.rs/lsm_engine/0.1.1/lsm_engine/


lib.rs:

A rust implementation of a key-value store using Log Structured Merge Trees

Example Usage

use lsm_engine::{LSMEngine, LSMBuilder} ;
use std::fs::File;


fn main() -> Result<(), Box< dyn std::error::Error>> {

  let mut lsm = LSMBuilder::new().
       segment_size(2000). // each sst file will have up to 2000 entries
       inmemory_capacity(100). //store only 100 entries in memory
       sparse_offset(20). //store one out of every 20 entries written into segments in memory
       wal_path("my_write_ahead_log.txt"). //path
       build();

  let mut default_lsm = LSMBuilder::new().build(); //an lsm engine with default parameters

  let dataset = vec![("k1", "v1"), ("k2", "v2"), ("k1", "v_1_1")];

  for (k, v) in dataset.iter() {
       lsm.write(String::from(*k), String::from(*v));
   }
   assert_eq!(lsm.read("k1")?, Some("v_1_1".to_owned()));


   let mut wal  = File::open("my_write_ahead_log.txt")?;
   default_lsm.recover_from(wal)?;
   for (k, v) in dataset {
       assert!(default_lsm.contains(k)?);
   }

   //cleanup
  std::fs::remove_file("my_write_ahead_log.txt")?;




   Ok(())
}

Design

lsm_engine is an embedded key-value store that uses LSM-trees and leverages a Write-Ahead log (WAL) file for data recovery.

The basic architecture is illustrated below:

Write

When a write comes in, the following happens:

  • The entry is written into the WAL file (unless an explicit request is made not to)
  • If the size of the internal is at full capacity, the contents are dumped into a segment file, with compaction performed in the end.
  • The entry is then inserted into the now-empty memtable.

Read

When a request for a read is made, the following happens:

  • It first checks its internal memtable for the value corresponding to the requested key. If it exists, it returns the value
  • Otherwise, it looks up the offset of the closest key with its sparse memory index. This is a balanced tree that maintains the position of 1 out of every sparse_offset entries in memeory.
  • It then linearly scans forward from that offset, looking for the desired key-value entry.

Delete

This is just a special case of write, with value being a special tombstone string. For more details with visual illustrations, check out this blog post

Dependencies

~3–13MB
~160K SLoC