#cache #key-value-store #disk-cache #hashmap

scdb

A very simple and fast key-value store but persisting data to disk, with a 'localStorage-like' API

6 releases

0.2.1 Mar 6, 2023
0.2.0 Jan 16, 2023
0.1.2 Jan 12, 2023
0.0.2 Nov 9, 2022
0.0.1 Oct 26, 2022

#214 in Database interfaces

Download history 69/week @ 2023-10-20 72/week @ 2023-10-27 60/week @ 2023-11-03 52/week @ 2023-11-10 79/week @ 2023-11-17 80/week @ 2023-11-24 45/week @ 2023-12-01 61/week @ 2023-12-08 73/week @ 2023-12-15 64/week @ 2023-12-22 31/week @ 2023-12-29 63/week @ 2024-01-05 63/week @ 2024-01-12 50/week @ 2024-01-19 59/week @ 2024-01-26 26/week @ 2024-02-02

209 downloads per month

Custom license

250KB
5K SLoC

scdb

CI

A very simple and fast key-value store but persisting data to disk, with a "localStorage-like" API.

scdb may not be production-ready yet. It works, quite well but it requires more rigorous testing.

Purpose

Coming from front-end web development, localStorage was always a convenient way of quickly persisting data to be used later by a given application even after a restart. Its API was extremely simple i.e. localStorage.getItem(), localStorage.setItem(), localStorage.removeItem() , localStorage.clear().

Coming to the backend (or even desktop) development, such an embedded persistent data store with a simple API was hard to come by.

scdb is meant to be like the 'localStorage' of backend and desktop (and possibly mobile) systems. Of course to make it a little more appealing, it has some extra features like:

  • Time-to-live (TTL) where a key-value pair expires after a given time
  • Non-blocking reads from separate processes, and threads.
  • Fast Sequential writes to the store, queueing any writes from multiple processes and threads.
  • Optional searching of keys that begin with a given subsequence. This option is turned on when scdb::new() is called.

Documentation

Find the following documentation sites, depending on the programming language.

Quick Start

  • Create a new cargo project

    cargo new hello_scdb && cd hello_scdb
    
  • Add scdb to your dependencies in Cargo.toml file

    [dependencies]
    scdb = { version = "0.1" }
    
  • Update your src/main.rs to the following.

use scdb::Store;
use std::thread;
use std::time::Duration;

/// Converts a byte array to string
macro_rules! to_str {
    ($arr:expr) => {
        std::str::from_utf8($arr).expect("bytes to str")
    };
}

/// Prints data from store to the screen in a pretty way
macro_rules! pprint_data {
    ($title:expr, $data:expr) => {
        println!("\n");
        println!("{}", $title);
        println!("===============");

        for (k, got) in $data {
            let got_str = match got {
                None => "None",
                Some(v) => to_str!(v),
            };
            println!("For key: '{}', str: '{}', raw: '{:?}',", k, got_str, got);
        }
    };
}

fn main() {
  // Creat the store. You can configure its `max_keys`, `redundant_blocks` etc. The defaults are usable though.
  // One very important config is `max_keys`. With it, you can limit the store size to a number of keys.
  // By default, the limit is 1 million keys
  let mut store =
          Store::new("db", Some(1000), Some(1), Some(10), Some(1800), true).expect("create store");
  let records = [
    ("hey", "English"),
    ("hi", "English"),
    ("salut", "French"),
    ("bonjour", "French"),
    ("hola", "Spanish"),
    ("oi", "Portuguese"),
    ("mulimuta", "Runyoro"),
  ];
  let updates = [
    ("hey", "Jane"),
    ("hi", "John"),
    ("hola", "Santos"),
    ("oi", "Ronaldo"),
    ("mulimuta", "Aliguma"),
  ];
  let keys: Vec<&str> = records.iter().map(|(k, _)| *k).collect();

  // Setting the values
  println!("Let's insert data\n{:?}]...", &records);
  for (k, v) in &records {
    let _ = store.set(k.as_bytes(), v.as_bytes(), None);
  }

  // Getting the values (this is similar to what is in `get_all(&mut store, &keys)` function
  let data: Vec<(&str, Option<Vec<u8>>)> = keys
          .iter()
          .map(|k| (*k, store.get(k.as_bytes()).expect(&format!("get {}", k))))
          .collect();
  pprint_data!("After inserting data", &data);

  // Setting the values with time-to-live
  println!(
    "\n\nLet's insert data with 1 second time-to-live (ttl) for keys {:?}]...",
    &keys[3..]
  );
  for (k, v) in &records[3..] {
    let _ = store.set(k.as_bytes(), v.as_bytes(), Some(1));
  }

  println!("We will wait for 1 second to elapse...");
  thread::sleep(Duration::from_secs(2));

  let data = get_all(&mut store, &keys);
  pprint_data!("After inserting keys with ttl", &data);

  // Updating the values
  println!("\n\nLet's update with data {:?}]...", &updates);
  for (k, v) in &updates {
    let _ = store.set(k.as_bytes(), v.as_bytes(), None);
  }

  let data = get_all(&mut store, &keys);
  pprint_data!("After updating keys", &data);

  // Full-text search by key. It returns array of key-value tuples.
  let data = store
          .search(&b"h"[..], 0, 0)
          .expect("search for keys starting with h");
  println!("\nSearching for keys starting with 'h'");
  println!("=======================================", );
  for (k, v) in &data {
    // note that to_str! is a custom macro changing byte array to UTF-8 string
    println!("{}: {}", to_str!(k), to_str!(v))
  }

  // Search with pagination
  let data = store
          .search(&b"h"[..], 1, 1)
          .expect("search for keys starting with h");
  println!("\nPaginated search for keys starting with 'h'");
  println!("==============================================", );
  println!("Skipping 1, returning 1 record only");
  println!("---");
  for (k, v) in &data {
    // note that to_str! is a custom macro changing byte array to UTF-8 string
    println!("{}: {}", to_str!(k), to_str!(v))
  }

  // Deleting some values
  let keys_to_delete = ["oi", "hi"];
  println!("\n\nLet's delete keys{:?}]...", &keys_to_delete);
  for k in keys_to_delete {
    store
            .delete(k.as_bytes())
            .expect(&format!("delete key {}", k));
  }

  let data = get_all(&mut store, &keys);
  pprint_data!("After deleting keys", &data);

  // Deleting all values
  println!("\n\nClear all data...");
  store.clear().expect("clear store");

  let data = get_all(&mut store, &keys);
  pprint_data!("After clearing", &data);
}

/// Gets all from store for the given keys
fn get_all<'a>(store: &mut Store, keys: &Vec<&'a str>) -> Vec<(&'a str, Option<Vec<u8>>)> {
  keys.iter()
          .map(|k| (*k, store.get(k.as_bytes()).expect(&format!("get {}", k))))
          .collect()
}
  • Run the main.rs file

    cargo run
    

Contributing

Contributions are welcome. The docs have to maintained, the code has to be made cleaner, more idiomatic and faster, and there might be need for someone else to take over this repo in case I move on to other things. It happens!

Please look at the CONTRIBUTIONS GUIDELINES

You can also look in the ./docs folder to get up to speed with the internals of scdb e.g.

Bindings

scdb is meant to be used in multiple languages of choice. However, the bindings for most of them are yet to be developed. Here are those that have been developed:

TODO:

  • compare benchmarks with those of redis, sqlite, lmdb etc.

How to Test

  • Make sure you have rust installed on your computer.

  • Clone the repo and enter its root folder

    git clone https://github.com/sopherapps/scdb.git && cd scdb
    
  • Run the example

    cargo run --example hello_scdb
    
  • Lint

    cargo clippy
    
  • Run the test command

    cargo test
    
  • Run the bench test command

    cargo bench
    

Benchmarks

On an average PC (i7Core, 16GB RAM):

set(no ttl): 'foo'      time:   [8.4622 µs 9.3052 µs 10.396 µs]
set(ttl): 'foo'         time:   [9.0695 µs 9.2830 µs 9.5413 µs]
set(no ttl) with search: 'foo'
                        time:   [40.573 µs 41.152 µs 41.825 µs]
set(ttl) with search: 'foo'
                        time:   [42.494 µs 43.880 µs 45.353 µs]
update(no ttl): 'foo'   time:   [8.0398 µs 8.1054 µs 8.1814 µs]
update(ttl): 'fenecans' time:   [8.2151 µs 8.3078 µs 8.4137 µs]
update(no ttl) with search: 'foo'
                        time:   [40.757 µs 40.854 µs 40.960 µs]
update(ttl) with search: 'fenecans'
                        time:   [40.901 µs 40.985 µs 41.076 µs]
                        time:   [7.9638 µs 8.0066 µs 8.0609 µs]
get(no ttl): 'hey'      time:   [209.98 ns 213.70 ns 218.01 ns]
get(no ttl): 'hi'       time:   [205.34 ns 207.45 ns 209.70 ns]
get(no ttl): 'salut'    time:   [203.01 ns 204.54 ns 206.45 ns]
get(no ttl): 'bonjour'  time:   [206.43 ns 208.68 ns 210.97 ns]
get(no ttl): 'hola'     time:   [268.69 ns 297.50 ns 334.32 ns]
get(no ttl): 'oi'       time:   [192.04 ns 192.62 ns 193.25 ns]
get(no ttl): 'mulimuta' time:   [202.74 ns 203.14 ns 203.56 ns]
get(with ttl): 'hey'    time:   [230.27 ns 230.65 ns 231.06 ns]
get(with ttl): 'hi'     time:   [229.39 ns 229.89 ns 230.50 ns]
get(with ttl): 'salut'  time:   [231.72 ns 232.10 ns 232.51 ns]
get(with ttl): 'bonjour'
                        time:   [232.30 ns 232.68 ns 233.10 ns]
get(with ttl): 'hola'   time:   [231.98 ns 232.56 ns 233.16 ns]
get(with ttl): 'oi'     time:   [228.74 ns 229.30 ns 229.87 ns]
get(with ttl): 'mulimuta'
                        time:   [237.61 ns 237.94 ns 238.29 ns]
get(no ttl) with search: 'hey'
                        time:   [194.52 ns 194.86 ns 195.25 ns]
get(no ttl) with search: 'hi'
                        time:   [195.36 ns 195.61 ns 195.86 ns]
get(no ttl) with search: 'salut'
                        time:   [198.78 ns 199.01 ns 199.25 ns]
get(no ttl) with search: 'bonjour'
                        time:   [199.74 ns 200.18 ns 200.79 ns]
get(no ttl) with search: 'hola'
                        time:   [199.81 ns 200.20 ns 200.60 ns]
get(no ttl) with search: 'oi'
                        time:   [191.97 ns 192.37 ns 192.80 ns]
get(no ttl) with search: 'mulimuta'
                        time:   [198.39 ns 198.80 ns 199.22 ns]
get(with ttl) without search: 'hey'
                        time:   [232.84 ns 234.11 ns 235.46 ns]
get(with ttl) without search: 'hi'
                        time:   [230.81 ns 231.25 ns 231.76 ns]
get(with ttl) without search: 'salut'
                        time:   [233.56 ns 234.07 ns 234.67 ns]
get(with ttl) without search: 'bonjour'
                        time:   [233.81 ns 234.23 ns 234.67 ns]
get(with ttl) without search: 'hola'
                        time:   [234.02 ns 234.43 ns 234.86 ns]
get(with ttl) without search: 'oi'
                        time:   [228.52 ns 228.84 ns 229.18 ns]
get(with ttl) without search: 'mulimuta'
                        time:   [233.36 ns 233.74 ns 234.15 ns]
search (not paged): 'h' time:   [18.156 µs 18.274 µs 18.429 µs]
search (not paged): 'h' #2
                        time:   [18.093 µs 18.139 µs 18.192 µs]
search (not paged): 's' time:   [8.6507 µs 8.6653 µs 8.6807 µs]
search (not paged): 'b' time:   [8.6318 µs 8.6531 µs 8.6766 µs]
search (not paged): 'h' #3
                        time:   [18.106 µs 18.147 µs 18.188 µs]
search (not paged): 'o' time:   [8.6288 µs 8.6415 µs 8.6557 µs]
search (not paged): 'm' time:   [8.6453 µs 8.6657 µs 8.6873 µs]
search (paged): 'h'     time:   [16.161 µs 16.230 µs 16.319 µs]
search (paged): 'h' #2  time:   [15.949 µs 16.016 µs 16.093 µs]
search (paged): 's'     time:   [6.0744 µs 6.1114 µs 6.1544 µs]
search (paged): 'b'     time:   [6.2516 µs 6.3119 µs 6.3827 µs]
search (paged): 'h' #3  time:   [15.990 µs 16.026 µs 16.063 µs]
search (paged): 'o'     time:   [6.1061 µs 6.1790 µs 6.2617 µs]
search (paged): 'm'     time:   [6.5727 µs 6.6862 µs 6.7921 µs]
delete(no ttl): 'foo'   time:   [51.172 µs 52.554 µs 54.057 µs]
delete(ttl): 'foo'      time:   [53.211 µs 54.964 µs 56.804 µs]
delete(no ttl) with search: 'foo'
                        time:   [70.327 µs 70.698 µs 71.226 µs]
delete(ttl) with search: 'foo'
                        time:   [70.753 µs 71.086 µs 71.520 µs]
clear(no ttl)           time:   [144.05 µs 153.14 µs 170.79 µs]
clear(ttl)              time:   [142.17 µs 142.68 µs 143.23 µs]
clear(no ttl) with search
                        time:   [221.58 µs 223.04 µs 224.52 µs]
clear(ttl) with search  time:   [218.17 µs 226.53 µs 242.62 µs]
compact                 time:   [126.76 ms 128.26 ms 129.86 ms]
compact with search     time:   [128.80 ms 131.45 ms 134.50 ms]

Acknowledgement

License

Copyright (c) 2022 Martin Ahindura Licensed under the MIT License

Gratitude

"For My Father’s will is that everyone who looks to the Son and believes in Him shall have eternal life, and I will raise them up at the last day."

-- John 6: 40

All glory be to God.

Buy Me A Coffee

Dependencies

~1.4–8.5MB
~37K SLoC