9 releases

0.3.3 Mar 17, 2024
0.3.2 Feb 6, 2023
0.3.1 Jan 21, 2023
0.2.2 Jan 15, 2023
0.1.1 Jan 7, 2023

#288 in Database interfaces

Download history 7/week @ 2024-02-15 23/week @ 2024-02-22 1/week @ 2024-02-29 4/week @ 2024-03-07 145/week @ 2024-03-14 15/week @ 2024-03-21 32/week @ 2024-03-28 32/week @ 2024-04-04

85 downloads per month

MIT license

105KB
2.5K SLoC

DiskLRU

DiskLRU is an experimental LRU store implemented in Rust.

It works as a simple key-value store and is powered by sled which is a lightweight pure-rust high-performance transactional embedded database.

It expires entries automatically passed a given number, following the LRU (least recently used) strategy, which is a quite common cache replacement policy.

It comes with a cost but is sometimes more adapted than just expiring the old keys in FIFO mode (first-in, first-out) as you may have a key-value pair which has been introduced a very long time ago in the store, but is still relevant. With DiskLRU, if your code is reading the value often, it will ensure the key stays alive and does not disappear.

DiskLRU icon

  • HashLRU is very similar in its design, and acts as an in-memory cache, as opposed to DiskLRU being a persistent store.
  • MenhirKV solves the same problem that DiskLRU addresses, which is "store stuff, keep most used value at hand, drop the rest". It does it in a less pedantic and precise way, but much more efficiently.

Status

For now this is a toy project, clearly NOT suitable for production use.

It has at least 2 significant flaws:

  • it is very (very) slow compared to other alternatives, the fact it is so slow motivated me to start MenhirKV which has a much more pragmatic approach to expiration.
  • it has serious stability issues, among other things, turning on disk writes (eg: not using a temporary store) on my arm64-based laptop causes benchmarks to just hang, with symptoms of a deadlock.

Build Status Crates.io Gitlab License

Usage

use disklru::Store;

let mut store = Store::open_temporary(4).unwrap();
store.insert(&1, &10).unwrap();
store.insert(&2, &20).unwrap();
store.insert(&3, &30).unwrap();
store.insert(&4, &40).unwrap();
store.insert(&5, &50).unwrap();
// key1 has been dropped, size is limited to 4
assert_eq!(Some(2), store.lru().unwrap());
assert_eq!(Some(20), store.get(&2).unwrap());
// getting key2 has made key3 the least recently used item
assert_eq!(Some(3), store.lru().unwrap());
assert_eq!(Some(40), store.get(&4).unwrap());
// getting key4 makes it the most recently used item
assert_eq!("[3: 30, 5: 50, 2: 20, 4: 40]", format!("{}", store));
store.flush().unwrap(); // commit

Benchmarks

Taken from a random CI job:

running 12 tests
test tests::bench_read_usize_disklru_compress    ... bench:     169,738 ns/iter (+/- 21,901)
test tests::bench_read_usize_disklru_persistent  ... bench:     111,211 ns/iter (+/- 12,312)
test tests::bench_read_usize_disklru_temporary   ... bench:     105,257 ns/iter (+/- 25,468)
test tests::bench_read_usize_hashlru             ... bench:         157 ns/iter (+/- 16)
test tests::bench_read_usize_hashmap             ... bench:          40 ns/iter (+/- 10)
test tests::bench_read_usize_lru                 ... bench:          34 ns/iter (+/- 15)
test tests::bench_write_usize_disklru_compress   ... bench:     151,871 ns/iter (+/- 15,723)
test tests::bench_write_usize_disklru_persistent ... bench:     100,999 ns/iter (+/- 5,399)
test tests::bench_write_usize_disklru_temporary  ... bench:      95,390 ns/iter (+/- 15,908)
test tests::bench_write_usize_hashlru            ... bench:         239 ns/iter (+/- 32)
test tests::bench_write_usize_hashmap            ... bench:         117 ns/iter (+/- 26)
test tests::bench_write_usize_lru                ... bench:          84 ns/iter (+/- 25)

test result: ok. 0 passed; 0 failed; 0 ignored; 12 measured; 0 filtered out; finished in 98.17s

This is not the result of extensive, thorough benchmarking, just a random snapshot at some point in development history.

TL;DR -> performance is not great.

Granted, the comparison is made with non-persistent LRU caches, so it is expected to have "some performance loss", but the gap here is huge. Another project I started, using a Bloom filter to expire old entries, is about 10X faster.

To run the benchmarks:

cd bench
rustup default nightly
cargo bench

Links

  • crate on crates.io
  • doc on docs.rs
  • source on gitlab.com
  • sled, the database powering this LRU store
  • HashLRU, a similar in-memory cache
  • MenhirKV, recommended alternative

License

DiskLRU is licensed under the MIT license.

Dependencies

~6MB
~108K SLoC