#arena-allocator #skip-list #arena #lock-free

no-std skl

A lock-free thread-safe concurrent ARENA based (heap backend or memory map backend) skiplist implementation which helps develop MVCC memtable for LSM-Tree

2 unstable releases

new 0.20.0 Nov 12, 2024
0.18.6 Nov 1, 2024
0.13.1 Jul 13, 2024
0.4.0-alpha.1 Sep 19, 2023
0.1.0 Oct 25, 2021

#98 in Database interfaces

Download history 4/week @ 2024-07-23 60/week @ 2024-07-30 1/week @ 2024-08-06 15/week @ 2024-08-13 113/week @ 2024-09-10 111/week @ 2024-09-17 122/week @ 2024-09-24 8/week @ 2024-10-01 390/week @ 2024-10-08 215/week @ 2024-10-15 971/week @ 2024-10-22 1370/week @ 2024-10-29 173/week @ 2024-11-05

2,744 downloads per month
Used in orderwal

MIT/Apache

1MB
19K SLoC

SKL

github LoC Build codecov

docs.rs crates.io crates.io

wakatime license
  1. A lock-free thread-safe concurrent SkipMap implementation based on ARENA skiplist which helps develop MVCC memtable for LSM-Tree.
  2. A lock-free thread-safe concurrent memory map based on-disk SkipMap.

Installation

  • Only use heap backend (suppport no_std)

    [dependencies]
    skl = "0.20"
    
  • Enable memory map backend

    [dependencies]
    skl = { version = "0.20", features = ["memmap"] }
    

Features

  • MVCC and 3D access: Builtin MVCC (multiple versioning concurrency control) and key-value-version access support.
  • Lock-free and Concurrent-Safe: SkipMap provide lock-free operations, ensuring efficient concurrent access without the need for explicit locking mechanisms.
  • Extensible for Key-Value Database Developers: Designed as a low-level crate, SkipMap offer a flexible foundation for key-value database developers. You can easily build your own memtable or durable storage using these structures.
  • Memory Efficiency: These data structures are optimized for minimal memory overhead. They operate around references, avoiding unnecessary allocations and deep copies, which can be crucial for efficient memory usage.
    • Segment tracker: Builtin segment recollection support, a lock-free freelist helps reuse free segments.
    • Prefix compression:
      • Same key will only be stored once.
      • Keys with common prefix will be stored once (longest one must be inserted first).
      • (experimental) Keys are sub-slice of negeibours will be stored once (require CompressionPolicy::High).
    • Discard tracker: Builtin discard tracker, which will track discarded bytes to help end-users decide when to compact or rewrite.
  • Efficient Iteration: Enjoy fast forward and backward iteration through the elements in your SkipMap. Additionally, bounded iterators are supported, allowing you to traverse only a specified range of elements efficiently.
  • Snapshot Support: Create snapshots of your SkipMap, offering a read-only view of the contents at a specific moment in time. Snapshots provide a consistent view of the data, enabling implementations of transactional semantics and other use cases where data consistency is crucial.
  • Memory Management Options:
    • Heap Allocation: Memory allocation is handled by Rust's allocator, ensuring all data resides in RAM.
    • Mmap: Data can be mapped to a disk file by the operating system, making it suitable for durable storage.
    • Mmap Anonymous: Mapped to anonymous memory (virtual memory) by the OS, ideal for large in-memory memtables, optimizing memory utilization.

Examples

Please see examples folder for more details.

Q & A

  • Does the on-disk version SkipMap ensure crash safety or power failure resilience?

    No, If you really need a crash safe, power failure resilience, concurrent-safe and durable ordered write-ahead log implementation, see orderwal project.

    On-disk version SkipMap does not ensure crash safety or power failure resilience. Hence, it is not recommended to directly use the SkipMap as a durable database. It is recommended to use the on-disk version SkipMap as a final frozen file for quick lookup.

  • aol: Yet another generic purpose, append-only write-ahead log implementation.
  • orderwal: A generic-purpose, atomic, ordered, zero-copy, concurrent-safe, pre-allocate style (memory map) write-ahead-log for developing databases.

Tests

  • test:

    cargo test --all-features
    
  • miri (Stack Borrows)

    MIRIFLAGS="-Zmiri-strict-provenance -Zmiri-disable-isolation -Zmiri-symbolic-alignment-check" \
    RUSTFLAGS = "--cfg all_skl_tests" \
    cargo miri test --all-features
    
  • miri (Tree Borrows)

    MIRIFLAGS="-Zmiri-strict-provenance -Zmiri-disable-isolation -Zmiri-symbolic-alignment-check -Zmiri-tree-borrows" \
    RUSTFLAGS = "--cfg all_skl_tests" \
    cargo miri test --all-features
    

Support Platforms

See cross section in GitHub CI file.

Pedigree

This code is inspired and modified based on Cockroachdb's pebble arenaskl and Dgraph's badger skl code:

https://github.com/cockroachdb/pebble/tree/master/internal/arenaskl

https://github.com/dgraph-io/badger/tree/master/skl

The pebble's arenaskl code is based on Andy Kimball's arenaskl code:

https://github.com/andy-kimball/arenaskl

The arenaskl code is based on the skiplist found in Badger, a Go-based KV store:

https://github.com/dgraph-io/badger/tree/master/skl

The skiplist in Badger is itself based on a C++ skiplist built for Facebook's RocksDB:

https://github.com/facebook/rocksdb/tree/master/memtable

License

skl is under the terms of both the MIT license and the Apache License (Version 2.0).

See LICENSE-APACHE, LICENSE-MIT for details.

Copyright (c) 2022 Al Liu.

Dependencies

~3–12MB
~160K SLoC