#map #concurrency #lockless #data-structures #key-value

nightly bin+lib cmap

Concurrent multi-writer hash-map using trie

3 releases (breaking)

0.3.0 Sep 18, 2021
0.2.0 Mar 10, 2021
0.1.0 Mar 8, 2021

#1887 in Database interfaces

MIT license

345KB
2.5K SLoC

Documentation

Concurrent Hash map

Package implement Concurrent hash map.

Quoting from Wikipedia:

A data structure is partially persistent if all versions can be accessed but only the newest version can be modified. The data structure is fully persistent if every version can be both accessed and modified. If there is also a meld or merge operation that can create a new version from two previous versions, the data structure is called confluently persistent. Structures that are not persistent are called ephemeral data structures.

This implementation of hash map cannot be strictly classified into either of the above definition. It supports concurrent writes, using atomic Load, Store and Cas operations under the hood, and does not provide point in time snapshot for transactional operations or iterative operations.

If point in time snapshots are needed refer to ppom package, that implement ordered map with multi-reader concurrency and serialised writes.

  • Each entry in Map instance correspond to a {Key, Value} pair.
  • Parametrised over key-type and value-type.
  • Parametrised over hash-builder for application defined hashing.
  • API - set(), get(), remove() using key.
  • Uses ownership model and borrow semantics to ensure safety.
  • Implement a custom epoch-based-garbage-collection to handle write concurrency and memory optimization.
  • No Durability guarantee.
  • Thread safe for both concurrent writes and concurrent reads.

Refer to rustdoc for details.

Performance

Machine: Gen-1 Thread-ripper 16/32 cores and 64GB RAM. All measurements use 32-bit key and 64-bit value and U32Hasher from cmap.

With 16 concurrent threads on a 10-million data set, cmap can perform ~12-million get operations.

  • Wikipedia link on hamt.
  • Research paper on ctrie.
  • Default hashing algorithm is city-hash.

Contribution

  • Simple workflow. Fork - Modify - Pull request.
  • Before creating a PR,
    • Run make build to confirm all versions of build is passing with 0 warnings and 0 errors.
    • Run check.sh with 0 warnings, 0 errors and all testcases passing.
    • Run perf.sh with 0 warnings, 0 errors and all testcases passing.
    • Install and run cargo spellcheck to remove common spelling mistakes.
  • Developer certificate of origin is preferred.

Dependencies

~1.7–2.9MB
~56K SLoC