#immutability #map #set #set-operations #persistent #functional

immutable-chunkmap

A fast immutable map and set with batch insert and update methods, COW operations, and big O efficient implementations of set and merge operations

29 releases (11 stable)

2.0.4 Feb 4, 2024
2.0.2 Oct 23, 2023
2.0.0 Jun 22, 2023
1.1.1 Jun 20, 2023
0.1.2 Dec 26, 2017

#116 in Data structures

Download history 483/week @ 2023-12-23 451/week @ 2023-12-30 632/week @ 2024-01-06 537/week @ 2024-01-13 499/week @ 2024-01-20 640/week @ 2024-01-27 970/week @ 2024-02-03 764/week @ 2024-02-10 919/week @ 2024-02-17 729/week @ 2024-02-24 1543/week @ 2024-03-02 1063/week @ 2024-03-09 832/week @ 2024-03-16 1004/week @ 2024-03-23 1061/week @ 2024-03-30 630/week @ 2024-04-06

3,652 downloads per month
Used in 17 crates (4 directly)

MIT/Apache

775KB
4K SLoC

immutable chunk map

A cache efficient immutable map, written using only safe rust, with lookup performance close to BTreeMap and reasonably good insertion performance. Optional copy on write mutable operations bring modification performance within 2x of BTreeMap in the best case while still offering snapshotting, and big O efficient set operations of a persistant data structure.

A graph of lookup performance of various data structures using usize keys. Full test data in the bench/charts directory. Tests performed on an Intel Core i7 8550U under Linux with a locked frequency of 1.8 GHz.

  • OCaml: core map (from the Jane Street core library), an AVL tree with distinct leaf nodes and a relaxed balance constraint.
  • Chunkmap: this library
  • Chunkmap COW: this library using only COW operations
  • BTreeMap: from the Rust standard library
  • HashMap: from the Rust standard library

alt text

Chunkmap is very close to BTreeMap for random accesses using keys without hashing. Obviously if you don't need ordered data use a HashMap.

alt text

Insertion performance, while not as good as most mutable data structures, is not awful when using COW mode exclusively. In the case where you have many updates to do at once you can go even faster by using insert_many. In some cases, e.g. building a map from scratch using sorted inputs this can be faster than even a HashMap. The below case is more typical, adding 10% of a data set to the map.

alt text

A note about the COW bar on this graph. It represents using only mutable COW operations on the map, it is perfectly possible to use an actual insert_many call instead of mutable COW operations if it's faster in your application, which as you can see, depends on the size of the map.

License

This project is dual licensed under the MIT or the Apache 2 at your discretion.

Dependencies

~3MB
~67K SLoC