#map #set #immutable #persistent #functional

immutable-chunkmap

A cache efficient immutable map and set with lookup performance equivalent to BTreeMap and BTreeSet, fast batch insert and update methods, and efficient implementations of all set operations

18 releases

0.5.9 Oct 30, 2020
0.5.8 Nov 3, 2019
0.5.7 Oct 28, 2019
0.5.4 Feb 13, 2019
0.1.2 Dec 26, 2017

#134 in Data structures

Download history 362/week @ 2021-01-21 161/week @ 2021-01-28 239/week @ 2021-02-04 1101/week @ 2021-02-11 2089/week @ 2021-02-18 292/week @ 2021-02-25 296/week @ 2021-03-04 272/week @ 2021-03-11 248/week @ 2021-03-18 390/week @ 2021-03-25 321/week @ 2021-04-01 412/week @ 2021-04-08 920/week @ 2021-04-15 767/week @ 2021-04-22 852/week @ 2021-04-29 643/week @ 2021-05-06

2,913 downloads per month
Used in 4 crates (via netidx)

MIT/Apache

1MB
3K SLoC

Rust 2.5K SLoC // 0.0% comments JavaScript 328 SLoC // 0.1% comments OCaml 80 SLoC Shell 1 SLoC

immutable chunk map

A cache efficient immutable map with lookup performance equivalent to BTreeMap written using only safe rust (and std), and reasonably good insertion performance, in line with other persistent map libraries.

A graph of lookup performance of various data structures using i64 keys. Full test data in the results.gnumeric spreadsheet. Tests performed on an Intel Core i7 6700HQ under Linux, cpu frequency governor 'performance'.

  • OCaml: core map (from the Jane Street core library), an AVL tree with distinct leaf nodes and a relaxed balance constraint.
  • Std Avl: a classical AVL tree map from the 'immutable-map' cargo package
  • Chunkmap: this library
  • BTreeMap: from the Rust standard library
  • Binary Search: binary search in a sorted array of keys
  • HashMap: from the Rust standard library
  • Array: random access to a Vec

alt text

Chunkmap is very close to BTreeMap, and binary search, which is probably optimal for random accesses using keys without hashing. Obviously if you don't need ordered data use a HashMap or even a Vec.

alt text

Single insertion is slightly more expensive than a classical AVL tree, however multi insertion modes can significantly reduce this overhead. Anyway, if you care a lot about insertion performance you really shouldn't use a persistent data structure, as you pay a heavy price compared to a mutable one.

However in one very specific case, inserting into a chunkmap can be faster than even a HashMap. If you have a lot of data to add at once (at least 10% of what's already in the map), then you can sort it first, and use insert_many, which is much faster than add. The graph below shows building maps from scratch using sorted data, which is the best possible case.

alt text

Dependencies