#finite-state #h3 #set #map #fst

h3o-ice

Frozen{Map,Set} for H3 cells, based on finite state transducers

5 releases

0.1.4 Nov 15, 2024
0.1.3 Mar 25, 2024
0.1.2 Mar 20, 2024
0.1.1 Jan 15, 2024
0.1.0 Sep 11, 2023

#292 in Data structures

Download history 77/week @ 2024-08-17 93/week @ 2024-08-24 140/week @ 2024-08-31 191/week @ 2024-09-07 198/week @ 2024-09-14 235/week @ 2024-09-21 309/week @ 2024-09-28 319/week @ 2024-10-05 299/week @ 2024-10-12 273/week @ 2024-10-19 191/week @ 2024-10-26 303/week @ 2024-11-02 291/week @ 2024-11-09 403/week @ 2024-11-16 266/week @ 2024-11-23 257/week @ 2024-11-30

1,305 downloads per month

BSD-3-Clause

67KB
667 lines

h3o-ice — Frozen (aka read-only) sets and maps for H3 cell index

Crates.io Docs.rs CI Status Coverage License

Those data structures are built on top of finite state transducers, which allows them to be extremely compact while offering fast lookup.

This is especially for read-only indexes (both in-memory or on-disk).

Installation

Cargo

  • Install the rust toolchain in order to have cargo installed by following this guide.
  • run cargo install h3o-ice

Usage

use h3o::{LatLng, Resolution};
use h3o_ice::FrozenSet;

let coord = LatLng::new(48.872280706 2.332697839).expect("valid coord");
let set = FrozenSet::try_from_iter(
    coord.to_cell(Resolution::Nine)
        .children(Resolution::Ten)
)
.expect("failed to create set");

set.contains(CellIndex::try_from(0x8a1fb4666417fff).expect("valid cell"));

Comparison with other data structures

Frozen{Set,Map} BTree{Set,Map} Hash{Set,Map}
Memory overhead ✅ (negative [^1]) ❌ (50%[^2]) ❌ (12-125%[^2])
Search complexity O(key size) O(log #items) O(1)
Range query
Prefix lookup
Insertion/Deletion
Direct lookup 163 ns 55 ns 19 ns
Compacted lookup 67 ns 401 ns 125 ns

About the lookup time:

  • input dataset is France coverage at resolution 10:
    • raw dataset: 44 250 550 cells (333.60M).
    • compacted (i.e. replacing clusters of cells their ancestors): 127 264 cells (0.97M)
  • Lookup of a resolution 10 cell:
    • single lookup in the raw dataset.
    • prefix lookup (or repetitive lookup if not supported) in the compacted dataset.

You can run the provided benchmarks to get measures relevant to your machine.

To sum up:

  • if you can afford the memory usage and don't care about ordering (e.g. range query) go with a HashSet for maximum speed.
  • if you need ordering but don't need to work on compacted set, go with a BTreeSet.
  • if you have a large read-only dataset, wants to optimize memory usage and be able to query compacted data directly then FrozenSet is your friend.

Frozen{Map,Set} are not general purpose data structures. They come with a bunch of extra constraints (read-only, must be build from pre-sorted input, ...) but they really shine in their niche and provides a bunch a goodies (e.g. key compressions, can be mmap'd, prefix lookup, ...).

License

BSD 3-Clause

[^1]: Frozen{Set,Map} compresses both common prefixes and suffixes of keys, resulting in a smaller size than the input data set (e.g. the 333MB test dataset fits into a 176KB FrozenSet) [^2]: Measuring the overhead of HashMaps in Rust

Dependencies

~6.5MB
~45K SLoC