5 releases
0.1.4 | Nov 15, 2024 |
---|---|
0.1.3 | Mar 25, 2024 |
0.1.2 | Mar 20, 2024 |
0.1.1 | Jan 15, 2024 |
0.1.0 | Sep 11, 2023 |
#292 in Data structures
1,305 downloads per month
67KB
667 lines
h3o-ice — Frozen (aka read-only) sets and maps for H3 cell index
Those data structures are built on top of finite state transducers, which allows them to be extremely compact while offering fast lookup.
This is especially for read-only indexes (both in-memory or on-disk).
Installation
Cargo
- Install the rust toolchain in order to have cargo installed by following this guide.
- run
cargo install h3o-ice
Usage
use h3o::{LatLng, Resolution};
use h3o_ice::FrozenSet;
let coord = LatLng::new(48.872280706 2.332697839).expect("valid coord");
let set = FrozenSet::try_from_iter(
coord.to_cell(Resolution::Nine)
.children(Resolution::Ten)
)
.expect("failed to create set");
set.contains(CellIndex::try_from(0x8a1fb4666417fff).expect("valid cell"));
Comparison with other data structures
Frozen{Set,Map} |
BTree{Set,Map} |
Hash{Set,Map} |
|
---|---|---|---|
Memory overhead | ✅ (negative [^1]) | ❌ (50%[^2]) | ❌ (12-125%[^2]) |
Search complexity | O(key size) | O(log #items) | O(1) |
Range query | ✅ | ✅ | ❌ |
Prefix lookup | ✅ | ❌ | ❌ |
Insertion/Deletion | ❌ | ✅ | ✅ |
Direct lookup | 163 ns | 55 ns | 19 ns |
Compacted lookup | 67 ns | 401 ns | 125 ns |
About the lookup time:
- input dataset is France coverage at resolution 10:
- raw dataset: 44 250 550 cells (333.60M).
- compacted (i.e. replacing clusters of cells their ancestors): 127 264 cells (0.97M)
- Lookup of a resolution 10 cell:
- single lookup in the raw dataset.
- prefix lookup (or repetitive lookup if not supported) in the compacted dataset.
You can run the provided benchmarks to get measures relevant to your machine.
To sum up:
- if you can afford the memory usage and don't care about ordering (e.g. range
query) go with a
HashSet
for maximum speed. - if you need ordering but don't need to work on compacted set, go with a
BTreeSet
. - if you have a large read-only dataset, wants to optimize memory usage and be
able to query compacted data directly then
FrozenSet
is your friend.
Frozen{Map,Set}
are not general purpose data structures. They come with a
bunch of extra constraints (read-only, must be build from pre-sorted input, ...)
but they really shine in their niche and provides a bunch a goodies (e.g. key
compressions, can be mmap'd, prefix lookup, ...).
License
[^1]: Frozen{Set,Map}
compresses both common prefixes and suffixes of keys, resulting in a smaller size than the input data set (e.g. the 333MB test dataset fits into a 176KB FrozenSet
)
[^2]: Measuring the overhead of HashMaps in Rust
Dependencies
~6.5MB
~45K SLoC