19 releases
| 0.1.1 | May 6, 2026 |
|---|---|
| 0.1.0 | Mar 17, 2026 |
| 0.0.15 | Mar 3, 2026 |
| 0.0.14 | Nov 17, 2025 |
| 0.0.4-dev |
|
#2495 in Algorithms
Used in kermit
93KB
1.5K
SLoC
kermit-ds
Relation data structures for the Kermit workspace. Provides two trie-based implementations that store tuples of usize keys:
TreeTrie— a pointer-based trie where each node owns its sorted children. Simple and direct; preferable for small relations or pedagogical use.ColumnTrie— a column-oriented (flattened) trie that stores each depth in paralleldata/intervalarrays. More compact and cache-friendly on large relations.
Both implement Relation and TrieIterable, so they're interchangeable in the join algorithms in kermit-algos.
Surface
- Core traits —
Relation,Projectable,RelationFileExt,HeapSize. - Metadata —
RelationHeader,ModelType,RelationError. - Data structures —
TreeTrie,ColumnTrie, plus theIndexStructureCLI enum.
File loading
Any Relation automatically gains from_csv and from_parquet via the blanket RelationFileExt impl. CSV files must be usize-valued with a header row; Parquet files must be Int64-valued. Both extract the relation name from the file stem.
use kermit_ds::{RelationFileExt, TreeTrie};
let edges = TreeTrie::from_csv("edges.csv")?;
Extending
To add a new data structure:
- Implement
Relation + TrieIterable + HeapSizein this crate. - Add a variant to
IndexStructureinsrc/ds/mod.rs. - Wire it into the CLI dispatch (
run_ds_bench/run_benchmarkin../kermit/src/main.rs).
See ARCHITECTURE.md for the design rationale behind the trie layouts.
Dependencies
~34MB
~744K SLoC