19 releases

0.1.1 May 6, 2026
0.1.0 Mar 17, 2026
0.0.15 Mar 3, 2026
0.0.14 Nov 17, 2025
0.0.4-dev Aug 8, 2024

#2495 in Algorithms


Used in kermit

Apache-2.0 OR MIT

93KB
1.5K SLoC

kermit-ds

Relation data structures for the Kermit workspace. Provides two trie-based implementations that store tuples of usize keys:

  • TreeTrie — a pointer-based trie where each node owns its sorted children. Simple and direct; preferable for small relations or pedagogical use.
  • ColumnTrie — a column-oriented (flattened) trie that stores each depth in parallel data/interval arrays. More compact and cache-friendly on large relations.

Both implement Relation and TrieIterable, so they're interchangeable in the join algorithms in kermit-algos.

Surface

  • Core traitsRelation, Projectable, RelationFileExt, HeapSize.
  • MetadataRelationHeader, ModelType, RelationError.
  • Data structuresTreeTrie, ColumnTrie, plus the IndexStructure CLI enum.

File loading

Any Relation automatically gains from_csv and from_parquet via the blanket RelationFileExt impl. CSV files must be usize-valued with a header row; Parquet files must be Int64-valued. Both extract the relation name from the file stem.

use kermit_ds::{RelationFileExt, TreeTrie};

let edges = TreeTrie::from_csv("edges.csv")?;

Extending

To add a new data structure:

  1. Implement Relation + TrieIterable + HeapSize in this crate.
  2. Add a variant to IndexStructure in src/ds/mod.rs.
  3. Wire it into the CLI dispatch (run_ds_bench / run_benchmark in ../kermit/src/main.rs).

See ARCHITECTURE.md for the design rationale behind the trie layouts.

Dependencies

~34MB
~744K SLoC