#hash-table #disk #memory #mapped #structure #decoding

odht

A Rust crate for hash tables that can be mapped from disk into memory without the need for up-front decoding

5 unstable releases

0.3.1 Oct 29, 2021
0.3.0 Sep 20, 2021
0.2.1 Sep 17, 2021
0.2.0 Sep 10, 2021
0.1.0 Jul 20, 2021

#289 in Data structures

Download history 8312/week @ 2024-01-02 9387/week @ 2024-01-09 9531/week @ 2024-01-16 9776/week @ 2024-01-23 13648/week @ 2024-01-30 16549/week @ 2024-02-06 16671/week @ 2024-02-13 15851/week @ 2024-02-20 14590/week @ 2024-02-27 17250/week @ 2024-03-05 17305/week @ 2024-03-12 16615/week @ 2024-03-19 15834/week @ 2024-03-26 15126/week @ 2024-04-02 15028/week @ 2024-04-09 13802/week @ 2024-04-16

62,272 downloads per month
Used in 2 crates

MIT/Apache

76KB
1.5K SLoC

CI Status

odht

A Rust crate for hash tables that can be mapped from disk into memory without the need for up-front decoding. The goal of the implementation is to provide a data structure that

  • can be used exactly in the format it is stored on disk,
  • provides roughly the same performance as a HashMap from Rust's standard library,
  • has a completely deterministic binary representation,
  • is platform and endianess independent, so that data serialized on one system can be used on any other system, and
  • is independent of alignment requirements so that
    • its use is not restricted to certain classes of CPUs, and
    • the data structure can be mapped to arbitrary memory addresses.

This crate is developed and maintained by the Rust compiler team for internal use within rustc. This crate will have regular breaking changes and provides no stability guarantees.


lib.rs:

This crate implements a hash table that can be used as is in its binary, on-disk format. The goal is to provide a high performance data structure that can be used without any significant up-front decoding. The implementation makes no assumptions about alignment or endianess of the underlying data, so a table encoded on one platform can be used on any other platform and the binary data can be mapped into memory at arbitrary addresses.

Usage

In order to use the hash table one needs to implement the Config trait. This trait defines how the table is encoded and what hash function is used. With a Config in place the HashTableOwned type can be used to build and serialize a hash table. The HashTable type can then be used to create an almost zero-cost view of the serialized hash table.


use odht::{HashTable, HashTableOwned, Config, FxHashFn};

struct MyConfig;

impl Config for MyConfig {

    type Key = u64;
    type Value = u32;

    type EncodedKey = [u8; 8];
    type EncodedValue = [u8; 4];

    type H = FxHashFn;

    #[inline] fn encode_key(k: &Self::Key) -> Self::EncodedKey { k.to_le_bytes() }
    #[inline] fn encode_value(v: &Self::Value) -> Self::EncodedValue { v.to_le_bytes() }
    #[inline] fn decode_key(k: &Self::EncodedKey) -> Self::Key { u64::from_le_bytes(*k) }
    #[inline] fn decode_value(v: &Self::EncodedValue) -> Self::Value { u32::from_le_bytes(*v)}
}

fn main() {
    let mut builder = HashTableOwned::<MyConfig>::with_capacity(3, 95);

    builder.insert(&1, &2);
    builder.insert(&3, &4);
    builder.insert(&5, &6);

    let serialized = builder.raw_bytes().to_owned();

    let table = HashTable::<MyConfig, &[u8]>::from_raw_bytes(
        &serialized[..]
    ).unwrap();

    assert_eq!(table.get(&1), Some(2));
    assert_eq!(table.get(&3), Some(4));
    assert_eq!(table.get(&5), Some(6));
}

Dependencies