3 releases
Uses new Rust 2024
0.1.2 | Apr 7, 2025 |
---|---|
0.1.1 | Mar 12, 2025 |
0.1.0 | Dec 16, 2024 |
#262 in Data structures
120 downloads per month
115KB
2K
SLoC
MuleMap<🫏,🗺>
MuleMap
is a hybrid between a HashMap
and a lookup table. It improves performance for frequently accessed keys in a known range. If a key (integer) is in the user specified range, then its value will be stored directly in the lookup table. Benchmarks (using random selection) start to show speed improvements when about 50% of the key accesses are in the lookup table. Performance is almost identical to HashMap
with less than 50%. MuleMap
tries to match the API of the standard library HashMap
- making it a drop-in replacement for HashMap
.
Example
use mule_map::MuleMap;
use std::num::NonZero;
type Hash = fnv_rs::FnvBuildHasher; // Use whatever hash function you prefer
// Using Entry API
let mut mule_map = MuleMap::<u32, usize, Hash>::new();
assert_eq!(mule_map.get(5), None);
let entry = mule_map.entry(5);
entry.or_insert(10);
assert_eq!(mule_map.get(5), Some(&10));
// Using NonZero and bump
let mut mule_map_non_zero = MuleMap::<u32, NonZero<i32>, Hash>::default();
mule_map_non_zero.bump_non_zero(10);
mule_map_non_zero.bump_non_zero(10);
mule_map_non_zero.bump_non_zero(999_999);
mule_map_non_zero.bump_non_zero(999_999);
assert_eq!(mule_map_non_zero.get(10), NonZero::<i32>::new(2).as_ref());
assert_eq!(mule_map_non_zero.get(999_999),NonZero::<i32>::new(2).as_ref());
Highlights
- All primitive integer types are supported for keys (
u8
,u16
,u32
,u64
,u128
,usize
,i8
,i16
,i32
,i64
,i128
, andisize
). - All corresponding
NonZero<T>
types are supported for keys. NonZero<T>
key types take advantage of the niche optimizations (guaranteed by the rust compiler) by being stored as anOption<NonZero<T>>
. This is used bybump_non_zero()
to directly castOption<NonZero<T>>
to it's underlying integer type (using bytemuck - no unsafe code) and directly incrementing its value. See benchmarks for details.- NOTE: Currently the type of a const generic can't depend on another generic type argument, so
TABLE_MIN_VALUE
can't use the same type as the key. Because of this, I am usingi128
, but that means we can't represent values nearu128::MAX
. Hopefully having frequent keys nearu128::MAX
is extremely rare. - No unsafe code used in safe APIs.
Benchmarks
Benchmark Setup
- Takes 2 random uniform distributions of small and large keys, and counts the frequency of all of the keys.
MuleMap
stores the small keys (near 0) in its lookup table.- The benchmarks are run with and without shuffling the 2 random distributions of keys.
- If you expect your lookup table keys to appear in clumps, than the "No Shuffle" graph is more representative of your use case.
- If you don't expect runs of small keys (random order), than the graph with the keys shuffled is more representative of your use case.
- "Input" is the percentage of small keys using the lookup table vs large keys that use the
HashMap
. - Benchmarks were run on a
MacBook
Pro 15-inch, Mid 2015 - 2.8 GHz Quad-Core Intel Core i7 (Sonoma). BothMuleMap
HashMap
are usingfnv_rs::FnvBuildHasher
. I tried other hash functions like GxHash, but they were slower (likely because my older CPU has slower AES/SSE2 instructions than more modern CPUs).
Types of Maps Compared
- Hand Rolled - Simple loop with an
if
block that switches between aHashMap
and indexing into a lookup table. This is the baseline to show thatMuleMap
is a zero cost abstraction. HashMap
- UsesHashMap<u32, usize, fnv_rs::FnvBuildHasher>
MuleMap
- UsesMuleMap<u32, usize, fnv_rs::FnvBuildHasher>
MuleMap (NonZero)
- UsesMuleMap<u32, NonZero<usize>, fnv_rs::FnvBuildHasher>
. This take advantage of the niche optimizations by directly castingOption<NonZero<usize>>
tousize
using bytemuck (no unsafe code)
License
Licensed under either of:
- Apache License, Version 2.0, (LICENSE-APACHE or https://www.apache.org/licenses/LICENSE-2.0)
- MIT license (LICENSE-MIT or https://opensource.org/licenses/MIT)
at your option.
Dependencies
~0.4–1MB
~19K SLoC