#hash-map #simd-accelerated #simd #small-vec #map #heap-allocation

small-map

An inline SIMD accelerated hashmap designed for small amount of data

3 releases

0.1.3 Jan 17, 2024
0.1.2 Jan 17, 2024
0.1.1 Nov 2, 2023
0.1.0 Nov 2, 2023

#418 in Data structures

Download history 3678/week @ 2024-08-12 2708/week @ 2024-08-19 3429/week @ 2024-08-26 3804/week @ 2024-09-02 3311/week @ 2024-09-09 2449/week @ 2024-09-16 3466/week @ 2024-09-23 3297/week @ 2024-09-30 3177/week @ 2024-10-07 3250/week @ 2024-10-14 2135/week @ 2024-10-21 2261/week @ 2024-10-28 1939/week @ 2024-11-04 4496/week @ 2024-11-11 3475/week @ 2024-11-18 3009/week @ 2024-11-25

13,010 downloads per month
Used in 9 crates (4 directly)

MIT/Apache

115KB
1.5K SLoC

Small-Map

Crates.io

An inline SIMD accelerated hashmap designed for small amount of data.

Usage

use small_map::SmallMap;
// Don't worry about the 16 here.
// When the size grows, it will automatically switch to heap impl.
type MySmallMap<K, V> = SmallMap<16, K, V>;

let mut map = MySmallMap::new();
map.insert(1_u8, 2_u8);
assert_eq!(*map.get(&1).unwrap(), 2);

Usually you can use this map for short lifetime and small key/values, for example, http or RPC headers.

You can use it like other normal hashmaps without worrying about its size.

Performance

The performance of SmallMap depends on the capacity, Key/Value size and operation scenario.

It is recommended to set the size to 16(or 32) because with SSE2 it can search 16 items within a single instruction. It is only recommended to use SmallMap for small Key/Values, such as numbers and strings.

benchmark result

Here is a benchmark result with size 16 and u8 key/value. This benchmark was conducted on an Intel Xeon.

[!NOTE] Comparing to HashBrown, it saves 25%~43% CPU in test scenario; and comparing to std HashMap, it saves 25%~54% CPU in test scenario.

How it Works

Like SmallVec, for HashMap with a small amount of data, inlining the data can avoid heap allocation overhead and improve performance.

SmallMap inlines a fixed-size sequential structure, and refers to the design of swisstable to additionally store the hash of the data in Group units, and uses SIMD to accelerate the search.

When the data capacity exceeds the inline capacity, the data will be transferred to HashMap implemented by hashbrown.

Credit

Hashbrown is heavily referenced to and copied by this project, and is a very elegant and efficient implementation.

Dependencies

~1–1.3MB
~19K SLoC