12 releases (stable)
new 2.4.1 | May 24, 2025 |
---|---|
2.3.0 | May 24, 2025 |
1.0.0 | Mar 6, 2025 |
0.2.1 | Mar 5, 2025 |
0.1.0 | Mar 5, 2025 |
#660 in Data structures
64 downloads per month
83KB
1.5K
SLoC
firims
FIxed Range Integer Maps and Sets
General
This crate implements a map ([VecMap]) and set ([BitSet]) for unsigned integer keys of a known range. For example, these structures are useful whenever you need a map with integer keys, and you know those keys can only ever be numbers ranging between 10 and 200.
While this is a major limitation, because you can no longer use arbitrary keys, constraints like this you unlock a lot of performance gains, because you don't need to hash numbers. Instead, in case of the bitset, you just track booleans (or in this case bits) telling you whether the number is in the set or not.
The implementation for this type of data structure is fairly simple, so you should be able to easily adapt this implementation to suit your own needs; just remember to honor the MIT license.
Implementation details
To ensure some of the limitations on the set members and keys, and to have some useful generic functionality, this crate uses [num_traits] as a dependency. With that, the keys / set members are limited to types implementing the [Integer] type.
The two data structures, the [VecMap] and the [BitSet], have very simple
implementations. The former is just a Vec<V>
, where V
is your value type,
and it maps keys to indexes in the Vec
through a simple addition. As
simple as that may be, the benefit of the [VecMap] is that it implements most of
the of the same interface of the std::collections::HashMap, making
this an almost drop-in replacement for most use cases (given that the
[VecMap] constraints fit your use case).
The [BitSet] is, well, a bit set. Each key is mapped to a bit in an element in a
Vec<u64>
, and a value being present just means that the bit is set to 1.
Again, very simple implementation, pretty easy to replicate, but it implements
the majority of the std::collections::HashSet interface in order to make the
bitset a drop-in replacement.
Interface differences to HashMap and HashSet
[VecMap] and [BitSet] are almost drop-in replacements for std::collections::HashMap and std::collections::HashSet... almost. Most importantly, some iterators return slightly different types than you might otherwise expect. The most important examples are any iterators in the [VecMap] that you would expect to return a reference to the key values. Due to the underlying implementation of the [VecMap], no such references actually exist, because the keys are actually calculated on the fly, and thus those iterators return copies of those keys.
Feature flags
This crate has no default features, and one optional feature:
serde
: Add derives forserde::Serialize
and::Deserialize
for [VecMap] and [BitSet]
Benchmarks
This crate contains a benchmark file benches/bench.rs
, which benchmarks the
following operations:
- construction of new sets
- insertions
.contains()
checks
The benchmarks compares the [BitSet] with an IntSet
from the integer_hasher
crate, and the [VecMap] with an IntMap
. The benchmarks have been run on two
different machines: An M2 macbook pro, as well as an x86_64-linux
machine
using a Ryzen 7 9800x3D CPU.
In general, the data structures in this crate beat the their competitors in all
those benchmarks, but note note that, for example, the IntSet
is also way
more flexible then the these data structures, allowing you to insert any sort
of integer at any time. The range constraint on the values / keys is what
unlocks those performance gains, but obviously it does not fit any use case.
Also note that the benchmark results are very different between the M2 and the
x86 architecture, with the latter producing a way larger gap between, for
example, the IntSet
and the [BitSet]s.
M2
IntSet
vs [BitSet]
bench \ input size | 100 | 1,000 | 10,000 | 100,000 | 1,000,000 |
---|---|---|---|---|---|
construction | ... | ... | ... | ... | ... |
IntSet |
281 ns | 2,576 ns | 25,711 ns | 254,961 ns | 2,585,656 ns |
[BitSet] | 189 ns | 2,015 ns | 19,890 ns | 198,217 ns | 1,987,995 ns |
insertion | ... | ... | ... | ... | ... |
IntSet |
281 ns | 2,576 ns | 25,711 ns | 254,961 ns | 2,585,656 ns |
[BitSet] | 189 ns | 2,015 ns | 19,890 ns | 198,217 ns | 1,987,995 ns |
contains | ... | ... | ... | ... | ... |
IntSet |
62 ns | 597 ns | 6,207 ns | 70,974 ns | 913,086 ns |
[BitSet] | 22 ns | 229 ns | 2,279 ns | 23,698 ns | 247,242 ns |
IntMap
vs [VecMap]
bench \ input size | 100 | 1,000 | 10,000 | 100,000 | 1,000,000 |
---|---|---|---|---|---|
insertion | ... | ... | ... | ... | ... |
IntMap |
178 ns | 1,790 ns | 18,243 ns | 186,263 ns | 3,044,192 ns |
[VecMap] | 41 ns | 417 ns | 4,665 ns | 67,046 ns | 837,838 ns |
contains | ... | ... | ... | ... | ... |
IntMap |
126 ns | 1,121 ns | 16,951 ns | 667,448 ns | 9,376,399 ns |
[VecMap] | 30 ns | 306 ns | 3,010 ns | 35,056 ns | 458,998 ns |
Ryzen7 9800X3D
IntSet
vs [BitSet]
bench \ input size | 100 | 1,000 | 10,000 | 100,000 | 1,000,000 |
---|---|---|---|---|---|
construction | ... | ... | ... | ... | ... |
IntSet |
263 ns | 2,541 ns | 23,419 ns | 251,295 ns | 2,335,786 ns |
[BitSet] | 75 ns | 681 ns | 6,622 ns | 66,238 ns | 721,745 ns |
insertion | ... | ... | ... | ... | ... |
IntSet |
83 ns | 804 ns | 8,814 ns | 106,878 ns | 1,629,813 ns |
[BitSet] | 58 ns | 339 ns | 2,990 ns | 26,420 ns | 348,972 ns |
contains | ... | ... | ... | ... | ... |
IntSet |
52 ns | 489 ns | 4,973 ns | 67,423 ns | 1,096,985 ns |
[BitSet] | 19 ns | 196 ns | 1,922 ns | 19,189 ns | 194,224 ns |
IntMap
vs [VecMap]
bench \ input size | 100 | 1,000 | 10,000 | 100,000 | 1,000,000 |
---|---|---|---|---|---|
insertion | ... | ... | ... | ... | ... |
IntMap |
80 ns | 761 ns | 8,198 ns | 121,147 ns | 1,778,716 ns |
[VecMap] | 30 ns | 297 ns | 3,231 ns | 34,430 ns | 484,658 ns |
contains | ... | ... | ... | ... | ... |
IntMap |
152 ns | 1,503 ns | 16,344 ns | 640,985 ns | 8,398,847 ns |
[VecMap] | 16 ns | 171 ns | 2,012 ns | 27,972 ns | 357,651 ns |
License
MIT License
Copyright (c) 2025 Tommy Breslein
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
Dependencies
~89–420KB