4 releases
0.1.2 | Oct 2, 2024 |
---|---|
0.1.1 | Feb 25, 2024 |
0.1.0 | Feb 25, 2024 |
0.0.1 | Dec 26, 2023 |
#70 in Compression
255KB
3.5K
SLoC
cseq_benchmark
is the program (by Piotr Beling) for benchmarking compact sequences and bitmaps.
It can test the listed algorithms contained in the following crates:
- cseq: Elias-Fano (experimental);
- bitm: rank and select queries on bit vectors;
- sucds: rank and select queries on bit vectors;
- succinct: rank and select queries on bit vectors;
- sux: select queries on bit vectors;
- vers (only if compiled with
vers-vecs
feature): rank and select queries on bit vectors.
Please run the program with the --help
switch to see the available options.
Below you can find instruction for installing cseq_benchmark
and
reproducing experiments performed with it,
which can be found in published or under review papers.
Note that these instructions have been tested under GNU/Linux and may require some modifications for other systems.
Installation
cseq_benchmark
can be compiled and installed from sources. To do this, a Rust compiler is needed.
The easiest way to obtain the compiler along with other necessary tools (like cargo
) is
to use rustup.
Please follow the instructions at https://www.rust-lang.org/tools/install.
Once Rust is installed, just execute the following to install cseq_benchmark
with native optimizations:
RUSTFLAGS="-C target-cpu=native" cargo install --features=vers-vecs cseq_benchmark
The --features=vers-vecs
flag enables compilation of the non-portable vers crate.
It should be omitted in case of compilation problems.
Reproducing experiments from the papers
Rust libraries and programs focused on succinct data structures
(Piotr Beling, BSuccinct: Rust libraries and programs focused on succinct data structures, SoftwareX, Volume 26, 2024, 101681, ISSN 2352-7110, https://doi.org/10.1016/j.softx.2024.101681)
Results for structures that support rank and select queries on bit vectors, included in libraries written in Rust (we used rustc 1.75.0 to compile), can be obtained by running:
cseq_benchmark -f -t 60 -c 20 -q 10000000 -u 1000000000 -n 500000000 bv
cseq_benchmark -f -t 60 -c 20 -q 10000000 -u 1000000000 -n 100000000 bv
Notes:
- The
-t 60 -c 20
switches force a long testing time (60s for warming up + about 60s for performing each test + 20s cooling/sleeping between tests). They can be omitted to get results faster, but averaged over fewer repetitions. - The
-f
switch causes the results to be written to files. It also can be skipped, as the results are printed to the screen anyway.
The results for the methods contained in SDSL2 (which is written in C++; we used clang 14.0.6 to compile) can be obtained using the program available at https://github.com/beling/benchmark-succinct (the page also contains compilation instructions) by running:
rank_sel 1000000000 500000000 60 10000000
rank_sel 1000000000 100000000 60 10000000
Benchmark results
Notes on benchmarks of structures supporting rank and select queries on bit vectors
- We do not distinguish rank0 from rank1 as each is trivially computable from the other.
- In a bit vector with adversarial distribution and n ones, 99% of them occupy the last n indices.
- Versions of the tested crates: bitm 0.4.1, succinct 0.5.2, sucds 0.8.1, sux 0.2.0, vers 1.1.0.
- Structures supporting select marked with * use or are integrated with (in the case of vers RsVec) the corresponding rank structure and the space overhead given for them is additional.
- We conducted the benchmarks using: long measurement (t=60) and cooling (c=20) times, 107 random query arguments, AMD Ryzen 5600G @3.9GHz CPU and compilation with native optimizations enabled.
Rank for uniform distribution of ones in the bit vector
bit vector length | 1010 | 109 | 108 | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
percent of ones | 50 | 10 | 50 | 10 | 50 | 10 | ||||||
space overhead [%] | time / query [ns] | space overhead [%] | time / query [ns] | space overhead [%] | time / query [ns] | space overhead [%] | time / query [ns] | space overhead [%] | time / query [ns] | space overhead [%] | time / query [ns] | |
bitm RankSelect101111 | 3.1 | 23 | 3.1 | 23 | 3.1 | 21 | 3.1 | 21 | 3.1 | 6 | 3.1 | 7 |
succinct JacobsonRank | 22.8 | 52 | 22.8 | 52 | 18.8 | 45 | 18.8 | 44 | 22.7 | 12 | 22.7 | 13 |
succinct Rank9 | 25.0 | 19 | 25.0 | 19 | 25.0 | 18 | 25.0 | 17 | 25.0 | 7 | 25.0 | 7 |
sucds Rank9Sel | 25.0 | 19 | 25.0 | 18 | 25.0 | 17 | 25.0 | 17 | 25.0 | 7 | 25.0 | 7 |
vers RsVec | 4.7 | 26 | 5.3 | 26 | 4.7 | 24 | 5.3 | 24 | 4.7 | 6 | 5.3 | 6 |
Select1 for uniform distribution of ones in the bit vector
bit vector length | 1010 | 109 | 108 | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
percent of ones | 50 | 10 | 50 | 10 | 50 | 10 | ||||||
space overhead [%] | time / query [ns] | space overhead [%] | time / query [ns] | space overhead [%] | time / query [ns] | space overhead [%] | time / query [ns] | space overhead [%] | time / query [ns] | space overhead [%] | time / query [ns] | |
bitm RankSelect101111 binary search* | 0.0 | 395 | 0.0 | 396 | 0.0 | 177 | 0.0 | 178 | 0.0 | 87 | 0.0 | 87 |
bitm RankSelect101111 combined sampling* | 0.4 | 91 | 0.3 | 92 | 0.4 | 60 | 0.3 | 61 | 0.4 | 29 | 0.3 | 31 |
succinct JacobsonRank* | 0.0 | 1486 | 0.0 | 1449 | 0.0 | 931 | 0.0 | 931 | 0.0 | 436 | 0.0 | 437 |
succinct Rank9* | 0.0 | 838 | 0.0 | 806 | 0.0 | 534 | 0.0 | 516 | 0.0 | 207 | 0.0 | 208 |
sucds Rank9Sel + hints* | 3.1 | 168 | 0.6 | 144 | 3.1 | 107 | 0.6 | 109 | 3.1 | 33 | 0.6 | 41 |
sucds Rank9Sel* | 0.0 | 511 | 0.0 | 448 | 0.0 | 249 | 0.0 | 252 | 0.0 | 103 | 0.0 | 105 |
sux SelectFixed1 | 12.5 | 71 | 2.5 | 204 | 12.5 | 50 | 2.5 | 138 | 12.5 | 15 | 2.5 | 34 |
sux SelectFixed2 | 15.6 | 39 | 3.1 | 90 | 15.6 | 33 | 3.1 | 55 | 15.6 | 10 | 3.1 | 19 |
vers RsVec* | 0.0 | 151 | 0.0 | 159 | 0.0 | 82 | 0.0 | 101 | 0.0 | 32 | 0.0 | 38 |
Select0 for uniform distribution of ones in the bit vector
bit vector length | 1010 | 109 | 108 | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
percent of ones | 50 | 10 | 50 | 10 | 50 | 10 | ||||||
space overhead [%] | time / query [ns] | space overhead [%] | time / query [ns] | space overhead [%] | time / query [ns] | space overhead [%] | time / query [ns] | space overhead [%] | time / query [ns] | space overhead [%] | time / query [ns] | |
bitm RankSelect101111 binary search* | 0.0 | 417 | 0.0 | 416 | 0.0 | 193 | 0.0 | 195 | 0.0 | 89 | 0.0 | 90 |
bitm RankSelect101111 combined sampling* | 0.4 | 120 | 0.4 | 122 | 0.4 | 77 | 0.4 | 77 | 0.4 | 32 | 0.4 | 33 |
succinct JacobsonRank* | 0.0 | 1508 | 0.0 | 1451 | 0.0 | 979 | 0.0 | 975 | 0.0 | 452 | 0.0 | 454 |
succinct Rank9* | 0.0 | 841 | 0.0 | 798 | 0.0 | 547 | 0.0 | 531 | 0.0 | 217 | 0.0 | 219 |
sucds Rank9Sel + hints* | 3.1 | 170 | 5.6 | 165 | 3.1 | 108 | 5.6 | 112 | 3.1 | 35 | 5.6 | 34 |
sucds Rank9Sel* | 0.0 | 534 | 0.0 | 456 | 0.0 | 272 | 0.0 | 273 | 0.0 | 107 | 0.0 | 109 |
sux SelectFixed1 | 12.5 | 93 | 22.5 | 61 | 12.5 | 62 | 22.5 | 50 | 12.5 | 17 | 22.5 | 15 |
sux SelectFixed2 | 15.6 | 41 | 28.1 | 37 | 15.6 | 34 | 28.1 | 34 | 15.6 | 10 | 28.1 | 12 |
vers RsVec* | 0.0 | 138 | 0.0 | 146 | 0.0 | 74 | 0.0 | 72 | 0.0 | 30 | 0.0 | 30 |
Rank for adversarial distribution with 99% of ones near the end of the vector
bit vector length | 1010 | 109 | 108 | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
percent of ones | 50 | 10 | 50 | 10 | 50 | 10 | ||||||
space overhead [%] | time / query [ns] | space overhead [%] | time / query [ns] | space overhead [%] | time / query [ns] | space overhead [%] | time / query [ns] | space overhead [%] | time / query [ns] | space overhead [%] | time / query [ns] | |
bitm RankSelect101111 | 3.1 | 23 | 3.1 | 23 | 3.1 | 21 | 3.1 | 21 | 3.1 | 7 | 3.1 | 7 |
succinct JacobsonRank | 22.8 | 52 | 22.8 | 52 | 18.8 | 44 | 18.8 | 44 | 22.7 | 12 | 22.7 | 12 |
succinct Rank9 | 25.0 | 19 | 25.0 | 18 | 25.0 | 17 | 25.0 | 17 | 25.0 | 6 | 25.0 | 7 |
sucds Rank9Sel | 25.0 | 19 | 25.0 | 18 | 25.0 | 17 | 25.0 | 17 | 25.0 | 6 | 25.0 | 7 |
vers RsVec | 4.7 | 26 | 5.3 | 26 | 4.7 | 24 | 5.3 | 24 | 4.7 | 6 | 5.3 | 6 |
Select1 for adversarial distribution with 99% of ones near the end of the vector
bit vector length | 1010 | 109 | 108 | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
percent of ones | 50 | 10 | 50 | 10 | 50 | 10 | ||||||
space overhead [%] | time / query [ns] | space overhead [%] | time / query [ns] | space overhead [%] | time / query [ns] | space overhead [%] | time / query [ns] | space overhead [%] | time / query [ns] | space overhead [%] | time / query [ns] | |
bitm RankSelect101111 binary search* | 0.0 | 296 | 0.0 | 187 | 0.0 | 166 | 0.0 | 97 | 0.0 | 82 | 0.0 | 72 |
bitm RankSelect101111 combined sampling* | 0.4 | 82 | 0.3 | 65 | 0.4 | 58 | 0.3 | 33 | 0.4 | 28 | 0.3 | 25 |
succinct JacobsonRank* | 0.0 | 1318 | 0.0 | 1045 | 0.0 | 819 | 0.0 | 477 | 0.0 | 415 | 0.0 | 369 |
succinct Rank9* | 0.0 | 715 | 0.0 | 571 | 0.0 | 477 | 0.0 | 232 | 0.0 | 195 | 0.0 | 168 |
sucds Rank9Sel + hints* | 3.1 | 159 | 0.6 | 115 | 3.1 | 97 | 0.6 | 46 | 3.1 | 26 | 0.6 | 21 |
sucds Rank9Sel* | 0.0 | 435 | 0.0 | 278 | 0.0 | 195 | 0.0 | 127 | 0.0 | 93 | 0.0 | 73 |
sux SelectFixed1 | 12.5 | 51 | 2.5 | 50 | 12.5 | 39 | 2.5 | 31 | 12.5 | 12 | 2.5 | 16 |
sux SelectFixed2 | 15.6 | 37 | 3.1 | 41 | 15.6 | 33 | 3.1 | 32 | 15.6 | 10 | 3.1 | 13 |
vers RsVec* | 0.0 | 142 | 0.0 | 87 | 0.0 | 77 | 0.0 | 39 | 0.0 | 31 | 0.0 | 30 |
Select0 for adversarial distribution with 99% of ones near the end of the vector
bit vector length | 1010 | 109 | 108 | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
percent of ones | 50 | 10 | 50 | 10 | 50 | 10 | ||||||
space overhead [%] | time / query [ns] | space overhead [%] | time / query [ns] | space overhead [%] | time / query [ns] | space overhead [%] | time / query [ns] | space overhead [%] | time / query [ns] | space overhead [%] | time / query [ns] | |
bitm RankSelect101111 binary search* | 0.0 | 335 | 0.0 | 418 | 0.0 | 182 | 0.0 | 192 | 0.0 | 84 | 0.0 | 88 |
bitm RankSelect101111 combined sampling* | 0.4 | 107 | 0.4 | 118 | 0.4 | 74 | 0.4 | 76 | 0.4 | 31 | 0.4 | 32 |
succinct JacobsonRank* | 0.0 | 1391 | 0.0 | 1427 | 0.0 | 894 | 0.0 | 963 | 0.0 | 437 | 0.0 | 449 |
succinct Rank9* | 0.0 | 737 | 0.0 | 808 | 0.0 | 490 | 0.0 | 519 | 0.0 | 207 | 0.0 | 216 |
sucds Rank9Sel + hints* | 3.1 | 160 | 5.6 | 164 | 3.1 | 98 | 5.6 | 110 | 3.1 | 27 | 5.6 | 30 |
sucds Rank9Sel* | 0.0 | 415 | 0.0 | 450 | 0.0 | 220 | 0.0 | 264 | 0.0 | 96 | 0.0 | 105 |
sux SelectFixed1 | 12.5 | 56 | 22.5 | 54 | 12.5 | 43 | 22.5 | 46 | 12.5 | 13 | 22.5 | 14 |
sux SelectFixed2 | 15.6 | 39 | 28.1 | 36 | 15.6 | 34 | 28.1 | 34 | 15.6 | 10 | 28.1 | 13 |
vers RsVec* | 0.0 | 132 | 0.0 | 146 | 0.0 | 70 | 0.0 | 71 | 0.0 | 30 | 0.0 | 31 |
Dependencies
~16–47MB
~800K SLoC