### 11 releases

0.3.8 | Feb 17, 2024 |
---|---|

0.3.7 | Feb 17, 2024 |

0.3.6 | Aug 12, 2023 |

0.3.5 | Jul 17, 2023 |

0.1.0 | Jul 4, 2023 |

#**271** in Data structures

**191** downloads per month

**Apache-2.0 OR MIT**

135KB

2.5K
SLoC

# indexset

A pure-Rust two-level dynamic order-statistic b-tree.

This crate implements a compact set data structure that preserves its elements' sorted order and allows lookups of entries by value or sorted order position.

Also, it is(mostly) a drop-in replacement for the stdlib BTree.

# Background

This was heavily inspired by

, and
python's `indexmap`

.`sortedcontainers`

It differs from both in that:

is a hashmap that provides numerical lookups, but does not maintain order in case of removals, while`indexmap`

is a b-tree that always maintains order, irrespective of which mutating operation is run.`indexset`

is similar in spirit, but utilizes a different routine for balancing the tree, and relies on a heap for numerical lookups.`sortecontainers`

provides the following features:`indexset`

- As fast to iterate as a vec.
- Zero indirection.
- Lookups by position and range.
- Minimal amount of allocations.

(lookups by position) and`select`

operations in near constant time.`rank`

## Performance

and `BTreeSet`

derive a couple of performance facts directly from how it is constructed, which is roughly:`BTreeMap`

A two-level B-Tree with a fenwick tree as a low-cost index for numerical lookups

- Iteration is very fast since it inherits a vec's iter struct.
- Insertions and removals are constant time in the best-case scenario, and logarithmic on the node size in the worst case
- Lookups are very fast, and rely only on two binary searches over very little data.

## Benchmarks

run

and see it for yourself.`cargo`` bench`

On a lowest-specced M1 macbook pro I got the following numbers:

### Inserting 100k random usize

: 8.25ms`stdlib``::``collections`BTreeSet`::``.``insert``(`i`)`

: 17.3ms`indexset`BTreeSet`::``.``insert``(`i`)`

### Checking that each 100k random usize integers exist

: 7.5ms`stdlib``::``collections`BTreeSet`::``.``contains``(`i`)`

: 6.8ms`indexset`BTreeSet`::``.``contains``(`i`)`

### Getting all 100k i-th elements

:`stdlib``::``collections`BTreeSet`::``.`iter`.``nth``(`i`)`**13.28s**yes, seconds!

:`indexset`BTreeSet`::``.``get_index``(`i`)`**3.93ms**

### Iterating over all 100k elements and then collecting it into a vec

:`Vec`from_iter`::``(``stdlib``::``collections`BTreeSet`::``.``iter``(``)``)`**227.28us**

:`Vec`from_iter`::``(``indexset`BTreeSet`::``.``iter``(``)``)`**123.21.us**

Yes.

Getting the i-th element is **3400x** faster than stdlib's btree,

is 10% faster, and iterating
is twice as fast, at the cost of insertions being half as fast.`contains`

If your use case of

and `std ::`

`collections`BTreeSet

`::``BTreeMap`

is more read-heavy, or if you really need to index by
sorted-order position, it might be worth checking out this `indexset`

instead.# Limitations

is less polished than`BTreeMap`

. This crate has been optimised for a leaner`BTreeSet`

.`BTreeSet`- This is
**beta**quality software, so use it at your own risk. I'm not 100% certain about every single iterator implementation(everything has tests though). - There's quite a couple utility traits that have not been implemented yet.

# Naming

This library is called

, because the base data structure is `indexset`

. `BTreeSet`

is a `BTreeMap`

with
a `BTreeSet`

item type.`Pair <K, V>`

# Changelog

See CHANGELOG.md.

#### Dependencies

~0.5–1MB

~24K SLoC