#range-set #symmetric-difference #operation #no-std #range #set

no-std range-set-blaze

Integer sets as fast, sorted integer ranges; Maps with integer-range keys; Full set operations

24 releases

Uses new Rust 2024

0.2.0 May 14, 2025
0.2.0-alpha1 May 1, 2024
0.1.16 Mar 9, 2024
0.1.14 Dec 19, 2023
0.1.9 Jun 29, 2023

#7 in WebAssembly

Download history 28838/week @ 2025-02-01 32176/week @ 2025-02-08 21881/week @ 2025-02-15 24900/week @ 2025-02-22 27080/week @ 2025-03-01 31443/week @ 2025-03-08 28723/week @ 2025-03-15 24742/week @ 2025-03-22 27117/week @ 2025-03-29 25444/week @ 2025-04-05 23409/week @ 2025-04-12 20211/week @ 2025-04-19 20086/week @ 2025-04-26 18894/week @ 2025-05-03 22238/week @ 2025-05-10 19080/week @ 2025-05-17

84,690 downloads per month
Used in 43 crates (7 directly)

MIT/Apache

565KB
9K SLoC

range-set-blaze

github crates.io docs.rs

Integer sets as fast, sorted integer ranges; Maps with integer-range keys; Full set operations

Supports all of Rust's integer-like types, [u8] to u128, [i8] to i128, char (Unicode characters), Ipv4Addr, and Ipv6Addr. Set operations—union, intersection, difference, symmetric difference, and complement— are available on both sets and maps.

The crate's main structs are:

Unlike the standard BTreeSet/BTreeMap and HashSet/HashMap, RangeSetBlaze does not store every integer in the set. Rather, it stores sorted & disjoint ranges of integers in a cache-efficient BTreeMap. It differs from other interval libraries -- that we know of -- by offering full set operations and by being optimized for sets of clumpy integers.

We can construct a RangeSetBlaze or RangeMapBlaze from unsorted & redundant integers (or ranges). When the inputs are clumpy, construction will be linear in the number of inputs and set operations will be sped up quadratically.

The crate's main traits are

  • SortedDisjoint, implemented by iterators of sorted & disjoint ranges of integers. See documentation for details.
  • SortedDisjointMap, implemented by iterators of pairs, where the first item is a sorted & disjoint range of integers. The second item is a value. See documentation for details.

With any SortedDisjoint or SortedDisjointMap iterator we can perform set operations in one pass through the ranges and with minimal (constant) memory. The package enforces the "sorted & disjoint" constraint at compile time (making invalid states unrepresentable).

The crate supports no_std, WASM, and embedded (with alloc) projects. For no_std, etc., Use the command:

cargo add range-set-blaze --features "alloc" --no-default-features

Benchmarks

See the set benchmarks and map benchmarks for performance comparisons with other range-related crates.

Generally, for many tasks involving clumpy integers and ranges, RangeSetBlaze is much faster than alternatives.

The benchmarks are in the benches directory. To run them, use cargo bench.

Articles

Examples

Example 1: Set Operations


Here we take the union (operator “|”) of two RangeSetBlaze's:

Example 1

use range_set_blaze::prelude::*;

 // a is the set of integers from 100 to 499 (inclusive) and 501 to 1000 (inclusive)
let a = RangeSetBlaze::from_iter([100..=499, 501..=999]);
 // b is the set of integers -20 and the range 400 to 599 (inclusive)
let b = RangeSetBlaze::from_iter([-20..=-20, 400..=599]);
// c is the union of a and b, namely -20 and 100 to 999 (inclusive)
let c = a | b;
assert_eq!(c, RangeSetBlaze::from_iter([-20..=-20, 100..=999]));

Example 2: Maps (and Network Addresses)


In networking, suppose we want to simplify a routing table. Here RangeMapBlaze merges identical routes -- if adjacent or overlapping. It also remove all overlaps (respecting priority) and sorts. The result is a fast BTree from address regions to the next network hop. Similar code can simplify font tables.

use range_set_blaze::prelude::*;
use std::net::Ipv4Addr;

// A routing table, sorted by priority
let routing = [
    // destination, prefix, next hop, interface
    ("10.0.1.8", 30, "10.1.1.0", "eth2"),
    ("10.0.1.12", 30, "10.1.1.0", "eth2"),
    ("10.0.1.7", 32, "10.1.1.0", "eth2"),
    ("10.0.0.0", 8, "10.3.4.2", "eth1"),
    ("0.0.0.0", 0, "152.10.0.0", "eth0"),
];

// Create a RangeMapBlaze from the routing table
let range_map = routing
    .iter()
    .map(|(dest, prefix_len, next_hop, interface)| {
        let dest: Ipv4Addr = dest.parse().unwrap();
        let next_hop: Ipv4Addr = next_hop.parse().unwrap();
        let mask = u32::MAX.checked_shr(*prefix_len).unwrap_or(0);
        let range_start = Ipv4Addr::from(u32::from(dest) & !mask);
        let range_end = Ipv4Addr::from(u32::from(dest) | mask);
        (range_start..=range_end, (next_hop, interface))
    })
    .collect::<RangeMapBlaze<_, _>>();

// Print the now disjoint, sorted ranges and their associated values
for (range, (next_hop, interface)) in range_map.range_values() {
    println!("{range:?} -> ({next_hop}, {interface})");
}

// Look up an address
assert_eq!(
    range_map.get(Ipv4Addr::new(10, 0, 1, 6)),
    Some(&(Ipv4Addr::new(10, 3, 4, 2), &"eth1"))
);

Output:

0.0.0.0..=9.255.255.255 -> (152.10.0.0, eth0)
10.0.0.0..=10.0.1.6 -> (10.3.4.2, eth1)
10.0.1.7..=10.0.1.15 -> (10.1.1.0, eth2)
10.0.1.16..=10.255.255.255 -> (10.3.4.2, eth1)
11.0.0.0..=255.255.255.255 -> (152.10.0.0, eth0)

Example 3: Biology


In biology, suppose we want to find the intron regions of a gene but we are given only the transcription region and the exon regions.

Example 3

We create a RangeSetBlaze for the transcription region and a RangeSetBlaze for all the exon regions. Then we take the difference between the transcription region and exon regions to find the intron regions.

use range_set_blaze::prelude::*;

let line = "chr15   29370   37380   29370,32358,36715   30817,32561,37380";

// split the line on white space
let mut iter = line.split_whitespace();
let chrom = iter.next().unwrap();

// Parse the start and end of the transcription region into a RangeSetBlaze
let trans_start: i32 = iter.next().unwrap().parse().unwrap();
let trans_end: i32 = iter.next().unwrap().parse().unwrap();
let trans = RangeSetBlaze::from_iter([trans_start..=trans_end]);
assert_eq!(trans, RangeSetBlaze::from_iter([29370..=37380]));

// Parse the start and end of the exons into a RangeSetBlaze
let exon_starts = iter.next().unwrap().split(',').map(|s| s.parse::<i32>());
let exon_ends = iter.next().unwrap().split(',').map(|s| s.parse::<i32>());
let exon_ranges = exon_starts
    .zip(exon_ends)
    .map(|(s, e)| s.unwrap()..=e.unwrap());
let exons = RangeSetBlaze::from_iter(exon_ranges);
assert_eq!(exons, RangeSetBlaze::from_iter([29370..=30817, 32358..=32561, 36715..=37380]));

// Use 'set difference' to find the introns
let intron = trans - exons;
assert_eq!(intron, RangeSetBlaze::from_iter([30818..=32357, 32562..=36714]));
for range in intron.ranges() {
    let (start, end) = range.into_inner();
    println!("{chrom}\t{start}\t{end}");
}

Dependencies

~1MB
~15K SLoC