18 releases (5 breaking)

0.6.0 Apr 22, 2024
0.5.1 Mar 8, 2024
0.4.0 Mar 8, 2024
0.3.0 Dec 7, 2023
0.1.9 Apr 26, 2019

#343 in Data structures


Used in bytebraise

MIT license

77KB
1.5K SLoC

nested_intervals

Crates.io docs.rs

This crate deals with interval sets which are lists of Ranges that may be both overlapping and nested.

The implementation is based on nested containment lists as proposed by Alekseyenko et al. 2007, which offers the same big-O complexity s interval trees (O(n * log(n)) construction, O(n + m) queries). The construction of the query data structure is lazy and only happens the first time a method relying on it is called.

Each interval has a vec of u32 ids attached, which allows linking back the results to other data structures.

Full documentation at docs.rs Source at GitHub

Example

Code example:

  fn test_example() {
        let intervals = vec![0..20, 15..30, 50..100];
        let mut interval_set = IntervalSet::new(&intervals);
        assert_eq!(interval_set.ids, vec![vec![0], vec![1], vec![2]]); // automatic ids, use new_with_ids otherwise
        let hits = interval_set.query_overlapping(10..16);
        assert_eq!(hits.intervals, [0..20, 15..30]);
        let merged = hits.merge_hull();
        assert_eq!(merged.intervals, [0..30]);
        assert_eq!(merged.ids, vec![vec![0,1]]);
    }

Functionality

Not (yet) supported

We currently can not

  • find the interval with the closest end

  • find the interval with the closest end to the left of a point //going be expensive O(n/2)

  • find the interval with the closest end to the right of a point //going be expensiv O(n/2)

  • intersect two interval sects (ie. covered units in both sets)

  • intersect more than two interval sects (ie. covered units in multiple sets, possibly applying a 'k' threshold)

  • merge internally overlapping by intersecting them? What does than even mean for nested sets?

Changelog

  • 0.6.0 - Fixed a bug with has_overlap, which would produce false negatives if intervals with identical starts were in the dataset.

Dependencies

~1MB
~18K SLoC