47 releases
0.2.25 | May 9, 2024 |
---|---|
0.2.24 | May 1, 2024 |
0.1.11 | Sep 29, 2023 |
0.1.7 | Aug 18, 2023 |
0.0.14 | Sep 30, 2022 |
#259 in Biology
Used in 2 crates
515KB
11K
SLoC
bedrs
bedtools
-like functionality for interval sets in rust
Summary
This is an interval library written in rust that takes advantage of the trait system, generics, monomorphization, and procedural macros, for high efficiency interval operations with nice quality of life features for developers.
It focuses around the Coordinates
trait, which once implemented on
and arbitrary interval type allows for a wide range of genomic interval arithmetic.
It also introduces a new collection type, IntervalContainer
, which acts as a collection
of Coordinates
and has many set operations implemented.
Interval arithmetic can be thought of as set theoretic operations (like intersection, union, difference, complement, etc.) on intervals with associated chromosomes, strands, and other genomic markers.
This library facilitates the development of these types of operations on arbitrary types and lets the user tailor their structures to minimize computational overhead, but also remains a flexible library for general interval operations.
Usage
The main benefit of this library is that it is trait-based.
So you can define your own types - but if they implement the
Coordinates
trait they can use the other functions within the
library.
For detailed usage and examples please review the documentation.
Coordinates
Trait
The library centers around the Coordinates
trait.
This trait defines some minimal functions that are required for all set operations. This includes things like getting the chromosome ID of an interval, or the start and endpoints of that interval, or the strand.
This can be implemented by hand, or if you follow common naming conventions used in the
library (chr
, start
, end
, strand
) then you can [derive(Coordinates)]
on your
custom interval type.
use bedrs::prelude::*;
// define a custom interval struct for testing
#[derive(Default, Coordinates)]
struct MyInterval {
chr: usize,
start: usize,
end: usize,
}
Interval Types
While you can create your own interval types, there are plenty of 'batteries-included' types you can use in your own libraries already.
These include:
These are pre-built interval types and can be used in many usecases:
use bedrs::prelude::*;
// An interval on chromosome 1 and spanning base 20 <-> 40
let a = Bed3::new(1, 20, 40);
// An interval on chromosome 1 and spanning base 30 <-> 50
let b = Bed3::new(1, 30, 50);
// Find the intersecting interval of the two
// This returns an Option<Bed3> because they may not intersect.
let c = a.intersect(&b).unwrap();
assert_eq!(c.chr(), &1);
assert_eq!(c.start(), 30);
assert_eq!(c.end(), 40);
Interval Operations
Interval Set Operations
Set operations are performed using the methods of the IntervalContainer
.
We can build an IntervalContainer
easily on any collection of intervals:
use bedrs::prelude::*;
let set = IntervalContainer::new(vec![
Bed3::new(1, 20, 30),
Bed3::new(1, 30, 40),
Bed3::new(1, 40, 50),
]);
assert_eq!(set.len(), 3);
For more details on each of these and more please explore the IntervalContainer
for all
associated methods.
- Bound
- Closest
- Complement
- Find
- Internal
- Merge
- Sample
- Intersect
- Segment
- Subtract
Other Work
This library is heavily inspired by other interval libraries in rust which are listed below:
It also was motivated by the following interval toolkits in C++ and C respectively:
Dependencies
~1.5–2.6MB
~47K SLoC