2 releases

Uses new Rust 2024

new 0.5.1 Nov 4, 2025
0.5.0 Sep 17, 2025

#423 in Biology

Download history 138/week @ 2025-09-16 33/week @ 2025-09-23 28/week @ 2025-09-30 15/week @ 2025-10-07 26/week @ 2025-10-14 15/week @ 2025-10-21 2/week @ 2025-10-28

61 downloads per month
Used in 6 crates

MIT license

120KB
2.5K SLoC

Core infrastructure for high-performance genomic interval overlap operations in Rust.

This crate provides efficient data structures and algorithms for finding overlapping intervals in genomic data. It is part of the gtars project, which provides tools for working with genomic interval data in Rust, Python, and R.

Features

  • Fast overlap queries: Efficiently find all intervals that overlap with a query interval
  • Iterator-based API: Memory-efficient iteration over overlapping intervals
  • Thread-safe: All data structures implement Send and Sync for concurrent access

All overlap computation logic should live here. Higher-level modules (scoring, tokenizers) wrap this functionality for their specific use cases but should not reimplement overlap algorithms.

Quick Start

use gtars_overlaprs::{AIList, Overlapper, Interval};

// create some genomic intervals (e.g., ChIP-seq peaks)
let intervals = vec![
    Interval { start: 100u32, end: 200, val: "gene1" },
    Interval { start: 150, end: 300, val: "gene2" },
    Interval { start: 400, end: 500, val: "gene3" },
];

// build the AIList data structure
let ailist = AIList::build(intervals);

// query for overlapping intervals
let overlaps = ailist.find(180, 250);
assert_eq!(overlaps.len(), 2); // gene1 and gene2 overlap

// or use an iterator for memory-efficient processing
for interval in ailist.find_iter(180, 250) {
    println!("Found overlap: {:?}", interval);
}

Performance

The AIList data structure is optimized for queries on genomic-scale datasets and provides excellent performance for typical genomic interval overlap operations. It uses a decomposition strategy to handle intervals efficiently, particularly when dealing with high-coverage regions common in genomic data.

Examples

Finding all genes that overlap a query region

use gtars_overlaprs::{AIList, Overlapper, Interval};

let genes = vec![
    Interval { start: 1000u32, end: 2000, val: "BRCA1" },
    Interval { start: 3000, end: 4000, val: "TP53" },
    Interval { start: 5000, end: 6000, val: "EGFR" },
];

let gene_index = AIList::build(genes);

// query a specific region (e.g., chr17:1500-3500)
let overlapping_genes: Vec<&str> = gene_index
    .find_iter(1500, 3500)
    .map(|interval| interval.val)
    .collect();

println!("Genes in region: {:?}", overlapping_genes);

Dependencies

~3–6.5MB
~121K SLoC