#dna #bioinformatics #sequencing #mismatch

disambiseq

Create unambiguous one-off mismatch libraries for DNA sequences

11 releases

0.1.10 Jun 15, 2023
0.1.9 Oct 12, 2022

#12 in #dna

26 downloads per month
Used in 3 crates

MIT license

24KB
533 lines

disambiseq

MIT licensed actions status codecov Crates.io docs.rs

Creates unambiguous nucleotide mismatch libraries for for a set of nucleotide sequences.

Usage

I've rewritten this functionality a few times for different use cases and put it into a standalone crate since it might be useful to others.

This is used to generate unambiguous one-off mismatch libraries for a set of DNA sequences.

Creating a new unambiguous set

use disambiseq::Disambiseq;

let sequences = vec![
    "ACT".to_string(),
    "AGT".to_string()
];
let dsq = Disambiseq::from_slice(&sequences);
println!("{:#?}", dsq);

Visualizing the set

Disambiseq {
    unambiguous: {
        "TCT": "ACT",
        "ACA": "ACT",
        "CCT": "ACT",
        "ACC": "ACT",
        "CGT": "AGT",
        "GGT": "AGT",
        "AGA": "AGT",
        "GCT": "ACT",
        "ACG": "ACT",
        "TGT": "AGT",
        "AGC": "AGT",
        "AGG": "AGT",
    },
    parents: {
        "AGT",
        "ACT",
    },
    ambiguous: {
        "ATT",
        "AAT",
    },
}

Querying the Set

use disambiseq::Disambiseq;

let sequences = vec![
    "ACT".to_string(),
    "AGT".to_string()
];
let dsq = Disambiseq::from_slice(&sequences);

// retrieve a parental sequence
assert_eq!(dsq.get_parent("ACT"), Some(&"ACT".to_string()));

// retrieve a mutation sequence's parent
assert_eq!(dsq.get_parent("TCT"), Some(&"ACT".to_string()));

// exclude sequences with ambiguous parents
assert_eq!(dsq.get_parent("AAT"), None);
assert_eq!(dsq.get_parent("ATT"), None);

Dependencies

~1MB
~15K SLoC