#noodles #sam #bioinformatics #reader

noodles-sam

Sequence Alignment/Map (SAM) format reader and writer

28 releases (breaking)

new 0.23.0 Feb 3, 2023
0.22.1 Nov 29, 2022
0.21.0 Oct 28, 2022
0.17.0 Jul 5, 2022
0.2.0 Jul 30, 2021

#109 in Science

Download history 241/week @ 2022-10-15 179/week @ 2022-10-22 205/week @ 2022-10-29 156/week @ 2022-11-05 187/week @ 2022-11-12 164/week @ 2022-11-19 96/week @ 2022-11-26 97/week @ 2022-12-03 187/week @ 2022-12-10 123/week @ 2022-12-17 165/week @ 2022-12-24 105/week @ 2022-12-31 188/week @ 2023-01-07 120/week @ 2023-01-14 235/week @ 2023-01-21 223/week @ 2023-01-28

796 downloads per month
Used in 12 crates (4 directly)

MIT license

690KB
15K SLoC

noodles-sam handles the reading and writing of the SAM (Sequence Alignment/Map) format.

SAM is a format typically used to store biological sequences, either mapped to a reference sequence or unmapped. It has two sections: a header and a list of records.

The header mostly holds meta information about the data: a header describing the file format version, reference sequences reads map to, read groups reads belong to, programs that previously manipulated the data, and free-form comments. The header is optional and may be empty.

Each record represents a read, a linear alignment of a segment. Records have fields describing how a read was mapped (or not) to a reference sequence.

Examples

Read all records from a file

# use std::{fs::File, io::BufReader};
use noodles_sam as sam;

let mut reader = File::open("sample.sam")
    .map(BufReader::new)
    .map(sam::Reader::new)?;

let header = reader.read_header()?.parse()?;

for result in reader.records(&header) {
    let record = result?;
    // ...
}
# Ok::<(), Box<dyn std::error::Error>>(())

Dependencies

~1.8–7MB
~106K SLoC