#bioinformatics #sequencing #dna

fxread

A barebones fastx reader for rust

19 releases

0.2.12 Jan 3, 2024
0.2.11 Oct 24, 2023
0.2.10 Jul 28, 2023
0.2.5 Sep 28, 2022
0.1.8 Jul 21, 2022

#180 in Biology

Download history 8/week @ 2024-01-01 2/week @ 2024-01-08 20/week @ 2024-02-12 7/week @ 2024-02-19 36/week @ 2024-02-26 28/week @ 2024-03-11 128/week @ 2024-04-01

156 downloads per month
Used in 5 crates

MIT license

57KB
1.5K SLoC

fxread

MIT licensed actions status codecov

A barebones fastx reader for rust.

Summary

This crate attempts to be a faster and more lightweight alternative to bio-rs and provides a standardized interface to working with fasta and fastq formats. The goal of this crate is to be fast and flexible - it is about twice as fast as bio-rs on average but about half as fast than fastq for standard fastx files. The difference between the different crates is reduced heavily though once gzip files are included (see benchmark).

The speed up can be attributed to reducing the total number of vectors allocated for each record - but the limitation compared to fastq is that each record has ownership over its data and is allocated once. This creates extra overhead, but is very convenient as you can treat the reader directly as an iterator.

Usage

Some benefits of this interface is that each FastaReader and FastqReader share the FastxReader trait and act as iterators over Records.

initialize_reader can determine the fastq format from the path name

use fxread::initialize_reader;

let path = "example/sequences.fq";
let reader = initialize_reader(path).unwrap();
assert_eq!(reader.count(), 10);

initialize_reader can handle if the file is gzip or not without changing the downstream usage

use fxread::initialize_reader;

let path = "example/sequences.fq.gz";
let reader = initialize_reader(path).unwrap();
assert_eq!(reader.count(), 10);

Check out the API Documentation for usage

Dependencies

~430KB