16 releases
0.5.3 | Mar 1, 2024 |
---|---|
0.5.1 |
|
0.5.0 |
|
0.3.1 | Jun 20, 2022 |
0.2.6 | Nov 16, 2021 |
#535 in Parser implementations
72 downloads per month
21KB
380 lines
kseq
kseq
is a simple fasta/fastq (fastx) format parser library for Rust, its main function is to iterate over the records from fastx files (similar to kseq in C
). It uses shared buffer to read and store records, so the speed is very fast. It supports a plain or gz fastx file or io::stdin
, as well as a fofn (file-of-file-names) file, which contains multiple plain or gz fastx files (one per line).
Using kseq
is very simple. Users only need to call parse_path
to parse a path or parse_reader
to parse a reader, and then use iter_record
method to get each record.
-
parse_path
This function takes a path that implementsAsRef<std::path::Path>
as input, a path can be afastx
file,-
forio::stdin
, or afofn
file. It returns aResult
type:Ok(T)
: A structT
with theiter_record
method.Err(E)
: An errorE
including missing input, can't open or read, wrong fastx format or invalid path or file errors.
-
parse_reader
This function takes a reader that implementsstd::io::Read
as input. It returns aResult
type:Ok(T)
: A structT
with theiter_record
method.Err(E)
: An errorE
including missing input, can't open or read, wrong fastx format or invalid path or file errors.
-
iter_record
This function can be called in a loop, it returns aResult<Option<Record>>
type:-
Ok(Some(Record))
: A structRecord
with methods:head -> &str
: get sequence id/identifierseq -> &str
: get sequencedes -> &str
: get sequence description/commentsep -> &str
: get separatorqual -> &str
: get quality scoreslen -> usize
: get sequence length
Note: call
des
,sep
andqual
will return""
ifRecord
doesn't have these attributes. -
Ok(None)
: Stream has reachedEOF
. -
Err(ParseError)
: An errorParseError
includingIO
,TruncateFile
,InvalidFasta
orInvalidFastq
errors.
-
Example
use std::env::args;
use std::fs::File;
use kseq::parse_path;
fn main(){
let path: String = args().nth(1).unwrap();
let mut records = parse_path(path).unwrap();
// let mut records = parse_reader(File::open(path).unwrap()).unwrap();
while let Some(record) = records.iter_record().unwrap() {
println!("head:{} des:{} seq:{} qual:{} len:{}",
record.head(), record.des(), record.seq(),
record.qual(), record.len());
}
}
Installation
cargo add kseq
Benchmarking
cargo bench
Dependencies
~1.2–2MB
~34K SLoC