#parser #uniprotkb #uniref #swissprot #trembl

uniprot

Rust data structures and parser for the Uniprot database(s)

6 releases (3 breaking)

new 0.4.0 Jul 24, 2021
0.3.1 Jan 19, 2020
0.2.0 Jan 18, 2020
0.1.1 Jan 15, 2020

#480 in Parser implementations

MIT license

2MB
4K SLoC

uniprot.rs Star me

Rust data structures and parser for the UniprotKB database(s).

Actions Codecov License Source Crate Documentation Changelog GitHub issues

🔌 Usage

The uniprot::uniprot::parse function can be used to obtain an iterator over the entries of a UniprotKB database in XML format (either SwissProt or TrEMBL).

extern crate uniprot;

let f = std::fs::File::open("tests/uniprot.xml")
   .map(std::io::BufReader::new)
   .unwrap();

for r in uniprot::uniprot::parse(f) {
   let entry = r.unwrap();
   // ... process the Uniprot entry ...
}

XML files for UniRef and UniParc can also be parsed, with uniprot::uniref::parse and uniprot::uniparc::parse, respectively.

Any BufRead implementor can be used as an input, so the database files can be streamed directly from their online location with the help of an HTTP library such as reqwest, or using the ftp library.

See the online documentation at docs.rs for more examples, and some details about the different features available.

📝 Features

  • threading (enabled by default): compiles the multithreaded parser that offers a 90% speed increase when processing XML files.

🤝 Credits

uniprot.rs is developed and maintained by:

📋 Changelog

This project adheres to Semantic Versioning and provides a changelog in the Keep a Changelog format.

📜 License

This library is provided under the open-source MIT license.

Dependencies

~3–4MB
~100K SLoC