#bioinformatics #ensembl #ncbi #uniprot #enrichr

bin+lib ggetrs

Efficient querying of biological databases from the command line

13 releases

Uses new Rust 2021

0.1.69 Nov 7, 2022
0.1.68 Nov 4, 2022

#268 in Science

MIT license

305KB
7.5K SLoC

ggetrs

MIT licensed actions status codecov Crates.io docs.rs

Introduction

What is ggetrs?

ggetrs is a free, open-source command-line tool that enables efficient querying of genomic databases. It consists of a collection of separate but interoperable modules, each designed to facilitate one type of database querying in a single line of code.

This is a rust reimplentation of the original python-based program gget and was rewritten to take advantage of rust's powerful HTTP and asynchronous functionality for a faster user experience.

There are some minor syntactic changes between function calls from the original gget and a description for each tool is provided on the modules page.

Installation

The command line tool is distributed via crates.io and can be installed via the rust package manager cargo.

cargo install ggetrs

Alternative installation instructions as well as the python module installation can be found on the install page.

Documentation

Command Line and Python

All usage and function documentation for the command line and python utilities can be found on the ggetrs homepage.

Rust

ggetrs is implemented as a rust library as well as a standalone binary so all documentation for API calls and Data Structures can be found on docs.rs.

Contributing

This project is intended to be open-source and contributions are very welcome!

If you are new to rust or open source in general but still want to contribute please don't hesitate to reach out! I would be more than happy to help guide you through building your first module.

All new additions must pass and follow current testing standards.

FAQ

What makes this different than gget?

ggetrs takes advantage of rust's powerful powerful asynchronous features and lets you perform a large numbers of HTTP requests without increasing wait times.

Since it is a compiled program as well there is no start-up time between commands and you can run your favorite tool in a for loop with no overhead.

However ggetrs stays true to the original gget mindset and tries to make usage as simple as possible no matter the interface (from python to CLI).

Does this have functions that gget doesn't?

We're working on having both tools mirror functionality - but currently this includes the Chembl bioactivity database, more endpoints from the Ensembl API, and direct queries to NCBI and Uniprot.

Does gget have functions that ggetrs does not?

ggetrs will likely not support the ggetrs muscle and ggetrs alphafold functionalities for the time being. The reasoning being that these are wrappers around existing binaries and not HTTP requests.

Do I need to know rust to use this tool?

This tool is written fully in rust - but allows for a python interface using pyo3. Currently not all tools have a python API - but they are planned to be implemented eventually.

All of the currently supported gget modules have their python API implemented.

Dependencies

~20–31MB
~619K SLoC