105 releases (30 breaking)
0.32.2 | Sep 4, 2024 |
---|---|
0.31.0 | Aug 17, 2024 |
0.29.1 | Jul 16, 2024 |
0.15.0 | Mar 27, 2024 |
0.2.6 | Jul 31, 2023 |
#60 in Biology
172 downloads per month
Used in 3 crates
4MB
27K
SLoC
Exon is an execution engine designed to work with bioinformatics data. It features:
- SQL based access to bioinformatics data -- general DML and some DDL support
- Support for many file formats from bioinformatics, proteomics, and others
- Local filesystem and object storage support
- Arrow FFI primitives for multi-language support
Installation
Exon is available via crates.io. To install, run:
cargo add exon
Documentation
Related Projects
Benchmarks
Please see the benchmarks README for more information.
lib.rs
:
Exon is a library to facilitate open-ended analysis of scientific data, ease the application of ML models, and provide a common data interface for science and engineering teams.
Overview
The main interface for users is through datafusion's SessionContext
plus the ExonSessionExt
extension trait. This has a number of convenience methods for loading data from various sources.
See the read_*
methods on ExonSessionExt
for more information. For example, read_fasta
, or read_gff
. There's also a read_inferred_exon_table
method that will attempt to infer the data type and compression from the file extension for ease of use.
To facilitate those methods, Exon implements a number of traits for DataFusion that serve as a good base for scientific data work. See the datasources
module for more information.
Dependencies
~95MB
~1.5M SLoC