16 releases

0.9.4 Mar 1, 2023
0.9.3 Dec 28, 2022
0.9.2 Jul 30, 2022
0.9.1 May 31, 2021
0.5.0 Nov 30, 2020

#44 in Biology

Download history 8/week @ 2024-02-26 13/week @ 2024-03-11 266/week @ 2024-04-01

279 downloads per month

Apache-2.0

3MB
396 lines

Contains (ELF lib, 1.5MB) libgenomicsqlite.so, (Mach-o library, 1.5MB) libgenomicsqlite.dylib

Genomics Extension for SQLite

("GenomicSQLite")

This SQLite3 loadable extension adds features to the ubiquitous embedded RDBMS supporting applications in genome bioinformatics:

  • genomic range indexing for overlap queries & joins
  • in-SQL utility functions, e.g. reverse-complement DNA, parse "chr1:2,345-6,789"
  • automatic streaming storage compression (also available standalone)
  • reading directly from HTTP(S) URLs (also available standalone)
  • pre-tuned settings for "big data"

This November 2021 poster discusses the context and long-run ambitions:

GenomicSQLite Poster

Our Colab notebook demonstrates key features with Python, one of several language bindings.

USE AT YOUR OWN RISK: This project is not associated with the SQLite developers. The database storage extensions are designed to preserve ACID transaction safety, but they're young and unlikely to be totally bug-free.

Installation & Programming Guide

Start Here 👉 full documentation site

We supply the extension prepackaged for Linux and macOS on x86-64. An up-to-date version of SQLite itself is also required, as specified in the docs.

Programming language support:

  • C/C++
  • Python ≥3.6
  • Java & JVM languages
  • Rust

More to come. (Help wanted; see Language Bindings Guide)

Building from source

build

Most will prefer to install a pre-built shared library (see above). To build from source, see our Actions yml (Ubuntu 20.04) or Dockerfile (CentOS 7) used to build the more-portable releases. Briefly, you'll need:

  • C++11 build system
  • CMake ≥ 3.14
  • Dev packages: SQLite ≥ 3.31.0, Zstandard ≥ 1.3.4, libcurl

And incantations:

cmake -DCMAKE_BUILD_TYPE=Release -B build .
cmake --build build -j 4 --target genomicsqlite

...generating build/libgenomicsqlite.so. To run the test suite, you'll furthermore need:

  • htslib ≥ 1.9, samtools, and tabix
  • pigz
  • Python ≥ 3.6 and packages: pytest pytest-xdist pre-commit black pylint flake8
  • JDK, mvn, rust
  • clang-format & cppcheck

to:

pre-commit run --all-files  # formatters+linters
cmake -DCMAKE_BUILD_TYPE=Debug -B build .
cmake --build build -j 4
env -C build ctest -V

Dependencies

~23–33MB
~523K SLoC