12 releases

0.3.1 Jan 23, 2022
0.3.0 Jan 23, 2022
0.2.4 Jan 22, 2022
0.2.3 Aug 11, 2021
0.1.2 May 27, 2021

#1477 in Data structures

31 downloads per month

MIT license

2.5MB
4K SLoC

Docs Matrix CI Discussions

SDSL-RS

A Rust interface for the Succinct Data Structure Library (SDSL-lite).

Introduction

SDSL-lite is a C++11 library which implements succinct data structures. Example structures include: arbitrary width integer vectors, wavelet trees, and compressed suffix arrays. The library is commonly used within bioinformatics among other fields.

Many of the data structures provided by the library are defined using C++ templates. This poses a challenge when interfacing with the library from languages other than C++. The primary aim of SDSL-RS is to take on the heavy lifting of interfacing with the library from Rust!

Progress

The following list includes all structures which will be supported. The list was derived from the cheatsheet found here.

Checked entries in the list are currently supported. The near term goal is to support one data structure from each section.

Please contact the chat channel for prioritization requests.

Full Structures Status List

Integer vectors

  • IntVector

Bit vectors

  • BitVector (plain bit vector)
  • BitVectorIl (interleaved bit vector)
  • RrrVector (H0 compressed bit vector)
  • SdVector (sparse bit vector)
  • HybVector (hybrid bit vector)

Rank Supports

  • RankSupportV
  • RankSupportV5
  • RankSupportScan
  • RankSupportIl
  • RankSupportRrr
  • RankSupportSd
  • RankSupportHyb

Select Supports

  • SelectSupportMcl
  • SelectSupportScan
  • SelectSupportIl
  • SelectSupportRrr
  • SelectSupportSd

Wavelet Trees

  • WtHuff
  • WtInt
  • WtRlmn
  • WtGmr
  • WtAp
  • WmInt
  • WtBlcd
  • WtHutu

Compressed Suffix Arrays

  • CsaBitcompressed
  • CsaSada
  • CsaWt

Longest Common Prefix Arrays

  • LcpBitcompressed
  • LcpDac
  • LcpByte
  • LcpWt
  • LcpVlc
  • LcpSupportSada
  • LcpSupportTree
  • LcpSupportTree2

Balanced Parentheses Supports

  • BpSupportG
  • BpSupportGg
  • BpSupportSada

Compressed Suffix Trees

  • CstSada
  • CstSct3

Range Min/Max Query

  • RmqSupportSparseTable
  • RmqSuccintSada
  • RmqSuccintSct

Requirements

SDSL-lite

SDSL-lite must be compilable within the development environment. Requirements can be found here.

Commonly missing dependencies include libdivsufsort-dev.

SDSL-RS

SDSL-RS uses const generics and therefore may require the beta Rust toolchain.

Projects which use SDSL-RS must include a build script (build.rs) with contents such as:

// build.rs
fn main() {
    match sdsl::build() {
        Ok(_) => {}
        Err(e) => panic!("Error: {}", e),
    };
}

The project's Cargo.toml file must therefore include a build-dependencies section such as:

[dependencies]
sdsl = "0.3.0"
# ... other dependencies ...

[build-dependencies]
sdsl = "0.3.0"

The sdsl::build() function call allows SDSL-RS to analyse the current project's code base (via MIR) and build an appropriate interface in the top level target directory. The initial compilation of the project after adding SDSL-RS takes a while because SDSL-lite is compiled as a dependency. Subsequent compilations should be quick.

Examples

An example project can be found here. It contains examples for all supported data structures.

This example shows how to construct a H0 compressed bit vector (sdsl::bit_vectors::RrrVector):

let bv = sdsl::bit_vector! {1, 1, 0, 1};
let rv = sdsl::bit_vectors::RrrVector::<sdsl::int_vectors::IntVector<0>, 10, 2>::new(&bv)?;

let result = rv.get_bv_element(2);
let expected = 0;
assert_eq!(result, expected);

Dependencies

~7–16MB
~205K SLoC