#bioinformatics #dna

seq-hash

A SIMD-accelerated library to compute hashes of DNA sequences

3 unstable releases

Uses new Rust 2024

0.1.1 Oct 15, 2025
0.1.0 Oct 1, 2025
0.0.1 Sep 26, 2025

#531 in Biology

Download history 286/week @ 2025-09-26 88/week @ 2025-10-03 155/week @ 2025-10-10 74/week @ 2025-10-17 35/week @ 2025-10-24 8/week @ 2025-10-31 7/week @ 2025-11-07 17/week @ 2025-11-14 32/week @ 2025-11-21 7/week @ 2025-11-28 2/week @ 2025-12-05 8/week @ 2025-12-12

54 downloads per month
Used in 5 crates (3 directly)

MIT license

38KB
817 lines

seq-hash

crates.io docs

A SIMD-accelerated library for iterating over k-mer hashes of DNA sequences, building on packed_seq. Building block for simd-minimizers.

Paper: Please cite the simd-minimizers paper, for which this crate was developed:

Requirements

This library supports AVX2 and NEON instruction sets. Make sure to set RUSTFLAGS="-C target-cpu=native" when compiling to use the instruction sets available on your architecture.

RUSTFLAGS="-C target-cpu=native" cargo run --release

Usage example

Full documentation can be found on docs.rs.

use packed_seq::{AsciiSeqVec, PackedSeqVec, SeqVec};
use seq_hash::{KmerHasher, NtHasher};

let seq = b"ACGGCAGCGCATATGTAGT";
let packed_seq = PackedSeqVec::from_ascii(seq);

let k = 3;
// Default `NtHasher` is canonical.
let hasher = <NtHasher>::new(k);

// Consider a 'context' of a single kmer.
let hashes: Vec<_> = hasher.hash_kmers_simd(packed_seq.as_slice(), 1).collect();
assert_eq!(hashes.len(), seq.len() - (k-1)

Dependencies

~2MB
~44K SLoC