2 releases

0.2.2 Oct 12, 2023
0.2.1 Oct 10, 2023

#139 in Biology

GPL-3.0-or-later

62KB
2K SLoC

K-mers and associated operations.

This library provides functionality for extracting k-mers from sequences, and manipulating them in useful ways. The underlying representation is 64-bit integers (u64), so k > 32 is not supported by this library.

K-mers (or q-grams in some computer science contexts) are k-length sequences of DNA/RNA "letters" represented as unsigned integers. Following usual practice,

  • "A" -> b00
  • "C" -> b01
  • "G" -> b10
  • "T" or "U" -> b11

which has the nice property that the complementary bases are bitwise complements.

No runtime deps