12 stable releases (3 major)
Uses new Rust 2024
| 4.11.0 | Jul 26, 2025 |
|---|---|
| 4.10.0 | Feb 19, 2025 |
| 4.9.0 | Oct 12, 2024 |
| 4.0.0 | Jul 12, 2024 |
| 1.0.0 | Apr 21, 2022 |
#7 in #bit-packed
Used in 4 crates
(via bio-seq)
19KB
294 lines
bio-seq-derive
bio-seq-derive is a procedural macro crate that provides the Codec derive macro for the bio-seq library. It allows users to define custom bit-packed alphabets from an enum. The bit representation of the symbols is derived from the enum discriminants.
This crate also provides the dna!() and iupac!() macros that are reexported by bio-seq for declaring static sequences at compile time.
You probably don't want to directly include this crate as a dependency.
Please refer to the bio-seq documentation for a complete guide on defining custom alphabets.
Features
widthattribute: Specify the number of bits required to represent each variant in the custom alphabet. Default is optimal.altattribute: Define alternate bit representations for the same variant.displayattribute: Set a custom character representation for a variant.
Usage
To derive a custom encoding, use the Codec derive macro as reexported in the bio-seq prelude:
use bio_seq::prelude::*;
Codecs can be annotated with #[repr(u8)] for convenient casting.
#[derive(Clone, Copy, Debug, PartialEq, Eq, Hash, Codec)]
#[width(6)]
pub enum Amino {
#[alt(0b110110, 0b010110, 0b100110)]
A = 0b000110, // GCA
#[alt(0b111011)]
C = 0b011011, // TGC
#[alt(0b110010)]
D = 0b010010, // GAC
#[alt(0b100010)]
E = 0b000010, // GAA
#[alt(0b111111)]
F = 0b011111, // TTC
#[alt(0b101010, 0b011010, 0b111010)]
G = 0b001010, // GGA
#[alt(0b110001)]
H = 0b010001, // CAC
#[alt(0b011100, 0b111100)]
I = 0b001100, // ATA
#[alt(0b100000)]
K = 0b000000, // AAA
#[alt(0b001111, 0b101111, 0b111101, 0b011101, 0b101101)]
L = 0b001101, // CTA
M = 0b101100, // ATG
#[alt(0b110000)]
N = 0b010000, // AAC
#[alt(0b010101, 0b100101, 0b110101)]
P = 0b000101, // CCA
#[alt(0b100001)]
Q = 0b000001, // CAA
#[alt(0b101000, 0b111001, 0b011001, 0b001001, 0b101001)]
R = 0b001000, // AGA
#[alt(0b110111, 0b010111, 0b000111, 0b100111, 0b111000)]
S = 0b011000, // AGC
#[alt(0b110100, 0b010100, 0b100100)]
T = 0b000100, // ACA
#[alt(0b011110, 0b111110, 0b101110)]
V = 0b001110, // GTA
W = 0b101011, // TGG
#[alt(0b110011)]
Y = 0b010011, // TAC
#[display('*')]
#[alt(0b001011, 0b100011)]
X = 0b000011, // TAA (stop)
}
Dependencies
~150–560KB
~13K SLoC