10 stable releases (3 major)
4.9.0 | Oct 12, 2024 |
---|---|
4.0.0 | Jul 12, 2024 |
3.9.0 | Jun 21, 2024 |
2.0.2 | May 14, 2024 |
1.0.0 | Apr 21, 2022 |
#234 in #bit
Used in 3 crates
(via bio-seq)
19KB
323 lines
bio-seq-derive
bio-seq-derive
is a procedural macro crate that provides the Codec
derive macro for the bio-seq
library. It allows users to define custom bit-packed alphabets from an enum. The bit representation of the symbols is derived from the enum discriminants.
This crate also provides the dna!()
and iupac!()
macros that are reexported by bio-seq
for declaring static sequences at compile time.
You probably don't want to directly include this crate as a dependency.
Please refer to the bio-seq
documentation for a complete guide on defining custom alphabets.
Features
width
attribute: Specify the number of bits required to represent each variant in the custom alphabet. Default is optimal.alt
attribute: Define alternate bit representations for the same variant.display
attribute: Set a custom character representation for a variant.
Usage
To derive a custom encoding, use the Codec
derive macro as reexported in the bio-seq
prelude:
use bio_seq::prelude::*;
Codecs can be annotated with #[repr(u8)]
for convenient casting.
#[derive(Clone, Copy, Debug, PartialEq, Eq, Hash, Codec)]
#[width(6)]
pub enum Amino {
#[alt(0b110110, 0b010110, 0b100110)]
A = 0b000110, // GCA
#[alt(0b111011)]
C = 0b011011, // TGC
#[alt(0b110010)]
D = 0b010010, // GAC
#[alt(0b100010)]
E = 0b000010, // GAA
#[alt(0b111111)]
F = 0b011111, // TTC
#[alt(0b101010, 0b011010, 0b111010)]
G = 0b001010, // GGA
#[alt(0b110001)]
H = 0b010001, // CAC
#[alt(0b011100, 0b111100)]
I = 0b001100, // ATA
#[alt(0b100000)]
K = 0b000000, // AAA
#[alt(0b001111, 0b101111, 0b111101, 0b011101, 0b101101)]
L = 0b001101, // CTA
M = 0b101100, // ATG
#[alt(0b110000)]
N = 0b010000, // AAC
#[alt(0b010101, 0b100101, 0b110101)]
P = 0b000101, // CCA
#[alt(0b100001)]
Q = 0b000001, // CAA
#[alt(0b101000, 0b111001, 0b011001, 0b001001, 0b101001)]
R = 0b001000, // AGA
#[alt(0b110111, 0b010111, 0b000111, 0b100111, 0b111000)]
S = 0b011000, // AGC
#[alt(0b110100, 0b010100, 0b100100)]
T = 0b000100, // ACA
#[alt(0b011110, 0b111110, 0b101110)]
V = 0b001110, // GTA
W = 0b101011, // TGG
#[alt(0b110011)]
Y = 0b010011, // TAC
#[display('*')]
#[alt(0b001011, 0b100011)]
X = 0b000011, // TAA (stop)
}
Dependencies
~220–660KB
~16K SLoC