1 unstable release

0.1.0 Dec 25, 2020

#50 in #similarity

MIT/Apache

565KB
100 lines

Crates.io Documentation Build Codecov

chordclust

BCLUST

Bclust implements similarity clustering using rust-bio.

Algorithm

The algorithm is a greedy search, similar to what is explained in https://www.drive5.com/usearch/manual/uclust_algo.html. It uses similarity instead of identity (for now)

  1. Sort by sequence length (bigger is first).
  2. For each sequence, compare it with the database of centroids:
  • If identity with best match > T: add to cluster of best match.
  • Else: form a new cluster.

License

Licensed under either of

at your option.

Contribution

Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in the work by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.

README.md is automatically generated on CI using cargo-readme. Please, modify README.tpl or lib.rs instead (check the github worflow for more details).

Dependencies

~14MB
~237K SLoC