1 unstable release
0.1.0 | Dec 25, 2020 |
---|
#50 in #similarity
565KB
100 lines
chordclust
BCLUST
Bclust implements similarity clustering using rust-bio.
Algorithm
The algorithm is a greedy search, similar to what is explained in https://www.drive5.com/usearch/manual/uclust_algo.html. It uses similarity instead of identity (for now)
- Sort by sequence length (bigger is first).
- For each sequence, compare it with the database of centroids:
- If identity with best match > T: add to cluster of best match.
- Else: form a new cluster.
License
Licensed under either of
- Apache License, Version 2.0, (LICENSE-APACHE or http://www.apache.org/licenses/LICENSE-2.0)
- MIT license (LICENSE-MIT or http://opensource.org/licenses/MIT)
at your option.
Contribution
Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in the work by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.
README.md is automatically generated on CI using cargo-readme. Please, modify README.tpl or lib.rs instead (check the github worflow for more details).
Dependencies
~14MB
~237K SLoC