2 unstable releases
0.2.0 | Oct 3, 2020 |
---|---|
0.1.0 | Sep 20, 2020 |
#2412 in Algorithms
15KB
299 lines
README
A generic implementation of the Levenshtein distance that allows arbitrarily weighting operations for different elements.
Generic
This crate can work on slices of any kind. It can:
- Compute a distance in characters between two strings:
assert_eq!(distance("abc", "aaxcc"), 3);
- Compute a distance in words between two strings:
assert_eq!(
distance(
"The quick brown fox".split (' ').collect::<Vec<_>>(),
"The very quick brown cat".split (' ').collect()),
2);
- Or compute a distance between arbitrary sequences:
assert_eq!(distance(vec![1, 2, 3], vec![0, 1, 3, 3, 4]), 3);
Weighting
This crate allows defining custom weights for each operation on each symbol.
These weights can be specified for custom types by implementing the EditWeight
trait.
For example:
enum MyType {
A,
B,
}
impl EditWeight for MyType {
fn add_cost(&self) -> usize {
match *self {
MyType::A => 1,
MyType::B => 2,
}
}
fn rm_cost(&self) -> usize {
match *self {
MyType::A => 1,
MyType::B => 2,
}
}
fn sub_cost(&self, other: &Self) -> usize {
if self == other {
0
} else {
3
}
}
}
assert_eq!(distance(vec![MyType::A], vec![MyType::B, MyType::B]), 5)