4 releases
Uses old Rust 2015
0.1.3 | Feb 17, 2018 |
---|---|
0.1.2 | Feb 17, 2018 |
0.1.1 | Feb 17, 2018 |
0.1.0 | Feb 17, 2018 |
#2613 in Algorithms
16KB
217 lines
Alcs
All Longest Common Substrings and Fuzzy String Search for Rust
Link to crates.io
Implementation of An all-substrings common subsequence algorithm
Given two strings s1 and s2, it is possible construct, in O(|s1|*|s2|) time and O(|s1|+|s2|) space, a structure that can be queried to find the length of all Longest Common Subsequences between s1 and all possible substrings of s2, each query requiring constant time.
Some accessor functions are provided to retrieve the matrices and the vectors defined in the paper
All LCS
extern crate alcs;
use alcs::Alcs;
fn main() {
let a = "word";
let b = "hello world";
let va = a.chars().collect::<Vec<char>>();
let vb = b.chars().collect::<Vec<char>>();
let alcs = Alcs::new(&va, &vb);
for i in 0..b.len() {
for (i, j, cij) in alcs.suffix(i) {
println!(r#"LCS between "{}" and "{}" has length {}"#,a,&b[i..j],cij);
}
}
}
Output:
LCS between "word" and "h" has length 0
LCS between "word" and "he" has length 0
LCS between "word" and "hel" has length 0
...
LCS between "word" and " world" has length 4
LCS between "word" and "w" has length 1
LCS between "word" and "wo" has length 2
LCS between "word" and "wor" has length 3
...
LCS between "word" and "d" has length 1
Fuzzy substring
Also, it is defined a trait that allows to fuzzy search a string:
extern crate alcs;
use alcs::FuzzyStrstr;
fn main() {
let tsh = 0.7;
let tests = [
("he.llo.wor.ld.!", "world"),
("he.llo.word", "world"),
("hello world", "word"),
("hello world", "banana"),
];
for &(h, n) in &tests {
match h.fuzzy_find_str(n, tsh) {
None => {
println!(r#""{}" does not contain "{}""#, h, n);
}
Some((score, sub)) => {
println!(r#""{}" contains "{}" ("{}") with score {}"#, h, n, sub, score);
}
}
}
}
Output:
"he.llo.wor.ld.!" contains "world" ("wor.ld") with score 0.8333333
"he.llo.word" contains "world" ("word") with score 0.8
"hello world" contains "word" ("world") with score 0.8
"hello world" does not contain "banana"