1 unstable release
0.1.0 | Jun 25, 2019 |
---|
#1672 in Text processing
27KB
505 lines
rust-darts: Double-Array Trie Rust implementation.
This library is in alpha state, PRs are welcomed. An optional Forward Maximum Matching Searcher is provided when enabled by features.
Installation
Add it to your Cargo.toml
:
[dependencies]
darts = "0.1"
then you are good to go. If you are using Rust 2015 you have to extern crate darts
to your crate root as well.
Example
use std::fs::File;
use darts::DoubleArrayTrie;
fn main() {
let mut f = File::open("./priv/dict.big.bincode").unwrap();
let da = DoubleArrayTrie::load(&mut f).unwrap();
let string = "中华人民共和国";
let prefixes = da.common_prefix_search(string).map(|matches| {
matches
.iter()
.map(|(end_idx, v)| {
&string[..end_idx]
})
.collect();
}).unwrap_or(vec![]);
assert_eq!(vec!["中", "中华", "中华人民", "中华人民共和国"], prefixes);
}
use std::fs::File;
use darts::DoubleArrayTrie;
fn main() {
let mut f = File::open("./priv/dict.big.bincode").unwrap();
let da = DoubleArrayTrie::load(&mut f).unwrap();
assert!(da.exact_match_search("东湖高新技术开发区").is_some());
}
Enabling Additional Features
searcher
feature enables searcher for maximum forward matcherserialization
feature enables saving and loading serializedDoubleArrayTrie
data
[dependencies]
darts = { version = "0.1", features = ["searcher", "serialization"] }
To Rebuild Dictionary
# It would take minutes, be patient.
time cargo test -- --nocapture --ignored test_dat_basic
To run benchmark tests
cargo bench --all-features
License
This work is released under the MIT license. A copy of the license is provided in the LICENSE file.
Reference
Dependencies
~220KB