5 releases (2 stable)
1.0.1 | Aug 2, 2022 |
---|---|
1.0.0 | May 8, 2022 |
0.1.2 | May 13, 2020 |
0.1.1 | May 5, 2020 |
0.1.0 | May 5, 2020 |
#1535 in Text processing
133 downloads per month
4KB
segmenter
v1.0.0
About
Segment Chinese sentences into component words using a dictionary-driven largest first matching approach.
Usage
extern crate chinese_segmenter;
use chinese_segmenter::{initialize, tokenize};
let sentence = "今天晚上想吃羊肉吗?";
initialize(); // Optional intialization to load data
let result: Vec<&str> = tokenize(sentence);
println!("{:?}", result); // --> ['今天', '晚上', '想', '吃', '羊肉', '吗']
Contributors
License
lib.rs
:
About
Segment Chinese sentences into component words using a dictionary-driven largest first matching approach.
Usage
extern crate chinese_segmenter;
use chinese_segmenter::{initialize, tokenize};
let sentence = "今天晚上想吃羊肉吗?";
initialize(); // Optional initialization to load data
let result: Vec<&str> = tokenize(sentence);
println!("{:?}", result); // --> ['今天', '晚上', '想', '吃', '羊肉', '吗']
Dependencies
~4.5MB
~19K SLoC