#nlp #chinese #segmenation

jieba-rs

The Jieba Chinese Word Segmentation Implemented in Rust

18 releases

✓ Uses Rust 2018 edition

new 0.4.1 Jun 16, 2019
0.3.2 Jun 5, 2019
0.2.5 Oct 30, 2018
0.2.3 Jul 6, 2018

#47 in Text processing

Download history 26/week @ 2019-03-03 4/week @ 2019-03-10 20/week @ 2019-03-17 52/week @ 2019-03-24 14/week @ 2019-03-31 26/week @ 2019-04-07 12/week @ 2019-04-14 31/week @ 2019-04-21 8/week @ 2019-04-28 17/week @ 2019-05-05 20/week @ 2019-05-12 47/week @ 2019-05-19 121/week @ 2019-05-26 53/week @ 2019-06-02 137/week @ 2019-06-09

188 downloads per month
Used in 3 crates

MIT license

4.5MB
1.5K SLoC

jieba-rs

Build Status codecov Crates.io docs.rs

The Jieba Chinese Word Segmentation Implemented in Rust

Installation

Add it to your Cargo.toml:

[dependencies]
jieba-rs = "0.4"

then you are good to go. If you are using Rust 2015 you have to extern crate jieba_rs to your crate root as well.

Example

use jieba_rs::Jieba;

fn main() {
    let jieba = Jieba::new();
    let words = jieba.cut("我们中出了一个叛徒", false);
    assert_eq!(words, vec!["我们", "", "", "", "一个", "叛徒"]);
}

Enabling Additional Features

  • tfidf feature enables TF-IDF keywords extractor
  • textrank feature enables TextRank keywords extractor
[dependencies]
jieba-rs = { version = "0.4", features = ["tfidf", "textrank"] }

License

This work is released under the MIT license. A copy of the license is provided in the LICENSE file.

Dependencies