#tantivy #bridge #tokenize #jieba #adapter #register #jieba-rs

tantivy-jieba

A library that bridges between tantivy and jieba-rs

12 breaking releases

0.14.0 May 29, 2025
0.12.0 May 23, 2025
0.11.0 Apr 28, 2024
0.10.0 Oct 13, 2023
0.1.1 Feb 13, 2019

#180 in Database interfaces

Download history 673/week @ 2025-02-19 1526/week @ 2025-02-26 2147/week @ 2025-03-05 655/week @ 2025-03-12 1556/week @ 2025-03-19 1131/week @ 2025-03-26 1123/week @ 2025-04-02 1100/week @ 2025-04-09 1018/week @ 2025-04-16 2523/week @ 2025-04-23 1047/week @ 2025-04-30 1947/week @ 2025-05-07 2375/week @ 2025-05-14 1783/week @ 2025-05-21 1431/week @ 2025-05-28 2670/week @ 2025-06-04

8,850 downloads per month
Used in 2 crates

MIT license

350KB
117 lines

tantivy-jieba

Crates.io version docs.rs Changelog FOSSA Status

An adapter that bridges between tantivy and jieba-rs.

Usage

Add dependency tantivy-jieba to your Cargo.toml.

Example

use tantivy::tokenizer::*;
let mut tokenizer = tantivy_jieba::JiebaTokenizer {};
let mut token_stream = tokenizer.token_stream("测试");
assert_eq!(token_stream.next().unwrap().text, "测试");
assert!(token_stream.next().is_none());

Register tantivy tokenizer

use tantivy::schema::Schema;
use tantivy::tokenizer::*;
use tantivy::Index;
let tokenizer = tantivy_jieba::JiebaTokenizer {};
let index = Index::create_in_ram(schema);
index.tokenizers()
     .register("jieba", tokenizer);

See examples/mod.rs for detailed example.

License

FOSSA Status

Dependencies

~16MB
~125K SLoC