3 releases (breaking)

0.3.0 Apr 18, 2024
0.2.0 Oct 14, 2023
0.1.0 Sep 8, 2023

#15 in #tantivy

Download history 5/week @ 2024-01-28 280/week @ 2024-02-04 9/week @ 2024-02-18 31/week @ 2024-02-25 23/week @ 2024-03-03 184/week @ 2024-03-10 6/week @ 2024-03-17 11/week @ 2024-03-24 46/week @ 2024-03-31 2/week @ 2024-04-07 512/week @ 2024-04-14 11/week @ 2024-04-21 1/week @ 2024-04-28 244/week @ 2024-05-05 114/week @ 2024-05-12

639 downloads per month
Used in 3 crates (via izihawa-tantivy)

MIT license

7KB
117 lines

#Tokenizer-API

An API to interface a tokenizer with tantivy.

The API will be kept stable in order to not break support for existing tokenizers.


lib.rs:

Tokenizer are in charge of chopping text into a stream of tokens ready for indexing. This is an seperate crate from tantivy, so implementors don't need to update for each new tantivy version.

To add support for a tokenizer, implement the Tokenizer trait. Checkout the tantivy repo for some examples.

Dependencies

~0.4–1MB
~23K SLoC