4 releases (2 breaking)

0.3.0 Apr 12, 2024
0.2.0 Sep 1, 2023
0.1.1 Jun 23, 2023
0.1.0 Jun 9, 2023

#16 in #tantivy

Download history 14618/week @ 2024-01-29 15707/week @ 2024-02-05 13476/week @ 2024-02-12 21206/week @ 2024-02-19 29790/week @ 2024-02-26 29814/week @ 2024-03-04 32083/week @ 2024-03-11 34967/week @ 2024-03-18 31465/week @ 2024-03-25 30580/week @ 2024-04-01 30815/week @ 2024-04-08 32536/week @ 2024-04-15 31493/week @ 2024-04-22 44475/week @ 2024-04-29 49109/week @ 2024-05-06 50449/week @ 2024-05-13

177,330 downloads per month
Used in 27 crates (8 directly)

MIT license

7KB
117 lines

#Tokenizer-API

An API to interface a tokenizer with tantivy.

The API will be kept stable in order to not break support for existing tokenizers.


lib.rs:

Tokenizer are in charge of chopping text into a stream of tokens ready for indexing. This is an seperate crate from tantivy, so implementors don't need to update for each new tantivy version.

To add support for a tokenizer, implement the Tokenizer trait. Checkout the tantivy repo for some examples.

Dependencies

~0.5–1MB
~25K SLoC