3 unstable releases

0.2.0 Sep 1, 2023
0.1.1 Jun 23, 2023
0.1.0 Jun 9, 2023

#147 in Database implementations

Download history 50005/week @ 2023-08-20 61046/week @ 2023-08-27 55939/week @ 2023-09-03 42070/week @ 2023-09-10 4664/week @ 2023-09-17 3524/week @ 2023-09-24 4580/week @ 2023-10-01 4725/week @ 2023-10-08 4881/week @ 2023-10-15 4533/week @ 2023-10-22 4466/week @ 2023-10-29 4981/week @ 2023-11-05 7395/week @ 2023-11-12 10855/week @ 2023-11-19 15491/week @ 2023-11-26 13005/week @ 2023-12-03

47,056 downloads per month
Used in 18 crates (4 directly)

MIT license

7KB
117 lines

#Tokenizer-API

An API to interface a tokenizer with tantivy.

The API will be kept stable in order to not break support for existing tokenizers.


lib.rs:

Tokenizer are in charge of chopping text into a stream of tokens ready for indexing. This is an seperate crate from tantivy, so implementors don't need to update for each new tantivy version.

To add support for a tokenizer, implement the Tokenizer trait. Checkout the tantivy repo for some examples.

Dependencies

~0.4–1MB
~24K SLoC