4 releases (2 breaking)

new 0.3.0 Apr 12, 2024
0.2.0 Sep 1, 2023
0.1.1 Jun 23, 2023
0.1.0 Jun 9, 2023

#294 in Database implementations

Download history 2728/week @ 2023-12-23 5406/week @ 2023-12-30 7323/week @ 2024-01-06 10186/week @ 2024-01-13 14707/week @ 2024-01-20 14561/week @ 2024-01-27 15347/week @ 2024-02-03 13888/week @ 2024-02-10 18499/week @ 2024-02-17 30844/week @ 2024-02-24 28349/week @ 2024-03-02 33179/week @ 2024-03-09 35000/week @ 2024-03-16 31240/week @ 2024-03-23 30850/week @ 2024-03-30 25947/week @ 2024-04-06

129,008 downloads per month
Used in 25 crates (6 directly)

MIT license

7KB
117 lines

#Tokenizer-API

An API to interface a tokenizer with tantivy.

The API will be kept stable in order to not break support for existing tokenizers.


lib.rs:

Tokenizer are in charge of chopping text into a stream of tokens ready for indexing. This is an seperate crate from tantivy, so implementors don't need to update for each new tantivy version.

To add support for a tokenizer, implement the Tokenizer trait. Checkout the tantivy repo for some examples.

Dependencies

~0.4–1MB
~24K SLoC