21 releases (12 breaking)
|new 0.13.1||Sep 19, 2020|
|0.12.0||Feb 19, 2020|
|0.11.3||Dec 20, 2019|
|0.10.3||Nov 10, 2019|
|0.1.1||Aug 14, 2016|
#2 in Database implementations
3,621 downloads per month
Used in 18 crates (14 directly)
Tantivy is a full text search engine library written in Rust.
Tantivy is, in fact, strongly inspired by Lucene's design.
The following benchmark break downs performance for different type of queries / collection.
In general, Tantivy tends to be
- slower than Lucene on union with a Top-K due to Block-WAND optimization.
- faster than Lucene on intersection and phrase queries.
Your mileage WILL vary depending on the nature of queries and their load.
- Full-text search
- Configurable tokenizer (stemming available for 17 Latin languages with third party support for Chinese (tantivy-jieba and cang-jie), Japanese (lindera and tantivy-tokenizer-tiny-segmente) and Korean (lindera + lindera-ko-dic-builder)
- Fast (check out the 🐎 ✨ benchmark ✨ 🐎)
- Tiny startup time (<10ms), perfect for command line tools
- BM25 scoring (the same as Lucene)
- Natural query language (e.g.
(michael AND jackson) OR "king of pop")
- Phrase queries search (e.g.
- Incremental indexing
- Multithreaded indexing (indexing English Wikipedia takes < 3 minutes on my desktop)
- Mmap directory
- SIMD integer compression when the platform/CPU includes the SSE2 instruction set
- Single valued and multivalued u64, i64, and f64 fast fields (equivalent of doc values in Lucene)
- Text, i64, u64, f64, dates, and hierarchical facet fields
- LZ4 compressed document store
- Range queries
- Faceted search
- Configurable indexing (optional term frequency and position indexing)
- Cheesy logo with a horse
- Distributed search is out of the scope of Tantivy. That being said, Tantivy is a library upon which one could build a distributed search. Serializable/mergeable collector state for instance, are within the scope of Tantivy.
Tantivy works on stable Rust (>= 1.27) and supports Linux, MacOS, and Windows.
- Tantivy's simple search example
- tantivy-cli and its tutorial -
tantivy-cliis an actual command line interface that makes it easy for you to create a search engine, index documents, and search via the CLI or a small server with a REST API. It walks you through getting a wikipedia search engine up and running in a few minutes.
- Reference doc for the last released version
There are many ways to support this project.
- Use Tantivy and tell us about your experience on Gitter or by email (firstname.lastname@example.org)
- Report bugs
- Write a blog post
- Help with documentation by asking questions or submitting PRs
- Contribute code (you can join our Gitter)
- Talk about Tantivy around you
- Drop a word on on or even
We use the GitHub Pull Request workflow: reference a GitHub ticket and/or include a comprehensive commit message when opening a PR.
Tantivy compiles on stable Rust but requires
Rust >= 1.27.
To check out and run tests, you can simply run:
git clone https://github.com/tantivy-search/tantivy.git cd tantivy cargo build
Some tests will not run with just
cargo test because of
To run the tests exhaustively, run
You might find it useful to step through the programme with a debugger.
Make sure you haven't run
cargo clean after the most recent
cargo test or
cargo build to guarantee that the
target/ directory exists. Use this bash script to find the name of the most recent debug build of Tantivy and run it under
find target/debug/ -maxdepth 1 -executable -type f -name "tantivy*" -printf '%TY-%Tm-%Td %TT %p\n' | sort -r | cut -d " " -f 3 | xargs -I RECENT_DBG_TANTIVY rust-gdb RECENT_DBG_TANTIVY
Now that you are in
rust-gdb, you can set breakpoints on lines and methods that match your source code and run the debug executable with flags that you normally pass to
cargo test like this:
$gdb run --test-threads 1 --test $NAME_OF_TEST
rustc compiles everything in the
examples/ directory in debug mode. This makes it easy for you to make examples to reproduce bugs:
rust-gdb target/debug/examples/$EXAMPLE_NAME $ gdb run