#sqlite-extension #tokenize #sqlite #extension #fts5

libsimple

Rust bindings to simple, a SQLite3 fts5 tokenizer which supports Chinese and PinYin

13 unstable releases (4 breaking)

Uses new Rust 2024

new 0.5.0 Apr 24, 2025
0.4.0 Mar 6, 2025
0.3.7 Mar 6, 2025
0.3.4 Oct 6, 2024
0.3.1 Jul 25, 2024

#886 in Database interfaces

37 downloads per month

MIT license

13MB
180K SLoC

C 178K SLoC // 0.2% comments C++ 1K SLoC // 0.1% comments Rust 118 SLoC

libsimple

Crate GitHub last commit GitHub issues GitHub pull requests GitHub

Description

Rust bindings to simple, a SQLite3 fts5 tokenizer which supports Chinese and PinYin.

Usage

Add this to your Cargo.toml:

[dependencies]
libsimple = "~0.5"

Example

use anyhow::Result;
use tempfile::tempdir;

fn main() -> Result<()> {
    libsimple::enable_auto_extension()?;
    let dir = tempdir()?;
    libsimple::release_dict(&dir)?;
    
    let conn = rusqlite::Connection::open_in_memory()?;
    libsimple::set_dict(&conn, &dir)?;
    
    conn.execute_batch("
        CREATE VIRTUAL TABLE d USING fts5(id, text, tokenize = 'simple');
        INSERT INTO d (id, text) VALUES (1, '中华人民共和国国歌');
        INSERT INTO d (id, text) VALUES (2, '周杰伦');
    ")?;
    assert_eq!(1, conn.query_row(
        "SELECT id FROM d WHERE text MATCH jieba_query('中华国歌')",
        [], |row| row.get::<_, i64>(0)
    )?);
    assert_eq!(2, conn.query_row(
        "SELECT id FROM d WHERE text MATCH simple_query('zhoujiel')",
        [], |row| row.get::<_, i64>(0)
    )?);
    Ok(())
}

License

Licensed under MIT license (LICENSE or http://opensource.org/licenses/MIT)

Version map

This is the compatible version map between libsimple and rusqlite:

libsimple version rusqlite version
=0.5.0 ~0.35
=0.4.0 ~0.34
=0.3.7 ~0.34
=0.3.6 ~0.33
=0.3.5 ~0.33
=0.3.4 ~0.32
=0.3.3 ~0.32
=0.3.2 ~0.32
=0.3.1 ~0.32
=0.3.0 ~0.31
=0.2.2 ~0.31
=0.2.1 ~0.31
=0.2.0 ~0.31
=0.1.0 ~0.31

Generate CMRC

This is only required when the simple/contrib/pinyin.txt updated. Normal user can ignore this.

cd simple && mkdir build && cd build
cmake .. -DBUILD_SQLITE3=off -DSIMPLE_WITH_JIEBA=off -DBUILD_TEST_EXAMPLE=off
make
cp -f _cmrc/include/cmrc/cmrc.hpp ../../cmrc/include/cmrc/cmrc.hpp
cp -f __cmrc_PINYIN_TEXT/lib.cpp ../../cmrc/pinyin.txt/lib.cpp
cp -f __cmrc_PINYIN_TEXT/intermediate/contrib/pinyin.txt.cpp ../../cmrc/pinyin.txt/pinyin.txt.cpp
cd .. && rm -r build && cd ..

Dependencies

~22MB
~422K SLoC