17 unstable releases (3 breaking)

✓ Uses Rust 2018 edition

new 0.4.1 May 30, 2020
0.4.0 May 22, 2020
0.3.5 Apr 30, 2020
0.3.4 Feb 25, 2020
0.1.6 Feb 7, 2020

#117 in Text processing

Download history 72/week @ 2020-02-08 68/week @ 2020-02-15 133/week @ 2020-02-22 36/week @ 2020-02-29 47/week @ 2020-03-07 26/week @ 2020-03-14 91/week @ 2020-03-21 33/week @ 2020-03-28 35/week @ 2020-04-04 94/week @ 2020-04-11 21/week @ 2020-04-18 54/week @ 2020-04-25 844/week @ 2020-05-02 778/week @ 2020-05-09 71/week @ 2020-05-16 307/week @ 2020-05-23

819 downloads per month
Used in 9 crates (3 directly)

MIT license

10MB
6K SLoC

F* 5K SLoC // 0.4% comments Rust 1K SLoC // 0.1% comments

Lindera

License: MIT Join the chat at https://gitter.im/lindera-morphology/lindera

A Japanese morphological analysis library in Rust. This project fork from fulmicoton's kuromoji-rs.

Lindera aims to build a library which is easy to install and provides concise APIs for various Rust applications.

Build

The following products are required to build:

  • Rust >= 1.39.0
  • make >= 3.81
% cargo build --release

Usage

Basic example

This example covers the basic usage of Lindera.

It will:

  • Create a tokenizer in normal mode
  • Tokenize the input text
  • Output the tokens
use lindera::tokenizer::Tokenizer;

fn main() -> std::io::Result<()> {
    // create tokenizer
    let mut tokenizer = Tokenizer::new("normal", "");

    // tokenize the text
    let tokens = tokenizer.tokenize("関西国際空港限定トートバッグ");

    // output the tokens
    for token in tokens {
        println!("{}", token.text);
    }

    Ok(())
}

The above example can be run as follows:

% cargo run --example basic_example

You can see the result as follows:

関西国際空港
限定
トートバッグ

API reference

The API reference is available. Please see following URL:

Dependencies

~7.5MB
~133K SLoC