3 unstable releases

✓ Uses Rust 2018 edition

0.2.2 Feb 26, 2020
0.2.1 Feb 11, 2020
0.1.0 Feb 10, 2020

#35 in Machine learning

46 downloads per month

MIT license

2.5MB
262 lines

NNSplit Rust Bindings

Crates.io CI License

Fast, robust sentence splitting with bindings for Python, Rust and Javascript and pretrained models for English and German.

Installation

Add NNSplit as a dependency to your Cargo.toml:

[dependencies]
# ...
nnsplit = "<version>"
# ...

Usage

use nnsplit::NNSplit;

fn main() -> failure::Fallible<()> {
    let splitter = NNSplit::new("en")?;

    let input = vec!["This is a test This is another test."];
    println!("{:#?}", splitter.split(input));

    Ok(())
}

Models for German (NNSplit::new("de")) and English (NNSplit::new("en")) come prepackaged with NNSplit. Alternatively, you can also load your own model with NNSplit::from_model(model: tch::CModule).

Advanced

Run cargo test to test the NNSplit Rust Bindings. The NNSplit Rust Bindings also come with a simple example which splits the text passed via a CLI.

cargo run --example cli -- <text> <language>

for example:

cargo run --example cli -- "This is a test This is another test." en

You can run a benchmark of the Rust Bindings with cargo bench.

Dependencies

~6MB
~117K SLoC