#nlp #language #processing #natural

yanked so_many_words

Not Linear Programming

19 releases

0.1.12 Jul 10, 2020
0.1.11 Jul 8, 2020
0.0.8 Jul 6, 2020

#46 in #natural

48 downloads per month

MIT/Apache

34KB
744 lines

So Many Words!!

Crates.IO Documentation Build Nightly Build

This writes a lot of words, not so much for reading. Maybe this could eventually be useful for directed automatic translation.

cargo run --bin tokenize [input]
cargo run --bin stem [language] [input]
cargo run --bin detect [input]
cargo run --bin eudex [input]
cargo run --bin build_phoneme [language] [input]
cargo run --bin search_phoneme [terms]

Partial support for these language: Arabic, Danish, Dutch, English, French, German Greek, Hungarian, Italian, Norwegian, Portuguese, Romanian, Russian, Spanish, Swedish Tamil, Turkish

If you would like to contribute to this project or more generally to any Open NLP Project then checkout my TODO page for open issues.

Contribution

Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in so_many_words by you, shall be dual licensed under the MIT and Apache 2.0 license without any additional terms or conditions.

Dependencies

~10MB
~141K SLoC