10 releases

0.3.2	Jan 5, 2022
0.3.1	Jan 5, 2022
0.2.0	Jan 5, 2022
0.1.5	Apr 10, 2021
0.1.3	Mar 24, 2021

#1462 in Text processing

36 downloads per month
Used in lingo

MIT license

18KB
422 lines

textcat-rs

Library to extract N-Grams from texts. This is a low level library. Lingo is build on top of this library to detect human languages on texts.

This library provides tools to train with sample texts, extracting N-Grams from texts, create sample and train categories. The trained data can be serialized to be used later. The library also provides tools to detect to which pretained category a given text would be closer to.

Dependencies

~1–2MB
~38K SLoC