2 releases

0.0.2 Nov 5, 2024
0.0.1 Oct 24, 2024

#377 in Text processing

Download history 109/week @ 2024-10-21 8/week @ 2024-10-28 153/week @ 2024-11-04 50/week @ 2024-11-11 697/week @ 2024-11-18 134/week @ 2024-11-25 155/week @ 2024-12-02 147/week @ 2024-12-09

1,145 downloads per month

MIT license

15MB
177K SLoC

common-words-all

Most common words sorted by ngram frequency.

Available in the following languages:

  • Chinese
  • English
  • French
  • German
  • Hebrew
  • Italian
  • Russian
  • Spanish

Available ngram sizes:

  • 1
  • 2
  • 3
  • 4
  • 5

Usage

Get top 10 english ngrams:

let top = get_top(Language::English, 10, NgramSize::One);

Examples

Simple

You can specify features of language (english) and ngram size (one)

cargo run --example simple --no-default-features -F english -F one --release

Data

Dataset version 20200217 from Google Books

License

MIT

© 2024, Eugene Hauptmann

No runtime deps