#nlp #japanese #japan #unicode

kanji

A library for the handling and analysis of Japanese text, particularly Kanji

4 stable releases

2.0.0 Mar 19, 2022
1.1.0 Aug 25, 2020
1.0.1 Jun 15, 2020
1.0.0 May 31, 2020

#139 in Internationalization (i18n)

Download history 34/week @ 2024-06-17 23/week @ 2024-06-24 2/week @ 2024-07-01 41/week @ 2024-07-08 2/week @ 2024-07-15 65/week @ 2024-07-22 37/week @ 2024-07-29 53/week @ 2024-08-05 24/week @ 2024-08-12 15/week @ 2024-08-19 30/week @ 2024-08-26 32/week @ 2024-09-02 31/week @ 2024-09-09 35/week @ 2024-09-16 47/week @ 2024-09-23 40/week @ 2024-09-30

154 downloads per month

MIT license

55KB
760 lines

Kanji

Tests

A library for the handling and analysis of Japanese text, particularly Kanji. It can be used to find the density of Kanji in given texts according to their Level classification, as defined by the Japan Kanji Aptitude Testing Foundation (日本漢字能力検定協会).

The Kanji data presented here matches the Foundation's official 2020 February charts. Note that some Kanji had their levels changed (pdf) as of 2020.

See the documentation for further explanation and usage examples.

For the Haskell version of this library, see here.


kanjiは日本文を分析するライブラリです。漢字を中心とし、日本漢字能力検定協会が 指定する「級」に従って文の中の漢字の密度や難度を計算する事ができます。

「級」自体は2020年2月現在。注意:協会の2月の報告によるといくつかの級の配 当漢字に変更がありました。

ライブラリの詳しい使い方はドキュメンテーションをご覧ください。

kanjiのHaskell版はこちら.

Example・例

To find out how many Kanji of each exam level belong to some text:

ある文の漢字はどの級に所属するかを計算するには:

let level_table = kanji::level_table();
let texts = vec![
    "非常に面白い文章",
    "誰でも読んだ事のある名作",
    "飛行機で空を飛ぶ",
];

for t in texts {
    let counts = kanji::kanji_counts(t, &level_table);
    println!("{:#?}", counts);
}

Dependencies

~165KB