84 releases (21 stable)

Uses new Rust 2024

new 2.3.2 Mar 14, 2026
2.2.0 Feb 10, 2026
1.5.1 Jan 3, 2026
1.5.0 Dec 30, 2025
0.12.2 Mar 23, 2022

#2377 in Text processing

Download history 11286/week @ 2025-11-22 13292/week @ 2025-11-29 12572/week @ 2025-12-06 10276/week @ 2025-12-13 8548/week @ 2025-12-20 8270/week @ 2025-12-27 11357/week @ 2026-01-03 13740/week @ 2026-01-10 12905/week @ 2026-01-17 14121/week @ 2026-01-24 13985/week @ 2026-01-31 15015/week @ 2026-02-07 13992/week @ 2026-02-14 16593/week @ 2026-02-21 17784/week @ 2026-02-28 20340/week @ 2026-03-07

70,699 downloads per month
Used in 15 crates (via lindera)

MIT license

400KB
8K SLoC

Lindera CC-CE-DICT

License: MIT Crates.io

Dictionary version

This repository contains CC-CEDICT-MeCab.

Dictionary format

Refer to the manual for details on the unidic-mecab dictionary format and part-of-speech tags.

Index Name (Chinese) Name (English) Notes
0 表面形式 Surface
1 左语境ID Left context ID
2 右语境ID Right context ID
3 成本 Cost
4 词类 Part-of-speech
5 词类1 Part-of-speech subcategory 1
6 词类2 Part-of-speech subcategory 2
7 词类3 Part-of-speech subcategory 3
8 併音 Pinyin
9 繁体字 Traditional
10 簡体字 Simplified
11 定义 Definition

User dictionary format (CSV)

Simple version

Index Name (Japanese) Name (English) Notes
0 表面形式 Surface
1 词类 Part-of-speech
2 併音 Pinyin

Detailed version

Index Name (Japanese) Name (English) Notes
0 表面形式 Surface
1 左语境ID Left context ID
2 右语境ID Right context ID
3 成本 Cost
4 词类 Part-of-speech
5 词类1 Part-of-speech subcategory 1
6 词类2 Part-of-speech subcategory 2
7 词类3 Part-of-speech subcategory 3
8 併音 Pinyin
9 繁体字 Traditional
10 簡体字 Simplified
11 定义 Definition
12 - - After 12, it can be freely expanded.

API reference

The API reference is available. Please see following URL:

Dependencies

~14–23MB
~457K SLoC