50 releases (26 breaking)
| 0.32.3 | Mar 18, 2025 |
|---|---|
| 0.32.2 | Jun 29, 2024 |
| 0.31.0 | May 28, 2024 |
| 0.29.0 | Mar 18, 2024 |
| 0.3.2 | Feb 20, 2020 |
#645 in Text processing
5,169 downloads per month
Used in 13 crates
(2 directly)
155KB
3K
SLoC
Lindera UniDic Builder
UniDic builder for Lindera.
Dictionary version
This repository contains unidic-mecab.
Dictionary format
Refer to the manual for details on the unidic-mecab dictionary format and part-of-speech tags.
| Index | Name (Japanese) | Name (English) | Notes |
|---|---|---|---|
| 0 | 表層形 | Surface | |
| 1 | 左文脈ID | Left context ID | |
| 2 | 右文脈ID | Right context ID | |
| 3 | コスト | Cost | |
| 4 | 品詞大分類 | Major POS classification | |
| 5 | 品詞中分類 | Middle POS classification | |
| 6 | 品詞小分類 | Small POS classification | |
| 7 | 品詞細分類 | Fine POS classification | |
| 8 | 活用型 | Conjugation form | |
| 9 | 活用形 | Conjugation type | |
| 10 | 語彙素読み | Lexeme reading | |
| 11 | 語彙素(語彙素表記 + 語彙素細分類) | Lexeme | |
| 12 | 書字形出現形 | Orthography appearance type | |
| 13 | 発音形出現形 | Pronunciation appearance type | |
| 14 | 書字形基本形 | Orthography basic type | |
| 15 | 発音形基本形 | Pronunciation basic type | |
| 16 | 語種 | Word type | |
| 17 | 語頭変化型 | Prefix of a word form | |
| 18 | 語頭変化形 | Prefix of a word type | |
| 19 | 語末変化型 | Suffix of a word form | |
| 20 | 語末変化形 | Suffix of a word type |
User dictionary format (CSV)
Simple version
| Index | Name (Japanese) | Name (English) | Notes |
|---|---|---|---|
| 0 | 表層形 | Surface | |
| 1 | 品詞大分類 | Major POS classification | |
| 2 | 語彙素読み | Lexeme reading |
Detailed version
| Index | Name (Japanese) | Name (English) | Notes |
|---|---|---|---|
| 0 | 表層形 | Surface | |
| 1 | 左文脈ID | Left context ID | |
| 2 | 右文脈ID | Right context ID | |
| 3 | コスト | Cost | |
| 4 | 品詞大分類 | Major POS classification | |
| 5 | 品詞中分類 | Middle POS classification | |
| 6 | 品詞小分類 | Small POS classification | |
| 7 | 品詞細分類 | Fine POS classification | |
| 8 | 活用型 | Conjugation form | |
| 9 | 活用形 | Conjugation type | |
| 10 | 語彙素読み | Lexeme reading | |
| 11 | 語彙素(語彙素表記 + 語彙素細分類) | Lexeme | |
| 12 | 書字形出現形 | Orthography appearance type | |
| 13 | 発音形出現形 | Pronunciation appearance type | |
| 14 | 書字形基本形 | Orthography basic type | |
| 15 | 発音形基本形 | Pronunciation basic type | |
| 16 | 語種 | Word type | |
| 17 | 語頭変化型 | Prefix of a word form | |
| 18 | 語頭変化形 | Prefix of a word type | |
| 19 | 語末変化型 | Suffix of a word form | |
| 20 | 語末変化形 | Suffix of a word type | |
| 21 | - | - | After 21, it can be freely expanded. |
How to use IPADIC dictionary
For more details about lindera command, please refer to the following URL:
API reference
The API reference is available. Please see following URL:
Dependencies
~9MB
~212K SLoC