#japanese #morphological #dictionary #builder #ipadic

lindera-ipadic-builder

A Japanese morphological dictionary builder for IPADIC

45 releases (21 breaking)

0.23.0 Feb 23, 2023
0.21.0 Jan 22, 2023
0.19.2 Dec 27, 2022
0.18.0 Oct 26, 2022
0.3.2 Feb 20, 2020

#706 in Text processing

Download history 2019/week @ 2022-11-26 2175/week @ 2022-12-03 2335/week @ 2022-12-10 2368/week @ 2022-12-17 1749/week @ 2022-12-24 1812/week @ 2022-12-31 2039/week @ 2023-01-07 2228/week @ 2023-01-14 2271/week @ 2023-01-21 2352/week @ 2023-01-28 2491/week @ 2023-02-04 2370/week @ 2023-02-11 2839/week @ 2023-02-18 2717/week @ 2023-02-25 2959/week @ 2023-03-04 2515/week @ 2023-03-11

11,436 downloads per month
Used in 20 crates (2 directly)

MIT license

69KB
1.5K SLoC

Lindera IPADIC Builder

License: MIT Join the chat at https://gitter.im/lindera-morphology/lindera

IPADIC dictionary builder for Lindera. This project fork from kuromoji-rs.

Dictionary version

This repository contains mecab-ipadic-2.7.0-20070801.

Dictionary format

Refer to the manual for details on the IPADIC dictionary format and part-of-speech tags.

Index Name (Japanese) Name (English) Notes
0 表層形 Surface
1 左文脈ID Left context ID
2 右文脈ID Right context ID
3 コスト Cost
4 品詞 Major POS classification
5 品詞細分類1 Middle POS classification
6 品詞細分類2 Small POS classification
7 品詞細分類3 Fine POS classification
8 活用形 Conjugation type
9 活用型 Conjugation form
10 原形 Base form
11 読み Reading
12 発音 Pronunciation

User dictionary format (CSV)

Simple version

Index Name (Japanese) Name (English) Notes
0 表層形 surface
1 品詞 Major POS classification
2 読み Reading

Detailed version

Index Name (Japanese) Name (English) Notes
0 表層形 Surface
1 左文脈ID Left context ID
2 右文脈ID Right context ID
3 コスト Cost
4 品詞 POS
5 品詞細分類1 POS subcategory 1
6 品詞細分類2 POS subcategory 2
7 品詞細分類3 POS subcategory 3
8 活用形 Conjugation type
9 活用型 Conjugation form
10 原形 Base form
11 読み Reading
12 発音 Pronunciation
13 - - After 13, it can be freely expanded.

How to use IPADIC dictionary

For more details about lindera command, please refer to the following URL:

API reference

The API reference is available. Please see following URL:

Dependencies

~8.5MB
~222K SLoC