#japanese-morphological #dictionary #japanese #builder #ipadic #neologd

lindera-ipadic-neologd-builder

A Japanese morphological dictionary builder for IPADIC NEologd

15 releases (7 breaking)

new 0.30.0 Apr 13, 2024
0.29.0 Mar 18, 2024
0.28.0 Feb 23, 2024
0.27.2 Dec 30, 2023
0.1.2 Feb 20, 2020

#1304 in Text processing

Download history 2255/week @ 2023-12-23 2622/week @ 2023-12-30 3659/week @ 2024-01-06 3939/week @ 2024-01-13 3864/week @ 2024-01-20 3832/week @ 2024-01-27 3938/week @ 2024-02-03 4278/week @ 2024-02-10 4400/week @ 2024-02-17 5409/week @ 2024-02-24 5552/week @ 2024-03-02 4558/week @ 2024-03-09 6383/week @ 2024-03-16 5049/week @ 2024-03-23 4839/week @ 2024-03-30 3872/week @ 2024-04-06

20,769 downloads per month
Used in 17 crates (2 directly)

MIT license

71KB
1.5K SLoC

Lindera IPADIC NEologd Builder

License: MIT Join the chat at https://gitter.im/lindera-morphology/lindera

IPADIC NEologd dictionary builder for Lindera. This project fork from kuromoji-rs.

Dictionary version

This repository contains mecab-ipadic-neologd.

Dictionary format

Refer to the manual for details on the IPADIC dictionary format and part-of-speech tags.

Index Name (Japanese) Name (English) Notes
0 表層形 Surface
1 左文脈ID Left context ID
2 右文脈ID Right context ID
3 コスト Cost
4 品詞 Major POS classification
5 品詞細分類1 Middle POS classification
6 品詞細分類2 Small POS classification
7 品詞細分類3 Fine POS classification
8 活用形 Conjugation type
9 活用型 Conjugation form
10 原形 Base form
11 読み Reading
12 発音 Pronunciation

User dictionary format (CSV)

Simple version

Index Name (Japanese) Name (English) Notes
0 表層形 surface
1 品詞 Major POS classification
2 読み Reading

Detailed version

Index Name (Japanese) Name (English) Notes
0 表層形 Surface
1 左文脈ID Left context ID
2 右文脈ID Right context ID
3 コスト Cost
4 品詞 POS
5 品詞細分類1 POS subcategory 1
6 品詞細分類2 POS subcategory 2
7 品詞細分類3 POS subcategory 3
8 活用形 Conjugation type
9 活用型 Conjugation form
10 原形 Base form
11 読み Reading
12 発音 Pronunciation
13 - - After 13, it can be freely expanded.

How to use IPADIC dictionary

For more details about lindera command, please refer to the following URL:

API reference

The API reference is available. Please see following URL:

Dependencies

~7MB
~177K SLoC