17 releases (9 breaking)

0.32.2 Jun 29, 2024
0.30.0 Apr 13, 2024
0.29.0 Mar 18, 2024
0.27.2 Dec 30, 2023
0.1.2 Feb 20, 2020

#1314 in Text processing

Download history 2475/week @ 2024-07-25 2435/week @ 2024-08-01 2387/week @ 2024-08-08 1768/week @ 2024-08-15 2303/week @ 2024-08-22 2408/week @ 2024-08-29 2884/week @ 2024-09-05 2254/week @ 2024-09-12 2321/week @ 2024-09-19 2663/week @ 2024-09-26 2303/week @ 2024-10-03 3026/week @ 2024-10-10 3013/week @ 2024-10-17 2682/week @ 2024-10-24 3039/week @ 2024-10-31 2820/week @ 2024-11-07

12,187 downloads per month

MIT license

150KB
3K SLoC

Lindera IPADIC NEologd Builder

License: MIT Join the chat at https://gitter.im/lindera-morphology/lindera Crates.io

IPADIC NEologd dictionary builder for Lindera. This project fork from kuromoji-rs.

Dictionary version

This repository contains mecab-ipadic-neologd.

Dictionary format

Refer to the manual for details on the IPADIC dictionary format and part-of-speech tags.

Index Name (Japanese) Name (English) Notes
0 表層形 Surface
1 左文脈ID Left context ID
2 右文脈ID Right context ID
3 コスト Cost
4 品詞 Major POS classification
5 品詞細分類1 Middle POS classification
6 品詞細分類2 Small POS classification
7 品詞細分類3 Fine POS classification
8 活用形 Conjugation type
9 活用型 Conjugation form
10 原形 Base form
11 読み Reading
12 発音 Pronunciation

User dictionary format (CSV)

Simple version

Index Name (Japanese) Name (English) Notes
0 表層形 surface
1 品詞 Major POS classification
2 読み Reading

Detailed version

Index Name (Japanese) Name (English) Notes
0 表層形 Surface
1 左文脈ID Left context ID
2 右文脈ID Right context ID
3 コスト Cost
4 品詞 POS
5 品詞細分類1 POS subcategory 1
6 品詞細分類2 POS subcategory 2
7 品詞細分類3 POS subcategory 3
8 活用形 Conjugation type
9 活用型 Conjugation form
10 原形 Base form
11 読み Reading
12 発音 Pronunciation
13 - - After 13, it can be freely expanded.

How to use IPADIC dictionary

For more details about lindera command, please refer to the following URL:

API reference

The API reference is available. Please see following URL:

Dependencies

~9MB
~213K SLoC