2 unstable releases

new 0.2.0 Jan 30, 2025
0.1.0 Jan 25, 2025

#3 in #katakana

Download history 97/week @ 2025-01-22

97 downloads per month

MIT/Apache

125KB
4.5K SLoC

Crates.io Documentation Codecov Dependency status

jp-deinflector

Introduction

This is a Rust crate for deinflecting Japanese words optimized for maximum performance. Currently, it has a function deinflect(word: &str) -> Vec<String> that will output a list of possible deinflections for the input word. Since it doesn't do any dictionary lookups, the list may (and will) contain false deinflections.

There is also a function kata_to_hira(kata: &str) that converts all katakana characters in kata into their hiragana counterparts.

This crate is meant for use in dictionary applications to obtain a list of possible deinflections that can then be looked up in a dictionary.

Performance

This crate uses a perfect hash table to store the deinflection rules, which means that lookup can be performed very quickly in constant time. The time required for a single deinflection is usually in the nanosecond range.

Dependencies

~0.7–1.4MB
~29K SLoC