#sentence #chinese #pinyin #hanzi #characters #match #parser

match-pinyin-with-hanzi

Checks whether the sentence in Chinese characters (汉字) matches with the sentence in pinyin (拼音). Erhua is supported.

5 releases

0.1.4 Jan 6, 2024
0.1.3 Aug 22, 2021
0.1.2 Jun 28, 2021
0.1.1 Jun 28, 2021
0.1.0 Jun 28, 2021

#711 in Text processing

MIT license

6KB
61 lines

match-pinyin-with-hanzi

How can I check that a Chinese sentence written in Chinese characters (汉字) matches with the sentence in pinyin (拼音)? Well, first I have to parse the pinyin (which is not so easy), then I have to iterate over the Chinese characters... wait, 儿 might or might not stick to the previous syllable...

This crate resolves all that mess. With this crate, all you need is:

use match_pinyin_with_hanzi::match_pinyin_with_hanzi;
match_pinyin_with_hanzi(
    "māmā qí mǎ, mǎ màn, māma mà mǎ.", 
    "妈妈骑马,马慢,妈妈骂马。"
).unwrap();

Note that both māmā and māma are allowed to match with 妈妈. This crate assumes that any syllable can lose its tone, so it is perfectly okay to match a toneless pinyin to the Chinese character.

Dependencies

~1.5MB
~35K SLoC