2 releases
0.1.1 | Jan 25, 2024 |
---|---|
0.1.0 | Nov 26, 2023 |
#564 in Text processing
40 downloads per month
29KB
607 lines
Nomnom 🥘
Nado - CLI
Just a small util tool to convert the cedict_ts.u8 into a JSON or CSV file. Additionals features are:
- Add pinyin with accent based on these rules
- Add HSK level character based fetched on mandarinbean. The HSK7-9 level is parsed from a different website by wohok
- Add zhuyin support based on this conversion rules link
- Add wade-giles support based on this conversion rules link
Usage
Clone this project and run one of the cargo command below. If needed I could provided the generate json & csv file.
Json
cargo run -- generate -e ../cedict_ts.u8 -o ../cedict.json -f json
Csv
cargo run -- generate -e ../cedict_ts.u8 -o ../cedict.csv -f csv
Dodo - Lib
A small crates is available which provided a list of utility method to interact with the cedict and doing some pinyin conversion. Below is how you can use the crate to load the cedict
use dodo_zh;
use dodo_zh::KeyVariant;
fn main() {
// The KeyVariant can either be Traditional or Simplified chinese
let cedict = dodo_zh::load_cedict_dictionary(path, KeyVariant::Traditional).unwrap();
let wo = cedict.items.get("我").unwrap();
// will return an Item struct
println!(wo.translations);
}
A set of example exist which can helps you to see how to do some pinyin manipulation. Namely convert the pinyin with tone number to a pinyin with tone marker etc...
You can run the example with the following command
cargo run --example pinyin
Dependencies
~1.3–2.2MB
~73K SLoC