17 releases (9 breaking)
0.10.0 | Aug 14, 2024 |
---|---|
0.9.1 | Apr 29, 2024 |
0.8.1 | Mar 23, 2024 |
0.6.3 | Dec 16, 2023 |
0.2.0 | Jun 6, 2023 |
#423 in Text processing
323 downloads per month
Used in sbv2_core
330KB
8K
SLoC
jpreprocess
Japanese text preprocessor for Text-to-Speech application.
This project is a rewrite of OpenJTalk in Rust language.
Usage
Put the following in Cargo.toml
[dependencies]
jpreprocess = "0.10.0"
It may be necessary to add jpreprocess-njd and/or jpreprocess-jpcommon if you want control over how njd and jpcommon are processed.
Example
In this example, jpreprocess takes a lindera dictionary and preprocesses a text into jpcommon labels.
use jpreprocess::*;
let config = JPreprocessConfig {
dictionary: SystemDictionaryConfig::File(path),
user_dictionary: None,
};
let jpreprocess = JPreprocess::from_config(config)?;
let jpcommon_label = jpreprocess
.extract_fullcontext("日本語文を解析し、音声合成エンジンに渡せる形式に変換します.")?;
assert_eq!(
jpcommon_label[2].to_string(),
concat!(
"sil^n-i+h=o",
"/A:-3+1+7",
"/B:xx-xx_xx",
"/C:02_xx+xx",
"/D:02+xx_xx",
"/E:xx_xx!xx_xx-xx",
"/F:7_4#0_xx@1_3|1_12",
"/G:4_4%0_xx_1",
"/H:xx_xx",
"/I:3-12@1+2&1-8|1+41",
"/J:5_29",
"/K:2+8-41"
)
);
Other examples can be found at GitHub.
Copyrights
This software includes source code from:
- OpenJTalk. Copyright (c) 2008-2016 Nagoya Institute of Technology Department of Computer Science
- Lindera. Copyright (c) 2019 by the project authors
License
BSD-3-Clause
API Reference
Dependencies
~17MB
~341K SLoC