#japanese #tts #text-to-speech #openjtalk #library

bin+lib jpreprocess

Japanese text preprocessor for Text-to-Speech application (OpenJTalk rewrite in rust language)

15 releases (8 breaking)

0.9.0 Apr 14, 2024
0.8.1 Mar 23, 2024
0.8.0 Feb 24, 2024
0.6.3 Dec 16, 2023
0.2.0 Jun 6, 2023

#341 in Text processing

Download history 8/week @ 2024-02-05 143/week @ 2024-02-19 52/week @ 2024-02-26 5/week @ 2024-03-04 25/week @ 2024-03-11 171/week @ 2024-03-18 54/week @ 2024-03-25 36/week @ 2024-04-01 21/week @ 2024-04-08 168/week @ 2024-04-15

300 downloads per month

BSD-3-Clause

2MB
7.5K SLoC

jpreprocess

Japanese text preprocessor for Text-to-Speech application.

This project is a rewrite of OpenJTalk in Rust language.

Usage

Put the following in Cargo.toml

[dependencies]
jpreprocess = "0.9.0"

It may be necessary to add jpreprocess-njd and/or jpreprocess-jpcommon if you want control over how njd and jpcommon are processed.

Example

In this example, jpreprocess takes a lindera dictionary and preprocesses a text into jpcommon labels.

use jpreprocess::*;

let config = JPreprocessConfig {
     dictionary: SystemDictionaryConfig::File(path),
     user_dictionary: None,
 };
let jpreprocess = JPreprocess::from_config(config)?;

let jpcommon_label = jpreprocess
    .extract_fullcontext("日本語文を解析し、音声合成エンジンに渡せる形式に変換します.")?;
assert_eq!(
  jpcommon_label[2].to_string(),
  concat!(
      "sil^n-i+h=o",
      "/A:-3+1+7",
      "/B:xx-xx_xx",
      "/C:02_xx+xx",
      "/D:02+xx_xx",
      "/E:xx_xx!xx_xx-xx",
      "/F:7_4#0_xx@1_3|1_12",
      "/G:4_4%0_xx_1",
      "/H:xx_xx",
      "/I:3-12@1+2&1-8|1+41",
      "/J:5_29",
      "/K:2+8-41"
  )
);

Other examples can be found at GitHub.

Copyrights

This software includes source code from:

  • OpenJTalk. Copyright (c) 2008-2016 Nagoya Institute of Technology Department of Computer Science
  • Lindera. Copyright (c) 2019 by the project authors

License

BSD-3-Clause

API Reference

Dependencies

~12–16MB
~332K SLoC