#tts #open-j-talk #text-to-speech #library

bin+lib jpreprocess

Japanese text preprocessor for Text-to-Speech application (OpenJTalk rewrite in rust language)

17 releases (9 breaking)

0.10.0 Aug 14, 2024
0.9.1 Apr 29, 2024
0.8.1 Mar 23, 2024
0.6.3 Dec 16, 2023
0.2.0 Jun 6, 2023

#423 in Text processing

Download history 11/week @ 2024-08-17 1/week @ 2024-08-31 68/week @ 2024-09-07 36/week @ 2024-09-14 341/week @ 2024-09-21 127/week @ 2024-09-28 102/week @ 2024-10-05 236/week @ 2024-10-12 215/week @ 2024-10-19 65/week @ 2024-10-26 147/week @ 2024-11-02 107/week @ 2024-11-09 117/week @ 2024-11-16 19/week @ 2024-11-23 70/week @ 2024-11-30

323 downloads per month
Used in sbv2_core

BSD-3-Clause

330KB
8K SLoC

jpreprocess

Japanese text preprocessor for Text-to-Speech application.

This project is a rewrite of OpenJTalk in Rust language.

Usage

Put the following in Cargo.toml

[dependencies]
jpreprocess = "0.10.0"

It may be necessary to add jpreprocess-njd and/or jpreprocess-jpcommon if you want control over how njd and jpcommon are processed.

Example

In this example, jpreprocess takes a lindera dictionary and preprocesses a text into jpcommon labels.

use jpreprocess::*;

let config = JPreprocessConfig {
     dictionary: SystemDictionaryConfig::File(path),
     user_dictionary: None,
 };
let jpreprocess = JPreprocess::from_config(config)?;

let jpcommon_label = jpreprocess
    .extract_fullcontext("日本語文を解析し、音声合成エンジンに渡せる形式に変換します.")?;
assert_eq!(
  jpcommon_label[2].to_string(),
  concat!(
      "sil^n-i+h=o",
      "/A:-3+1+7",
      "/B:xx-xx_xx",
      "/C:02_xx+xx",
      "/D:02+xx_xx",
      "/E:xx_xx!xx_xx-xx",
      "/F:7_4#0_xx@1_3|1_12",
      "/G:4_4%0_xx_1",
      "/H:xx_xx",
      "/I:3-12@1+2&1-8|1+41",
      "/J:5_29",
      "/K:2+8-41"
  )
);

Other examples can be found at GitHub.

Copyrights

This software includes source code from:

  • OpenJTalk. Copyright (c) 2008-2016 Nagoya Institute of Technology Department of Computer Science
  • Lindera. Copyright (c) 2019 by the project authors

License

BSD-3-Clause

API Reference

Dependencies

~17MB
~341K SLoC