#tts #open-j-talk #library

jlabel-question

HTS-style full-context label question parser and matcher

5 releases

0.1.4 Mar 25, 2024
0.1.3 Mar 2, 2024
0.1.2 Feb 6, 2024
0.1.1 Feb 4, 2024
0.1.0 Jan 25, 2024

#1218 in Encoding


Used in jbonsai

BSD-3-Clause

83KB
2K SLoC

jlabel-question

Parser and matcher of htsvoice full-context-label questions.

Note: serde support is experimental. Changes may not be treated as breaking.

Usage

Put the following in Cargo.toml

[dependencies]
jlabel-question = "0.1.4"

Copyrights

The question.hed file in the tests directory comes from The Nitech Japanese Speech Database "NIT ATR503 M001".

  • Creative Commons Attribution 3.0 license
    • Copyright (c) 2003-2015 Nagoya Institute of Technology Department of Computer Science

License

BSD-3-Clause

API Reference


lib.rs:

HTS-style full-context label question parser and matcher.

The main structure for parsing and matching is AllQuestion. It can parse most patterns, but it cannot parse some of them. For details, please see Condition for parsing as AllQuestion.

use jlabel::Label;
use jlabel_question::{AllQuestion, QuestionMatcher};

use std::str::FromStr;

let question = AllQuestion::parse(&["*/A:-??+*", "*/A:-?+*"])?;
let label_str = concat!(
    "sil^n-i+h=o",
    "/A:-3+1+7",
    "/B:xx-xx_xx",
    "/C:02_xx+xx",
    "/D:02+xx_xx",
    "/E:xx_xx!xx_xx-xx",
    "/F:7_4#0_xx@1_3|1_12",
    "/G:4_4%0_xx_1",
    "/H:xx_xx",
    "/I:3-12@1+2&1-8|1+41",
    "/J:5_29",
    "/K:2+8-41"
);
assert!(question.test(&label_str.parse()?));
#

Condition for parsing as AllQuestion

Here is the necessary condition for the pattern to succeed in parsing as AllQuestion, but some questions may not succeed even if they fulfill these requirements.

  • The patterns must be valid as htsvoice question pattern.
    • Using * and ? as wildcard, matches the entire full-context label.
    • The pattern that cannot match full-context label in any situation (e.g. */A:-?????+*) are not allowed.
    • Minus sign (-) in numerical field can only be used in the first element of A (A1).
  • All the patterns must be about the same position
    • e.g. The first pattern is about the first element of Phoneme, the second pattern is about the last element of field J, is not allowed.
  • Each pattern must not have conditions on two or more positions.
  • When the pattern is about position of numerical field (except for categorical field such as B, C, or D),
    • The pattern must be continuous.

Fallback

As AllQuestion parsing does not always suceed (even if the pattern is correct), you may need to write fallback for that.

If you just want to ignore those pattern, you can simply return false instead of the result of test().

If you need to successfully parse pattern which AllQuestion fails to parse, regex::RegexQuestion is the best choice.

use jlabel::Label;
use jlabel_question::{regex::RegexQuestion, AllQuestion, ParseError, QuestionMatcher};

enum Pattern {
    AllQustion(AllQuestion),
    Regex(RegexQuestion),
}
impl Pattern {
    fn parse(patterns: &[&str]) -> Result<Self, ParseError> {
        match AllQuestion::parse(patterns) {
            Ok(question) => Ok(Self::AllQustion(question)),
            Err(_) => Ok(Self::Regex(RegexQuestion::parse(patterns)?)),
        }
    }
    fn test(&self, label: &Label) -> bool {
        match self {
            Self::AllQustion(question) => question.test(label),
            Self::Regex(question) => question.test(label),
        }
    }
}

Dependencies

~0.3–1.5MB
~32K SLoC