5 releases
0.1.4 | Mar 25, 2024 |
---|---|
0.1.3 | Mar 2, 2024 |
0.1.2 | Feb 6, 2024 |
0.1.1 | Feb 4, 2024 |
0.1.0 | Jan 25, 2024 |
#849 in Encoding
Used in jbonsai
83KB
2K
SLoC
jlabel-question
Parser and matcher of htsvoice full-context-label questions.
Note: serde
support is experimental. Changes may not be treated as breaking.
Usage
Put the following in Cargo.toml
[dependencies]
jlabel-question = "0.1.4"
Copyrights
The question.hed
file in the tests directory comes from
The Nitech Japanese Speech Database "NIT ATR503 M001".
- Creative Commons Attribution 3.0 license
- Copyright (c) 2003-2015 Nagoya Institute of Technology Department of Computer Science
License
BSD-3-Clause
API Reference
lib.rs
:
HTS-style full-context label question parser and matcher.
The main structure for parsing and matching is AllQuestion
.
It can parse most patterns, but it cannot parse some of them.
For details, please see Condition for parsing as AllQuestion.
use jlabel::Label;
use jlabel_question::{AllQuestion, QuestionMatcher};
use std::str::FromStr;
let question = AllQuestion::parse(&["*/A:-??+*", "*/A:-?+*"])?;
let label_str = concat!(
"sil^n-i+h=o",
"/A:-3+1+7",
"/B:xx-xx_xx",
"/C:02_xx+xx",
"/D:02+xx_xx",
"/E:xx_xx!xx_xx-xx",
"/F:7_4#0_xx@1_3|1_12",
"/G:4_4%0_xx_1",
"/H:xx_xx",
"/I:3-12@1+2&1-8|1+41",
"/J:5_29",
"/K:2+8-41"
);
assert!(question.test(&label_str.parse()?));
#
Condition for parsing as AllQuestion
Here is the necessary condition for the pattern to succeed in parsing as AllQuestion
,
but some questions may not succeed even if they fulfill these requirements.
- The patterns must be valid as htsvoice question pattern.
- Using
*
and?
as wildcard, matches the entire full-context label. - The pattern that cannot match full-context label in any situation (e.g.
*/A:-?????+*
) are not allowed. - Minus sign (
-
) in numerical field can only be used in the first element ofA
(A1
).
- Using
- All the patterns must be about the same position
- e.g. The first pattern is about the first element of Phoneme, the second pattern is about the last element of field
J
, is not allowed.
- e.g. The first pattern is about the first element of Phoneme, the second pattern is about the last element of field
- Each pattern must not have conditions on two or more positions.
- When the pattern is about position of numerical field (except for categorical field such as
B
,C
, orD
),- The pattern must be continuous.
Fallback
As AllQuestion
parsing does not always suceed (even if the pattern is correct),
you may need to write fallback for that.
If you just want to ignore those pattern, you can simply return false
instead of the result of test()
.
If you need to successfully parse pattern which AllQuestion
fails to parse,
regex::RegexQuestion
is the best choice.
use jlabel::Label;
use jlabel_question::{regex::RegexQuestion, AllQuestion, ParseError, QuestionMatcher};
enum Pattern {
AllQustion(AllQuestion),
Regex(RegexQuestion),
}
impl Pattern {
fn parse(patterns: &[&str]) -> Result<Self, ParseError> {
match AllQuestion::parse(patterns) {
Ok(question) => Ok(Self::AllQustion(question)),
Err(_) => Ok(Self::Regex(RegexQuestion::parse(patterns)?)),
}
}
fn test(&self, label: &Label) -> bool {
match self {
Self::AllQustion(question) => question.test(label),
Self::Regex(question) => question.test(label),
}
}
}
Dependencies
~0.3–1.5MB
~31K SLoC