12 releases (5 breaking)
0.6.4 | Dec 2, 2021 |
---|---|
0.6.3 | Dec 2, 2021 |
0.5.0 | Nov 30, 2021 |
0.4.1 | Nov 30, 2021 |
0.1.0 | Nov 27, 2021 |
#1172 in Text processing
32 downloads per month
38KB
817 lines
GenEx
GenEx is a text template expansion library.
lib.rs
:
Rust library implementing a custom text generation/templating system. Genex is similar to Tracery, but with some extra functionality around using external data.
Usage
First create a grammar, then generate an expansion or multiple expansions from it.
use std::collections::HashSet;
use std::str::FromStr;
use maplit::hashmap;
use genex::Grammar;
let grammar = Grammar::from_str(
r#"
RULES:
top = The <adj> <noun> #action|ed# #object|a#?:[ with gusto] in <place>.
adj = [glistening|#adj#]
noun = key
place = [the #room#|#city#]
WEIGHTS:
room = 2
city = 1
"#,
)
.unwrap();
let data = hashmap! {
"action".to_string() => "pick".to_string(),
"object".to_string() => "lizard".to_string(),
"room".to_string() => "kitchen".to_string(),
"city".to_string() => "New York".to_string(),
};
// Now we find the top-scoring expansion. The score is the sum of the
// weights of all variables used in an expansion. We know that the top
// scoring expansion is going to end with "the kitchen" because we gave
// `room` a higher weight than `city`.
let best_expansion = grammar.generate("top", &data).unwrap().unwrap();
assert_eq!(
best_expansion,
"The glistening key picked a lizard in the kitchen.".to_string()
);
// Now get all possible expansions:
let all_expansions = grammar.generate_all("top", &data).unwrap();
assert_eq!(
HashSet::<_>::from_iter(all_expansions),
HashSet::<_>::from_iter(vec![
"The glistening key picked a lizard in New York.".to_string(),
"The glistening key picked a lizard with gusto in New York.".to_string(),
"The glistening key picked a lizard with gusto in the kitchen.".to_string(),
"The glistening key picked a lizard in the kitchen.".to_string(),
])
);
Features
Genex tries to make it easy to generate text based on varying amounts of external data. For example you can write a single expansion grammar that works when all you know is the name of an object, but uses the additional information if you know the object's size, location, color, or other qualities.
The default behavior is for genex to try to find an expansion that uses the most external data possible, but by changing the weights assigned to variables you can prioritize which variables are used, even prioritizing the use of a single important variable over the use of multiple, less important variables.
Grammar syntax
Rules
"RULES:
" indicates the rules section of the grammar. Rules are defined by
a left-hand side (LHS) and a right-hand side (RHS). The LHS is the name of
the rule. The RHS is a sequence of terms.
Terms:
- Sequence:
[term1 term2 ...]
- Choice:
[term1|term2|...]
(You can put a newline after a|
character.) - Optional:
?:[term1 term2 ...]
- Variable:
#variable#
or#variable|modifier#
- Non-terminal:
<rule-name>
- Plain text:
I am some plain text. I hope I get expanded.
Weights
"WEIGHTS:
" indicates the weights section of the grammar. Weights are of
the form <rule-name> = <number>.
Modifiers
Modifiers are used to transform variable values during expansion.
Modifiers:
capitalize
: Capitalizes the first letter of the value.capitalizeAll
: Capitalizes the first letter of each word in the value.inQuotes
: Surrounds the value with double quotes.comma
: Adds a comma after the value, if it doesn't already end with punctuation.s
: Pluralizes the value.a
: Prefixes the value with an "a"/"an" article as appropriate.ed
: Changes the first word of the value to be past tense.
Dependencies
~5.5–7.5MB
~143K SLoC