#text #gender #text-generation #sentence #generator #respecting #grammatical

bin+lib genere

A library for randomization of text respecting grammatical gender of sentences

4 releases

0.1.2 Apr 19, 2019
0.1.1 Apr 14, 2019
0.1.0 Apr 13, 2019
0.0.1 Apr 11, 2019

#1166 in Text processing

25 downloads per month

MPL-2.0 license

41KB
644 lines

Build Status

genere

Genere is a library to generate (possibly randomized) text with options to match the (grammatical) gender of various elements.

Example

use genere::Generator;
let json = r#"
{
   "hero": ["John[m]", "Joan[f]"],
   "job[hero]": ["wizard/witch"],
   "main[hero]": ["{hero}. He/She is a {job}."]
}"#;

let mut gen = Generator::new();
gen.add_json(json).unwrap();;
let result = gen.instantiate("main").unwrap();
assert!(&result == "John. He is a wizard."
       || &result == "Joan. She is a witch.");

Features

Binary or Rust library

It is possible to use Genere as a binary:

$ genere main < file.json

will instantiate the main symbol in the file.json file.

Genere is, however, primarily a Rust library, so it can be used in programs written in Rust: you only have to add

genere = "0.1"

In the dependencies section of your Cargo.toml file.

Text generation

Genere is inspired by Tracery and thus has a similar syntax to allow you to easily generate randonized text:

let json = r#"
{
    "name": ["John", "Johana", "Vivienne", "Eric"],
    "last_name": ["StrongArm", "Slayer", "The Red"],
    "class": ["mage", "warrior", "thief", "rogue", "barbarian"],
    "race": ["human", "dwarvish", "elvish", "vampire"],
    "text": ["{name} {last_name} is a {race} {class}.",
	     "Meet {name} {last_name}, A proud {class}!"]
}
"#;

might display "Johana Slayer is a vampire warrior."

Basically, you define a list of symbols which will be replaced (randomly) by one version of the string in the corresponding array when you "call" them using the {symbol} syntax.

Not that once a symbol has been "instantiated", ils value is fixed. So if you had:

"text": ["Meet {name} {last_name}. {name} is a proud {class}."]

it is guaranteed that both replacements for {name} will be identical.

If you want to get a (possibly) different instantiation, you need to use {{symbol}}:

"text": ["Meet {name} {last_name}. {name} is a proud {class}. There is also {{name}}, a {{class}}."]

Capitalization

When declared, symbols are case-insensitive. When they are referred to in content replacements, the capitalization of the symbol will impact the capitalization of the replacement: if thhe symbol is in lowercase, the content is not touched; if only the first letter of the symbol is in uppercase, the first letter of the replacement content will be changed to uppercase; and if the symbol is all in uppercase, the same will be applied for the replacement content.

let json = r#"
{
    "dog": ["a good dog"],
    "text1": ["This is {dog}"],
    "text2": ["This is {DOG}"],
    "text3": ["{Dog}"]
}
"#;

will display "This is a good dog", "This is A GOOD DOG" and "A good dog" for "text1", "text2" and "text3" respectively.

Gender adaptation

Genere seeks to allow easy generation of sentences that are grammaticaly gender accurate:

let json = r#"
{
    "name": ["John[m]", "Johana[f]", "Vivienne[f]", "Eric[m]"],
    "class": ["mage", "warrior", "thief", "rogue", "barbarian"],
    "text[name]": ["Meet {name}. He/She is a proud {class}!"]
}
"#;

will make sure to display "He" or She" according to the gender specified in the symbol name.

You can set a gender to these values using the [m], [f] or [n]. Similarly, you can tell genere that a symbol depends on another's symbol gender by using [symbol] in the symbol name. E.g., text[main] means that the gender in main's replacement strings will be determined by name's gender. It is also possible to specify a neutral gender, by using [n] in the definition and by adding a / in the replacement string (e.g. He/She/They). If it isn't specified in the replacement string, both male and female version will be outputted (e.g. He/She instead of Them).

Sometimes a sentence might use various gendered elements and not just depend on only one symbol's gender. For each gender variation, it is possible to specify a "dependency":

"text[hero]": ["He/She is called {hero}. His/Her son/daughter[child] is named {child}."]

Here, the gender of hero will be used to determine between He/She and His/Her, but the gender of child will be used to pick between son/daughter.

Spaces in gender adaptation

When you use this gender syntax, the '/' will only consider the word before and the word after, not allowing to have spaces in your expressions. If you want to insert a space in a gender adaptation expression, you must escape it with ~, e.g.: "du/de~ la"

Additional gender syntax

It is also possible to use the "median point" syntax used e.g. in french: "C'est un·e sorci·er·ère." is equivalent to "C'est un/une sorcier/sorcière".

Escaping

If you want to use the '[', ']', '{', '}', '/' and '·' characters in your text, you can use the escape character '~'. E.g., "~{foo}" will display "{foo}" instead of trying to find the symbol foo and replace it with its content. You can also use "~~" if you want to display the tilde symbol.

License

Genere is published under the Mozilla Public License, version 2.0. For more information, see the License.

ChangeLog

See ChangeLog.

Dependencies

~5.5–8.5MB
~152K SLoC