2 releases

0.1.1 May 6, 2021
0.1.0 May 5, 2021

#495 in Text editors


Used in 3 crates

MIT/Apache

155KB
3.5K SLoC

Layered NLP

Incrementally build up recognizers over an abstract token that combine to create multiple possible interpretations.

Key features:

  • Abstract over token type to support "rich" tokens like we have at Storyscript.
  • May generate multiple interpretations of the same token span.
  • Produces a set of ranges over the input token list with different attributes, for example:

Layering

The key idea here is to enable starting from a bunch of vague tags and slowly building meaning up through incrementally adding information that builds on itself.

Simplification: Money = '$' + Number

    $   123   .     00
                    ╰Natural
              ╰Punct
        ╰Natural
        ╰Amt(Decimal)
    ╰Money($/£, Num)─╯

Simplification:

  • Location(NYC) = 'New' + 'York' + 'City'
  • Location(AMS) = 'Amsterdam'
  • Address(Person, Location) = Person + Verb('live') + Predicate('in') + Location
    I     live      in      New York City
                                     ╰Noun
                                ╰Noun
                            ╰Adj
                    ╰Predicate
          ╰Verb
    ╰Noun
    ╰Person(Self)
                            ╰──Location─╯
    ╰────Address(Person, Location)─────╯

MIT licensed APACHE licensed

Dependencies

~3–4MB
~63K SLoC