2 releases

Uses old Rust 2015

0.1.1 Sep 24, 2016
0.1.0 Sep 22, 2016

#11 in #brackets

27 downloads per month

GPL-3.0 license

28KB
713 lines

malk-lexer

A unicode lexer for use as a first-pass when writing a parser.

The main function exported by this library is lex which takes a &str and a table of valid symbols and converts them to a token tree.

The kinds of token recognized by the lexer are:

  • Idents: A string starting with a XID_Start character followed by a sequence of XID_Continue characters.
  • Whitespace: Any sequence of whitespace characters.
  • Brackets: Any bracket character, it's corresponding closing bracket and the tokens in-between returned as a sub-tree.
  • Symbols: Any string that appears in the symbol table provided to lex
  • Strings: A string enclosed with either " or ' and which may contain escaped characters.

Patches welcome!

Dependencies

~1.5MB
~20K SLoC