#lexer #tokenizer #parser #tokenize #parse-tree

tuker

A small tokenizer/parser library with an emphasis on usability

1 unstable release

0.1.0 Oct 2, 2024

#180 in Parser tooling

MIT license

46KB
922 lines

Tuker

Tuker is a small tokenizer/parser library with an emphasis on usability. Tokenize text in two lines of code, then parse it in two more. Navigate parse trees using simple functions that aren't brittle.

Tuker is only the latest in a long series of lexer/parser libraries I've written in a number of languages. This one is an evolution of tuckey, using many of the same principles while also being much easier to work with.

# example.toml

[tokenizer]

l_paren = "[(]"
r_paren = "[)]"
word = "[a-zA-Z_]+"
number = "[0-9]+"

[parser]

main = "expr"
expr = "[word number list]"
list = "(l_paren expr* r_paren)"
// main.rs

let table_input = &fs::read_to_string("example.toml")
    .expect("Error reading table");
let input_text = "(add 1 2)";

let tokenizer = Tokenizer::from_toml(table_input)
    .expect("Error constructing tokenizer");
let tokens = tokenizer.tokenize_str(&input_text);

let parser = Parser::from_toml(table_input, &tokenizer)
    .expect("Error constructing parser");
let parse_tree = parser.parse_tokens("main", &tokens, &tokenizer)
    .expect("Error parsing tokens");

See the examples folder for more comprehensive examples.

Dependencies

~2.6–4MB
~78K SLoC