#tokenization #utils #parser #tokenize

macro build-trie

Procedural macro for generating match and state code representing a trie structure

2 releases

0.1.1 Mar 26, 2021
0.1.0 Mar 10, 2021

#30 in #tokenization

MIT license

10KB
206 lines

This crate has been replaced with derive-finite-automaton. Same functionality but redesigned API


build-trie

Crates

A procedural macro for building a state machine / trie. Main use is for lexing multiple character wide tokens see example/src/main.rs.

Run example:

cargo run -p example

View example macro expansion (requires cargo-expand):

cargo expand -p example

Probably over engineered ©

Design is WIP and subject to change.

Example

Given the following example:

  • Define the name of a function (in this case get_symbol_from_state_and_char) which we will use.
  • The function takes two arguments. A reference to the previous state (if no state yet use state_enum::None) and a character.
  • It will return a result_enum. Which has two forms, either result_enum::Result(result, character_consumed) with a result that matched and whether the character was used in constructing the result (if not rerun lex loop on character). or result_enum::NewState indicating a new state (which should be assigned somewhere).
  • result: ... indicates the return type of this trie
use build_trie::build_trie;

#[derive(Debug)]
pub enum Tokens {
    OpenBrace, CloseBrace, ArrowFunction, Equal, StrictEqual, Assign, Literal(String)
}

build_trie! {
    function: fn get_symbol_from_state_and_char;
    result: Tokens;
    state_enum: enum SymbolState;
    result_enum: enum SymbolStateResult;
    mappings: {
        "{" => Tokens::OpenBrace,
        "}" => Tokens::CloseBrace,
        "=>" => Tokens::ArrowFunction,
        "==" => Tokens::Equal,
        "===" => Tokens::StrictEqual,
        "=" => Tokens::Assign
    }
}

Dependencies

~1.5MB
~34K SLoC