3 releases (breaking)
0.3.0 | Mar 12, 2023 |
---|---|
0.2.0 | Mar 5, 2023 |
0.0.1 | Feb 24, 2023 |
#166 in #identifier
28 downloads per month
115KB
4K
SLoC
Syntax
This crate holds the Hebi lexer, parser, and AST.
The lexer is automatically generated using logos. The parser is a hand-written recursive descent parser.
Indentation is lexed by assigning the first non-whitespace token on each line the number of whitespace characters that precede it. For example:
asdf
asdf
asdf asdf
Would produce the following tokens:
Identifier("asdf", indentation_level=0)
Identifier("asdf", indentation_level=2)
Identifier("asdf", indentation_level=2)
Identifier("asdf", indentation_level=None)
Note the last token, which doesn't have any indentation, because it is not the first non-whitespace token on its line.
The parser uses the indentation levels to track blocks using these functions:
no_indent
, no indentation may be attached to the current tokenindent_eq
, the indentation level of the current token is equal to the current indentation stackindent_gt
, the indentation level of the current token is greater than the current indentation stack. This function also adds the new indentation level to the indentation stack.dedent
, the indentation level of the current token is lower than the current indentation stack. This functino also pops the last indentation level off of the indentation stack.
These functions are used to query for indentation at strategic places, but the parser code can be written without caring about the indentation where it doesn't matter. For example, see the import_stmt
node, which does not care about indentation at all, and so it doesn't have to track it, either!
Dependencies
~2–9.5MB
~66K SLoC