2 stable releases
1.0.1 | Mar 19, 2024 |
---|---|
1.0.0 | Mar 5, 2024 |
#1382 in Text processing
40 downloads per month
Used in 4 crates
43KB
888 lines
Lexer module
The lexer module is responsible for tokenising input strings. The lexer supports various token types such as identifiers, numbers, strings, and operators. The lexer uses a cursor-based approach to iterate over the input string and extract tokens.
The lexer is implemented as a struct called Lexer
, which provides methods for
tokenising input strings into individual tokens. The Lexer
struct contains an
iterator over the characters of the input string, and uses this iterator to extract
tokens from the input.
The Lexer
struct provides a method called next_token
, which advances the lexer to
the next token in the input stream and returns the token. This method is essentially a
large switch statement, containing branches corresponding to every token type. The
next_token
method skips any whitespace and comments before identifying the next token.
The token is represented by a Token
struct, which contains information about its kind
(e.g., identifier, operator, literal) and its span in the input stream.
The lexer module is used by the parser to tokenise the input string before parsing it into an abstract syntax tree (AST).
Dependencies
~0.3–1MB
~23K SLoC