1 unstable release
0.1.0 | Sep 11, 2024 |
---|
#148 in #tokenizer
Used in lexerus
49KB
1.5K
SLoC
Note: This readme is auto generated. Please refer to the docs.
Lexerus
Lexerus is a lexer dinosaur that consumes a [Buffer] constructed from [str] and spits out a structure through the lexer::Lexer::lex call.
This library uses the lexer_derive::Token and lexer_derive::Lexer macros to decorate a structure for automatic parsing. See those macros for additional options.
This library was developed in conjunction with SPEW and examples on actual implementation can be found there.
Example
// Create and decorate a struct
#[derive(Lexer, Token, Debug)]
struct Trex<'code>(#[pattern = "trex::"] Buffer<'code>);
#[derive(Lexer, Token, Debug)]
struct TrexCall<'code>(
#[pattern = "RAWR"] Buffer<'code>,
);
#[derive(Lexer, Token, Debug)]
struct Call<'code> {
rex: Trex<'code>,
call: TrexCall<'code>,
}
// Create a raw buffe
let mut buffer = Buffer::from("trex::RAWR");
// Attempt to parse the trex
let trex_calling = Call::lex(&mut buffer).unwrap();
// Extract the buffer from trex
let trex = trex_calling.rex.buffer().unwrap();
let trex_calling = trex_calling.buffer().unwrap();
// Buffer should contain the exact matched string
assert_eq!(trex_calling.to_string(), "trex::RAWR");
assert_eq!(trex.to_string(), "trex::");
Goals
- No heap allocations when parsing. However there are
some exceptions:
- When using helpers such as [GroupUntil], a [Vec] is allocated to store the parsed [Buffer] in individual units. Contrast this with [Group] which only captures the [Buffer] output without individual segregation.
- Heap allocations only occur when calling Token::buffer on non-contigous sections of text or repeated sections of text. This is inevitable beause different sections [str] have to be stitched together and teh only way to do so is with a heap allocation.
- Proper debuggable information, i.e. the [Buffer] retains information about its source and the exact range on the source.
Dependencies
~250–700KB
~17K SLoC