3 releases
Uses new Rust 2024
new 0.0.3 | Apr 20, 2025 |
---|---|
0.0.2 | Apr 19, 2025 |
0.0.1 | Apr 19, 2025 |
#986 in Algorithms
159 downloads per month
Used in incpa-tokio
43KB
1K
SLoC
incpa
incpa
is an incremental parser composition crate.
Incremental parsers process a chunk of input, then either produce an error, a parsed output, or an updated parser state ready for future input. This primitive, codified by parsing::ParserState::feed, allows the same parser definition to support parsing streaming input from async or sync sources, as well as other "incremental" use cases such as interactive REPL loop parsing.
Support for async
input streams is provided in the downstream incpa-tokio
crate.
The term "parser composition" emphasizes how sophisticated parsers can be defined by composing simpler parsers.
Example
use incpa::BaseParserError;
use incpa::primitive::remaining;
use incpa::Parser;
fn main() -> Result<(), BaseParserError> {
let parser = define_my_parser();
let output = parser.parse_all("Hello World!")?;
assert_eq!(output, ("Hello", " World!".to_string()));
Ok(())
}
fn define_my_parser() -> impl Parser<str, Output=(&'static str, String), Error=BaseParserError> {
"Hello".then(remaining())
}
Trade-offs
There is a fundamental trade-off between streaming parsers, such as this crate specializes in, versus "zero-copy" parsers which parse values which refer back to the original input buffer. Zero-copy parsers reduce the memory footprint and amount of copying at the cost of requiring all input to be held in memory, whereas streaming parsers can parse very large inputs at the cost of internally copying input where necessary.
Related Projects
This crate is inspired by chumsky which is an excellent and mature parser composition crate. Another inspiration is parsec in haskell-land.
Status
This crate is in the version 0.0.x phase of early proof of concept with unstable APIs.
0.1.0 Feature Goals
- A basic suite of general composition abstractions such as Parser::then and Parser::or with backtracking support.
- Support for both string parsers and slice parsers (including byte slices)
- Efficient streaming string parsing from byte-oriented I/O sources using UTF8 decoding
- Common generic primitive parsers, such as end-of-input, constants, and literals
- Common primitive text parsers, such as number literal parsers, whitespace parsers, keyword parsing, etc...
- Common primitive byte-oriented parsers, such as integer types with different endianness, common variable-length integer encodings such as VLQ and LEB128, UTF8 chars, fixed-sized arrays, etc...
- Basic support for non-byte slice parsing with an example token slice parser.
- Location tracking in errors.
- Recursive parsers.
- Basic self benchmarks for comparison across revisions (but not necessarily comparison to alternative parser crates).
Dependencies
~2.5–8.5MB
~67K SLoC