#lexer #tokenization #token #streaming #streaming-parser #parser #format

text-scanner

A UTF-8 char-oriented, zero-copy, text and code scanning library

3 releases

0.0.3 Jul 1, 2023
0.0.2 Jun 23, 2023
0.0.1 Jun 16, 2023

#1252 in Text processing


Used in 4 crates (via any-lexer)

Zlib license

195KB
3K SLoC

text-scanner

CI Latest Version Docs License

Warning: This library is experimental and may change drastically in 0.0.* versions.

A UTF-8 char-oriented, zero-copy, text and code scanning library.

This crate implements a UTF-8 char-based text Scanner, which includes various methods for scanning a string slice, as well as backtracking capabilities, which can be used to implement lexers for tokenizing text or code.

Scanning extensions for existing languages and formats have already been implemented, such as for Rust, C, Python, CSS, SCSS, JSON, JSON with Comments, and many more.

For examples of lexers implemented using Scanner, see the any-lexer crate, which implements lexers for e.g. Rust, C, Python, CSS, SCSS, JSON, JSON with Comments, and many more.

Dependencies