#parsing #tokenization #utils

tokenizer-lib

Tokenization utilities for building parsers in Rust

13 releases (8 stable)

1.5.0 Feb 9, 2023
1.4.0 Nov 26, 2022
1.3.0 Apr 5, 2022
1.2.1 Nov 16, 2021
0.4.1 Feb 22, 2021

#102 in Parser tooling

Download history 6/week @ 2022-12-01 21/week @ 2022-12-08 3/week @ 2022-12-15 5/week @ 2022-12-22 4/week @ 2022-12-29 4/week @ 2023-01-05 4/week @ 2023-01-12 8/week @ 2023-01-19 11/week @ 2023-01-26 13/week @ 2023-02-02 51/week @ 2023-02-09 42/week @ 2023-02-16 36/week @ 2023-02-23 64/week @ 2023-03-02 37/week @ 2023-03-09 129/week @ 2023-03-16

266 downloads per month
Used in 4 crates (2 directly)

MIT license

30KB
609 lines

Tokenizer-lib

Docs Crates

Tokenization utilities for building parsers in Rust

Examples

Buffered token channel:

use tokenizer_lib::{BufferedTokenQueue, Token, TokenReader, TokenSender, TokenTrait};

#[derive(PartialEq, Debug)]
struct Span(pub u32, pub u32);

#[derive(PartialEq, Debug)]
struct N(pub u32);

impl TokenTrait for N {}

let mut btq = BufferedTokenQueue::new();
btq.push(Token(N(12), Span(0, 2)));
btq.push(Token(N(32), Span(2, 4)));
btq.push(Token(N(52), Span(4, 8)));
assert_eq!(btq.next().unwrap().0, N(12));
assert_eq!(btq.next().unwrap().0, N(32));
assert_eq!(btq.next().unwrap().0, N(52));
assert!(btq.next().is_none());

(Multi-thread safe) Parallel token queue:

use tokenizer_lib::{ParallelTokenQueue, Token, TokenReader, TokenSender, TokenTrait};

#[derive(PartialEq, Debug)]
struct Span(pub u32, pub u32);

#[derive(PartialEq, Debug)]
struct N(pub u32);

impl TokenTrait for N {}

let (mut sender, mut reader) = ParallelTokenQueue::new();
std::thread::spawn(move || {
    sender.push(Token(N(12), Span(0, 2)));
    sender.push(Token(N(32), Span(2, 4)));
    sender.push(Token(N(52), Span(4, 8)));
});

assert_eq!(reader.next().unwrap().0, N(12));
assert_eq!(reader.next().unwrap().0, N(32));
assert_eq!(reader.next().unwrap().0, N(52));
assert!(reader.next().is_none());

Generator token queue:

use tokenizer_lib::{GeneratorTokenQueue, GeneratorTokenQueueBuffer, Token, TokenReader, TokenSender, TokenTrait};

#[derive(PartialEq, Debug)]
struct N(pub u32);

impl TokenTrait for N {}

fn lexer(state: &mut u32, sender: &mut GeneratorTokenQueueBuffer<N, ()>) {
    *state += 1;
    match state {
        1..=3 => {
            sender.push(Token(N(*state * 2), ()));
        }
        _ => {}
    }
}

let mut reader = GeneratorTokenQueue::new(lexer, 0);

assert_eq!(reader.next().unwrap().0, N(2));
assert_eq!(reader.next().unwrap().0, N(4));
assert_eq!(reader.next().unwrap().0, N(6));
assert!(reader.next().is_none());

Provides utilities such as peek, peek_n and scan for lookahead. Also expect_next for expecting a token value and conditional_next for advancing on a predicate.

No runtime deps

Features

  • buffered
  • generator
  • parallel