#chess #pgn #parser #streaming-parser #reader #non-allocating #visitor

pgn-reader

Fast non-allocating and streaming reader for chess games in PGN notation

25 breaking releases

0.26.0 Apr 1, 2024
0.25.0 Jun 10, 2023
0.24.0 May 4, 2023
0.22.0 Dec 22, 2022
0.3.0 Nov 4, 2017

#222 in Parser implementations

Download history 105/week @ 2024-07-21 130/week @ 2024-07-28 137/week @ 2024-08-04 141/week @ 2024-08-11 147/week @ 2024-08-18 157/week @ 2024-08-25 129/week @ 2024-09-01 133/week @ 2024-09-08 95/week @ 2024-09-15 259/week @ 2024-09-22 135/week @ 2024-09-29 86/week @ 2024-10-06 79/week @ 2024-10-13 72/week @ 2024-10-20 175/week @ 2024-10-27 148/week @ 2024-11-03

484 downloads per month
Used in 15 crates (10 directly)

GPL-3.0+

66KB
819 lines

pgn-reader

A fast non-allocating and streaming reader for chess games in PGN notation, as a Rust library.

Build Status crates.io docs.rs

Introduction

Reader parses games and calls methods of a user provided Visitor. Implementing custom visitors allows for maximum flexibility:

  • The reader itself does not allocate (besides a single fixed-size buffer). The visitor can decide if and how to represent games in memory.
  • The reader does not validate move legality. This allows implementing support for custom chess variants, or delaying move validation.
  • The visitor can signal to the reader that it does not care about a game or variation.

Example

A visitor that counts the number of syntactically valid moves in the mainline of each game.

use std::io;
use pgn_reader::{Visitor, Skip, BufferedReader, SanPlus};

struct MoveCounter {
    moves: usize,
}

impl MoveCounter {
    fn new() -> MoveCounter {
        MoveCounter { moves: 0 }
    }
}

impl Visitor for MoveCounter {
    type Result = usize;

    fn begin_game(&mut self) {
        self.moves = 0;
    }

    fn san(&mut self, _san_plus: SanPlus) {
        self.moves += 1;
    }

    fn begin_variation(&mut self) -> Skip {
        Skip(true) // stay in the mainline
    }

    fn end_game(&mut self) -> Self::Result {
        self.moves
    }
}

fn main() -> io::Result<()> {
    let pgn = b"1. e4 e5 2. Nf3 (2. f4)
                { game paused due to bad weather }
                2... Nf6 *";

    let mut reader = BufferedReader::new_cursor(&pgn[..]);

    let mut counter = MoveCounter::new();
    let moves = reader.read_game(&mut counter)?;

    assert_eq!(moves, Some(4));
    Ok(())
}

Documentation

Read the documentation

State of the library

The API could be cleaner and performance may have regressed slightly compared to the mmap based approach from old versions (#12). This needs some attention. Until I get around to it, I am doing only minimal maintenance, following shakmaty as required.

Nonetheless, it is probably still one of the fastest PGN parsers around.

Benchmarks (v0.12.0)

Run with lichess_db_standard_rated_2018-10.pgn (24,784,600 games, 52,750 MB uncompressed) on an SSD (Samsung 850), Intel i7-6850K CPU @ 3.60 GHz:

Benchmark Time Throughput
examples/stats.rs 111.9s 471.4 MB/s
examples/validate.rs 237.1s 222.5 MB/s
examples/parallel_validate.rs 148.6s 355.0 MB/s
scoutfish make 269.2s 196.0 MB/s
grep -F "[Event " -c 39.2s 1345.7 MB/s

examples/stats.rs with compressed files:

Compression File size Time Throughput
none 52,750 MB 111.9s 471.4 MB/s
bz2 6,226 MB 1263.1s 4.9 MB/s
xz 6,989 MB 495.9s 14.1 MB/s
gz 10,627 MB 335.7s 31.7 MB/s
lz4 16,428 MB 180.0s 91.3 MB/s

License

pgn-reader is licensed under the GPL-3.0 (or any later version at your option). See the COPYING file for the full license text.

Dependencies

~1MB
~20K SLoC