6 releases
| 0.2.5 | Sep 14, 2025 |
|---|---|
| 0.2.1 | Sep 13, 2025 |
| 0.1.2 | Sep 12, 2025 |
#655 in Text processing
Used in gf-core
290KB
2.5K
SLoC
pgf2json.rs
Overview
A Rust library for parsing Portable Grammar Format (PGF) files and converting them to JSON. This crate provides an API to load and interpret grammars compiled in Portable Grammar Format (PGF), which is the final output format from the Grammatical Framework (GF) compiler.
The library enables embedding GF grammars in Rust programs. The current implementation is sensitive to PGF versioning; stable for 1.0 and experimental support for the yanked (upcoming?) version 2.1 .
Documentation
Tread carefully, here be dragons!
There seems to be a minor format difference between PGF 1.0 and 2.1 that needs addressing (see more below), but the core parser architecture is solid.
Health Status
- ✅ All 11/11 tests passing (100% success rate)
- ✅ Synthetic PGF creation and JSON conversion working
- ✅ Error handling tests passing
- ✅ Parse sentence functionality working
- ✅ UTF-8 decoding issues resolved
- ✅ All PGF parsing functionality working correctly
Features
pgf2json contains a complete PGF binary parser, covering all of the PGF v1.0 format specifications. The libraries strength include:
- PGF Parsing: Read binary PGF files into Rust data structures
- JSON Conversion: Convert PGF grammars to structured JSON format
- Grammar Analysis: Extract functions, categories, and language-specific concrete syntaxes
- Sentence Parsing: Parse sentences using loaded grammars
- Error Handling: Comprehensive error reporting for invalid files and parsing failures
- Testing - Around 75% of the crates functionality has been tested.
Usage
Pretty simple, just load a .pgf files and convert it to .json.
use pgf2json::{read_pgf, pgf_to_json, parse, language, types};
// Load a PGF file
let pgf = read_pgf("./grammars/Food.pgf")?;
// Convert to JSON
let json = pgf_to_json(&pgf)?;
// Parse a sentence
let lang = language::read_language("FoodEng").unwrap();
let typ = types::start_cat(&pgf);
let trees = parse(&pgf, &lang, &typ, "this pizza is delicious")?;
Testing
The test suite includes parsing real PGF files and validating JSON output structure.
cargo test
String Encoding Handling
The parser supports both PGF 1.0 and PGF 2.1 formats with their different string encoding approaches:
- PGF 1.0 uses variable-length string encoding
- PGF 2.1 uses fixed-length string encoding
The current implementation includes robust string parsing that:
- Handles both UTF-8 and Latin-1 encoded strings
- Provides fallback mechanisms for binary data
- Gracefully handles parsing errors to extract maximum information
- Uses version-aware parsing by propagating the
is_pgf_2_1flag
All tests including test_real_pgf_parsing now pass, confirming correct parsing of strings across all supported formats.
Dependencies
~1.9–3MB
~60K SLoC