2 releases
0.1.1 | Jun 15, 2024 |
---|---|
0.1.0 | Jun 13, 2024 |
#2335 in Parser implementations
115KB
902 lines
Parser for Python's "marshal" serialization format
This is a Rust port of the marshalparser project, which is written in Python.
It provides both a command-line interface and a library interface for parsing
data in Python's internal "marshal" serialization format, functionality for
pretty-printing the resulting data structures, and some basic data manipulation,
for example, removing unused reference flags in order to make pyc
files more
reproducible.
The default feature set is intentionally minimal. Dependencies that are only
required for building the command-line interface can be enabled with the cli
flag. Pretty-printing of byte strings can be enabled with the fancy
feature.
This project supports parsing "marshal" data produced by CPython versions between 3.8 and 3.13.
lib.rs
:
Parser for the "marshal" binary de/serialization format used by CPython
This crate implements a parser and some utilities for reading files in the "marshal" de/serialization format used internally in CPython. The exact format is not stable and can change between minor versions of CPython.
This crate supports parsing "marshal" dumps and pyc
files that were
written by CPython versions >= 3.6
and < 3.14
.
There is a high-level and a low-level API, depending on how much access to
the underlying data structures is needed. The low-level API also provides
more flexibility since it does not require files, but can operate on plain
bytes (Vec<u8>
).
Reading a pyc
file from disk:
use marshal_parser::{MarshalFile, Object};
let pyc = MarshalFile::from_pyc_path("mod.cpython-310.pyc").unwrap();
let object: Object = pyc.into_inner();
Reading a "marshal" dump (i.e. a file without pyc
header):
use marshal_parser::{MarshalFile, Object};
let dump = MarshalFile::from_dump_path("dump.marshal", (3, 11)).unwrap();
let object: Object = dump.into_inner();
Dependencies
~0.7–1.5MB
~33K SLoC