#tree-sitter #reverse-engineering #parser

tree-sitter-sleigh

Tree-sitter parser for the Ghidra SLEIGH language

1 unstable release

0.1.0 Jul 22, 2024

#1151 in Data structures

Apache-2.0 and LGPL-3.0-only

150KB
3.5K SLoC

tree-sitter-sleigh

Tree sitter parser for the Ghidra Sleigh language (read about it here).

This is a raw parser, and doesn't semantically intepret the SLA language in any way except to accurately parse it to a machine-usable structure. The intent behind this project is to form the frontend of a SLA to Rust transpiler.

Example

let language_path = std::path::PathBuf::from("../Processors/x86/data/languages/x86-64.sla");
let language_contents = std::fs::read_to_string(&language_path)?;
let parsed = tree_sitter_sleigh::parse(&language_contents)?;
println!("{:?}", parsed);

This will take a few minutes (the parser is not particularly fast, and these files are quite large, hence why this project is not really appropriate for use to repeatedly load SLA specifications). You'll eventually get some output like:

Sleigh {
    _open: (),
    version: Some(
        3,
    ),
    bigendian: false,
    align: 1,
    uniqbase: 1097856,
    maxdelay: None,
    uniqmask: None,
    numsections: None,
    _close: (),
    sourcefiles: SourceFiles {
        _start: (),
        source_files: [
            SourceFile {
                _start: (),
                name: "ia.sinc",
                index: 0,
                _end: (),
            },
            SourceFile {
                _start: (),
                name: "lockable.sinc",
                index: 1,
                _end: (),
            },

...and so on.

Dependencies

~11MB
~196K SLoC