#parser #incremental #api-bindings

tree-sitter-c2rust

Rust bindings to the Tree-sitter parsing library, with a pure Rust runtime via c2rust

6 releases

new 0.22.6 Jul 19, 2024
0.22.5 Apr 19, 2024
0.20.11-pre.1 Apr 13, 2024
0.20.10 Apr 9, 2023
0.20.9 Feb 1, 2023

#247 in Parser implementations

Download history 1354/week @ 2024-04-03 1392/week @ 2024-04-10 1600/week @ 2024-04-17 957/week @ 2024-04-24 1094/week @ 2024-05-01 1402/week @ 2024-05-08 1305/week @ 2024-05-15 1490/week @ 2024-05-22 1821/week @ 2024-05-29 1666/week @ 2024-06-05 2203/week @ 2024-06-12 2101/week @ 2024-06-19 2197/week @ 2024-06-26 1233/week @ 2024-07-03 1810/week @ 2024-07-10 1689/week @ 2024-07-17

7,347 downloads per month
Used in 8 crates (5 directly)

MIT license

1.5MB
35K SLoC

Rust 24K SLoC // 0.0% comments C 11K SLoC // 0.1% comments

Rust Tree-sitter

crates.io badge

Rust bindings to the Tree-sitter parsing library.

Basic Usage

First, create a parser:

use tree_sitter::{InputEdit, Language, Parser, Point};

let mut parser = Parser::new();

Add the cc crate to your Cargo.toml under [build-dependencies]:

[build-dependencies]
cc="*"

Then, add a language as a dependency:

[dependencies]
tree-sitter = "0.22"
tree-sitter-rust = "0.21"

To then use a language, you assign them to the parser.

parser.set_language(&tree_sitter_rust::language()).expect("Error loading Rust grammar");

Now you can parse source code:

let source_code = "fn test() {}";
let mut tree = parser.parse(source_code, None).unwrap();
let root_node = tree.root_node();

assert_eq!(root_node.kind(), "source_file");
assert_eq!(root_node.start_position().column, 0);
assert_eq!(root_node.end_position().column, 12);

Editing

Once you have a syntax tree, you can update it when your source code changes. Passing in the previous edited tree makes parse run much more quickly:

let new_source_code = "fn test(a: u32) {}";

tree.edit(&InputEdit {
  start_byte: 8,
  old_end_byte: 8,
  new_end_byte: 14,
  start_position: Point::new(0, 8),
  old_end_position: Point::new(0, 8),
  new_end_position: Point::new(0, 14),
});

let new_tree = parser.parse(new_source_code, Some(&tree));

Text Input

The source code to parse can be provided either as a string, a slice, a vector, or as a function that returns a slice. The text can be encoded as either UTF8 or UTF16:

// Store some source code in an array of lines.
let lines = &[
    "pub fn foo() {",
    "  1",
    "}",
];

// Parse the source code using a custom callback. The callback is called
// with both a byte offset and a row/column offset.
let tree = parser.parse_with(&mut |_byte: usize, position: Point| -> &[u8] {
    let row = position.row as usize;
    let column = position.column as usize;
    if row < lines.len() {
        if column < lines[row].as_bytes().len() {
            &lines[row].as_bytes()[column..]
        } else {
            b"\n"
        }
    } else {
        &[]
    }
}, None).unwrap();

assert_eq!(
  tree.root_node().to_sexp(),
  "(source_file (function_item (visibility_modifier) (identifier) (parameters) (block (number_literal))))"
);

Dependencies

~3–17MB
~220K SLoC