#tree-sitter #incremental #parser #api-bindings

sys no-std tree-sitter-c2rust

Rust bindings to the Tree-sitter parsing library, with a pure Rust runtime via c2rust

11 unstable releases (3 breaking)

new 0.25.2 Feb 21, 2025
0.24.7 Feb 20, 2025
0.24.3 Oct 10, 2024
0.22.6 Jul 19, 2024
0.20.9 Feb 1, 2023

#715 in Parser implementations

Download history 2497/week @ 2024-10-30 2672/week @ 2024-11-06 1514/week @ 2024-11-13 2155/week @ 2024-11-20 1334/week @ 2024-11-27 1944/week @ 2024-12-04 2413/week @ 2024-12-11 1948/week @ 2024-12-18 1248/week @ 2024-12-25 3617/week @ 2025-01-01 4771/week @ 2025-01-08 4053/week @ 2025-01-15 3495/week @ 2025-01-22 1735/week @ 2025-01-29 4268/week @ 2025-02-05 3978/week @ 2025-02-12

13,687 downloads per month
Used in 9 crates (6 directly)

MIT license

1.5MB
37K SLoC

Rust 25K SLoC // 0.0% comments C 12K SLoC // 0.1% comments

Rust Tree-sitter

crates.io badge

Rust bindings to the Tree-sitter parsing library.

Basic Usage

First, create a parser:

use tree_sitter::{InputEdit, Language, Parser, Point};

let mut parser = Parser::new();

Add the cc crate to your Cargo.toml under [build-dependencies]:

[build-dependencies]
cc="*"

Then, add a language as a dependency:

[dependencies]
tree-sitter = "0.24"
tree-sitter-rust = "0.23"

To then use a language, you assign them to the parser.

parser.set_language(&tree_sitter_rust::LANGUAGE.into()).expect("Error loading Rust grammar");

Now you can parse source code:

let source_code = "fn test() {}";
let mut tree = parser.parse(source_code, None).unwrap();
let root_node = tree.root_node();

assert_eq!(root_node.kind(), "source_file");
assert_eq!(root_node.start_position().column, 0);
assert_eq!(root_node.end_position().column, 12);

Editing

Once you have a syntax tree, you can update it when your source code changes. Passing in the previous edited tree makes parse run much more quickly:

let new_source_code = "fn test(a: u32) {}";

tree.edit(&InputEdit {
  start_byte: 8,
  old_end_byte: 8,
  new_end_byte: 14,
  start_position: Point::new(0, 8),
  old_end_position: Point::new(0, 8),
  new_end_position: Point::new(0, 14),
});

let new_tree = parser.parse(new_source_code, Some(&tree));

Text Input

The source code to parse can be provided either as a string, a slice, a vector, or as a function that returns a slice. The text can be encoded as either UTF8 or UTF16:

// Store some source code in an array of lines.
let lines = &[
    "pub fn foo() {",
    "  1",
    "}",
];

// Parse the source code using a custom callback. The callback is called
// with both a byte offset and a row/column offset.
let tree = parser.parse_with(&mut |_byte: usize, position: Point| -> &[u8] {
    let row = position.row as usize;
    let column = position.column as usize;
    if row < lines.len() {
        if column < lines[row].as_bytes().len() {
            &lines[row].as_bytes()[column..]
        } else {
            b"\n"
        }
    } else {
        &[]
    }
}, None).unwrap();

assert_eq!(
  tree.root_node().to_sexp(),
  "(source_file (function_item (visibility_modifier) (identifier) (parameters) (block (number_literal))))"
);

Features

  • std - This feature is enabled by default and allows tree-sitter to use the standard library.
    • Error types implement the std::error:Error trait.
    • regex performance optimizations are enabled.
    • The DOT graph methods are enabled.
  • wasm - This feature allows tree-sitter to be built for Wasm targets using the wasmtime-c-api crate.

Dependencies

~4–17MB
~245K SLoC