#parser #html-parser

parse-html

A simple Rust project to parse HTML

5 releases (3 breaking)

Uses new Rust 2024

0.4.1 Apr 1, 2025
0.4.0 Apr 1, 2025
0.3.0 Apr 1, 2025
0.2.0 Apr 1, 2025
0.1.0 Mar 31, 2025

#20 in #html-parser

Download history 195/week @ 2025-03-29 125/week @ 2025-04-05

320 downloads per month

MIT license

50KB
1.5K SLoC

parse-html

A simple Rust project to parse HTML.

Features

  • Tokenizes HTML input into structured tokens.
  • Parses tokens into a tree DOM.
  • Supports querying elements by ID, class, tag name.
  • Chaining queries. For example, you can query elements by ID and then filter by class or tag name.

Usage

use parse_html::{dom::dom_tree::DomTree, lexer::tokenizer::Lexer, parser::ast::Parser};

fn main() {
    let html = r#"<div id="main"><p>Hello</p></div>"#;

    match DomTree::new::<Lexer, Parser>(html) {
        Ok(dom) => {
            if let Some(container) = dom.get_by_id("main") {
                println!("Node id='main' {:?}", container);
            } else {
                println!("Id not found");
            }
        }
        Err(e) => println!("Erreur de parsing : {:?}", e),
    }
}

License

MIT

No runtime deps