2 unstable releases
0.7.0 | Feb 4, 2025 |
---|---|
0.6.0 | Jan 11, 2025 |
#1139 in Parser implementations
134 downloads per month
1MB
840 lines
lithtml
A lightweight and fast HTML/XHTML parser for Rust, designed to handle both full HTML documents and fragments. This parser uses Pest for parsing and is forked from html-parser.
Features
- Parse html & xhtml (not xml processing instructions)
- Parse html-documents
- Parse html-fragments
- Parse empty documents
- Parse with the same api for both documents and fragments
- Parse custom, non-standard, elements;
<cat/>
,<Cat/>
and<C4-t/>
- Removes comments
- Removes dangling elements
- Iterate over all nodes in the dom three
- Returned structured json or html
- Create a dom manually
Examples
Parse html document and print as json & formatted dom
use lithtml::Dom;
fn main() {
let html = r#"
<!doctype html>
<html lang="en">
<head>
<meta charset="utf-8">
<title>Html parser</title>
</head>
<body>
<h1 id="a" class="b c">Hello world</h1>
</h1> <!-- comments & dangling elements are ignored -->
</body>
</html>"#;
let dom = Dom::parse(html).unwrap();
println!("{}", dom.to_json_pretty().unwrap());
println!("{}", dom);
}
Parse html fragment and print as json & formatted fragment
use lithtml::Dom;
fn main() {
let html = "<div id=cat />";
let dom = Dom::parse(html).unwrap();
println!("{}", dom.to_json_pretty().unwrap());
println!("{}", dom);
}
Create a dom manually
use lithtml::{Dom, Node, Result};
fn main() -> Result<()> {
let mut dom = Dom::new();
dom.children.push(Node::new_comment("Welcome to the test"));
dom.children.push(Node::parse_json(
r#"{
"name": "div",
"variant": "normal",
"children": [
{
"name": "h1",
"variant": "normal",
"children": [
"Tjena världen!"
]
},
{
"name": "p",
"variant": "normal",
"children": [
"Tänkte bara informera om att Sverige är bättre än Finland i ishockey."
]
}
]
}"#
)?);
dom.children.append(&mut Node::parse(
r#"<div>Testing</div><p>Multiple elements from node</p>"#,
)?);
println!("{}", dom);
Ok(())
}
Dependencies
~2.4–3.5MB
~71K SLoC