10 releases
0.0.11-alpha.1 | Jun 14, 2023 |
---|---|
0.0.10 |
|
0.0.9 | May 16, 2023 |
#23 in #crawler
120 downloads per month
10KB
121 lines
jsdom
A fast javascript dom parser for rust built for web scraping.
cargo add jsdom
use std::collections::HashSet;
use jsdom::extract::extract_links;
const SCRIPT: &str = r###"
var ele = document.createElement('a');
ele.href = 'https://a11ywatch.com';
"###;
#[test]
fn parse_links() {
// build tree with elements created from the nodes todo
let links: HashSet<String> = extract_links(SCRIPT);
assert!(links.contains("https://a11ywatch.com"))
}
Features
This package will rollout features that are most important for web scraping first.
hashbrown
: Enable the hashbrown crate.tokio
: Enable tokio streaming utils.
Stage 0.1
Intro stage can handle elements created in statements and expressions.
Dependencies
~1.8–8MB
~64K SLoC