#html #css #selector #scraping #crawler

nipper-trunk

HTML manipulation with CSS seletors

1 unstable release

0.1.9 Mar 6, 2021

#232 in Web programming

44 downloads per month
Used in trunk

MIT/Apache

225KB
2K SLoC

Nipper

A crate for manipulating HTML with Rust.

NOTE WELL: this is a temporary fork & release of the upstream nipper crate, as the Trunk project was blocked on a few needed features. These changes are intended to be merged upstream ASAP.

Nipper based on HTML crate html5ever and the CSS selector crate selectors. You can use the jQuery-like syntax to query and manipulate an HTML document quickly. Not only can query, but also can modify.

nipper-logo

Example

Extract the hacker news.

use nipper::Document;

fn main() {
    let html = include_str!("../test-pages/hacker_news.html");
    let document = Document::from(html);

    document.select("tr.athing").iter().for_each(|athing| {
        let title = athing.select(".title a");
        let href = athing.select(".storylink");
        println!("{}", title.text());
        println!("{}", href.attr("href").unwrap());
        println!();
    });
}

Readability.

examples/readability.rs

Related projects

License

Licensed under either of

at your option.

Contribution

Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in the work by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.

Dependencies

~2.5MB
~58K SLoC