#soup #html5ever #html #querying #layer #python #top

soup-kuchiki

Inspired by the python library BeautifulSoup, this is a layer on top of html5ever that adds a different API for querying and manipulating HTML

1 unstable release

Uses old Rust 2015

0.5.0 Oct 26, 2022

#4 in #soup


Used in 3 crates (via fetcher-core)

GPL-3.0 license

50KB
867 lines

Soup

Inspired by the python library BeautifulSoup, this is a layer on top of html5ever that adds a different API for querying & manipulating HTML

This a re-upload of the original soup crate that has been ported to kuchiki here

Documentation (latest release)

Documentation (master)

Installation

In order to use, add the following to your Cargo.toml:

[dependencies]
soup = "0.5"

Usage

// src/main.rs
extern crate reqwest;
extern crate soup;

use std::error::Error;

use reqwest;
use soup::prelude::*;

fn main() -> Result<(), Box<Error>> {
    let response = reqwest::get("https://google.com")?;
    let soup = Soup::from_reader(response);
    let some_text = soup.tag("p")
			.attr("class", "hidden")
			.find()
			.and_then(|p| p.text());
    OK(())
}

Dependencies

~6–12MB
~146K SLoC