#readability #port #content #extractor #fork #scrape #product

readability-fork

Temporary fork of 'readability' crate with updated dependencies

3 releases

0.2.2 Jan 6, 2021
0.2.1 May 25, 2020
0.2.0 May 25, 2020

#24 in #scrape

31 downloads per month

MIT license

26KB
651 lines

readability-rs

Build Status

readability-rs is a library for extracting the primary readable content of a webpage. This is a rust port of arc90's readability project. inspired by kingwkb/readability.

Hot to use

  • Add readability to dependencies in Cargo.toml
[dependencies]
readability = "^0"
  • Then, use it as below

use readability::extractor;

fn main() {
  match extractor::scrape("https://spincoaster.com/chromeo-juice") {
      Ok(product) => {
          println!("------- html ------");
          println!("{}", product.content);
          println!("---- plain text ---");
          println!("{}", product.text);
      },
      Err(_) => println!("error occured"),
  }
}

Demo

Visit demo page.

License

MIT

Dependencies

~8–21MB
~303K SLoC