#web-scraping #parser #file #loader #site

scr

Most simple site parser and file loader

9 releases (3 stable)

1.0.2 Aug 21, 2023
1.0.1 Aug 20, 2023
0.2.3 Aug 16, 2023
0.1.1 Aug 15, 2023

#47 in #site

MIT license

12KB
102 lines

scr

Crates.io Crates.io GitHub: 1kawdalg/scr

"Simplicity is prerequisite for reliability" — Edsger Dijkstra

What is "scr"?

This is simplified fork of crates reqwest = {version = "0.11", features = ["blocking"]} and scraper = "0.17.1" which working together. Also are system pub struct std::path::Path, pub struct std::fs::File and pub fn std::fs::write.

"How use last stable version of scr in app?"

# Cargo.toml
[dependencies]
scr = "1.0.2"

Examples

  • parse site

use scr::Scraper;

fn main() {
    let scraper = Scraper::new("scrapeme.live/shop/").unwrap();
    let element = scraper.get_el("main#main>ul>li.product>a>h2").unwrap();

    assert_eq!(element.inner_html(), "Bulbasaur")
}
  • parse fragment of site

use scr::Scraper;

fn main() {
    let scraper = Scraper::new("scrapeme.live/shop/").unwrap();
    let fragment = scraper.get_text_once("main#main>ul>li.product>a").unwrap();
    let new_scraper = Scraper::from_fragment(fragment.as_str()).unwrap();
    let element = new_scraper.get_el("a").unwrap();

    assert_eq!(element.inner_html(), "Bulbasaur")
}
  • download file

use scr::FileLoader;

fn main() {
    let file_loader = FileLoader::new(
        "scrapeme.live/wp-content/uploads/2018/08/011.png",
        "./data/some_png.png"
    ).unwrap();
    
    assert_eq!(
        file_loader.file
            .file_name().unwrap()
            .to_str().unwrap(),
        "some_png.png"
    );
}

The crate was developed by:

  • version 1.0

onekg;
reloginn;
Black Soul

Dependencies

~7–19MB
~269K SLoC