2 releases

0.1.1 Apr 30, 2025
0.1.0 Apr 22, 2025

#1724 in Parser implementations

Download history 100/week @ 2025-04-22 136/week @ 2025-04-29 16/week @ 2025-05-06

252 downloads per month

MIT license

140KB
3K SLoC

Seedframe Webscraper

A web scraper integration for Seedframe.

This crate provides struct WebScraper, which implementing the seedframe::loader::Loader trait, that can fetch HTML content from a URL at regular intervals (or once, if no interval is specified) and publishes the results to subscribers. Can optionally be filtered using CSS selectors. The unit of intervals is seconds. The interval and selector fields are optional.

Accepts the following configuration parameters, passed as json to the config attribute in the loader proc-macro - url: String - url of the page to load content from - interval: optional u64 - interval at which content gets fetched from the page, happens only once if value is None - selector: optional String - CSS selector to filter content

#[loader(
    external = "WebScraper",
    config = r#"{
        "url": "https://example.com",
        "interval": 5,
        "selector": "div.content"
    }"#
)]
struct OurLoader;

Dependencies

~13–25MB
~344K SLoC