1 unstable release
0.0.1 | May 30, 2024 |
---|
#13 in #stories
455KB
8K
SLoC
progscrape
progscrape.com is a scraper for Hacker News, Reddit, Lobste.rs, and Slashdot. It contains a naive ranking/tagging engine that tries to keep a good mix of interesting stories on the front page.
Rust
The Rust code is divided into three projects:
Documentation for each sub-project will be available at some point.
Running
To initialize the server index:
SERVER_LOG="debug,tantivy=info" cargo run -- initialize --persist-path target/index --root=.
To load from a set of backup scrapes:
SERVER_LOG="debug,tantivy=info" cargo run -- initialize --persist-path target/index --root=. backup/????-??.json
To run the server behind a CloudFlare Access tunnel:
SERVER_LOG="debug,tantivy=info" cargo run -- serve --auth-header 'cf-access-authenticated-user-email'
To run the server completely standlone on localhost
:
SERVER_LOG="debug,tantivy=info" cargo run -- serve --fixed-auth-value 'username@example.com'
Historical
The app was previously designed to run on Google's AppEngine, at a low-enough load to stay in the free tier. There was a fair bit of caching and we avoid interactivity where possible to make this possible. The last Python version currently runs on Python 2.7, only because Python 3.x was a big undertaking on AppEngine.
Android
There's also an open-source Android app.
Dependencies
~69MB
~1M SLoC