6 releases (breaking)
0.6.0 | Aug 10, 2021 |
---|---|
0.5.0 | Apr 12, 2021 |
0.4.0 | Mar 9, 2021 |
0.3.0 | Jan 15, 2021 |
0.1.0 | Nov 12, 2020 |
#5 in #text-content
63 downloads per month
480KB
1K
SLoC
Boilerpipe
This is the Rust port of the Golang port of excellent Java library boilerpipe
which cleans up the boilerplate and extracts text content from HTML documents.
This library implements Article Extractor only and text content only (no images, links etc).
Dependencies
~8–14MB
~181K SLoC