5 releases
0.2.0 | Dec 15, 2023 |
---|---|
0.1.0 | Apr 3, 2023 |
0.1.0-alpha.3 | Feb 20, 2023 |
0.1.0-alpha.2 | Feb 15, 2023 |
0.1.0-alpha.1 | Feb 14, 2023 |
#494 in Text processing
22 downloads per month
180KB
3.5K
SLoC
Calculate Wikipedia prose size
This crate is a rough port of the Wikipedia Prosesize script that allows for counting the bytes of prose on a page rather than the wikitext markup or generated HTML.
You will most likely fetch ImmutableWikicode
using the parsoid
crate.
The response from prosesize()
provides the text-only prose size, word count and text-only
references size. Enabling the optional serde-1
feature makes the size struct serializable
and deserializable.
Contributing
wikipedia_prosesize
is part of the mwbot-rs
project.
We're always looking for new contributors, please reach out
if you're interested!
Dependencies
~5–12MB
~164K SLoC