7 releases

0.3.0-rc.2 Jun 13, 2024
0.3.0-rc.1 Jun 6, 2024
0.2.0 Dec 15, 2023
0.1.0 Apr 3, 2023
0.1.0-alpha.3 Feb 20, 2023

#421 in Text processing

Download history 49/week @ 2024-08-25 4/week @ 2024-09-01 16/week @ 2024-09-22 25/week @ 2024-09-29 1/week @ 2024-10-06

347 downloads per month

GPL-3.0-or-later

210KB
4K SLoC

wikipedia_prosesize

crates.io docs.rs docs (main) pipeline status coverage report

Calculate Wikipedia prose size

This crate is a rough port of the Wikipedia Prosesize script that allows for counting the bytes of prose on a page rather than the wikitext markup or generated HTML.

You will most likely fetch ImmutableWikicode using the parsoid crate.

The response from prosesize() provides the text-only prose size, word count and text-only references size. Enabling the optional serde-1 feature makes the size struct serializable and deserializable.

Contributing

wikipedia_prosesize is part of the mwbot-rs project. We're always looking for new contributors, please reach out if you're interested!

License

This crate is released under GPL-3.0-or-later. See COPYING for details.

Dependencies

~7–13MB
~171K SLoC