11 releases

0.3.1 Jun 28, 2024
0.3.0 Jun 28, 2024
0.2.2 Jun 27, 2024
0.1.5 Jun 22, 2024

#68 in #productivity

Apache-2.0

34KB
642 lines

htmlproc

crates.io Documentation Dependency Status License

HTML processors as utils written in Rust. Each function is offered as a single feature, so the dependencies are kept small. (omit_enclosure which is used as document outline formatter is exception.)

Install in Rust project

# install crate
cargo add htmlproc

# install crate with specific features
cargo add htmlproc --features path_to_url

# uninstall
# cargo remove htmlproc

Functions (Features)

omit_attr

Remove specific tag attribute(s) from HTML text.

Usage

First, run cargo add htmlproc --features omit_attr. Then specify attrs to omit. Three formats are available:

  • attr: remove all attrs from all tags.
  • *.attr: same to the above.
  • tag.attr: remove all attrs from specifig tag. ex) span.style
use htmlproc::omit_attr::manipulate;

let html = "<div id=\"preserved\"><span style=\"want: remove;\" class=\"also: wanted;\" z-index=\"1\">Content</span></div>";
let omit_attrs = &["style", "*.class", "span.z-index"];
let result: String = manipulate(html, omit_attrs);

omit_enclosure

Remove specific tag enclosure(s) from HTML text.

Usage

use htmlproc::omit_enclosure::manipulate;

let result: String = manipulate("<div>...<span>---</span>...</div>", &["span"]);

path_to_url

Convert paths to URLs.

Usage

use htmlproc::path_to_url::{convert, ConvertOptions};

let result: String = convert("<a href=\"/some/path\">link</a>", ConvertOptions::new("target.domain"));

In this case, href value "/some/path" is converted to "https://target.domain/some/path". Options such as http protocol, port number and current directory are available.

Dependencies

~1.5–6.5MB
~35K SLoC