9 releases (stable)
1.0.5 | Mar 15, 2024 |
---|---|
1.0.4 | Feb 19, 2024 |
1.0.3 | Jul 10, 2023 |
1.0.2 | Mar 4, 2023 |
0.1.0 | Dec 12, 2019 |
#90 in Encoding
16,816 downloads per month
Used in 12 crates
(7 directly)
72KB
947 lines
Encode and decode HTML entities according to the standard
If you only need to escape text for embedding into HTML then installing is as simple as running:
cargo add htmlize
If you want to unescape entities back into raw text, see Unescaping entities into text below.
Escaping text into entities
The escape
functions should cover most cases where you need to safely embed a
string in HTML. Generally, if the text goes in an attribute, use
escape_attribute()
, otherwise use escape_text()
.
& |
< |
> |
" |
' |
|
---|---|---|---|---|---|
escape_text() |
✓ | ✓ | ✓ | ||
escape_attribute() |
✓ | ✓ | ✓ | ✓ | |
escape_all_quotes() |
✓ | ✓ | ✓ | ✓ | ✓ |
You should almost never need escape_all_quotes()
, but it is included because
sometimes it’s convenient to wrap attribute values in single quotes.
escape_text(string) -> string
Escape a string so that it can be embedded in the main text. This does not escape quotes at all.
Reference. See also escape_text_bytes()
.
escape_attribute(string) -> string
Escape a string so that it can be embedded in an attribute. Always use double quotes around attributes.
Reference. See also escape_attribute_bytes()
.
escape_all_quotes(string) -> string
Escape both single and double quotes in a string along with other standard characters. In general you should not need to use this.
Reference. See also escape_all_quotes_bytes()
.
Unescaping entities into text
This requires the unescape
or unescape_fast
feature. (unescape
builds
much faster, so unless you really need the very fastest unescape, use it.) To
configure it:
cargo add htmlize --features unescape
unescape(string) -> string
This follows the official WHATWG algorithm for expanding entities in general.
Strictly speaking, this does not correctly handle text from the value of
attributes. It’s probably fine for most uses, but if you know that the input
string came from the value of an attribute, use unescape_attribute()
instead. See the unescape_in()
reference documentation for
more information.
unescape_attribute(string) -> string
This follows the official WHATWG algorithm for expanding entities found in the value of an attribute.
The only difference is in how this handles named entities without a trailing
semicolon. See the unescape_in()
reference documentation
for more information.
unescape_in(string, Htmlize::Context) -> string
This follows the official WHATWG algorithm for expanding entities based on the context where they are found. See the reference documentation for more information.
unescape_bytes_in([u8], Htmlize::Context) -> [u8]
This is the same as unescape_in()
, except that it works on bytes rather than
strings. (Note that both functions actually take and return Cow
s.)
Features
The escape
functions are all available with no features enabled.
-
unescape_fast
: provide fast version ofunescape()
. This does not enable theentities
feature automatically.This takes perhaps 30 seconds longer to build than
unescape
, but the performance is significantly better in the worst cases. That said, the performance of of theunescape
version is already pretty good, so I don’t recommend enabling this unless you really need it. -
unescape
: provide normal version ofunescape()
. This will automatically enable theentities
feature. -
entities
: buildENTITIES
map. Enabling this will add a dependency on phf and may slow builds by a few seconds.
All other features are internal and should not be used when specifying a dependency. See the reference documentation.
Benchmarks
This has two suites of benchmarks. One is a typical multi-run benchmark using
criterion. These can be run with cargo bench
or cargo criterion
if you
have it installed.
To run benchmarks on the unescape functions, enable features bench
and
unescape
or unescape_fast
(or both).
Note: The internal bench
feature is required to expose internal functions
like unescape_fast()
and unescape_slow()
to the benchmarks. You must not
enable this feature when specifying a dependency, since its behavior is not
guaranteed to stay the same from point release to point release.
iai benchmarks
The other suite of benchmarks uses iai to count instructions, cache accesses,
and to estimate cycles. It requires the internal iai
feature to be enabled,
and only really works well on Linux.
To run iai benchmarks locally:
cargo bench --features iai iai
You may want to use --all-features
or --features iai,bench,unescape
or
--features iai,bench,unescape_fast
to enable benchmarks of the unescape()
functions.
To run in a Docker container, use the docker.sh
script. It will build an image
if necessary, then use that image for all future runs:
./docker.sh cargo bench --features iai iai
You can also start it in interactive mode and run the benchmark multiple times:
❯ ./docker.sh
root@d0a0db46770d:/work# cargo bench --features iai iai
Compiling htmlize [...]
Development status
This is stable. I have no features planned for the future, though I’m open to suggestions.
License
This project dual-licensed under the Apache 2 and MIT licenses. You may choose to use either.
The entities.json file is copyright WHATWG, and is copied from https://html.spec.whatwg.org/entities.json. It is licensed under the BSD 3-Clause License.
Contributions
Unless you explicitly state otherwise, any contribution you submit as defined in the Apache 2.0 license shall be dual licensed as above, without any additional terms or conditions.
Dependencies
~105–305KB