#escaping #text-formatting #quote #ellipsis #latex #rules #html

bin+lib crowbook-text-processing

Provides some utilities functions for escaping text (HTML/LaTeX) and formatting it according to typographic rules (smart quotes, ellipsis, french typograhic rules)

18 releases (3 stable)

1.1.1 Aug 3, 2023
1.1.0 Jul 29, 2023
1.0.0 Feb 10, 2020
0.2.8 Nov 11, 2019
0.2.2 Oct 21, 2016

#315 in Text processing

Download history 28/week @ 2024-07-31 32/week @ 2024-08-07 44/week @ 2024-08-14 35/week @ 2024-08-21 33/week @ 2024-08-28 69/week @ 2024-09-04 102/week @ 2024-09-11 76/week @ 2024-09-18 83/week @ 2024-09-25 50/week @ 2024-10-02 65/week @ 2024-10-09 40/week @ 2024-10-16 20/week @ 2024-10-23 97/week @ 2024-10-30 59/week @ 2024-11-06 78/week @ 2024-11-13

256 downloads per month
Used in 3 crates

MPL-2.0 license

62KB
1K SLoC

See the full library documentation on Docs.rs.

Provides some utilities functions for escaping text (to HTML or LaTeX) and formatting it according to typographic rules (smart quotes, ellipsis, french rules for non-breaking spaces).

These functions were originally written for Crowbook, but have been published on a separate crate and under a less restrictive license (MPL instead of LGPL) so they can be used in other projects.

Usage

Just add this line in the dependencies section of your Cargo.toml file:

[dependencies]
crowbook-text-processing = "0.2"

Example

use crowbook_text_processing::{FrenchFormatter, clean, escape};

let s = " Some  string with  too much   whitespaces & around 1% \
         characters that might cause trouble to HTML or LaTeX.";
// Remove unnecessary whitespaces (but doesn't trim as it can have meaning)
let new_s = clean::whitespaces(s);
// Escape forHTML
println!("for HTML: {}", escape::html(new_s.clone()));
// Escape for LaTeX
println!("for LaTeX: {}", escape::tex(new_s));

// Replace quotes with typographic quotation marks
let s = r#"Some "quoted string" and 'another one'."#;
let new_s = clean::quotes(s);
assert_eq!(&new_s, "Some “quoted string” and ‘another one’.");

// Replace three consecutive dots with ellipsis character
let s = clean::ellipsis("Foo...");
assert_eq!(&s, "Foo…");

// Format whitespaces according to french typographic rules, using
// the appropriate non-breaking spaces where needed
let s = " Une chaîne en français ! On voudrait un résultat \
         « typographiquement correct ».";
let french = FrenchFormatter::new();
println!("for text: {}", french.format(s));
println!("for LaTeX: {}", escape::tex(french.format_tex(s)));

License

This is free software, published under the Mozilla Public License, version 2.0.

ChangeLog

See the ChangeLog file.

Dependencies

~2–3MB
~54K SLoC