1 unstable release

new 0.1.0 Feb 21, 2025

#1185 in Data structures

MIT license

43KB
866 lines

Lesbar (ˈleːsbaːɐ̯ | laze-bahr) is a Rust library that provides strongly typed APIs for strings that represent legible text. Lesbar extends and is implemented with Mitsein.

GitHub docs.rs crates.io

Basic Usage

Allocating a TString (textual string) from a string literal:

use lesbar::prelude::*;

let text = TString::try_from("Servus!").unwrap();
let error = TString::try_from("\u{FEFF}").unwrap_err();

Constructing a TStr (textual string slice) with the tstr! macro:

let text = lesbar::tstr!("Macros sind der Hammer!");

// This does not build.
//
// let text = lesbar::tstr!("\u{200B}\u{200E}");

Removing text from a TString:

use lesbar::prelude::*;

let mut text = TString::from(lesbar::tstr!("Raus damit."));
let grapheme = text.pop_grapheme_or().none().unwrap();

assert_eq!(grapheme, ".");

Legibility

Legible string types encode some non-zero amount of Unicode with a specified non-zero column width or code points and grapheme clusters that specify a visual presentation (explicitly or otherwise). Note that blank non-empty space is considered legible. This is based only on the Unicode specification and its interpretations. Fonts, glyphs, and other rendering elements are not considered at all, for example.

Some elements of Unicode are ambiguous regarding this notion of legibility, and Lesbar attempts reasonable compromise that errs on the side of considering Unicode illegible in such cases.

Text rendering software has far more context when presenting text and can interpret Unicode arbitrarily. There is no guarantee that the contents of a legible string type in Lesbar will necessarily present as non-empty when rendered. However, this is very likely.

Features and Comparisons

The mitsein crate provides the non-empty string types String1 and Str1, which represent non-empty strings. Similarly, the non-empty-string crate provides the NonEmptyString type. However, these types only guarantee that strings are comprised of one or more Unicode code points or bytes of UTF-8. Lesbar implements types with more strict requirements: textual strings that must encode some amount of legible (visible) text.

Lesbar implements both textual strings and textual string slices (TString and TStr), which are analogous to standard Rust string types. These types also support conversions into textual container types like Box. The non-empty-string crate does not make this distinction, only implements owned string buffers, and does not preserve the non-empty property when converting into containers like Box.

Lesbar is implemented with the mitsein crate, which provides non-empty collections, slices, and iterators. Textual string types provide strongly typed APIs for slicing and iteration that reflect the non-empty and legible guarantee with conversions into and from non-empty types. The non-empty-string crate, for example, provides no conversions or iteration mechanism that consider this property.

Lesbar is a no_std library and alloc is optional. Textual string slices can be used in contexts where OS features or allocation are not available.

Integrations and Cargo Features

Lesbar provides some optional features and integrations via the following Cargo features.

Feature Default Primary Dependency Description
alloc Yes alloc Legible string buffers, like TString.
serde No serde De/serialization of legible strings with serde.

Dependencies

~3.5MB
~62K SLoC