10 releases (4 breaking)

0.5.1 Feb 24, 2024
0.5.0 Feb 24, 2024
0.4.1 Dec 27, 2023
0.3.0 Oct 2, 2023
0.1.1 Aug 19, 2023

#106 in Text processing

Download history 10972/week @ 2024-01-23 11753/week @ 2024-01-30 11700/week @ 2024-02-06 14046/week @ 2024-02-13 12215/week @ 2024-02-20 14569/week @ 2024-02-27 14480/week @ 2024-03-05 13861/week @ 2024-03-12 16086/week @ 2024-03-19 17440/week @ 2024-03-26 18059/week @ 2024-04-02 15173/week @ 2024-04-09 14675/week @ 2024-04-16 14997/week @ 2024-04-23 16264/week @ 2024-04-30 13250/week @ 2024-05-07

62,198 downloads per month
Used in 2 crates

Apache-2.0

77KB
1.5K SLoC

byteyarn

byteyarn - Space-efficient byte strings πŸ§ΆπŸˆβ€β¬›

A Yarn is a highly optimized string type that provides a number of useful properties over String:

  • Always two pointers wide, so it is always passed into and out of functions in registers.
  • Small string optimization (SSO) up to 15 bytes on 64-bit architectures.
  • Can be either an owned buffer or a borrowed buffer (like Cow<str>).
  • Can be upcast to 'static lifetime if it was constructed from a known-static string.
  • Option<Yarn> has the same size and ABI as Yarn.

The main caveat is that Yarns cannot be easily appended to, since they do not track an internal capacity, and the slice returned by Yarn::as_slice() does not have the same pointer stability properties as String (these are rarely needed, though).


Yarns are useful for situations in which a copy-on-write string is necessary and most of the strings are relatively small. Although Yarn itself is not Copy, there is a separate YarnRef type that is. These types have equivalent representations, and can be cheaply cast between each other.

The easiest way to create a yarn is with the yarn!() macro, which is similar to format!().

// Create a new yarn via `fmt`ing.
let yarn = yarn!("Answer: {}", 42);

// Convert that yarn into a reference.
let ry: YarnRef<str> = yarn.as_ref();

// Try up-casting the yarn into an "immortal yarn" without copying.
let copy: YarnRef<'static, str> = ry.immortalize().unwrap();

assert_eq!(yarn, copy);

Yarns are intended for storing text, either as UTF-8 or as probably-UTF-8 bytes; Yarn<str> and Yarn<u8> serve these purposes, and can be inter-converted with each other. The Yarn::utf8_chunks() function can be used to iterate over definitely-valid-UTF-8 chunks within a string.

Both kinds of yarns can be Debuged and Displayed, and will print out as strings would. In particular, invalid UTF-8 is converted into either \xNN escapes or replacement characters (for Debug and Display respectively).

let invalid = ByteYarn::from_byte(0xff);
assert_eq!(format!("{invalid:?}"), r#""\xFF""#);
assert_eq!(format!("{invalid}"), "οΏ½");

Dependencies

~1MB
~12K SLoC