#codec #unicode #encode #ascii #lower-case #up #truncated

utf58

High-tech encoding of the Unicode space in one quibble and up to 3 bytes

6 releases

0.1.1 Oct 20, 2024
0.1.0 Oct 14, 2024
0.0.4 Oct 14, 2024

#919 in Text processing

Download history 4/week @ 2024-12-07 2/week @ 2024-12-28 3/week @ 2025-02-01 5/week @ 2025-02-08

300 downloads per month

MIT/Apache

8KB
153 lines

UTF-58

A UTF-58 encoder and decoder. UTF-58 (pronounced fifty-eight) is an encoding for arbitrary Unicode codepoints that uses an initial 5 bits (called a quibble), and then up to 3 bytes.

This is useful when wanting to encode a Unicode codepoint in a way that leaves 3 bits of space for additional data.

UTF-58 is kinda ASCII-compatible (as in, the first quibble represents the truncated ASCII value) for lowercase a-z.

For more information, check the official specification.

No runtime deps