1 unstable release
0.1.0 | Aug 20, 2024 |
---|
#1918 in Algorithms
34KB
657 lines
german-str
German strings are a string type with the follow properties:
- They are immutable.
size_of::<GermanStr>() == 16
- They can't be longer than 2^32 bytes.
- Strings of 12 or less bytes are entirely located on the stack.
- Comparisons depending only on the first 4 bytes are very fast.
They are described here. TL;DR: it's a 16 bytes struct where:
- The first 4 bytes of the struct is an
u32
representing the length of the string. - The first 4 bytes of the string are stored right after.
- If the rest of the string can fit in the remaining 8 bytes, it is directly stored there.
- Otherwise the last 8 bytes are a pointer to the string buffer on the heap (which includes the 4 bytes prefix).
The implementation was heavily inspired by SmolStr
.
The main downside of GermanStr
compared to SmolStr
is that heap buffers aren't shared between instances by default: this is enabled by calling leaky_shared_clone
, which clones in O(1) time, but introduces the risks associated with manual memory management.
Requirements
[cfg(target_pointer_width = "64")]
- The crate is compatible with
[no_std]
.
Benchmarks
The following plots are generated by the crate's benchmarks. In the first half of rows, comparisons are made on random ASCII strings. As a result, the vast majority of comparisons only require comparing prefixes.
In the second half (worst cases), the string compared are identical, and every pair of byte has to be compared. Unless the string is short enough to be inlined, performance is equivalent to comparing two regular String
.
Dependencies
~210KB