#string #byte #byte-slice #type #owned #unsafe #annoyed

nightly no-std generic-str

Annoyed that Rust has two string types? Well it doesn't any more

4 releases

0.3.1 Jan 10, 2022
0.3.0 Jan 10, 2022
0.2.2 Nov 11, 2021
0.2.1 Nov 10, 2021

#2249 in Algorithms

MIT license

120KB
1.5K SLoC

generic-str

docs

The one true string type in Rust!

This project intends to be a proof-of-concept for an idea I had a few months back. There is lots of unsafe and requires nightly. Tested on cargo 1.58.0-nightly (2e2a16e98 2021-11-08)

Explanation

Rust notoriously has a few different string types. The two main contenders are:

  • &str which is a 'string reference'. It's non-resizable and it's mutability is limited.
  • String which is an 'owned string'. It's resizable and can be mutated simply.

It turns out that these two strings aren't too different. str is just a string that's backed by a [u8] byte slice. Similarly, String is just a string that's backed by a Vec<u8>.

So why are they really different types? Couldn't we theoretically have something like

type str = StringBase<[u8]>;
type String = StringBase<Vec<u8>>;

So that's what this is. It's mostly up to feature parity with the standard library strings. A lot of the standard trait implementations are there too.

generic-vec

So there was some discussion about whether Allocator was the best abstraction for customising Vec storage. I was very intrigured by this concept, and I made use of an implementation that RustyYato contributed in the thread in this project.

So, now I have

use generic_vec::{GenericVec, raw::Heap};
pub type String<A = Global> = OwnedString<u8, Box<[MaybeUninit<u8>], A>>;
pub type OwnedString<S> = StringBase<GenericVec<u8, S>>;

Which might look more complicated, and you'd be right. Implementation wise, GenericVec<U, Heap<U, A>> is supposed to be identical to Vec<u8> so it should be functionally the same as before.

But, with the added power of this storage backed system, it allows for static allocated but resizable† strings!

pub type ArrayString<const N: usize> = OwnedString<[MaybeUninit<u8>; N]>;

And I get to re-use all of the same code from when implementing String, because it's all implemented on the base OwnedString type for string manipulations that needs resizablility.

†: obviously, they cannot be resized larger than the pre-defined N value, and it will panic in the event that you attempt to push over that.

Dependencies

~145KB