#enums #unsized #dst #trait-object

unsized_enum

Unsized enum implementation

2 releases

0.0.2 Aug 7, 2020
0.0.1 Aug 6, 2020

#1895 in Data structures

MIT/Apache

24KB
399 lines

Unsized enums

Rust does not support unsized (?Sized) variants in an enum. This crate provides an unsized enum with one unsized variant and one sized variant, returned boxed along with a common base structure. The enum may be read and modified, including switching variants, even through a trait object reference.

Documentation

See the crate documentation.

License

This project is licensed under either the Apache License version 2 or the MIT license, at your option. (See LICENSE-APACHE and LICENSE-MIT).

Contribution

Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in this crate by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.


lib.rs:

Rust unsized enum implementation

In stable Rust as of mid-2020, ?Sized (aka unsized or trait object or "dynamically sized type" or DST) support is missing in various places:

  • Rust's built-in enums don't support ?Sized variants

  • Option doesn't support ?Sized (because it is an enum)

  • Rust's union type doesn't support ?Sized

  • MaybeUninit also doesn't support ?Sized (because it is currently built on top of union)

So if there is a requirement for unsized types within an enum, some other approach must be taken. Currently this crate implements a single enum (UnsizedEnum), with one unsized variant and one sized variant, which is always returned boxed. (This approach can only support a single unsized variant, although it could be extended to provide additional sized variants.)

The enum may be read and modified, including switching variants, even through a trait object reference.

Safety and soundness discussion

This crate is intended to be sound, and if unsoundness can be demonstrated, it will be fixed (if possible) or else the API marked as unsafe until a safe way can be found. However right now the theoretical (rather than practical) soundness of the crate depends on aspects of Rust's safety contract which are as yet undecided. So if that makes you nervous, then don't use this crate for the time being.

The boxed memory is accessed with two structures which represent two views of that memory: UnsizedEnum for the header plus the V0 (unsized) variant, and UnsizedEnum_V1 for the header plus the V1 (sized) variant. #[repr(C)] is used to enforce the order of members and to ensure that the header part of the UnsizedEnum structure is compatible with UnsizedEnum_V1.

It's necessary to include the V0 instance directly in the UnsizedEnum structure, because its cleanup must be handled through the vtable. If no V0 value is included in UnsizedEnum, it seems that the Drop handler doesn't receive a fat pointer, and so has no access to the vtable. However in the case of storing a V1 variant, the V0 value included in the UnsizedEnum must not be dropped because it will be invalid data for the V0 type. So the V0 value is made ManuallyDrop so that we can skip dropping that invalid V0 data in the V1 variant case. (MaybeUninit would be better but it doesn't support ?Sized yet.)

So strictly speaking in the case of storing the V1 variant, because the UnsizedEnum struct contains val: ManuallyDrop<V0>, we're working with references to an invalid UnsizedEnum (invalid in the val part). However we never "produce" an invalid UnsizedEnum value. V1 values are produced using UnsizedEnum_V1. The only code that is exposed to the entire invalid UnsizedEnum is the compiler-generated drop code. (Whether passing around a reference to invalid data is theoretically sound or not is undecided, but it seems like the consensus is leaning towards it being sound.)

It's important that the niche-filling optimisation doesn't try to make use of any unused bit-patterns in the V0 value to store data, because those may overwrite the value for the V1 variant. However since this implementation is in total control of the structure and the structure is returned boxed, there is no way for a crate user to cause the ManuallyDrop<V0> value to be wrapped in an enum, so there should be no case where niche-filling would try to make use of the memory within the ManuallyDrop. So the compiler-generated drop code should have no reason to touch the V0 variant memory in the V1 case. So our Drop implementation is free to skip dropping the (invalid) V0 value and drop the overlaid V1 value instead.

If Rust gains support for ?Sized in more places, especially MaybeUninit, this implementation will be improved to use those features. That would also resolve the question about the theoretical soundness of holding a reference to invalid data.

In addition it's necessary to compare vtable pointers in set_v0. This depends on the layout of fat pointers. This is much lower-risk, since it will fail immediately and very obviously in testing if the layout changes in the compiler. Also several other crates already depend on this layout.

No runtime deps