2 stable releases

4.0.1	Aug 18, 2023
4.0.0	Feb 16, 2023
3.0.1	~~Jul 3, 2021~~
2.0.0	~~Jun 21, 2021~~
1.0.0	~~Jun 20, 2021~~

#860 in Rust patterns

Used in german-str

Apache-2.0 WITH LLVM-exception

27KB
210 lines

ointers

What do you call a pointer with the high bits stolen? An ointer!

Ointers is a library for representing pointers where some bits have been stolen so that they may be used by the programmer for something else. In effect, it's a small amount of free storage

Fully supports no_std, dependency-free, <100loc.

Bit sources

Ointers supports a handful of bit sources. It's up to you to determine when it is safe to use them.

Alignment bits (const parameter A)

If we know that a pointer's address will always be aligned to A bytes where A > 1, we can steal log2(A-1) bytes. For an 8-byte aligned value, this provides a modest 3 bits.

If you have values aligned to some larger width, you could get even more. It's common in parallel programming to pad data to a cache line by increasing its alignment requirements in order eliminate false sharing. Because a cache line on amd64 or aarch64 is effectively 128 bytes thanks to prefetching, you can reclaim a respectable 7 extra bits.

If your data is aligned wider still, the sky is the limit, but you could get an incredible 24 bits if you have 16MB-aligned data! Remember that the only alignment rust knows about is what is declared for the type, so you must create a newtype wrapper to take full advantage of large alignment sizes.

Bits	Min alignment
1	2b
2	4b
3	8b
4	16b
5	32b
6	64b
7	128b
8	256b
9	512b
10	1k
11	2k
12	4k
13	8k
14	16k
15	32k
16	64k
17	128k
18	256k
19	512k
20	1m
21	2m
22	4m
23	8m
24	16m

Stealing bits from alignment is relatively innocuous, but we can only get the compiler to check it for you in dev mode as things stand in rust today.

Sign bit (parameter S)

The most commonly used operating systems arrange memory so that the high half of virtual memory space is reserved for the kernel and the low half is given to userspace.

Looked at as a signed integer, this makes the kernel half of address space negative and the userspace half positive.

Most programs do not deal with kernel addresses, thus giving us an extra bit to play with.

We can also get this extra bit in kernel mode if we know we will not be dealing with userspace addresses. We do this by taking a pointer to a value on the stack and stealing its sign bit.

If you know you will be dealing with userspace addresses in kernel space or kernel space addresses in userspace, or you are using or implementing a kernel which does not follow this convention, you must set S to false.

The S bit is currently only tested with userspace pointers in userspace. While we think it should work more generally, we currently haven't got a test rig for other scenarios so we can't promise it does.

Unused virtual address space (V)

In 64-bit mode, address space is plentiful: nobody has 64 bits' worth of RAM and even if they did, their processor is unable to address it all. V is required to be 0 unless compiling for a 64bit target.

The number of bits that may be safely stolen by this method depends upon the microarchitecture in question and the page table depth.

For x86-64 and aarch64, the following sizes apply:

Bits	PT depth	Support
25	3	aarch64 only, uncommon, opt-in
16	4	most common default
7	5	some intel only, uncommon, opt-in

If you are made of money and need more than 128TB virtual address space, you should limit yourself to 7 bits for V. Likewise if you know you'll be on 3-deep page tables, you can steal a whopping 25 bits. But you're probably limited to 16 bits in general.

Hacking

Contributions welcome, please be nice.

The test suite requires std at present (because of rand's ThreadRng). It takes about 36 seconds on my Ryzen 3900X. If you're on a slower machine, you might want to reduce the loop iterations. It should really only take one to shake most bugs out and a handful to shake out the rest. The million iterations is just overkill to be sure that's conveniently enabled by the incredible ease-of-use of rayon.

The tests are extensive. The number of configurations I'm currently testing against is limited, however:

x86-64, linux with 4PT

We would like to support more. These should be easy:

x86, linux with 4PT should be possible through github actions.
aarch64, linux with 4PT shouldn't be too hard to find access to.
32-bit arm (3PT) should be doable as well.

These will likely be harder:

aarch64, linux with 3PT is probably not widely deployed
Intel's new 5PT is of incredibly niche interest - if you want us to test it, you'll have to sponsor it because I don't have access to the hardware..

Changelog

v4.0.1

Optional pointer provenance support with the sptr feature (thanks @GoldsteinE!)

v4.0.0

Security (please upgrade as soon as possible, i've yanked old versions):

Fix possible UB when stealing alignment bits with Ox/NotNull.

Features:

Add method new_stealing for all types allowing to create while stealing data.

v3.0.1

Add i_know_what_im_doing feature to enable stealing from V when not building for a 64-bit target.

v3.0.0

Make Ointer<T> and NotNull<T> be Clone + Copy even if T is not.

v2.0.0

New APIS:

pack() - packs a pointer into the low bits of a usize.
unpack() - reverse of pack.
asv_mask() - calculates a mask where the stolen bits are set on by a, s, v.
mask() - calculates a mask where the stolen bits are set on by total bits.
Ointer::raw() - returns the raw data in the ointer (stolen + ptr) as a usize.
NotNull::raw() - same, but for NotNull.

Changes:

Ointer now uses a usize internally.

Copyright and License

Licensed under Apache License, Version 2.0 (https://www.apache.org/licenses/LICENSE-2.0), with LLVM Exceptions (https://spdx.org/licenses/LLVM-exception.html).

Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in the work by you, as defined in the Apache-2.0 license, shall be licensed as above, without any additional terms or conditions.

no-std ointers