9 unstable releases (3 breaking)

0.4.4 Apr 2, 2023
0.4.3 Apr 2, 2023
0.4.0 Mar 29, 2023
0.3.1 Mar 18, 2023
0.1.1 Feb 18, 2023

#106 in Memory management

Download history 346/week @ 2024-08-06 236/week @ 2024-08-13 306/week @ 2024-08-20 287/week @ 2024-08-27 297/week @ 2024-09-03 389/week @ 2024-09-10 376/week @ 2024-09-17 371/week @ 2024-09-24 249/week @ 2024-10-01 384/week @ 2024-10-08 190/week @ 2024-10-15 287/week @ 2024-10-22 204/week @ 2024-10-29 281/week @ 2024-11-05 272/week @ 2024-11-12 507/week @ 2024-11-19

1,315 downloads per month

MIT/Apache

130KB
2.5K SLoC

Reference counted vectors.

Overview

This crate provides the following two types:

  • SharedVector<T, A>/AtomicSharedVector<T, A>, an immutable reference counted vector (with an atomically reference counted variant).
  • Vector<T, A>, an unique vector type with an API similar to std::Vec<T>.

Internally, shared vectors are a little different from the standard Vec<T>. SharedVector and AtomicSharedVector hold a single pointer to a buffer containing:

  • A header storing the length, capacity and allocator,
  • the contiguous sequence of items of type T.

Vector's representation is closer to Vec<T>: it stores the length and capacity information inline and only writes them into the header if/when converting into a shared vector. The allocated buffer does leave room for the header so that converting to and from SharedVector does not require reallocating.

This allows very cheap conversion between the two:

  • shared to unique: a new allocation is made only if there are multiple handles to the same buffer (the reference count is greather than one).
  • unique to shared: always fast since unique buffers are guaranteed to be sole owners of their buffer.

The generic parameter A is the allocator. This crate uses allocator-api2 to polyfill the unstable allocator_api feature. This makes it possible to use custom allocators in stable rust while the feature is still nightly-only.

Use cases

Arc<Vec<T>> without the indirection.

A vector can be be built using a Vec-style API, and then made immutable and reference counted for various use case (easy multi-threading or simply shared ownership).

Using the standard library one might be tempted to first build a Vec<T> and share it via Arc<Vec<T>>. This is a fine approach at the cost of an extra pointer indirection that could be avoided in principle. Another approach is to share it as an Arc<[T]> which removes the indirection at the cost of the need to reallocate and copy the contents.

Using this crate there is no extra indirection in the resulting shared vector nor any copy between the unique and shared versions.

use shared_vector::Vector;
let mut builder = Vector::new();
builder.push(1u32);
builder.push(2);
builder.push(3);
// Make it reference counted, no allocation.
let mut shared = builder.into_shared();
// We can now create new references
let shared_2 = shared.new_ref();
let shared_3 = shared.new_ref();

Immutable data structures

SharedVector and AtomicSharedVector behave like simple immutable data structures and are good building blocks for creating more complicated ones.

use shared_vector::{SharedVector, rc_vector};
let mut a = rc_vector![1u32, 2, 3];

// `new_ref` (you can also use `clone`) creates a second reference to the same buffer.
// future mutations of `a` won't affect `b`.
let mut b = a.new_ref();
// Because both a and b point to the same buffer, the next mutation allocates a new
// copy under the hood.
a.push(4);
// Now that a and b are unique pointers to their own buffers, no allocation happens
// when pushing to them.
a.push(5);
b.push(6);

assert_eq!(a.as_slice(), &[1, 2, 3, 4, 5]);
assert_eq!(b.as_slice(), &[1, 2, 3, 6]);

Note that SharedVector is not a RRB vector implementation.

ChunkVector

As a very light experiment towards making custom immutable data structures on top of shared vectors, there is a very simple chunked vector implementation in the examples folder.

Just like the vector types, "chunk vector" comes in three flavors: ChunkVector, SharedChunkVector, AtomicSharedChunkVector.

They are internally represented as a reference counted table of reference counted memory blocks (or "chunks"). In the illustration above, two chunked vectors point to the same table, while another points to a different table but still shares some of the storage chunks. In practice the chunked vector types are very little more than SharedVector<SharedVector<T>>

Limitiations

  • These vector types can hold at most u32::MAX elements.

License

Licensed under either of:

Contributions

See the contribution guidelines.

Dependencies

~265KB