17 releases
0.2.15 | Sep 11, 2023 |
---|---|
0.2.14 | Mar 8, 2023 |
0.2.13 | Dec 27, 2022 |
0.2.11 | Nov 4, 2022 |
0.2.3 | Nov 18, 2020 |
#123 in Memory management
22,282 downloads per month
Used in 31 crates
(13 directly)
45KB
973 lines
Heap data estimator.
The datasize
crate allows estimating the amount of heap memory used by a value. It does so by
providing or deriving an implementation of the DataSize
trait, which knows how to calculate
the size for many std
types and primitives.
The aim is to get a reasonable approximation of memory usage, especially with variably sized
types like Vec
s. While it is acceptable to be a few bytes off in some cases, any user should
be able to easily tell whether their memory is growing linearly or logarithmically by glancing
at the reported numbers.
The crate does not take alignment or memory layouts into account, or unusual behavior or optimizations of allocators. It is depending entirely on the data inside the type, thus the name of the crate.
General usage
For any type that implements DataSize
, the data_size
convenience function can be used to
guess the size of its heap allocation:
use datasize::data_size;
let data: Vec<u64> = vec![1, 2, 3];
#[cfg(feature = "std")]
assert_eq!(data_size(&data), 24);
Types implementing the trait also provide two additional constants, IS_DYNAMIC
and
STATIC_HEAP_SIZE
.
IS_DYNAMIC
indicates whether a value's size can change over time:
use datasize::DataSize;
#[cfg(feature = "std")]
// A `Vec` of any kind may have elements added or removed, so it changes size.
assert!(Vec::<u64>::IS_DYNAMIC);
// The elements of type `u64` in it are not dynamic. This allows the implementation to
// simply estimate the size as number_of_elements * size_of::<u64>.
assert!(!u64::IS_DYNAMIC);
Additionally, STATIC_HEAP_SIZE
indicates the amount of heap memory a type will always use. A
good example is a Box<u64>
-- it will always use 8 bytes of heap memory, but not change in
size:
use datasize::DataSize;
#[cfg(feature = "std")]
assert_eq!(Box::<u64>::STATIC_HEAP_SIZE, 8);
#[cfg(feature = "std")]
assert!(!Box::<u64>::IS_DYNAMIC);
Overriding derived data size calculation for single fields.
On structs (but not enums!) the calculation for heap size can be overriden for single fields,
which is useful when dealing with third-party crates whose fields do not implement DataSize
by
simply annotating it with #[data_size(with = ...)]
and pointing to a Fn(T) -> usize
function:
use datasize::DataSize;
// Let's pretend this type is from a foreign crate.
struct ThirdPartyType;
fn estimate_third_party_type(value: &Vec<ThirdPartyType>) -> usize {
// We assume every item is 512 bytes in heap size.
value.len() * 512
}
#[cfg(feature = "std")]
#[derive(DataSize)]
struct MyStruct {
items: Vec<u32>,
#[data_size(with = estimate_third_party_type)]
other_stuff: Vec<ThirdPartyType>,
}
This automatically marks the whole struct as always dynamic, so the custom estimation function
is called every time MyStruct
is sized.
Implementing DataSize
for custom types
The DataSize
trait can be implemented for custom types manually:
struct MyType {
items: Vec<i64>,
flag: bool,
counter: Box<u64>,
}
#[cfg(feature = "std")]
impl DataSize for MyType {
// `MyType` contains a `Vec`, so `IS_DYNAMIC` is set to true.
const IS_DYNAMIC: bool = true;
// The only always present heap item is the `counter` value, which is 8 bytes.
const STATIC_HEAP_SIZE: usize = 8;
#[inline]
fn estimate_heap_size(&self) -> usize {
// We can be lazy here and delegate to all the existing implementations:
data_size(&self.items) + data_size(&self.flag) + data_size(&self.counter)
}
}
let my_data = MyType {
items: vec![1, 2, 3],
flag: true,
counter: Box::new(42),
};
#[cfg(feature = "std")]
// Three i64 and one u64 on the heap sum up to 32 bytes:
assert_eq!(data_size(&my_data), 32);
Since implementing this for struct
types is cumbersome and repetitive, the crate provides a
DataSize
macro for convenience:
// Equivalent to the manual implementation above:
#[cfg(feature = "std")]
#[derive(DataSize)]
struct MyType {
items: Vec<i64>,
flag: bool,
counter: Box<u64>,
}
See the DataSize
macro documentation in the datasize_derive
crate for details.
Performance considerations
Determining the full size of data can be quite expensive, especially if multiple nested levels
of dynamic types are used. The crate uses IS_DYNAMIC
and STATIC_HEAP_SIZE
to optimize when
it can, so in many cases not every element of a vector needs to be checked individually.
However, if the contained types are dynamic, every element must (and will) be checked, so keep this in mind when performance is an issue.
Handlings references, Arc
s and similar types
Any reference will be counted as having a data size of 0, as it does not own the value. There
are some special reference-like types like Arc
, which are discussed below.
Arc
and Rc
Currently Arc
s are not supported. A planned development is to allow users to mark an instance
of an Arc
as "primary" and have its heap memory usage counted, but currently this is not
implemented.
Any Arc
will be estimated to have a heap size of 0
, to avoid cycles resulting in infinite
loops.
The Rc
type is handled in the same manner.
Additional types
Some additional types from external crates are available behind feature flags.
fake_clock-types
: Support for thefake_instant::FakeClock
type.futures-types
: Some types from thefutures
crate.smallvec-types
: Support for thesmallvec::SmallVec
type.tokio-types
: Some types from thetokio
crate.
no_std
support
Although slightly paradoxical due to the fact that without std
or at least alloc
there won't
be any heap in most cases, the crate supports a no_std
environment. Disabling the "std"
feature (by disabling default features) will produce a version of the crate that does not rely
on the standard library. This can be used to derive the DataSize
trait for types without
boilerplate, even though their heap size will usually be 0.
Arrays and const generics
By default, this crate requires at least Rust version 1.51.0, in order to implement DataSize
for [T; N] arrays generically. This implementation is provided by the "const-generics"
feature flag, which is enabled by default. In order to use an older Rust version,
you can specify default-features = false
and features = ["std"]
for datasize
in your Cargo.toml.
When the const-generics
feature flag is disabled, a DataSize implementation will be provided
for arrays of small sizes, and for some larger sizes related to powers of 2.
Known issues
The derive macro currently does not support generic structs with inline type bounds, e.g.
struct Foo<T: Copy> { ... }
This can be worked around by using an equivalent where
clause:
struct Foo<T>
where T: Copy
{ ... }
Dependencies
~1.3–2.6MB
~53K SLoC