10 releases

0.3.1	Nov 15, 2024
0.3.0	Oct 10, 2023
0.2.3	Jan 3, 2023
0.2.2	Dec 30, 2022
0.1.3	Dec 30, 2022

#136 in Memory management

772 downloads per month

MIT/Apache

27KB
566 lines

alloc-track

This project allows per-thread and per-backtrace realtime memory profiling.

Use Cases

Diagnosing memory fragmentation (in the form of volatile allocations)
Diagnosing memory leaks
Profiling memory consumption of individual components

Usage

Add the following dependency to your project: alloc-track = "0.2.3"

Set a global allocator wrapped by alloc_track::AllocTrack

Default rust allocator:


use alloc_track::{AllocTrack, BacktraceMode};
use std::alloc::System;

#[global_allocator]
static GLOBAL_ALLOC: AllocTrack<System> = AllocTrack::new(System, BacktraceMode::Short);

Jemallocator allocator:


use alloc_track::{AllocTrack, BacktraceMode};
use jemallocator::Jemalloc;

#[global_allocator]
static GLOBAL_ALLOC: AllocTrack<Jemalloc> = AllocTrack::new(Jemalloc, BacktraceMode::Short);

Call alloc_track::thread_report() or alloc_track::backtrace_report() to generate a report. Note that backtrace_report requires the backtrace feature and the BacktraceMode::Short or BacktraceMode::Full flag to be passed to AllocTrack::new.

Performance

In BacktraceMode::None or without the backtrace feature enabled, the thread memory profiling is reasonably performant. It is not something you would want to run in a production environment though, so feature-gating is a good idea.

When backtrace logging is enabled, the performance will degrade substantially depending on the number of allocations and stack depth. Symbol resolution is delaying, but a lot of allocations means a lot of backtraces. backtrace_report takes a single argument, which is a filter for individual backtrace records. Filtering out uninteresting backtraces is both easier to read, and substantially faster to generate a report as symbol resolution can be skipped. See examples/example.rs for an example.

Real World Example

At LeakSignal, we had extreme memory segmentation in a high-bandwidth/high-concurrency gRPC service. We suspected a known hyper issue with high concurrency, but needed to confirm the cause and fix the issue ASAP. Existing tooling (bpftrace, valgrind) wasn't able to give us a concrete cause. I had created a prototype of this project back in 2019 or so, and it's time had come to shine. In a staging environment, I added an HTTP endpoint to generate a thread and backtrace report. I was able to identify a location where a large multi-allocation object was being cloned and dropped very often. A quick fix there solved our memory segmentation issue.

Dependencies

~3–43MB
~661K SLoC