1 stable release
1.0.0 | Jul 27, 2024 |
---|
#99 in Memory management
73KB
1.5K
SLoC
alloc_cat
|\_/|
) (
=\___/=
/ \ alloc_cat
| |
_ ( ) ______ ______ _
\ / \_ _/\ / \ / \ /
| // | | |
| \\ | | |
| \) | | |
| | | |
Alloc_cat is a simple allocator for small-to-tiny Wasm projects in rust, #[no_std]
compatible.
It's meant as a substitute for the now-abandoned wee_alloc in situations where the rust system allocator is too heavy-weight. It's not as sophisticated as wee_alloc, focusing instead on simplicity and testability.
The allocation strategy is a simple bump pointer with free list, optimized for programs that rarely use more than 2MB or allocate chunks larger than 32KB, though it can handle any size of heap or allocation up to your system's limit. In addition, metrics can be collected to help analyze your program's memory use.
Quick start
Replace the rust system allocator with the one from alloc_cat. It should be as simple as putting this in your top-level lib.rs
:
extern crate alloc;
use alloc_cat::{ALLOCATOR, AllocCat};
#[global_allocator]
pub static GLOBAL_ALLOCATOR: &AllocCat = &ALLOCATOR;
Then all heap allocations (like Box
) will use alloc_cat. You can verify at runtime by calling get_stats()
(see below) and checking that the values aren't all zero.
Using it
There are three different ways to use alloc_cat:
-
within Wasm (
target_arch = "wasm32"
)This mode provides an allocator based on Wasm's memory model (described in a separate section below), and is the main purpose of this library.
-
wrapping the system allocator
This mode is activated by turning on the
system_allocator
feature (which is on by default) when running on a non-Wasm architecture. It provides a wrapper for the system allocator with no other intrinsic features except allocation metrics. -
using the alloc_cat algorithm outside of Wasm
This mode is activated by turning off the
system_allocator
default feature when running on a non-Wasm architecture. It grabs a static pool of memory from the OS usingmmap
and provides the alloc_cat allocator to manage it. It uses mutexes to avoid thread contention, which may be slow. It's really only useful for testing the alloc_cat algorithms in a non-Wasm environment.
Supported cargo "features":
-
system_allocator
(default)Use the rust system allocator when running on any non-Wasm architecture. This is usually what you want for tests, unless you're testing alloc_cat itself. The alternative is an mmap-based pool, as described above.
-
audit_sizes
Turning this on will cause make alloc_cat track the count of allocations per size and alignment (in addition to its normal metrics), which makes the allocator a little slower, and uses 2KB of additional memory, but can be useful for debugging leaks. It may also be interesting to the curious or detail-oriented.
When using the mmap allocator (target_arch
is not Wasm and system_allocator
feature is turned off), you can set the size of the pool used for the heap with an environment variable:
ALLOC_CAT_MMAP_HEAP_SIZE
(number, in bytes, default = 2097152 = 2MB)
Metrics
To get basic statistics about the heap:
let stats = ALLOCATOR.get_stats();
The stats
object will have various fields filled in, depending on which allocator is in use (unused fields will be zero).
-
in_use: usize
-
high_water: usize
in_use
is the number of bytes currently allocated by the application, andhigh_water
is the highest value thatin_use
has ever had. The amount of memory in use by the heap will be larger, due to fragmentation and block size rounding, which are tracked in more detailed metrics below (if you aren't using the system allocator). -
size_table: [SizeInfo; 257]
(only withfeature = "audit_sizes"
)Each element in the array is one "bucket" of a scaled histogram of request sizes:
size: usize
-- bytescount: usize
-- # of current/active allocationshigh_water: usize
-- highest value ofcount
The first 48 buckets are sizes 1 to 48, and the remaining ones cover spans of 2 (50, 52, 54, ...), 4, 8, and up, exponentially, gradually reaching a span of 512 bytes, up to a size of about 22.5KB. The final bucket will cover any allocations larger than 22.5KB, and its
size
field will beusize::MAX
.Allocation requests are assigned to the nearest bucket greater than or equal to the requested size, so with buckets of (..., 88, 92, 96, ...), the 92 bucket will count all allocation requests of 89, 90, 91, or 92.
-
align_table: [usize; 16]
(only withfeature = "audit_sizes"
)Each element in the array is one bit-width alignment, from byte alignment (
align_table[0]
or 0 bits) through u32 alignment (align_table[2]
, 2 bits) and up. In practice, alignments beyond 3 bits (8 bytes) are rare to nonexistent.The values in the array are the counts of allocation requests per alignment, as a request count, not the number of current allocations.
The remaining fields are only relevant to the Wasm allocator. They'll all remain 0 for the system allocator. "Words" refer to 32-bit (4-byte) units, and "pages" refer to 64KB units.
-
regions: usize
-- number of 2MB regions being managed by the allocator -
total_words: usize
-- number of (4-byte) words granted to the allocator by Wasm. This includes memory allocated, tracked in the free list, and not yet allocated by the bump pointer. -
allocated_words: usize
-- number of words allocated or in the free list -
unused_words: usize
-- number of words available (so far) to the bump pointer. This is justtotal_words - allocated_words
. -
fragment_words: usize
-- number of words in the free list, which is unallocated space wedged between active allocations -
fragments: usize
-- number of separate segments that make up the free list -
total_large_pages: usize
-- number of 64KB pages allocated for "large" (>= 32KB) requests -
unused_large_pages: usize
-- number of 64KB large pages that have been freed and not (yet) reused -
allocated_large_pages_words: usize
-- total number of words requested in all active large allocations -
heap_start: usize
-- address of the start of the Wasm heap, for the curious. The Wasm heap is described in detail in its own section below.
How Wasm memory works
The Wasm engine provides a contiguous block of memory as a whole number of "pages", where a page is 64KB. Addresses start at 0 (0x0000_0000
).
A typical program will reserve memory for global data and a task stack, then leave the rest available for a heap that can grow on demand. As the heap grows, it must ask Wasm for more pages.
| |
| +-- 192KB (3 pages)
| |
| |
| ^ not yet ^ |
| | allocated | |
| - - - - - - - - - - - - - +-- 128KB (2 pages)
| ^ |
| heap | |
+---------------------------+ (88KB)
| memory reserved before |
| program start: +-- 64KB (1 page)
| - task stack |
| - data/bss |
| |
| |
+---------------------------+-- 0
In this example, 128KB (or 2 pages) is allocated at startup. 88KB is reserved for global data and a stack, leaving 40KB of the 2nd page for the heap. The program can ask for another page, extending total memory to 192KB and extending the available heap from 40KB to 104KB (40KB + one new 64KB page).
We only get two "system calls" for managing memory:
- size:
memory_size() -> usize
which returns the number of pages currently allocated - grow:
memory_grow(count: usize) -> usize
which allocatescount
new pages and returns the previous page count
In our example, memory_size()
would return 2
initially. To extend the heap to 104KB, we'd call memory_grow(1)
(which would return 2
) and now memory_size()
will return 3
.
There is no way to return memory to the system. Wasm's heap can grow, but never shrink.
We also get a symbol __heap_base
which points to the end of reserved memory, at address 88KB (0x0001_6000
) here.
Luckily, this is enough to implement a heap. Our memory starts at address __heap_base
and ends at address memory_size() * 0x1_0000
. Everything within this range is ours, for use in allocating and freeing blocks of memory for the program.
If Wasm runs out of memory, alloc_cat will report failure, which will usually gracelessly end your program in a panic (just like any other allocator).
How alloc_cat works
Normal allocation requests (less than 32KB) use a basic algorithm like you'd find in any textbook since the 20th century, based on a bump pointer and a linked list of free blocks. At the start, the allocator sets the bump pointer to the start of free memory (__heap_base
in Wasm), and sets the free list to an empty list.
Each request is rounded up to the nearest number of whole words (4 bytes). First, we walk the free list, and return the first available block which is big enough and has the right alignment. If the block is bigger than the requested size, we'll split it first. If nothing is available, we increment the bump pointer, extending the heap by another page if necessary.
By restricting the block size to 32KB and each region's size to 2MB, each "next" pointer in the free list can pack the size and offset into a single word:
31 24 19 16 8 0 <- bit
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| size, in words | offset to next block, in words |
| 0-8KW (0-32KB) | 0-512KW (0-2MB) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
free list pointer, packed into one word (32 bits)
The free list is kept sorted by address, so when a block of memory is freed, it can be merged with any adjacent free block. If this causes the last free block to meet the bump pointer, we move the bump pointer backwards instead.
If we want to extend the heap, but it's reached 2MB (or the next Wasm page has already been used for some other purpose, like a large allocation), we start a new heap region and link it to the previous one, creating a linked list of regions, each up to 2MB.
Large allocations (32KB or larger, with no limit) are simpler. They're tracked in a linked list, with each link containing two words: a pointer to the start of the next block, and a count of how many pages are in that next block. This link is stored in the final two words at the end of the last page of each subsequent block. Freed blocks are reused when possible, and can be split and merged if they're next to each other, but only in units of a page (64KB).
License
This code is licensed under either the prosperity license (LICENSE-propserity.md
) or the anti-capitalist license (LICENSE-anti-capitalist.txt
) at your preference, for personal or non-commercial use.
Authors
- Robey Pointer robey@lag.net
The logo is strongly inspired by (based on) a series of pieces by prolific ASCII artist "jgs". You can find cool art of cats, including her work, here.
Dependencies
~2MB
~41K SLoC