3 releases
0.1.2 | Mar 30, 2024 |
---|---|
0.1.1 | Mar 30, 2024 |
0.1.0 | Mar 30, 2024 |
#589 in Development tools
155 downloads per month
Used in hunt
36KB
546 lines
Profi
A simple profiler for single and multithreaded applications.
Record the time it takes for a scope to end and print the timings when the program exits.
Each measurement has an overhead of ~25ns-50ns, so it shouldn't impact benchmarks.
Run the benchmarks example to see what's the overhead on your machine.
Setup
profi
is controlled by the enable
feature, which is active by default.
When disabled, all macros and methods will become no-ops, resulting in zero impact on your code.
To disable it, add default-features = false
to the profi
dependency in your Cargo.toml
.
For convenience, you can also add a custom feature:
[dependencies]
profi = { version = "*", default-features = false }
[features]
prof = ["profi/enable"]
And run it with cargo run --release --features prof
If you use rayon
, enable the rayon
feature!
Usage
See the examples
for more usage cases.
Basic Usage
use profi::{prof, print_on_exit};
fn main() {
// Prints the timings to stdout when the program exits
// Always put at the top of the main function to ensure it's dropped last
//
// An implicit `main` guard is created to profile the whole application
print_on_exit!();
// Sleep for a bit to simulate some work
std::thread::sleep(std::time::Duration::from_millis(200));
}
┌──────────────┬────────────────────┬───────────┬──────────────┬───────┐
│ Name ┆ % Application Time ┆ Real Time ┆ Average time ┆ Calls │
╞══════════════╪════════════════════╪═══════════╪══════════════╪═══════╡
│ simple::main ┆ 100.00% ┆ 200.13ms ┆ - ┆ 1 │
└──────────────┴────────────────────┴───────────┴──────────────┴───────┘
Loops
use profi::{prof, print_on_exit};
fn main() {
print_on_exit!();
for _ in 0..100 {
prof!(iteration);
std::thread::sleep(std::time::Duration::from_millis(10));
}
}
┌────────────┬────────────────────┬───────────┬──────────────┬───────┐
│ Name ┆ % Application Time ┆ Real Time ┆ Average time ┆ Calls │
╞════════════╪════════════════════╪═══════════╪══════════════╪═══════╡
│ loop::main ┆ 100.00% ┆ 1.01s ┆ - ┆ 1 │
├╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌┤
│ iteration ┆ 99.99% ┆ 1.01s ┆ 10.10ms/call ┆ 100 │
└────────────┴────────────────────┴───────────┴──────────────┴───────┘
Multiple threads
use profi::{print_on_exit, prof_guard};
fn do_work(i: usize) {
// Need to bind it to a variable to ensure it isn't dropped before sleeping
let _guard = if i < 6 {
prof_guard!("6 first")
} else {
prof_guard!("4 last")
};
std::thread::sleep(std::time::Duration::from_millis(10));
// The guard goes out of scope here
}
fn main() {
print_on_exit!();
// Spawn 10 threads
std::thread::scope(|s| {
for i in 0..10 {
s.spawn(move || {
do_work(i);
});
}
});
}
┌───────────────┬────────────────────┬───────────┬────────────┬──────────┬──────────────┬───────┐
│ Name ┆ % Application Time ┆ Real Time ┆ % CPU Time ┆ CPU Time ┆ Average time ┆ Calls │
╞═══════════════╪════════════════════╪═══════════╪════════════╪══════════╪══════════════╪═══════╡
│ threads::main ┆ 100.00% ┆ 10.48ms ┆ 9.43% ┆ 10.48ms ┆ - ┆ 1 │
├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌┤
│ 6 first ┆ 96.42% ┆ 10.11ms ┆ 54.38% ┆ 60.44ms ┆ 10.08ms/call ┆ 6 │
├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌┤
│ 4 last ┆ 95.93% ┆ 10.06ms ┆ 36.19% ┆ 40.22ms ┆ 10.06ms/call ┆ 4 │
└───────────────┴────────────────────┴───────────┴────────────┴──────────┴──────────────┴───────┘
"CPU Time" is the combined time all threads have spent on that scope.
For example, "6 first" has a "CPU Time" of 60 milliseconds because each thread waits 10ms, and the program spawns six of them.
Attribute
Enable the attributes
feature to use the profile
attribute on functions.
This will add a guard at the start of the function.
use profi::profile;
#[profile]
fn anotated() { /* ... */ }
Features
Name | Description |
---|---|
enable |
Activates the profiling, if not active all macros become no-ops |
attributes |
Enables the #[prof] macro |
deep-hierarchy |
By default profi merges all uses of a function, use this feature to disable this behaviour.See the nested example for more information |
nightly |
Enables nightly-only optimizations (unused at the moment) |
rayon |
Necessary if using rayon |
Dependencies
~0–1MB
~18K SLoC