#low-overhead #tracing #circular-buffer

bin+lib tsc-trace

low overhead tracing of rust code using time stamp counter (x86 rdtsc)

3 releases (breaking)

0.6.0 Sep 10, 2023
0.5.0 Sep 6, 2023
0.4.0 Sep 4, 2023

#373 in Debugging

Download history 27/week @ 2023-12-19 104/week @ 2024-01-02 101/week @ 2024-01-09 274/week @ 2024-01-16 242/week @ 2024-01-23 124/week @ 2024-01-30 275/week @ 2024-02-06 208/week @ 2024-02-13 288/week @ 2024-02-20 154/week @ 2024-02-27 134/week @ 2024-03-05 85/week @ 2024-03-12 146/week @ 2024-03-19 170/week @ 2024-03-26 134/week @ 2024-04-02

553 downloads per month

MIT license

13KB
208 lines

tsc-trace

Crates.io

Trace the number of cycles used by spans of code, via the x86 rdtsc instruction. This is only usable on x86 / x86_64 architectures. It will probably give questionable results unless you're pinning threads to cores.

See main.rs for example usage.

The features "capacity_1_million" ... "capacity_64_million" set the capacity (in number of traces, not bytes) used by the thread-local vec to store traces. Default is 1 million. That vec is treated as a circular buffer, so it will wrap around and overwrite traces rather than reallocating, OOMing or stopping collection. Each trace uses 24 bytes (u64 tag, u64 starting rdtsc count, u64 ending rdtsc count). So total memory overhead is:

(1 usize for index + (capacity * 24 bytes)) * number of threads.

Alternatively you can use the feature "off" to set capacity to 0 and statically disable collection of traces. This is useful if you want to leave timing markers in place for future use, but not pay any runtime overhead.

The feature "const_array" will use a const array rather than a vec for the thread local storage of traces.

The feature "lfence" will add an lfence instruction before and after each call to rdtsc.

Run e.g. cargo bench --features "tsc-trace/capacity_1_million" to show the runtime overhead difference between using this library, vs directly calling rdtsc twice and subtracting.

Dependencies

~130KB