12 releases (breaking)
| 0.10.0 | Sep 1, 2025 |
|---|---|
| 0.8.0 | Aug 15, 2025 |
| 0.3.1 | Mar 2, 2025 |
| 0.2.2 | Oct 5, 2024 |
#282 in Machine learning
698 downloads per month
110KB
2K
SLoC
rstrace
rstrace is a Rust implementation of strace for x86 Linux. It allows the user to trace system calls of a process or command.
Unlike strace it can introspect NVIDIA CUDA system calls.
Install
[!NOTE] Currently only x86 Linux is supported. aarch64 support is planned, but MacOS support is out-of-scope.
Binary download
curl -LsSf https://rstrace.xyz/install.sh | sh
Cargo
cargo install rstrace
Usage
rstrace ls /tmp/
To get a quick overview, use rstrace --help
Usage: rstrace [OPTIONS] [ARGS]...
Arguments:
[ARGS]... Arguments for the program to trace. e.g. 'ls /tmp/'
Options:
-o, --output <OUTPUT> send trace output to FILE instead of stderr
-t, --timestamp... Print absolute timestamp. -tt includes microseconds, -ttt uses UNIX timestamps
-c, --summary-only Count time, calls, and errors for each syscall and report summary
-C, --summary like -c, but also print the regular output
-j, --summary-json Count time, calls, and errors for each syscall and report summary in JSON format
--tef Emit Trace Event Format (TEF) trace data as output
--verbose Output human readable information about CUDA ioctls.
--cuda Enable CUDA ioctl sniffing. [Requires 'cuda_sniff' feature]
--cuda-only Enable CUDA ioctl sniffing and disable all other output. [Requires 'cuda_sniff' feature]
-p, --attach <PID> Attach to the process with the process ID pid and begin tracing.
-f, --follow-forks Trace child processes as they are created by currently traced processes as a result of the fork(2), vfork(2) and clone(2) system calls.
--color Enable colored output (default)
-h, --help Print help
-V, --version Print version
cuda_sniff extension
cuda_sniff is an extension to rstrace that allows the user to trace CUDA API calls. It is based on
https://github.com/geohot/cuda_ioctl_sniffer by George Hotz.
gvisor has an alternative implementation called ioct_sniffer which uses LD_PRELOAD to intercept calls,
unlike rstrace which uses ptrace.
Trace Event Format (TEF)
rstrace can emit basic tracing data which can be loaded into the Google Perfetto trace viewer.
e.g. rstrace --output in-kernel.tef.json --tef python userspace.py
| Kernel | Userspace |
|---|---|
![]() |
![]() |
Syscall heavy code snippet
import os, time
deadline = time.monotonic() + 5
fd = os.open("tmp.bin", os.O_CREAT | os.O_RDWR, 0o600)
buf = b"\0" * 4096
while time.monotonic() < deadline:
os.pwrite(fd, buf, 0)
os.fsync(fd)
os.close(fd)
Userspace code snippet
use std::time::{Duration, Instant};
fn main() {
let start = Instant::now();
let mut x: u64 = 0;
while start.elapsed() < Duration::from_secs(5) {
for _ in 0..1_000_000 {
// LCG-style math to burn CPU; wrapping avoids UB
x = x.wrapping_mul(1664525).wrapping_add(1013904223);
}
}
println!("{}", x); // prevents optimizing away the loop
}
Alternatives
-
https://github.com/bpftrace/bpftrace
- Don't use rstrace/strace in production. Where feasible use
bpftracewhich is a more powerful tracing tool with much lower overhead.
- Don't use rstrace/strace in production. Where feasible use
Dependencies
~15–23MB
~330K SLoC

