#tracing #applications #spans #parallelism #parallel #plot #cli-applications

bin+lib tracing-durations-export

Record and visualize parallelism of tracing spans

4 releases

0.2.0 Jan 30, 2024
0.1.2 Jan 27, 2024
0.1.1 Jan 7, 2024
0.1.0 Jan 6, 2024

#154 in Debugging

Download history 9/week @ 2024-01-05 252/week @ 2024-01-12 300/week @ 2024-01-19 122/week @ 2024-01-26 80/week @ 2024-02-02 391/week @ 2024-02-09 928/week @ 2024-02-16 663/week @ 2024-02-23 623/week @ 2024-03-01 817/week @ 2024-03-08 425/week @ 2024-03-15 475/week @ 2024-03-22 422/week @ 2024-03-29 380/week @ 2024-04-05

2,060 downloads per month

MIT/Apache

51KB
907 lines

tracing-durations-export

crates.io Documentation

A tracing layer to figure out which tasks are running in parallel and which are blocked on cpu, mainly for cli applications.

Each span from beginning to end is a blue stripe. An async span can either be active or yield and wait (details in the tracing docs), only in the sections in which a span is active, we plot an orange section above it. Sync spans are always active, so their blue and orange regions are identical. The darker the color the more spans of the same name are active at the same time.

The example plot below is generated by cached_network.rs has four sections. The first show sequentially making network requests and parsing the response, the second section shows the same logic but parallelized with buffer_unordered. You can see how the requests happen in parallel and the parsing starts as soon as a request is finished. Sections three and four are a more complex example where we first check a cache before emitting a network request.

Example plot

Open the svg in your browser and hover over the sections for detailed timings and field information.

The multi-lane option provides a more verbose view showing each individual span:

Example plot, multi lane

The plots are complementary to a cpu profiler such as perf or sample and looking at raw span durations. They don't give you and exact work-by-line breakdown, instead they tell you where cpu is blocking or delaying other work and when the cpu is idle waiting for more parallelism.

Usage

use std::fs::File;
use std::io::BufWriter;
use tracing_durations_layer::DurationLayer;
use tracing_subscriber::layer::SubscriberExt;
use tracing_subscriber::{registry::Registry, fmt};

fn setup_global_subscriber() -> DurationsLayerDropGuard {
    let fmt_layer = fmt::Layer::default();
    let (duration_layer, guard) = DurationsLayerBuilder::default()
        .durations_file("traces.ndjson")
        // Available with the `plot` feature
        // .plot_file("traces.svg")
        .build()
        .unwrap();
    let subscriber = Registry::default()
        .with(fmt_layer)
        .with(duration_layer);
    
    tracing::subscriber::set_global_default(subscriber).unwrap();

    guard
}

// your code here ...

You can either use the plot feature and set .plot_file() on the builder, or after running your application, run

cargo run --bin plot --features plot --features cli -- traces.ndjson

and open traces.svg.

For the plots at the beginning of the readme:

TRACING_DURATION_EXPORT=examples/cached_network.ndjson cargo run --example cached_network
cargo run --bin plot --features plot --features cli -- examples/cached_network.ndjson
cargo run --bin plot --features plot --features cli -- --multi-lane examples/cached_network.ndjson --output examples/cached_network_multi_lane.svg

The traces.ndjson output file will look something like below, where each section where a span is active is one line.

[...]
{"id":6,"name":"read_cache","start":{"secs":0,"nanos":122457871},"end":{"secs":0,"nanos":122463135},"parents":[5],"fields":{"id":"2"}}
{"id":5,"name":"cached_network_request","start":{"secs":0,"nanos":122433854},"end":{"secs":0,"nanos":122499689},"parents":[],"fields":{"id":"2","api":"https://example.net/cached"}}
{"id":9007474132647937,"name":"parse_cache","start":{"secs":0,"nanos":122625724},"end":{"secs":0,"nanos":125791908},"parents":[],"fields":{}}
{"id":5,"name":"cached_network_request","start":{"secs":0,"nanos":125973025},"end":{"secs":0,"nanos":126007737},"parents":[],"fields":{"id":"2","api":"https://example.net/cached"}}
{"id":5,"name":"cached_network_request","start":{"secs":0,"nanos":126061739},"end":{"secs":0,"nanos":126066912},"parents":[],"fields":{"id":"2","api":"https://example.net/cached"}}
{"id":2251799813685254,"name":"read_cache","start":{"secs":0,"nanos":126157156},"end":{"secs":0,"nanos":126193547},"parents":[2251799813685253],"fields":{"id":"3"}}
{"id":2251799813685253,"name":"cached_network_request","start":{"secs":0,"nanos":126144140},"end":{"secs":0,"nanos":126213181},"parents":[],"fields":{"api":"https://example.net/cached","id":"3"}}
{"id":27021597764222977,"name":"make_network_request","start":{"secs":0,"nanos":128343009},"end":{"secs":0,"nanos":128383121},"parents":[13510798882111491],"fields":{"api":"https://example.net/cached","id":"0"}}```
[...]

Note that 0 is the time of the first span, not the start of the process.

Dependencies

~1.8–3MB
~58K SLoC