2 stable releases

1.0.1 Oct 18, 2023
1.0.0 Oct 17, 2023

#1212 in WebAssembly

35 downloads per month

Apache-2.0 WITH LLVM-exception

130KB
2K SLoC

Winliner

The WebAssembly indirect call inliner!

build status Documentation Status

API Docs | Contributing

About

Winliner speculatively inlines indirect calls in WebAssembly, based on observed information from a previous profiling phase. This is a form of profile-guided optimization that we affectionately call winlining.

First, Winliner inserts instrumentation to observe the actual target callee of every indirect call site in your Wasm program. Next, you run the instrumented program for a while, building up a profile. Finally, you invoke Winliner again, this time providing it with the recorded profile, and it optimizes your Wasm program based on the behavior observed in that profile.

For example, if profiling shows that an indirect call always (or nearly always) goes to the 42nd entry in the funcrefs table, then Winliner will perform the following semantically-transparent transformation:

;; Before:

call_indirect

;; After:

;; If the callee index is 42, execute the inlined body of
;; the associated function.
local.tee $temp
i32.const 42
i32.eq
if
  <inlined body of table[42] here>
else
  local.get $temp
  call_indirect
end

The speculative inlining by itself is generally not a huge performance win, since CPU indirect branch prediction is very powerful these days. (Although, depending on the Wasm engine, entering a new function may incur some cost and inlining does avoid that.) The primary benefit is that it allows the Wasm compiler to "see through" the indirect call and perform subsequent optimizations (like GVN and LICM) on the inlined callee's body, which can result in significant performance benefits.

This technique is similar to devirtualization but doesn't require that the compiler is able to statically determine the callee, nor that the callee is always a single, particular function 100% of the time. Unlike devirtualization, Winlining can still optimize indirect calls that go a certain way 99% of the time and a different way 1% of the time because it can always fall back to an unoptimized indirect call.

Install

You can install via cargo:

$ cargo install winliner --all-features

Example Usage

First, instrument your Wasm program:

$ winliner instrument my-program.wasm > my-program.instrumented.wasm

Next, run the instrumented program to build a profile. This can either be done in your Wasm environment of choice (e.g. the Web) with a little glue code to extract and shepherd out the profile, or you can run within Winliner itself and the Wasmtime-based WASI environment that comes with it:

$ winliner profile my-program.instrumented.wasm > profile.json

Finally, tell Winliner to optimize the original program based on the observed call_indirect behavior observed in the given profile:

$ winliner optimize --profile profile.json my-program.wasm > my-program.winlined.wasm

Caveats

  • Winliner is not safe in the face of mutations to the funcref table, which is possible via the table.set instruction (and others) introduced as part of the reference-types proposal. You must either disable this proposal or manually uphold the invariant that the funcref table is never mutated. Breaking this invariant will likely lead to diverging behavior from the original program and very wonky bugs! Any exported funcref tables must additionally not be mutated by the host.

  • Winliner only optimizes call_indirect instructions; it cannot optimize call_ref instructions because WebAssembly function references are not comparable, so we can't insert the if actual_callee == speculative_callee check.

  • Winliner assumes support for the (widely implemented) multi-value proposal in its generated code.

Using Winliner as a Library

First, add a dependency on Winliner to your Cargo.toml:

[dependencies]
winliner = "1"

Then, use the library like so:

use winliner::{InstrumentationStrategy, Instrumenter, Optimizer, Profile, Result};

fn main() -> Result<()> {
    let original_wasm = std::fs::read("path/to/my.wasm")?;

    // Configure instrumentation.
    let mut instrumenter = Instrumenter::new();
    instrumenter.strategy(InstrumentationStrategy::ThreeGlobals);

    // Instrument our wasm.
    let instrumented_wasm = instrumenter.instrument(&original_wasm)?;

    // Get a profile for our Wasm program from somewhere. Read it from disk,
    // record it now in this process, etc...
    //
    // See the API docs for `Profile` for more details.
    let profile = Profile::default();

    // Configure optimization and thresholds for inlining.
    let mut optimizer = Optimizer::new();
    optimizer
        .min_total_calls(100)
        .min_ratio(0.8)?
        .max_inline_depth(3);

    // Run the optimizer with the given profile!
    let optimized_wasm = optimizer.optimize(&profile, &original_wasm)?;

    std::fs::write("path/to/optimized.wasm", optimized_wasm)?;
    Ok(())
}

Acknowledgements

The inspiration for this tool -- along with the low-overhead but imprecise "three globals" instrumentation strategy -- sprang from conversations with Chris Fallin and Luke Wagner.

Dependencies

~2–17MB
~201K SLoC