#real-time-audio #asr #text-to-speech #model #real-time #sherpa

sherpa-transducers

low latency asr/tagging via sherpa-onnx streaming zipformer-transducer models

16 releases (4 breaking)

Uses new Rust 2024

0.5.5 Apr 27, 2025
0.5.4 Apr 27, 2025
0.4.0 Apr 1, 2025
0.3.3 Mar 30, 2025
0.1.2 Mar 28, 2025

#196 in Audio

Download history 742/week @ 2025-03-26 201/week @ 2025-04-02 97/week @ 2025-04-09 327/week @ 2025-04-16 212/week @ 2025-04-23 59/week @ 2025-04-30

710 downloads per month

Apache-2.0

43KB
787 lines

docs.rs Crates.io Crates.io Apache-2.0

sherpa-transducers

A rust wrapper around streaming mode sherpa-onnx zipformer transducers.

performance characteristics

It's very quicklike. Expect to be able to stay abreast of a realtime audio stream on 1-2 modest CPU cores.

For higher throughput applications (many streams served on the same machine), continuous batching is fully supported and significantly improves on per-stream compute utilization.

installation / basic usage

Add the dep:

cargo add sherpa-transducers

And use it:

use sherpa_transducers::asr;

async fn my_stream_handler() -> anyhow::Result<()> {
    let t = asr::Model::from_pretrained("nytopop/zipformer-en-2023-06-21-320ms")
        .await?
        .num_threads(2)
        .build()?;

    let mut s = t.phased_stream(1)?;

    loop {
        // use the sample rate of _your_ audio, input will be resampled automatically
        let sample_rate = 24_000;
        let audio_samples = vec![0.; 512];

        // buffer some samples to be decoded
        s.accept_waveform(sample_rate, &audio_samples);

        // actually do the decode
        s.decode();

        // get the transcript since last reset
        let (epoch, transcript) = s.result()?;

        if transcript.contains("DELETE THIS") {
            s.reset();
        }
    }
}

feature flags

Default features:

  • static: Compile and link sherpa-onnx statically
  • download-models: Enable support for loading pretrained transducers from huggingface

Features disabled by default:

  • cuda: enable CUDA compute provider support (requires CUDA 11.8, 12.x will not bring you joy and happiness)
  • directml: enable DirectML compute provider support (entirely untested but theoretically works)
  • download-binaries: download sherpa-onnx object files instead of building it

Dependencies

~3–19MB
~215K SLoC