#pattern #language #transcription #process #bin #pipescribe

bin+lib pipescribe

A tool for real-time transcription of audio streams using Whisper and PipeWire

1 unstable release

Uses new Rust 2024

new 0.1.0 May 1, 2025

#309 in Audio

GPL-3.0-or-later

10MB
811 lines

Pipescribe

Build and Test

Pipescribe is a real-time audio transcription tool that captures audio from PipeWire sources and transcribes it using the Whisper speech recognition model.

Features

  • Capture audio from any PipeWire source (source detection or manual selection via regex patterns)
  • Real-time speech-to-text transcription
  • Support for multiple languages

Requirements

  • Rust (cargo)
  • PipeWire
  • Whisper model files

Installation

  1. Clone the repository:
git clone https://github.com/mz2/pipescribe.git
cd pipescribe
  1. Download a Whisper model file:
mkdir -p models
./bin/download-ggml-models.sh medium.en

(model download script whisper.cpp(lifted from https://github.com/ggml-org/whisper.cpp/blob/master/models/download-ggml-model.sh)

  1. Build the project:
cargo build --release

Usage

Basic usage:

pipescribe --buffer-seconds 5 --model ./models/ggml-medium.en.bin

... or to test out a local build:

cargo run --bin pipescribe -- --buffer-seconds 5 --model ./models/ggml-medium.en.bin

Dependencies

~16–29MB
~468K SLoC