#onnx #mcp #piper #text-to-speech #speech-synthesis

app piper-mcp-server

Text-to-speech MCP server using Piper ONNX models

1 unstable release

0.1.0 Mar 7, 2026

#82 in Audio

MIT license

52KB
1K SLoC

piper-mcp-server

CI Release crates.io

Text-to-speech MCP server using Piper ONNX models.

Standalone binary that exposes a synthesize tool over MCP (Model Context Protocol) via stdio or HTTP transport using JSON-RPC 2.0.

Features

  • Tool synthesize — accepts text, returns base64-encoded OGG/Opus audio
  • ONNX inference — runs Piper VITS models via ONNX Runtime
  • Phonemization — automatic IPA conversion via espeak-ng
  • Native audio encoding — PCM to OGG/Opus via statically linked libopus — no ffmpeg required
  • Multi-speaker — speaker selection for multi-speaker models via --speaker
  • Optional CUDA — GPU acceleration via ONNX Runtime CUDA provider
  • HTTP transport — MCP Streamable HTTP with Bearer token authentication and session management
  • Dual transport — stdio (default) or HTTP mode via --transport flag

Architecture

Single crate, five source modules:

Module Responsibility
src/main.rs CLI parsing (clap), config loading, ONNX session, entry point, transport selection
src/mcp.rs JSON-RPC 2.0 dispatch, MCP tool result helpers, stdio read/write loop
src/http.rs HTTP transport: axum server, Bearer auth, session management
src/synthesize.rs Phonemization (espeak-ng), phoneme-to-ID mapping, ONNX inference, full pipeline
src/encode.rs f32-to-i16, resampling, OGG/Opus encoding (opus + ogg crates)

CLI Arguments

piper-mcp-server --model <PATH> [OPTIONS]

Options:
  --model <PATH>         Path to Piper ONNX model file (.onnx) [required]
  --config <PATH>        Path to Piper voice config (.onnx.json) [default: {model}.json]
  --device <DEVICE>      Device: "cpu" or "cuda" [default: cpu]
  --speaker <NAME>       Speaker name for multi-speaker models
  --espeak <PATH>        Path to espeak-ng binary [default: espeak-ng]
  --transport <MODE>     Transport mode: stdio or http [default: stdio]
  --host <HOST>          Host to bind HTTP server [default: 127.0.0.1]
  --port <PORT>          Port for HTTP server [default: 8080]
  --auth <TOKEN>         Bearer token for HTTP authentication (optional)
  --version              Print version and exit

Build

Prerequisites

  • Rust toolchain (stable)
  • CMake (ONNX Runtime build dependency)
  • libclang (bindgen dependency)

Runtime Dependencies

  • espeak-ng — phonemization backend
  • Piper model + config — download from Piper voices

Build with Nix

nix develop    # sets up all build and runtime deps
cargo build --release

Build with Cargo

cargo build --release

Ensure cmake, libclang, openssl (dev) are available in your system. libopus and ONNX Runtime are statically linked from vendored/prebuilt sources.

Rust Dependencies

Crate Purpose
ort ONNX Runtime bindings for Piper model inference
clap CLI argument parsing
serde, serde_json JSON serialization for MCP protocol and Piper config
base64 Encoding audio output as base64
opus Opus audio encoding (libopus bindings)
ogg OGG container encoding (pure Rust)
hound WAV encoding (tests only)
tracing, tracing-subscriber Structured logging to stderr
axum HTTP server framework for MCP HTTP transport
tokio Async runtime for HTTP transport
uuid Session ID generation (UUID v4)

MCP Protocol

The server supports two transport modes:

  • stdio (default) — communicates over stdin/stdout, one JSON object per line
  • HTTP — MCP Streamable HTTP on POST /mcp and DELETE /mcp

HTTP Transport

Start the server in HTTP mode:

piper-mcp-server --model model.onnx --transport http --port 8080 --auth secret123

Authentication: when --auth is set, all requests must include Authorization: Bearer <token>. Without --auth, authentication is disabled.

Sessions: the initialize request returns an Mcp-Session-Id header. All subsequent requests must include this header. Sessions are terminated via DELETE /mcp.

Example session:

# Initialize (get session ID)
curl -s -D- -X POST http://127.0.0.1:8080/mcp \
  -H "Authorization: Bearer secret123" \
  -H "Content-Type: application/json" \
  -d '{"jsonrpc":"2.0","id":1,"method":"initialize","params":{}}'

# Use the Mcp-Session-Id from the response headers for subsequent requests
curl -s -X POST http://127.0.0.1:8080/mcp \
  -H "Authorization: Bearer secret123" \
  -H "Content-Type: application/json" \
  -H "Mcp-Session-Id: <session-id>" \
  -d '{"jsonrpc":"2.0","id":2,"method":"tools/call","params":{"name":"synthesize","arguments":{"text":"Hello"}}}'

# Terminate session
curl -s -X DELETE http://127.0.0.1:8080/mcp \
  -H "Authorization: Bearer secret123" \
  -H "Mcp-Session-Id: <session-id>"

Stdio Transport

Initialize

Request:

{"jsonrpc":"2.0","id":1,"method":"initialize","params":{}}

Response:

{"jsonrpc":"2.0","id":1,"result":{"protocolVersion":"2024-11-05","capabilities":{"tools":{}},"serverInfo":{"name":"piper-mcp-server","version":"<version>"}}}

List tools

Request:

{"jsonrpc":"2.0","id":2,"method":"tools/list"}

Response:

{"jsonrpc":"2.0","id":2,"result":{"tools":[{"name":"synthesize","description":"Synthesize speech from text using Piper TTS","inputSchema":{"type":"object","properties":{"text":{"type":"string","description":"Text to synthesize into speech"}},"required":["text"]}}]}}

Synthesize

Request:

{"jsonrpc":"2.0","id":3,"method":"tools/call","params":{"name":"synthesize","arguments":{"text":"Hello, world!"}}}

Response:

{"jsonrpc":"2.0","id":3,"result":{"content":[{"type":"text","text":"<base64-encoded OGG/Opus audio>"}]}}

CI/CD

GitHub Actions workflows:

  • CI (ci.yml) — runs cargo fmt, cargo clippy, cargo test on every push/PR to main/develop
  • Release (release.yml) — builds binaries for 3 targets on tag push (v*), uploads as release assets

Release targets:

Artifact Build method Notes
linux-x86_64 nix (default) glibc, CPU only
macos-x86_64 nix (default) Intel Mac
macos-arm64 nix (default) Apple Silicon

Release process:

  1. Create a git tag: git tag vX.Y.Z && git push --tags
  2. CI builds binaries for all targets via nix
  3. Create a GitHub release from the tag — CI attaches build artifacts automatically

To update cargoHash in flake.nix after changing dependencies:

./scripts/update-cargo-hash.sh

Usage

Claude Desktop (stdio)

Add to claude_desktop_config.json:

{
  "mcpServers": {
    "piper": {
      "command": "/path/to/piper-mcp-server",
      "args": ["--model", "/path/to/en_US-lessac-medium.onnx"]
    }
  }
}

HTTP mode

piper-mcp-server --model /path/to/model.onnx --transport http --port 8080 --auth mytoken

Connect any HTTP-capable MCP client to http://127.0.0.1:8080/mcp. All requests require Authorization: Bearer mytoken and Content-Type: application/json. See HTTP Transport for protocol details.

Any MCP client (stdio)

The server reads JSON-RPC requests from stdin and writes responses to stdout. Logs go to stderr. Connect any MCP-compatible client using stdio transport.

Dependencies

~18–26MB
~397K SLoC