2 releases

0.1.12 Oct 27, 2023
0.1.0 Oct 26, 2023

#9 in #transcribe

MIT license

34KB
876 lines

whisperd 🎙️

Crates.io Documentation GitHub release

A simple HTTP server written in Rust for the OpenAI Whisper speech-to-text model.

Features ✨

  • 🎧 Transcribe audio files
  • 🔄 OpenAI API compatibility
  • 🌈 Models
    • tiny.en
    • tiny
    • base.en
    • base
    • small.en
    • small
    • medium.en
    • medium
    • large
    • large-v1
  • 🌎 Languages
    • 🇬🇧 English (en)
    • 🇨🇳 Chinese (zh)
    • 🇩🇪 German (de)
    • 🇪🇸 Spanish (es)
    • 🇷🇺 Russian (ru)

Quickstart 🚀

  1. Clone this repository:
git clone https://github.com/tiero/whisperd.git
  1. Navigate to the repository and build:
cd whisperd
cargo build --release
  1. Run the server:
./target/release/whisperd serve --model_path path_to_whisper_model

Now, the server is running at http://localhost:8000 and ready to transcribe!

Usage 🛠️

CLI Commands

  • Start the transcription server:
whisperd serve --port 5000 --model_path <path_to_model> 
  • Transcribe a given audio file (this downloads the model automatically from HuggingFace):
whisperd transcribe --audio <path_to_audio>

For more advanced options, use:

whisperd --help

Contribution 🤝

Pull requests and issues are welcome!

License 📜

This project is licensed under the MIT License - see the LICENSE file for details.

Dependencies

~29–46MB
~843K SLoC