3 releases
0.1.5 | Sep 13, 2023 |
---|---|
0.1.4 | May 31, 2023 |
0.1.0 | May 31, 2023 |
#2327 in Command line utilities
28KB
712 lines
This project attempts to build a simple Whisper CLI with Rust, to replace the base Python one. It uses whisper.cpp under the hood, making it significantly faster on M1 systems.
Installation
You can download the binary corresponding to your OS from the latest release, or build it from scratch with cargo install whisper_cli
.
Run from anywhere
Put the whisper
binary in /usr/local/bin
on Unix systems (Mac/Linux) & make sure it has permissions to execute (use chmod +x whisper
in terminal.)
Close & Re-open the terminal to test it by typing whisper --help
. It should output the following.
Usage
$ whisper --help
Generate a transcript of an audio file using the Whisper speech-to-text engine. The transcript will be saved as a .txt, .vtt, and .srt file in the same directory as the audio file.
Usage: whisper [OPTIONS] <AUDIO>
Arguments:
<AUDIO> Path to the audio file to transcribe
Options:
-m, --model <MODEL>
Name of the Whisper model to use
[default: medium]
[possible values: tiny.en, tiny, base.en, base, small.en, small, medium.en, medium, large, large-v1]
-l, --lang <LANG>
Language spoken in the audio. Attempts to auto-detect by default
[possible values: auto, en, zh, de, es, ru, ko, fr, ja, pt, tr, pl, ca, nl, ar, sv, it, id, hi, fi, vi, he, uk, el, ms, cs, ro, da, hu, ta, no, th, ur, hr, bg, lt, la, mi, ml, cy, sk, te, fa, lv, bn, sr, az, sl, kn, et, mk, br, eu, is, hy, ne, mn, bs, kk, sq, sw, gl, mr, pa, si, km, sn, yo, so, af, oc, ka, be, tg, sd, gu, am, yi, lo, uz, fo, ht, ps, tk, nn, mt, sa, lb, my, bo, tl, mg, as, tt, haw, ln, ha, ba, jw, su]
-t, --translate
Toggle translation
-k, --karaoke
Generate timestamps for each word
-h, --help
Print help information (use `-h` for a summary)
-V, --version
Print version information
Develop
Make sure you have the latest version of rust installed (use rustup). Then, you can build the project by running cargo build
, and run it with cargo run
.
License
This project is licensed under the MIT License - see the LICENSE file for details.
Dependencies
~15–29MB
~447K SLoC