#voice #speech-quality

visqol-rs

The Visqol v3.1 algorithm for speech quality evaluation in Rust

4 releases

0.1.3 Dec 17, 2024
0.1.2 Aug 17, 2024
0.1.1 Dec 12, 2023
0.1.0 Jul 17, 2023

#578 in Audio

Download history 10/week @ 2024-09-13 26/week @ 2024-09-20 27/week @ 2024-09-27 2/week @ 2024-10-04 7/week @ 2024-12-06 117/week @ 2024-12-13 14/week @ 2024-12-20

138 downloads per month
Used in visqol

Apache-2.0

195KB
5K SLoC

Visqol-RS

  • Implementation of the Visqol v3.1 algorithm for speech quality evaluation in Rust
  • Compute visqol scores within your rust code! Just note that you will need to compile in Release mode.

Audience

  • Researchers, engineers, academics who work within the field of speech enhancement and perceptual audio evaluation.

Build instructions

  • You will need the stable rust toolchain.
  • MSRV: 1.83
  • So far, the library builds successfully on macOS 10.15 and WSL2 Ubuntu and Windows.

Example

use visqol_rs::*;
    
let path_to_reference_file = "./test_data/clean_speech/reference_signal.wav";
let path_to_degraded_file = "./test_data/clean_speech/degraded_signal.wav";
let config = visqol_config::VisqolConfig::get_speech_mode_config();
let mut visqol = visqol_manager::VisqolManager::from_config(&config);
let similarity_result = visqol.run(path_to_reference_file, path_to_degraded_file).unwrap();
println!("Mean objective score for degraded file {}: {}", path_to_degraded_file, similarity_result.moslqo);

Notes

  • For reasonable computation times, it is recommended to compile this library in Release mode. Due to the high complexity of the gammatone filterbank and computing the corresponding spectrogram, ViSQOL tends to be rather slow in debug mode.
  • This is a spare time project. Please expect delays with regard to issues, pull requests etc.

Papers

I highly encourage you to get familiar with Visqol by reading these papers:

Acknowledgement

  • Since this project was more an exercise for me to learn Rust, none of the actual algorithm creation comes from me. I'd like to thank Jan Skoglund, Michael Chinen and Andrew Hines for their tremendous effort and innovation in the field of perceptual audio evaluation.

Dependencies

~69MB
~1M SLoC