#asr #vk #webrtc #speech-to-text #speech-recognision

app wacr

The backend for WebRTC -> VK ASR (speech recognition technology) interaction

9 stable releases

2.2.1 Nov 12, 2022
2.1.0 Oct 25, 2022
1.6.1 Oct 10, 2022

#6 in #vk-api

Download history 12/week @ 2023-11-13 13/week @ 2023-11-20 29/week @ 2023-11-27 9/week @ 2023-12-04 9/week @ 2023-12-11 18/week @ 2023-12-25 11/week @ 2024-01-01 18/week @ 2024-01-29 2/week @ 2024-02-05 43/week @ 2024-02-12 189/week @ 2024-02-19

252 downloads per month

MIT/Apache

52KB
1K SLoC

WACR

WACR is the backend for WebRTC -> VK ASR (speech recognition technology) interaction.

How it works

The client just calls to WACR backend by WebRTC technology. After that WACR save an audio stream from WebRTC to file system. Saved audio stream will send to VK ACR backend by API. Then WACR save in-memory recognized text and send it to client.

Usage

Install from Cargo and run binary

cargo install wacr
RUST_LOG=debug;VK_API_SERVICE_TOKEN=XXX;VK_API_SERVICE_KEY=YYY wacr

Compile and run from Cargo

RUST_LOG=debug;VK_API_SERVICE_TOKEN=XXX;VK_API_SERVICE_KEY=YYY cargo run --package wacr --bin wacr

Compile from Cargo and run the binary

cargo build --package wacr --bin wacr --release
RUST_LOG=debug;VK_API_SERVICE_TOKEN=XXX;VK_API_SERVICE_KEY=YYY ./target/release/wacr

API

Get JWT Token.

Query must be extracted from mini apps launch params

Request

POST http://127.0.0.1:8080/token/generate
Content-Type: application/json

{
  "query": "XXX"
}

Response

{
  "token": "xxx",
  "expiration": 1664718489
}

Create session

Creating connection by WebRTC. access_token must be got from Get JWT Token API. Offer is client local WebRTC offer.

Request

POST http://127.0.0.1:8080/session/create?access_token=XXX
Content-Type: application/json

{
  "offer": {}
}

Response

{
  "session_id": "a3b26e68-7fda-4534-bbdd-92a98230a824",
  "offer": {}
}

Recognise the speech

Start recognising of speech accepted from Create session. access_token must be got from Get JWT Token API.

Request

POST http://127.0.0.1:8080/session/asr?access_token=XXX
Content-Type: application/json

{
  "session_id": "a3b26e68-7fda-4534-bbdd-92a98230a824"
}

Response

{
  "text": "Hello world!"
}

Listen recorded audio

GET http://127.0.0.1:8080/session/listen/{session_id}?access_token=XXX

Possible errors

Base Error Response

{
  "error": "error occurred"
}

Startup environments

Required

VK_API_SERVICE_TOKEN=XXX # Service token for requesting VK API endpoints
VK_API_SERVICE_KEY=YYY # Service key for validating query on token generation

Optional

LISTEN_ADDRESS=127.0.0.1:8080 # Listening address
JWT_EXPIRATION=3600 # How many seconds access token will valid
GARBAGE_COLLECTOR_TTL=3600 # How many seconds audio files and text results will alive
SESSION_KEEP_ALIVE_TIMEOUT=10 # How many seconds webrtc session will alive without incoming packets
SESSION_TOTAL_TIMEOUT=100 # Max number of seconds webrtc session will alive
AUDIO_DIR=/tmp # The directory where audio files saving
WEBRTC_PORT_MIN=0 # Minimal available port for webrtc peer connections
WEBRTC_PORT_MAX=0 # Maximal available port for webrtc peer connections
WEBRTC_INTERFACES_ALLOWED= # All interfaces allowed by default. List of allowed network interfaces split by ,
STATIC_DIR= # If set, service will distribute all static from this directory by path /static. Example: /static/index.html

Dependencies

~43–60MB
~1M SLoC