#ctc #beam-search #algolithms

ctclib

A collection of utilities related to CTC, with the goal of being fast and highly flexible

1 unstable release

0.1.0 Feb 6, 2022

#851 in Math

MIT license

410KB
7K SLoC

C++ 6K SLoC // 0.1% comments Rust 757 SLoC // 0.0% comments Cython 52 SLoC // 0.3% comments Python 18 SLoC // 0.1% comments C 15 SLoC // 0.1% comments Shell 7 SLoC

ctclib

NOTE: This is currently under development.

A collection of utilities related to CTC, with the goal of being fast and highly flexible.

Features

  • CTC Decode
    • Greedy Decoder
    • Beam Search Decoder
    • Beam Search Decoder with KenLM
    • Beam Search Decoder with user-defined LM
    • Python bindings

Installation

ctclib depends on kpu/kenlm. You must install the following libraries as KenLM dependencies.

  • Boost
  • Eigen3

For example, if you are using Ubuntu (or some Debian based Linux), you can install them by running the following command:

apt install libboost-all-dev libeigen3-dev

Use ctclib from Rust

Currently, ctclib isn't available on crates.io, but you can use this as git dependencies.

[dependencies]
ctclib = { version = "*", git = "https://github.com/agatan/ctclib" }

Use ctclib from Python

ctclib provides python interfaces, named pyctclib. Currently, pyctclib isn't available on PyPI, but you can install this as git dependency. Ensure that you have installed cargo and libclang-dev.

pip install 'git+https://github.com/agatan/ctclib.git#egg=pyctclib&subdirectory=bindings/python'

Example

import pyctclib

decoder = pyctclib.BeamSearchDecoderWithKenLM(
    pyctclib.BeamSearchDecoderOptions(
      beam_size=100,
      beam_size_token=1000,
      beam_threshold=1,
      lm_weight=0.5,
    ),
    "/path/to/model.arpa",
    ["a", "b", "c", "_"],
)
decode.decode(log_probs)

# or you can use user-defined LM
# See pyctclib.LMProtocol

Dependencies

~1–7.5MB
~50K SLoC