1 unstable release
| 0.2.0 | Feb 23, 2023 |
|---|
#938 in Math
21 downloads per month
Used in ungoliant
115KB
773 lines
ctclib
NOTE: This is currently under development.
A collection of utilities related to CTC, with the goal of being fast and highly flexible.
Features
- CTC Decode
- Greedy Decoder
- Beam Search Decoder
- Beam Search Decoder with KenLM
- Beam Search Decoder with user-defined LM
- Python bindings
Installation
ctclib depends on kpu/kenlm.
You must install the following libraries as KenLM dependencies.
- Boost
- Eigen3
For example, if you are using Ubuntu (or some Debian based Linux), you can install them by running the following command:
apt install libboost-all-dev libeigen3-dev
Use ctclib from Rust
Currently, ctclib isn't available on crates.io, but you can use this as git dependencies.
[dependencies]
ctclib = { version = "*", git = "https://github.com/agatan/ctclib" }
Use ctclib from Python
ctclib provides python interfaces, named pyctclib.
Currently, pyctclib isn't available on PyPI, but you can install this as git dependency.
Ensure that you have installed cargo and libclang-dev.
pip install 'git+https://github.com/agatan/ctclib.git#egg=pyctclib&subdirectory=bindings/python'
Example
import pyctclib
decoder = pyctclib.BeamSearchDecoderWithKenLM(
pyctclib.BeamSearchDecoderOptions(
beam_size=100,
beam_size_token=1000,
beam_threshold=1,
lm_weight=0.5,
),
"/path/to/model.arpa",
["a", "b", "c", "_"],
)
decode.decode(log_probs)
# or you can use user-defined LM
# See pyctclib.LMProtocol
Dependencies
~0.8–7MB
~53K SLoC