#tesseract #ocr #image #leptonica

leptess

Productive Rust binding for Tesseract and Leptonica

16 releases (7 breaking)

0.8.1 Nov 14, 2019
0.7.4 Oct 13, 2019
0.6.0 Jul 6, 2019

#50 in Images

Download history 16/week @ 2019-11-07 70/week @ 2019-11-14 38/week @ 2019-11-21 71/week @ 2019-11-28 72/week @ 2019-12-05 481/week @ 2019-12-12 10/week @ 2019-12-19 25/week @ 2019-12-26 16/week @ 2020-01-02 286/week @ 2020-01-09 67/week @ 2020-01-16 8/week @ 2020-01-23 3/week @ 2020-01-30 14/week @ 2020-02-06 45/week @ 2020-02-13

445 downloads per month

MIT license

2MB
354 lines

Leptess

CircleCI Crates.io Docs

Productive and safe Rust bindings/wrappers for Tesseract and Leptonica.

Build dependencies

Make sure you have clang, Leptonica and Tesseract installed.

For Ubuntu user:

sudo apt-get install libleptonica-dev libtesseract-dev clang

You will also need to install tesseract language data based on your OCR needs:

sudo apt-get install tesseract-ocr-eng

For mac user:

brew install tesseract leptonica

Usage

let mut lt = leptess::LepTess::new(None, "eng").unwrap();
lt.set_image("path/to/page.bmp");
println!("{}", lt.get_utf8_text().unwrap());

For more examples, see docs and examples directory.

To run demos in examples directory, try:

cargo run --example low_level_ocr_full_page

Development

To run tests, you will need at Tesseract 4.x to match what we have in tests/tessdata/eng.traineddata. See CircleCI config to see how to replicate the setup.

Dependencies