Show Your Support: This site is primarily supported by advertisements. Ads are what have allowed this site to be maintained on a daily basis for the past 18+ years. We do our best to ensure only clean, relevant ads are shown, when any nasty ads are detected, we work to remove them ASAP. If you would like to view the site without ads while still supporting our work, please consider our ad-free Phoronix Premium.
Tesseract 5.0 OCR Engine Bringing Faster Performance With "Fast Floats"
The Tesseract 5.0 Alpha has been available since the end of last year while marked this weekend was the first beta of Tesseract 5.0. Earlier Tesseract 5.0 Alpha releases have brought improved performance, support for Apple Silicon, build system improvements, an overhaul to its public API, and a lot of code improvements.
Yesterday's Tesseract 5.0 Beta brought more code modernization work, improved ARM NEON usage, and more.
Arguably most exciting with Tesseract 5.0 Beta is support for using floats for LSTM model training and text recognition. Traditionally the Tesseract OCR engine has relied upon doubles but when enabling the new "fast float" option at build time, floats can be used instead. In turn the hope is this will lead to faster training and OCR performance while also requiring less system memory than earlier versions of Tesseract or when building Tesseract without fast-float enabled.
Tests by Tesseract developers found the fast float mode is yielding dot product operations to be about 50% faster while other operations should also benefit from this new mode in Tesseract 5.0. There are also more fast float optimizations pending, including around AVX/AVX-512.
More details on the Tesseract 5.0 Beta release via GitHub.