Google has announced on its
BlogSpot Code Blog that the Tesseract OCR (Optical Character Recognition) engine is now open-source (originally developed by Hewlett Packard). With this Google is also hiring new OCR engineers, to apparently further develop this project. The Google Tesseract OCR code can be found at
SourceForge.