Tesseract installation depends on lots of other packages, the main one being leptonica. I was working on installing tesseract OCR on Cent OS 5.5. These were the steps that enables me to successfully set it up on Cent OS 5.5 and openSuse 11.3.
You may use zypper instead of yum on openSuse 11.3, the instructions and package names remain the same.
- Install the following packages using yum
- yum install libjpeg-devel libpng-devel libtiff-devel zlib-devel
- For compiling the any source code on linux gcc gcc-c++ and make utility would be required, install them again using yum
- yum install gcc gcc-c++ make
- Download Leptonica source 1.67 from http://www.leptonica.com/source/leptonlib-1.67.tar.gz and compile it using the following commands
- ./make install
*If you get error while running make in the above step for functions like sqrt, cos, sin, sincos, etc, then you may have to append -lm option to the make file in the src folder of leptonica source code and run the make again
- Download the Tesseract source code from the location http://tesseract-ocr.googlecode.com/files/tesseract-3.00.tar.gz
- Extract the source code into a directory and use the standard commands to compile the code as shown below
- ./make install
Post Installation Steps
Some enviornments may need to setup following environment variable to be exported
The english language training data can be downloaded from http://tesseract-ocr.googlecode.com/files/eng.traineddata.gz
After extraction of the language bundle, copy it in /usr/local/share/tessdata folder.
This completes the tesseract installation and now you should be able to run tesseract on linux