The Best OCR Software Tool To Convert Scanned Handwritten, Typewritten or Printed Text Into Documents

Optical Character Recognition (OCR)

It is a system of converting scanned printed/handwritten image files into its machine readable text format. OCR systems require calibration to read a specific font; early versions needed to be programmed with images of each character, and worked on one font at a time. "Intelligent" systems with a high degree of recognition accuracy for most fonts are now common. Some systems are capable of reproducing formatted output that closely approximates the original scanned page including images, columns and other non-textual components.

OCR software works by analyzing a document and comparing it with fonts stored in its database and/or by noting features typical to characters. Some OCR software also puts it through a spell checker to “guess” unrecognized words. OCR tools come with their own limitations. And scanning a page has to do a lot with resolutions, contrasts and clarity of fonts. From an average user’s standpoint,  100% accuracy is difficult to achieve, but close approximation is what most software strive for.

We will be looking at two OCR software. Microsoft OneNote, the overlooked and probably installed on your system and FreeOCR, the software that uses tesseract-ocr that is considered one of the most accurate free software OCR engines currently available.

Microsoft OneNote

Microsoft OneNote
Microsoft OneNote

Microsoft OneNote

For the occasional basic OCR stuff, Microsoft OneNote’s Optical Character Recognition feature is a time-saver. You might have missed it, it’s called "Copy Text from Picture".

Drag a scan or a saved picture into Microsoft OneNote. You can also use OneNote to clip part of the screen or an image into Microsoft OneNote.

Right click on the inserted picture and select "Copy Text from Picture". The copied optically recognized text goes into the clipboard and you can now paste it into any program like Microsoft Word or Notepad.

OneNote is simplicity personified. But it’s not too great for handwritten characters or even fuzzy ones. But for a quick job, I am all for Microsoft OneNote’s clip and paste.

FreeOCR

FreeOCR
FreeOCR

FreeOCR

This free OCR software uses tesseract-ocr, an OCR Engine that was developed at HP Labs between 1985 and 1995... and now at Google. The Tesseract OCR engine was one of the top 3 engines in the 1995 UNLV Accuracy test. Between 1995 and 2006 it had little work done on it, but it is probably one of the most accurate open source OCR engines available. The source code will read a binary, grey or color image and output text. A tiff reader is built in that will read uncompressed TIFF images, or libtiff can be added to read compressed images.

FreeOCR is a simple Windows interface for that underlying code. It supports most image files and multi-page TIFF files. It can handle PDF formats and is also compatible with TWAIN devices like scanners. FreeOCR also has the familiar double window interface with easy to understand settings. Before starting the one click conversion process, you can adjust the image contrast for better readability.

FreeOCR is a complete scan and OCR program including the Windows compiled Tesseract free ocr engine. FreeOCR is small, simple and easy-to-use, and it includes a Windows installer and supports multi-page tiff's, fax documents as well as most image types including compressed Tiff's which the Tesseract engine on its own cannot read. It has Twain scanning included and support for multipage Tiff documents. Best of all it is totally free !

FreeOCR has been totally rewritten for Microsoft's .Net Framework V2.0 This was mainly due to problems with displaying Unicode text properly which most older development environments sadly do not support. Unicode is important as the OCR engine supports different languages and outputs them in UTF-8 encoding.

Requirements:

  • Pentium Processor - 200MHz
  • 256 MB Memory (RAM)
  • 10MB Free Disk Space
  • SVGA Resolution Display
  • Net Framework 2.0 or higher

More by this Author


Comments 1 comment

manojglobal profile image

manojglobal 4 years ago from Kolkata

Now a days OCR conversion outsourcing is gaining popularity in different countries around the globe. In Optical Character Recognisition (OCR) system one can read stored scanned images electronically and convert into editable text with any file format like Ms-Word, Ms-Excel, PDF, TXT, RTF, HTML, Ms-Access etc and Adobe Acrobat PDF formats.

http://www.dataentryhelp.com/ocr-conversion-outsou...

    Sign in or sign up and post using a HubPages Network account.

    0 of 8192 characters used
    Post Comment

    No HTML is allowed in comments, but URLs will be hyperlinked. Comments are not for promoting your articles or other sites.


    Click to Rate This Article
    working