C#.net - Read Text from Image in C#.net
In this article i will show you how to read text from image by using OCR Components in C#.net.
What is OCR?
OCR (Optical Character Recognition) is the recognition of printed or written text characters by a computer. This involves photoscanning of the text character-by-character, analysis of the scanned-in image, and then translation of the character image into character codes, such as ASCII, commonly used in data processing.
OCR translates images of text, such as scanned documents, into actual text characters. Also known as text recognition, OCR makes it possible to edit and reuse the text that is normally locked inside scanned images. OCR works using a form of artificial intelligence known as pattern recognition, to identify individual text characters on a page, including punctuation marks, spaces, and ends of lines.
First off, you need to have MS Office 2007 installed or later version. This is obviously a dependency if you develop an application to use the OCR capabilites in the field – it won’t work without Office installed. Furthermore, the OCR capability doesn’t install by default when you install Office, you need to add a component called ‘Microsoft Office Document Imaging’ (MODI).
Instructions on how to add the required MODI component.