ABCocr is a .NET Optical Character Recognition (OCR) product. You use ABCocr .NET to extract text from images. ABCocr .NET is based around industry standard OCR software. At its heart is a custom version of the Tesseract 3 OCR engine.
The Tesseract OCR engine was originally developed by Hewlett-Packard UK. It was one of the top three engines in the 1995 UNLV Accuracy test and is probably one of the most accurate open source OCR engines available. Since then it has been extensively revised with sponsorship from Google.
Tesseract supports English, Spanish, German, French, Italian, Portuguese, Arabic, Bulgarian, Catalan, Chinese (simplified), Chinese (traditional), Croatian, Czech, Danish (Fraktur script), Danish (standard), Dutch, Finnish, Greek, Hebrew, Hungarian, Indonesian, Japanese, Korean, Latvian, Lithuanian, Norwegian, Polish, Romanian, Russian, Serbian, Slovak (Fraktur script), Slovak (standard), Slovenian, Swedish, Tagalog, Thai, Turkish, Ukrainian, and Vietnamese. Tesseract can be trained to work in other languages as well.
So why wouldn't I just use Tesseract? What does ABCocr .NET add?
- 100% Stable. The original Tesseract is based around a command line process which means that it does not matter if it occasionally terminates, crashes or leaks memory. If you are running a modern in-process application you absolutely cannot have this type of behavior. ABCocr resolves these issues and presents you with a - 100% stable platform.
- 100% Performant. Because Tesseract was based around a command line process it cannot multithread. ABCocr adds multithread support so you can spread load over multiple CPUs or cores and you can use it safely from multithreaded APIs like ASP.NET.
- 100% Compatible. Tesseract is 32-bit process and cannot be used in 64-bit applications. This is a significant issue when so many operating systems are now based around 64-bit address space. ABCocr eliminates this restriction and allows you to run in either x86 or x64 mode completely automatically.
- 100% Consistent. Tesseract is somewhat idiosyncratic. If you've ever seen error messages telling you that your TIFF tags are in the wrong order you will know what we mean. ABCocr eliminates this idiosyncrasy and provides a simple and uniform way of dealing with OCR.
- 100% Simple. We only have one example. Why is this? Well because it's so simple to use we couldn't think of anything else that you would need.