OCR Explained

Eren Akbulut
2 min readJan 22, 2021

--

Hello everyone, today I’ll be talking about what is OCR and what are the popular use cases of OCR. It won’t be a tutorial type of post I’ll just briefly explain some concepts and application fields overall.

What is OCR?

OCR stands for optical character recognition. OCR is a widely used technology to recognize text out of pictures, photos, or scanned documents. OCR itself is a really tough and highly researched field for computer scientists for long long years, yet with the results of this work for such long years both developers and people that are not involved with any development can use many OCR tools sometimes without even knowing.

For example one of the most popular and widely used OCR engines Tesseract is being developed since 1985 and Hewlett-Packard Co was the original creator of the project, current maintainer of the project is Google and they are doing it since 2006. We can safely say the thing that makes the project that mainstream is the involvement of Google after 2006.

Tesseract engine itself has many many implementations for different programming languages, actually, basically, all widely used programming languages have some sort of a port of Tesseract nowadays. You can check it here.

Other than Tesseract many cloud providers are offering OCR APIs under their roof but on the edge computing Tesseract and the OpenCV are the ones that carry the most load for the OCR businesses. OpenCV is a much widely used and more generic engine overall, so I might make another post about it to cover some of its most used features with some examples.

Popular Applications

  • Creating digital copies of hard printed documents out of scanned samples is one of the most popular uses of OCR, before the evolution I mention above happened only option to accomplish such a task was to write it again manually.
  • Creating digital copies of similar documents on the edge devices like mobile phones, many new generation mobile device now is eligible to take clear enough photos to run OCR on them, of course, with the improved computing power they have compared to recent years.
  • OCR is very flexible to use that in many applications OCR techniques are being used to create filters for many purposes. Extracting text from an image allows many text-based Machine Learning models to work on them. With that hybrid applications, people can create filters that catch harassment, abuse, offensive language, and so on and so forth even though they were hidden in images.
  • Using OCR to translate is also quite mainstream at the moment. Many people are using OCR technologies to build multi-language image-based text translators on both mobile and web platforms.

Many generic fields that I talked about above are several use cases in several fields from law to healthcare to insurance. I’ll not cover all the topics because that’s not the point of the post. I’ll however try to create tutorials about OCR that applies to many fields.

I hope to see you at the next one, take care :)

Originally published at https://blog.akbuluteren.com.

--

--

No responses yet