Science

What is ocr? »Its definition and meaning

Anonim

OCR are those of optical character recognition or also known in Spanish as optical character recognition. OCR is a software that enables text recognition, producing an image of it to transform it into a succession of characters, and then save them in a given format that can be used in those text editing programs. In other words, thanks to this new technology, any type of text or document, including PDF files, scanned papers or even images taken from digital cameras, can be converted into data in order to have the possibility of being edited.

This software works in the following way, first it analyzes each part of the image of the document in question; distribute the page in pieces such as tables, images, text blocks among others; then the lines are distributed in words to later become characters; and since the characters have already been indicated, the software makes the comparison with a group of images of the pattern. This progresses according to the series of hypotheses about what each character is; and based on these hypotheses, it analyzes the different variants of breaking lines into words and words into characters. And it is after a large number of analysis and processing of the hypotheses, that the program finally presents the text already recognized and transformed with a new format.

It should be noted that today there are a series of programs that the computer market offers based on OCR such as OmniPage, Abbyy Fine Reader or READiris. YY that have the ability, not only to analyze and recognize a text as such, but also recognize the format and style, but with certain limitations, thus requiring that the text, after being analyzed, be edited to make the adjustments that are require.