• Title/Summary/Keyword: OCR(Optical Character Recognition) technology

Search Result 46, Processing Time 0.024 seconds

Trends in Deep Learning-based Medical Optical Character Recognition (딥러닝 기반의 의료 OCR 기술 동향)

  • Sungyeon Yoon;Arin Choi;Chaewon Kim;Sumin Oh;Seoyoung Sohn;Jiyeon Kim;Hyunhee Lee;Myeongeun Han;Minseo Park
    • The Journal of the Convergence on Culture Technology
    • /
    • v.10 no.2
    • /
    • pp.453-458
    • /
    • 2024
  • Optical Character Recognition is the technology that recognizes text in images and converts them into digital format. Deep learning-based OCR is being used in many industries with large quantities of recorded data due to its high recognition performance. To improve medical services, deep learning-based OCR was actively introduced by the medical industry. In this paper, we discussed trends in OCR engines and medical OCR and provided a roadmap for development of medical OCR. By using natural language processing on detected text data, current medical OCR has improved its recognition performance. However, there are limits to the recognition performance, especially for non-standard handwriting and modified text. To develop advanced medical OCR, databaseization of medical data, image pre-processing, and natural language processing are necessary.

Development an Android based OCR Application for Hangul Food Menu (한글 음식 메뉴 인식을 위한 OCR 기반 어플리케이션 개발)

  • Lee, Gyu-Cheol;Yoo, Jisang
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.21 no.5
    • /
    • pp.951-959
    • /
    • 2017
  • In this paper, we design and implement an Android-based Hangul food menu recognition application that recognizes characters from images captured by a smart phone. Optical Character Recognition (OCR) technology is divided into preprocessing, recognition and post-processing. In the preprocessing process, the characters are extracted using Maximally Stable Extremal Regions (MSER). In recognition process, Tesseract-OCR, a free OCR engine, is used to recognize characters. In the post-processing process, the wrong result is corrected by using the dictionary DB for the food menu. In order to evaluate the performance of the proposed method, experiments were conducted to compare the recognition performance using the actual menu plate as the DB. The recognition rate measurement experiment with OCR Instantly Free, Text Scanner and Text Fairy, which is a character recognizing application in Google Play Store, was conducted. The experimental results show that the proposed method shows an average recognition rate of 14.1% higher than other techniques.

An Implementation of a System for Video Translation on Window Platform Using OCR (윈도우 기반의 광학문자인식을 이용한 영상 번역 시스템 구현)

  • Hwang, Sun-Myung;Yeom, Hee-Gyun
    • Journal of Internet of Things and Convergence
    • /
    • v.5 no.2
    • /
    • pp.15-20
    • /
    • 2019
  • As the machine learning research has developed, the field of translation and image analysis such as optical character recognition has made great progress. However, video translation that combines these two is slower than previous developments. In this paper, we develop an image translator that combines existing OCR technology and translation technology and verify its effectiveness. Before developing, we presented what functions are needed to implement this system and how to implement them, and then tested their performance. With the application program developed through this paper, users can access translation more conveniently, and also can contribute to ensuring the convenience provided in any environment.

Study on Performance Evaluation of Automatic license plate recognition program using Emgu CV (Emgu CV를 이용한 자동차 번호판 자동 인식 프로그램의 성능 평가에 관한 연구)

  • Kim, Nam-Woo;Hur, Chang-Wu
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.20 no.6
    • /
    • pp.1209-1214
    • /
    • 2016
  • LPR(License plate recognition) is a kind of the most popular surveillance technology based on accompanied by a video and video within the optical character recognition. LPR need a many process. One is a localization of car license plates, license plate of size, space, contrast, normalized to adjust the brightness, another is character division for recognize the character optical character recognition to win the individual characters, character recognition, the other is phrase analysis of the shape, size, position by year, the procedure for the analysis by comparing the database of license plate having a difference by region. In this paper, describing the results of performance of license plate recognition S/W, which was implemented using EmguCV, find the location, using the tesseract OCR, which are well known to an optical character recognition engine of open source, the characters of the license plate image capturing angle of the plate, image size, brightness.

Development of a Low-cost Industrial OCR System with an End-to-end Deep Learning Technology

  • Subedi, Bharat;Yunusov, Jahongir;Gaybulayev, Abdulaziz;Kim, Tae-Hyong
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.15 no.2
    • /
    • pp.51-60
    • /
    • 2020
  • Optical character recognition (OCR) has been studied for decades because it is very useful in a variety of places. Nowadays, OCR's performance has improved significantly due to outstanding deep learning technology. Thus, there is an increasing demand for commercial-grade but affordable OCR systems. We have developed a low-cost, high-performance OCR system for the industry with the cheapest embedded developer kit that supports GPU acceleration. To achieve high accuracy for industrial use on limited computing resources, we chose a state-of-the-art text recognition algorithm that uses an end-to-end deep learning network as a baseline model. The model was then improved by replacing the feature extraction network with the best one suited to our conditions. Among the various candidate networks, EfficientNet-B3 has shown the best performance: excellent recognition accuracy with relatively low memory consumption. Besides, we have optimized the model written in TensorFlow's Python API using TensorFlow-TensorRT integration and TensorFlow's C++ API, respectively.

Recognition of Characters Printed on PCB Components Using Deep Neural Networks (심층신경망을 이용한 PCB 부품의 인쇄문자 인식)

  • Cho, Tai-Hoon
    • Journal of the Semiconductor & Display Technology
    • /
    • v.20 no.3
    • /
    • pp.6-10
    • /
    • 2021
  • Recognition of characters printed or marked on the PCB components from images captured using cameras is an important task in PCB components inspection systems. Previous optical character recognition (OCR) of PCB components typically consists of two stages: character segmentation and classification of each segmented character. However, character segmentation often fails due to corrupted characters, low image contrast, etc. Thus, OCR without character segmentation is desirable and increasingly used via deep neural networks. Typical implementation based on deep neural nets without character segmentation includes convolutional neural network followed by recurrent neural network (RNN). However, one disadvantage of this approach is slow execution due to RNN layers. LPRNet is a segmentation-free character recognition network with excellent accuracy proved in license plate recognition. LPRNet uses a wide convolution instead of RNN, thus enabling fast inference. In this paper, LPRNet was adapted for recognizing characters printed on PCB components with fast execution and high accuracy. Initial training with synthetic images followed by fine-tuning on real text images yielded accurate recognition. This net can be further optimized on Intel CPU using OpenVINO tool kit. The optimized version of the network can be run in real-time faster than even GPU.

Recognition of Bill Form using Feature Pyramid Network (FPN(Feature Pyramid Network)을 이용한 고지서 양식 인식)

  • Kim, Dae-Jin;Hwang, Chi-Gon;Yoon, Chang-Pyo
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.25 no.4
    • /
    • pp.523-529
    • /
    • 2021
  • In the era of the Fourth Industrial Revolution, technological changes are being applied in various fields. Automation digitization and data management are also in the field of bills. There are more than tens of thousands of forms of bills circulating in society and bill recognition is essential for automation, digitization and data management. Currently in order to manage various bills, OCR technology is used for character recognition. In this time, we can increase the accuracy, when firstly recognize the form of the bill and secondly recognize bills. In this paper, a logo that can be used as an index to classify the form of the bill was recognized as an object. At this time, since the size of the logo is smaller than that of the entire bill, FPN was used for Small Object Detection among deep learning technologies. As a result, it was possible to reduce resource waste and increase the accuracy of OCR recognition through the proposed algorithm.

A Study on Construction of Technical Reports Management System Using Optical Technology (광기술을 이용한 연구보고서 관리시스템 구축)

  • 이상헌;김익철
    • Journal of the Korean Society for information Management
    • /
    • v.9 no.1
    • /
    • pp.131-164
    • /
    • 1992
  • In this study. a technical report management system using optical technology is described in detail. This management system is designed for both bibliographic (character) and full-text (image) information. Several optical filing systems already on the Korean market are scrutinized and compared with standard functions in order to build a more efficient management system for technical reports which can be easily integrated into existing KRISS library automation system. For that purpose, up-to-date technologies (i.e., digital image PI-ocessing (DIP), MARC standards, and optical character recognition (OCR), etc.) are applied to this system.

  • PDF

Optical Character Recognition for Hindi Language Using a Neural-network Approach

  • Yadav, Divakar;Sanchez-Cuadrado, Sonia;Morato, Jorge
    • Journal of Information Processing Systems
    • /
    • v.9 no.1
    • /
    • pp.117-140
    • /
    • 2013
  • Hindi is the most widely spoken language in India, with more than 300 million speakers. As there is no separation between the characters of texts written in Hindi as there is in English, the Optical Character Recognition (OCR) systems developed for the Hindi language carry a very poor recognition rate. In this paper we propose an OCR for printed Hindi text in Devanagari script, using Artificial Neural Network (ANN), which improves its efficiency. One of the major reasons for the poor recognition rate is error in character segmentation. The presence of touching characters in the scanned documents further complicates the segmentation process, creating a major problem when designing an effective character segmentation technique. Preprocessing, character segmentation, feature extraction, and finally, classification and recognition are the major steps which are followed by a general OCR. The preprocessing tasks considered in the paper are conversion of gray scaled images to binary images, image rectification, and segmentation of the document's textual contents into paragraphs, lines, words, and then at the level of basic symbols. The basic symbols, obtained as the fundamental unit from the segmentation process, are recognized by the neural classifier. In this work, three feature extraction techniques-: histogram of projection based on mean distance, histogram of projection based on pixel value, and vertical zero crossing, have been used to improve the rate of recognition. These feature extraction techniques are powerful enough to extract features of even distorted characters/symbols. For development of the neural classifier, a back-propagation neural network with two hidden layers is used. The classifier is trained and tested for printed Hindi texts. A performance of approximately 90% correct recognition rate is achieved.

Study on OCR Enhancement of Homomorphic Filtering with Adaptive Gamma Value

  • Heeyeon Jo;Jeongwoo Lee;Hongrae Lee
    • Journal of the Korea Society of Computer and Information
    • /
    • v.29 no.2
    • /
    • pp.101-108
    • /
    • 2024
  • AI-OCR (Artificial Intelligence Optical Character Recognition) combines OCR technology with Artificial Intelligence to overcome limitations that required human intervention. To enhance the performance of AI-OCR, training on diverse data sets is essential. However, the recognition rate declines when image colors have similar brightness levels. To solve this issue, this study employs Homomorphic filtering as a preprocessing step to clearly differentiate color levels, thereby increasing text recognition rates. While Homomorphic filtering is ideal for text extraction because of its ability to adjust the high and low frequency components of an image separately using a gamma value, it has the downside of requiring manual adjustments to the gamma value. This research proposes a range for gamma threshold values based on tests involving image contrast, brightness, and entropy. Experimental results using the proposed range of gamma values in Homomorphic filtering suggest a high likelihood for effective AI-OCR performance.