통합 검색 | Korea Science

딥러닝 기반 광학 문자 인식 기술 동향 (Recent Trends in Deep Learning-Based Optical Character Recognition)

민기현;이아람;김거식;김정은;강현서;이길행
- 전자통신동향분석
- /
- 제37권5호
- /
- pp.22-32
- /
- 2022
Optical character recognition is a primary technology required in different fields, including digitizing archival documents, industrial automation, automatic driving, video analytics, medicine, and financial institution, among others. It was created in 1928 using pattern matching, but with the advent of artificial intelligence, it has since evolved into a high-performance character recognition technology. Recently, methods for detecting curved text and characters existing in a complicated background are being studied. Additionally, deep learning models are being developed in a way to recognize texts in various orientations and resolutions, perspective distortion, illumination reflection and partially occluded text, complex font characters, and special characters and artistic text among others. This report reviews the recent deep learning-based text detection and recognition methods and their various applications.
https://doi.org/10.22648/ETRI.2022.J.370503 인용 PDF

영상 관찰 모델을 이용한 예제기반 초해상도 텍스트 영상 복원 (Example-based Super Resolution Text Image Reconstruction Using Image Observation Model)

박규로;김인중
- 정보처리학회논문지B
- /
- 제17B권4호
- /
- pp.295-302
- /
- 2010
예제기반 초해상도 영상 복원(EBSR)은 고해상도 영상과 저해상도 영상간의 패치간 대응관계를 학습함으로써 고해상도 영상을 복원하는 방법으로, 한 장의 저해상도 영상으로부터도 고해상도 영상을 복원할 수 있는 장점이 있다. 그러나, 폰트의 종류나 크기가 학습 영상과 다른 텍스트 영상을 적용할 경우 잡영을 많이 발생시킨다. 그 이유는 복원 과정 중 매칭 단계에서 입력 패치들이 사전 내의 고해상도 패치와 부적절하게 매칭될 수 있기 때문이다. 본 논문에서는 이러한 문제점을 극복하기 위한 새로운 패치 매칭 방법을 제안한다. 제안하는 방법은 영상 관찰 모델을 이용하여 입력 영상과 출력 영상간의 상관 관계를 보존함으로써 잘못 매칭된 패치로 인한 잡영을 효과적으로 억제한다. 이는 출력 영상의 화질을 개선할 뿐 아니라, 다양한 종류 및 크기의 폰트를 포함한 대용량 패치 사전을 적용할 수 있게 함으로써 폰트의 종류 및 크기의 변이에 대한 적응력을 크게 향상시킨다. 실험에서 제안하는 방법은 폰트와 크기가 다양한 영상에 대하여 기존의 방법보다 우수한 영상 복원 성능을 나타내었다. 뿐만 아니라, 인식 성능도 88.58%에서 93.54%로 개선되어 제안하는 방법이 인식 성능의 개선에도 효과적임을 확인하였다.
https://doi.org/10.3745/KIPSTB.2010.17B.4.295 인용 PDF KSCI

Development of character recognition system for the mixed font style in the steel processing material

Lee, Jong-Hak;Park, Sang-Gug;Park, Soo-Young
- 제어로봇시스템학회:학술대회논문집
- /
- 제어로봇시스템학회 2005년도 ICCAS
- /
- pp.1431-1434
- /
- 2005
In the steel production line, the molten metal of a furnace is transformed into billet and then moves to the heating furnace of the hot rolling mill. This paper describes about the development of recognition system for the characters, which was marked at the billet material by use template-marking plate and hand written method, in the steel plant. For the recognition of template-marked characters, we propose PSVM algorithm. And for the recognition of hand written character, we propose combination methods of CCD algorithm and PSVM algorithm. The PSVM algorithm need some more time than the conventional KLT or SVM algorithm. The CCD algorithm makes shorter classification time than the PSVM algorithm and good for the classification of closed curve characters from Arabic numerals. For the confirmation of algorithm, we have compared our algorithm with conventional methods such as KLT classifier and one-to-one SVM. The recognition rate of experimented billet characters shows that the proposing PSVM algorithm is 97 % for the template-marked characters and combinational algorithm of CCD & PSVM is 95.5 % for the hand written characters. The experimental results show that our proposing method has higher recognition rate than that of the conventional methods for the template-marked characters and hand written characters. By using our algorithm, we have installed real time character recognition system at the billet processing line of the steel-iron plant.
PDF

A Comparative Study on OCR using Super-Resolution for Small Fonts

Cho, Wooyeong;Kwon, Juwon;Kwon, Soonchu;Yoo, Jisang
- International journal of advanced smart convergence
- /
- 제8권3호
- /
- pp.95-101
- /
- 2019
Recently, there have been many issues related to text recognition using Tesseract. One of these issues is that the text recognition accuracy is significantly lower for smaller fonts. Tesseract extracts text by creating an outline with direction in the image. By searching the Tesseract database, template matching with characters with similar feature points is used to select the character with the lowest error. Because of the poor text extraction, the recognition accuracy is lowerd. In this paper, we compared text recognition accuracy after applying various super-resolution methods to smaller text images and experimented with how the recognition accuracy varies for various image size. In order to recognize small Korean text images, we have used super-resolution algorithms based on deep learning models such as SRCNN, ESRCNN, DSRCNN, and DCSCN. The dataset for training and testing consisted of Korean-based scanned images. The images was resized from 0.5 times to 0.8 times with 12pt font size. The experiment was performed on x0.5 resized images, and the experimental result showed that DCSCN super-resolution is the most efficient method to reduce precision error rate by 7.8%, and reduce the recall error rate by 8.4%. The experimental results have demonstrated that the accuracy of text recognition for smaller Korean fonts can be improved by adding super-resolution methods to the OCR preprocessing module.
https://doi.org/10.7236/IJASC.2019.8.3.95 인용 PDF KSCI

원형 정합 방법을 이용한 방송 프로그램의 등급 인식 시스템 (A Rating Recognition System of Broadcast Program using Template Matching)

황선주;조대제
- 한국콘텐츠학회논문지
- /
- 제4권1호
- /
- pp.24-31
- /
- 2004
논문에서는 등급이 표시된 방송 영상을 입력으로 하는 등급 인식 시스템을 구현하였다. 본 논문에서는 인식하고자 하는 방송 프로그램의 등급 표시 기호가 정형화된 틀을 가지고 있기 때문에 원형 정합 방법을 사용하였다. 실험에서 방송업자가 사용하는 글자체의 표준 숫자 에서 숫자가 가지는 특성 패턴들을 추출하고, 특성 패턴들 가운데서 해당 등급의 숫자만이 가지는 고유 패턴을 추출한 다음, 고유 패턴을 입력 영상과 비교하여 정합하는 과정으로 진행하였다. 3$\times$3크기의 패턴을 적용하였을 때는 88.6%의 인식률을 보였으나 패턴크기가 등급기호의 원형에 가까울수록 100%에 가까운 인식률을 보였다.
PDF

워드이미지로부터 영문인식을 위한 트루타입 특성 추출 (Deriving TrueType Features for Letter Recognition in Word Images)

SeongAh CHIN
- 한국시뮬레이션학회논문지
- /
- 제11권3호
- /
- pp.35-48
- /
- 2002
In the work presented here, we describe a method to extract TrueType features for supporting letter recognition. Even if variously existing document processing techniques have been challenged, almost few methods are capable of recognize a letter associated with its TrueType features supporting OCR free, which boost up fast processing time for image text retrieval. By reviewing the mechanism generating digital fonts and birth of TrueType, we realize that each TrueType is drawn by its contour of the glyph table. Hence, we are capable of deriving the segment with density for a letter with a specific TrueType, defined by the number of occurrence over a segment width. A certain number of occurrence appears frequently often due to the fixed segment width. We utilize letter recognition by comparing TrueType feature library of a letter with that from input word images. Experiments have been carried out to justify robustness of the proposed method showing acceptable results.
PDF

VOICE CONTROL SYSTEM FOR TELEVISION SET USING MASKING MODEL AS A FRONT-END OF SPEECH RECOGNIZER

Usagawa, Tsuyoshi;Iwata, Makoto;Ebata, Masanao
- 한국음향학회:학술대회논문집
- /
- 한국음향학회 1994년도 FIFTH WESTERN PACIFIC REGIONAL ACOUSTICS CONFERENCE SEOUL KOREA
- /
- pp.991-996
- /
- 1994
Surrounding noise often affects the performance of speech recognition system when it is used in office or home. Especially situation is more serious when colored and nonstational noise such as an sound from television or other audio equipment is introduced. The authors proposed a voice control system for television set using an adaptive noise canceler, and it works well even is sound of television set has comparable level of speech. In this paper, a new front-end of speech recognition is introduced for the voice control system. This font-end utilizes a simplified masking model to reduce the effect of residual noise. According to experimental results, 90% correct recognition is achieved even if the level of television sound is almost 15dB higher than one of speech.
PDF

Statistical Model-Based Noise Reduction Approach for Car Interior Applications to Speech Recognition

Lee, Sung-Joo;Kang, Byung-Ok;Jung, Ho-Young;Lee, Yun-Keun;Kim, Hyung-Soon
- ETRI Journal
- /
- 제32권5호
- /
- pp.801-809
- /
- 2010
This paper presents a statistical model-based noise suppression approach for voice recognition in a car environment. In order to alleviate the spectral whitening and signal distortion problem in the traditional decision-directed Wiener filter, we combine a decision-directed method with an original spectrum reconstruction method and develop a new two-stage noise reduction filter estimation scheme. When a tradeoff between the performance and computational efficiency under resource-constrained automotive devices is considered, ETSI standard advance distributed speech recognition font-end (ETSI-AFE) can be an effective solution, and ETSI-AFE is also based on the decision-directed Wiener filter. Thus, a series of voice recognition and computational complexity tests are conducted by comparing the proposed approach with ETSI-AFE. The experimental results show that the proposed approach is superior to the conventional method in terms of speech recognition accuracy, while the computational cost and frame latency are significantly reduced.
https://doi.org/10.4218/etrij.10.1510.0024 인용 PDF KSCI

폰트 밀도함수를 애용한 폰트 타입의 인식 (Fontface Recognition Using the Font Density Function)

진성아;주문원
- 한국멀티미디어학회:학술대회논문집
- /
- 한국멀티미디어학회 2001년도 춘계학술발표논문집
- /
- pp.189-191
- /
- 2001
폰트는 텍스트 정보를 기술하는 기본 요소로서 다양한 타입에 따른 독특한 감성정보를 내재하고 있다. 본 연구는 문서에 나타나 있는 영문폰트의 분포에 따른 감성정보 자동추출 시스템의 전처리 단계로서 문서상에서 특정의 폰트를 인식하는 모듈을 소개하고자 한다. 폰트 디자이너에 생성된 대부분의 폰트는 glyph data 라고 하는 2D boundary 좌표값에 의해 그 모양(Shape)이 결정된다. 이 데이터로부터 정의된 폰트밀도함수와 각 문자가 등장하는 보편적 확률 값의 linear combination으로부터 각 폰트를 식별할 수 있다.
PDF

문자-에지 맵의 패턴 히스토그램을 이용한 자연이미지에세 텍스트 영역 추출 (Text Region Extraction Using Pattern Histogram of Character-Edge Map in Natural Images)

박종천;황동국;이우람;전병민
- 한국산학기술학회논문지
- /
- 제7권6호
- /
- pp.1167-1174
- /
- 2006
자연이미지로부터 텍스트 영역 추출은 자동차 번호판 인식 등과 같은 많은 응용프로그램에서 유용하다. 따라서 본 논문은 문자-에지 맵의 패턴 히스토그램을 이용한 텍스트 영역을 추출하는 방법을 제안한다. 16종류의 에지맵을 생성하고, 이것을 조합하여 문자 특징을 갖는 8종류 문자-에지 맵 특징을 추출한다. 문자-에지 맵의 특징을 이용하여 텍스트 후보 영역을 추출하고, 텍스트 후보 영역에 대한 검증은 문자-에지 맵의 패턴 히스토그램 및 텍스트 영역의 구조적 특징을 이용하였다. 실험결과 제안한 방법은 복잡한 배경, 다양한 글꼴, 다양한 텍스트 컬러로 구성된 자연이미지로부터 텍스트 영역을 효과적으로 추출하였다.
PDF

검색결과 67건 처리시간 0.025초

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)