DOI QR코드

DOI QR Code

Weighted Disassemble-based Correction Method to Improve Recognition Rates of Korean Text in Signboard Images

간판영상에서 한글 인식 성능향상을 위한 가중치 기반 음소 단위 분할 교정

  • 이명훈 (코난테크놀로지 미디어기술지원) ;
  • 양형정 (전남대학교 전자 컴퓨터 공학부) ;
  • 김수형 (전남대학교 전자 컴퓨터 공학부) ;
  • 이귀상 (전남대학교 전자 컴퓨터 공학부) ;
  • 김선희
  • Received : 2012.01.20
  • Accepted : 2012.02.02
  • Published : 2012.02.28

Abstract

In this paper, we propose a correction method using phoneme unit segmentation to solve misrecognition of Korean Texts in signboard images using weighted Disassemble Levenshtein Distance. The proposed method calculates distances of recognized texts which are segmented into phoneme units and detects the best matched texts from signboard text database. For verifying the efficiency of the proposed method, a database dictionary is built using 1.3 million words of nationwide signboard through removing duplicated words. We compared the proposed method to Levenshtein Distance and Disassemble Levenshtein Distance which are common representative text string comparison algorithms. As a result, the proposed method based on weighted Disassemble Levenshtein Distance represents an improvement in recognition rates 29.85% and 6% on average compared to that of conventional methods, respectively.

본 논문에서는 휴대폰 카메라를 통해 간판영상의 한글문자를 인식한 후 오인식 된 결과를 교정하는 방법으로 인식 후보를 음소단위 분할하고 연산 가중치를 적용한 weighted Disassemble Levenshtein Distance(wDLD)를 제안한다. 제안된 방법은 인식된 문자열을 음소 단위로 분할한 후 입력 형태의 거리값을 산출하여, 가장 유사한 상호명을 데이터베이스에서 검출 한다. 제안된 방법의 효율성을 검증하기 위해, 전국의 상호명 중 중복되는 상호명을 제거한 130만개의 상호명을 이용하여 데이터베이스 사전을 구축하였다. 또한 대표적인 문자열 비교 알고리즘인 Levenshtein Distance와 음소를 분할하여 적용한 Disassemble Levenshtein Distance 방법, 그리고 본 논문에서 제안한 인식 후보의 음소 단위 분할 방법과 연산 가중치를 적용한 weighted Disassemble Levenshtein Distance의 교정율을 비교 분석 하였다. 그 결과 제안된 weighted Disassemble Levenshtein Distance(wDLD)은 Levenshtein Distance와 Disassemble Levenshtein Distance방법에 비해 각각 평균 29.85%와 6%의 인식률의 향상을 보였다.

Keywords

References

  1. A. Wojciechowsk and K. Siek,, "Barcode Scanning from Mobile-Phone Camera Photos Delivered Via MMS: Case Study," ER Workshops, pp.218-227, 2008.
  2. D. M. Chen, S. S. Tsai, R.Vedantham, R. Grzeszczuk, and B. Girod, "Streaming Mobile Augmented Reality on Mobile Phones," ISMAR, pp.181-182, 2009.
  3. C. Thillou and B. Gosselin, "Natural scene text understanding," Vision Systems, Segmentation and Pattern Recognition, Ch.16, pp.307-333, 2007.
  4. A. Canedo-Rodriguez, S. H. Kim, J. H. Kim, and Y. Blanco-Fernandez, "English to Spanish translation of signboard images from mobile phone camera," Southeastcon, 2009. SOUTHEASTCON. IEEE , pp.356-361, 2009.
  5. I. Haritaoglu, "Scene text extraction and translation for handheld devices," Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Vol.2, pp.408-413, 2001.
  6. J. Yang, X. Chen, J. Zhang, Y. Zhang, and A. Waibel, "Automatic Detection and Translation of Text from Natural Scenes," Proceedings of the IEEE 2002 International Conference on Acoustics, Speech and Signal Processing (ICASSP '02), Vol.2, pp.2101-2104, 2002.
  7. N. Ezaki, M. Bulacu, and L. Schomaker, "Text detection from natural scene images: towards a system for visually impaired persons," Pattern Recognition, ICPR 2004. Proceedings of the 17th International Conference on , Vol.2, pp.683-686, 2004.
  8. C. Li, X. Ding, and Y. Wu, "Automatic text location in natural scene images," Proceedings of International Conference on Document Analysis and Recognition, pp.1069-1073, 2001.
  9. M. L. Wick, M. G. Ross, and E. G. Learned- Miller, "Context-Sensitive Error Correction: Using Topic Models to Improve OCR," International Conference Document Analysis and Recognition, Vol.2, pp.1168-1172, 2007.
  10. W. S. Rosenbaum and J. J. Hilliard, "Multifont OCR Postprocessing System," IBM Journal of Research and Development, Vol.19, No.4, pp.398-421, 1975. https://doi.org/10.1147/rd.194.0398
  11. S. Dobrisek, J. Zibert, N. Pavesic, and F. Mihelic, "An Edit-Distance Model for the Approximate Matching of Timed Strings," Pattern Analysis and Machine Intelligence, IEEE Transactions, Vol.31, No.4, pp.736-741, 2009. https://doi.org/10.1109/TPAMI.2008.197
  12. J. J. Hell and S. N. Srihari, Experiments in Text Recognition with Binary n-Gram and Viterbi Algorithms," Pattern Analysis and Machine Intelligence, IEEE Transactions, Vol.PAMI-4, No.5, pp.520-530, 1982. https://doi.org/10.1109/TPAMI.1982.4767297
  13. R. Shinghal and G. T. Toussaint, "Experiments in Text Recognition with the Modified Viterbi Algorithm," Pattern Analysis and Machine Intelligence, IEEE Transactions, Vol.PAMI-1,No.2, pp.184-193, 1979. https://doi.org/10.1109/TPAMI.1979.4766904
  14. S. W. Kim and Y. Aoki, "A Postprocessing of HANGUL Recognitions Using Dictionary Lookup," JTC-CSCC: Joint Technical Conference on Circuits Systems, Computers and Communications Vol.2, pp.1013-1017, 1993.
  15. R. S. Boyer and J. S. Moore, "A fast string searching algorithm," Comm. of ACM, Vol.20, No.10, pp.762-772, 1977. https://doi.org/10.1145/359842.359859
  16. V. Bansal and R.M.K. Sinha, "Integrating knowledge sources in Devanagari text recognition system," Systems, Man and Cybernetics, Part A: Systems and Humans, IEEE Transactions, Vol.30, No.4, pp.500-505, 2000. https://doi.org/10.1109/3468.852443
  17. R. M. K. Shinha and B. Prasada, "Visual text recognition through contextual processing," Pattern Recognition, Vol.21, No.5, pp.463-479, 1988. https://doi.org/10.1016/0031-3203(88)90006-4
  18. H. Takashi, N. I. Amano, and A. Yamashita, "A spelling correction method and its application to and OCR system," Pattern Recognition, Vol.23, No.3/4, pp.363-377, 1990. https://doi.org/10.1016/0031-3203(90)90023-E
  19. T. Okuda, E. Tanaka, and T.Kasai, "A Method for the Correction of Garbled Words Based on the Levenshtein Metric," Computers, IEEE Transactions, Vol.C-25, No.2, pp.172-178, 1976. https://doi.org/10.1109/TC.1976.5009232