Image-based Retrieval of Printed Korean Words using Wavelets

웨이브렛을 이용한 영상기반 인쇄 한글 단어 검색

  • 김혜금 (전북대학교 전산통계학과) ;
  • 양진호 (전북대학교 전산통계학과) ;
  • 이진석 (우석대학교 정보통신컴퓨터공학부) ;
  • 오일석 (정보 검색시스템 연구센터, 전북대학교 컴퓨터과학과)
  • Published : 2001.02.01

Abstract

내용-기반 문서 검색의 필요성이 급속히 증가하고 있다. 기존의 OCR-기반 텍스트 변환 방법은 명백한 한계를 갖고 있기 때문에 영상-기반 매칭 방법이 대안으로서 인기를 얻고 있다. 새로운 매칭방법은 빠른 속도와 좋은 검색 성능의 두 가지 요구사항을 충족해야 한다. 이 논문은 웨이브렛의 좋은 특성을 기반으로 개발된 한글 단어에 대한 영상-기반 매칭 알고리즘을 제안한다. 실험은 고품질과 저품질 단어 영상을 가지고 수행하였으며, 실험 결과 제안한 알고리즘이 검색 성능과 속도 면에서 우수함을 확인하였다.

Keywords

References

  1. A. Belaid, Retrospective document conversion: application to the library domain, International Journal on Document Analysis and Recognition, Vol.1, No.3, pp.125-146, December 1998 https://doi.org/10.1007/s100320050013
  2. F.R. Chen, L.D. Wilcox, and D.S. Bloomberg, A comparison of discrete and continuous hidden Markov models for phrase spotting in text images, Proceeding of ICDAR95, Montreal, pp. 398-402, 1995 https://doi.org/10.1109/ICDAR.1995.599022
  3. K.S Chung and H.U. Kwon, A feature-based word spotting for content-based retrieval of machine-printed English document images, Journal of Korean Information Science Society (B), Vol.26, No.10, pp.1204-1218, October 1999 (in Korean)
  4. R.A. DeVore, B. Jawerth, and B.J. Lucier, Image compression through wavelet transform coding, IEEE Trans, on Information Theory, Vol.38, No.2, pp.719-746, March 1992 https://doi.org/10.1109/18.119733
  5. D. Doermann, The retrieval of document images: a brief survey, Proceedings of ICDAR97, Ulm, pp.945-949, 1997 https://doi.org/10.1109/ICDAR.1997.620650
  6. W.L. Hwang and F. Chang, Character extraction from documents using wavelet maxima, Image and Vision Computing, Vol.16, pp.307-315, 1998 https://doi.org/10.1016/S0262-8856(97)00063-2
  7. C.E. Jacobs, A. Finkelstein, and D.H. Salesin, Fast multiresolution image querying, Proceedings of SIGGRAPH95, pp.277-286, 1995 https://doi.org/10.1145/218380.218454
  8. S.-S. Kuo and O.E. Agazzi, Keyword spotting in poorly printed documents using pseudo 2-D hidden Markov models, IEEE Trans. on Pattern analysis and Machine Intelligence, Vol.16, No.8, pp.842-848, August 1994 https://doi.org/10.1109/34.308482
  9. S.W. Lee, C.H. Kim, H.Ma, and Y.Y. Tang, Multiresolution recognition of unconstrained handwritten numerals with wavelet transform and multilayer cluster neural network, Pattern Recognition, Vol.29, No.12, pp.1953-1961, 1996 https://doi.org/10.1016/S0031-3203(96)00053-2
  10. H. Ma, Y.Y. Tang, J. Liu, B.F. Li, and C.Y. Suen, Wavelet transform extracting features in Chinese character recognition, Proceedings of ICPOL97, pp.262-265
  11. F. Murtagh and J.-L. Starck, Pattern clustering based on noise modeling in wavelet space, Pattern Recognition, Vol.31, No.7, pp.847-855, 1998 https://doi.org/10.1016/S0031-3203(97)00115-5
  12. S. Pittner and S.V. Kamarthi, Feature extraction from wavelet coefficients for pattern recognition tasks, IEEE Trans. on Pattern analysis and Machine Intelligence, Vol.21, No.l, pp.83-88, January 1999 https://doi.org/10.1109/34.745739
  13. T. Shioyama, H.Y. Wu, and T. Nojima, Recognition algorithm based on wavelet transform for handprinted Chinese characters, Proceedings of ICPR98, Brisbane, pp.229-232, 1998 https://doi.org/10.1109/ICPR.1998.711123
  14. A.L. Spitz, Shape-based word recognition, International Journal on Document Analysis and Recognition, Vol.1, No.4, pp.178-190, May 1999 https://doi.org/10.1007/s100320050017
  15. E.J. Stollnitz, T.D. DeRose, and D.H. Salesin, Wavelets for Computer Graphics, Morgan Kaufmann, San Francisco, 1996
  16. Y.Y. Tang, H. Ma, J. Liu, B.F. Li, and D. Xi, Multiresolution analysis in extraction of reference lines from documents with gray level background, IEEE Trans, on Pattern analysis and Machine Intelligence, Vol.19, No.8, pp.921-926, August 1997 https://doi.org/10.1109/34.608296
  17. P. Wunsch and A.F. Laine, Wavelet descriptors for multiresolution recognition of hadprinted characters, Pattern Recognition, Vol.28, No.8, pp.1237-1249, 1995
  18. J. Zhu, T. Hong, and J.J. Hull, Image-based keyword recognition in Oriental language document images, Pattern Recognition, Vol.30, No.8, pp.1293-1300, 1997 https://doi.org/10.1016/S0031-3203(97)83110-X