DOI QR코드

DOI QR Code

연결 성분 간 간격 측정에 의한 필기체 수표 금액 문장에서의 단어 추출

Word Separation in Handwritten Legal Amounts on Bank Check by Measuring Gap Distance Between Connected Components

  • Kim, In-Cheol (Centre for Pattern Recognition and Machine Intelligence (CENPARMI) Concordia University)
  • 발행 : 2004.02.01

초록

본 논문에서는 연결 성분간의 공간적 간격에 기반하여 수표 영상 내의 필기체 문장 금액에서 단어를 효율적으로 추출하기 위한 방법을 제안한다. 인접한 연결 성분간의 거리측정을 위한 기존의 방식들은 과대추정 또는 과소추정 문제로 인한 단어 분리 오류를 초래할 수 있으나 본 논문에서는 이러한 문제를 줄이기 위해 각 측정 방식들을 수정 보완하였다. 또한 본 논문에서는 서로 다른 형태의 세 가지 거리 측정법들을 효과적으로 결합하여 각 개별 측정법이 가지는 단점을 상호 보완하고 전체 단어 추출 성능을 좀더 향상시킬 수 있는 4-클래스 군집화에 기반한 결합 방법을 새로이 제안하였다. 분장 금액에 대한 단어 추출 실험 결과로부터 수정된 각 거리 측정법이 대응되는 기존의 측정법에 비해 2-3% 정도 향상된 단어 분리율을 보임을 확인하였다. 또한 제안된 4-클래스 군집화에 기반한 결합 방식은 각 측정 방식에서 개별적으로 발생하는 에러뿐만 아니라 두 개의 방식에서 동시에 나타나는 에러도 효율적으로 감소시킴으로서 전체 단어 분리 성능을 향상 시킬수 있었다.

We have proposed an efficient method of word separation in a handwritten legal amount on bank check based on the spatial gaps between the connected components. The previous gap measures all suffer from the inherent problem of underestimation or overestimation that causes a deterioration in separation performance. In order to alleviate such burden, we have developed a modified version of each distance measure. Also, 4 class clustering based method of integrating three different types of distance measures has been proposed to compensate effectively the errors in each measure, whereby further improvement in performance of word separation is expected. Through a series of word separation experiments, we found that the modified distance measures show a better performance with over 2 - 3% of the word separation rate than their corresponding original distance measures. In addition, the proposed combining method based on 4-class clustering achieved further improvement by effectively reducing the errors common to two of three distance measures as well as the individual errors.

키워드

참고문헌

  1. D. D'Amato, E. Kuebert, and A. Lawson, "Results from a Perfonnance evaluation of Handwritten Address Recognition Systems for the United States Postal Service," Proc. Int'l Workshop on Frontiers in Handwriting Recognition, pp. 189-198, 2000.
  2. A. Ei-Yacoubi, M. Gilloux, R. Sabourin, and c.Y. Suen, "An HMM-Based Approach for Off-Line Unconstrained Handwritten Word Modeling and Recognition," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 21, no. 8, pp. 752-760, Aug. 1999. https://doi.org/10.1109/34.784288
  3. D. Guillevic and C.Y. Suen, "Recognition of Legal Amounts on Bank Cheques," Pattern Analysis and Applications, vol. 1, no. 1, pp. 28-41, 1998. https://doi.org/10.1007/BF01238024
  4. G. Seni and E. Cohen, "External Word Segmentation of Off-line Handwritten Text Lines," Pattern Recognition, vol. 27, no. 1, pp. 41-52, 1994. https://doi.org/10.1016/0031-3203(94)90016-7
  5. U. Mahadevan and R.C. Nagabushnam, "Gap Metrics for Word Separation in Handwritten Lines," Proc. Int'l Conf. Document Analysis and Recognition, vol. 1, pp. 124-127, 1995. https://doi.org/10.1109/ICDAR.1995.598958
  6. J. Schurmann, "Document Analysis - from Pixels to Contents," Proc. IEEE, vol. 80, no. 7, pp. 1101-1119, July 1992. https://doi.org/10.1109/5.156473
  7. D. Guillevic, "Unconstrained Handwriting Recognition Applied to the Recognition of Bank Cheques," Ph. D Thesis, Concordia University, Montreal, Canada, 1995.
  8. J. Zhou, C.Y. Suen, and K. Liu, "A Feedback-based Approach for Segmenting Handwritten Legal Amounts on Bank Cheques," Proc. Int'l Conf. Document Analysis and Recognition, pp. 887-891, 2001.
  9. Y. Linde, A. Buzo, and R.M. Gray, "An algorithm for vector quantizer design," IEEE Trans. Communications, vol. COM-28, no. 1, pp. 84-95, Jan. 1980.
  10. K.K. Kim, J.H. Kim, Y.K. Chung, and C.Y. Suen, "Legal Amount Recognition Based on the Segmentation Hypotheses for Bank Check Processing," Proc. Int'l Conf. Document Analysis and Recognition, pp. 964-967, 2001.

피인용 문헌

  1. Design of Digit Recognition System Realized with the Aid of Fuzzy RBFNNs and Incremental-PCA vol.26, pp.1, 2016, https://doi.org/10.5391/JKIIS.2016.26.1.056