DOI QR코드

DOI QR Code

Purchase Information Extraction Model From Scanned Invoice Document Image By Classification Of Invoice Table Header Texts

인보이스 서류 영상의 테이블 헤더 문자 분류를 통한 구매 정보 추출 모델

  • 신현경 (가천대학교 수학정보학과)
  • Received : 2012.11.12
  • Accepted : 2012.12.10
  • Published : 2012.12.31

Abstract

Development of automated document management system specified for scanned invoice images suffers from rigorous accuracy requirements for extraction of monetary data, which necessiate automatic validation on the extracted values for a generative invoice table model. Use of certain internal constraints such as "amount = unit price times quantity" is typical implementation. In this paper, we propose a noble invoice information extraction model with improved auto-validation method by utilizing table header detection and column classification.

Keywords

machine learning;ocr;text line segmentation;text classification;document image processing

Acknowledgement

Supported by : 가천대학교