Research on the Table Vacuolization in the Document Image

Kim, U-Seong;Sim, Jin-Bo;Park, Yong-Beom;Mun, Gyeong-Ae;Ji, Su-Yeong;

The Transactions of the Korea Information Processing Society (한국정보처리학회논문지)

Volume 3 Issue 5
/
Pages.1147-1159
/
1996
/
1226-9190(pISSN)

Korea Information Processing Society (한국정보처리학회)

Research on the Table Vacuolization in the Document Image

문서 영상 내의 테이블 벡터화 연구

Kim, U-Seong (Dept. of Computer Engineering, Hoseo University) ;
Sim, Jin-Bo (Dept. of Computer Engineering, Hoseo University) ;
Park, Yong-Beom (Dept.of Computer Science, Dankook University) ;
Mun, Gyeong-Ae (Systems Engineering Research Institute) ;
Ji, Su-Yeong (Systems Engineering Research Institute)

김우성 (호서대학교 컴퓨터공학과) ;
심진보 (호서대학교 컴퓨터공학과) ;
박용범 (단국대학교 전자계산학과) ;
문경애 (시스템공학연구소) ;
지수영 (시스템공학연구소)

Published : 1996.09.01

PDF

Download PDF

⟨ Previous Next ⟩

Abstract

In this paper. we develop an efficient algorithm which vectorize the table input for mixed document recognition system. It is necessary to separate character and line for recognizing the character in the table. For recognizing table, we have to recognize the character which is blocked by table line and develop the efficient rectorization method for table line. For vectorizing table, we develop several methods. The first method is to extract table line part using 8-dircction chaincodes. The second method is to extract horizontal and vertical lines using histogram of lines. The third one is to extract diagonal lines of table by using the cross points of horizontal and verticallines. Finally we also develop the table vectorization method which finds the regularity characteristics of horizontal and vertical lines composing table, In the paper, we sugest a regularity method for efficient table vectorization.

본 논문에서는 문서인식 시스템에서 정확한 문서 인식의 기본이 되고 인식 결과에 중요한 영향을 미치는 전처리 알고리즘 중 테이블 입력의 효율적인 처리 방법을 연구 한다. 테이블 내의 문자를 인식하기 위해서는 테두리선과 문자 부분을 먼저 분리하는 작업이 필요하다. 왜냐하면, 테이블을 인식하기 위해서는 테두리선에 의해 블록화된 테두리선 안의 문자를 인식해야 하며 또한 테두리선을 효율적으로 벡터화하는 방법이 필요하다. 테이블을 벡터화하는 방법으로 8방향 체인 코드를 이용하여 테이블 선 성분을 추출하는 방법과 히스토그램을 이용하여 테이블의 수행, 수직 성분을 추출 하여 얻어진 교차점을 이용하여 대각선 성분을 찾아내는 방법 및 화소의 Run-length를 이용하여 수평선 성분과 수직선 성분을 추출하여 얻어진 교차점을 이용해 대각선성분 을 찾아내는 방법이 있다. 또한 규칙성을 이용한 테이블 추출 방법은 테이블을 구성하는 수직선 성분과 수평선 성분의 규칙성을 찾아내 이를 이용하여 테이블을 벡터화 시킨다. 본 논문에서는 문서 영상 내의 테이블을 효율적으로 벡터화하기 위한 방법으로 규칙성을 이용한 방법을 제안한다.

The Transactions of the Korea Information Processing Society (한국정보처리학회논문지)

Research on the Table Vacuolization in the Document Image

문서 영상 내의 테이블 벡터화 연구

Abstract

Keywords

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)