Bilingual document analysis and character segmentation using connected components

;;;

The Journal of Korean Institute of Communications and Information Sciences (한국통신학회논문지)

Volume 22 Issue 3
/
Pages.410-422
/
1997
/
1226-4717(pISSN)
/
2287-3880(eISSN)

The Korean Institute of Commucations and Information Sciences (한국통신학회)

Bilingual document analysis and character segmentation using connected components

연결요소를 이용한 한.영 혼용문서의 구조분석 및 낱자분리

김민기 (중앙대학교 컴퓨터공학과) ;
권영빈 (중앙대학교 컴퓨터공학과) ;
한상용 (중앙대학교 컴퓨터공학과)

Published : 1997.03.01

PDF

Download PDF

⟨ Previous Next ⟩

Abstract

In this paper, we descried a bottom-up document structure analysis method in bilingual Korean-English document. We proposed a character segmentation method based on the layout information of connected component of each character. In many researches, a document has been analyzed into text blocks and graphics. We analyzed a document into four parts: text, table, graphic, and separator. A text is recursively subdivided into text blocks, text lines, words, and characters. To extract the character in bilingual text, we proposed a new method of word of word separation of Korean or English. Futhermore, we used a character merging and segmentation method in accordance with the properties of Hangul on the Korean word blocks. Experimental results on the various documents show that the proposed method is very effectively operated on the document structure analysis and the character segmentation.

Keywords

References

Proc.of the IEEE v.80 no.7 Historical Review of OCR Research adn Development S. Mori;C. Y. Suen;K. Yamamoto
Proc.of the ICDAR93 Perfect Metrics Tin Kam Ho;H. S. Baird
IEEE tran. on PAMI v.9 no.2 On the Recognition of Printed Characters of Any Font and Size S. Kahan;T. Pavlidis;H. S. Baird
정보과학회논문집 v.20 no.12 연결화소를 이용한 문서 영상의 분할 및 인식 장명욱;천대녕;양현승
Proc. of the ICDAR93 Document Structures:A Survey Y.Y. Tang;C.Y. Suen
Proc.of the ICDAR95 Realization of A High-Performance Bilingual Chinese-English OCR System Hong Guo (et al.)
Proc. of the ICDAR93 Initial Learning of Document Structure A. Dengel
CVGIP v.47 Classification of News-paper Image Block Using Texture Analysis D. Wang;S. N. Srihari
IEEE Trans. on Pattern Analysis and Machine Intelligence v.10 no.6 A Robust Algorithm for Text String Separation from Mixed Text/Graphics Images L.A. Fletcher;R. Kasturi
CVGIP Block Segmentation and Text Extraction in Mixed Text/Image Documents F.M. Wahl;K.Y.Wong;R.G.Gasey
Proc.of the ICDAR93 A Block Segmentation Method for Document Image with Complicated Column Structures Y. Hirayama
1994년도 한국정보과학회 가을 학술발표논문집 v.21 no.2 한글 및 영숫자 혼용 문서에서의 문자분할 및 인식 이동준;이성환
Proc. of IEEE v.80 no.7 Document Analysis-From Pixels to Contents J. Schurmann (et al.)
Proc. of the IEEE v.80 no.7 Segmentation Methods for Character Recognition:From Segmentation to Document Structure Analysis H.Fujisawa;Y.Nakano;K.Kurino
정보과학회논문지 v.21 no.1 문자영역 추출과정에서의 오본리의 교정 최봉희;이인동;김태균
1989년도 한글 및 한국어정보처리 학술발표논문집 신문 자동인식 시스템을 위한 문자의 분류에 관한 연구 이승형;전종익;조용주;남궁재찬
Proc. of IEEE v.80 no.7 Major Components of A Complete Text Reading System S. Tsujimoto;H. Asada
제2회 문자인식 워크샵 다양한 결합문자를 갖는 계층지도의 인식 박문규;권영빈
제2회 문자인식 워크샵 인쇄체 문서인식을 위한 문자추출에 관한 연구 김의정;김태균

The Journal of Korean Institute of Communications and Information Sciences (한국통신학회논문지)

Bilingual document analysis and character segmentation using connected components

연결요소를 이용한 한.영 혼용문서의 구조분석 및 낱자분리

Abstract

Keywords

References

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)