• Title, Summary, Keyword: character segmentation

Search Result 155, Processing Time 0.038 seconds

Character Segmentation and Recognition Algorithm for Various Text Region Images (다양한 문자열영상의 개별문자분리 및 인식 알고리즘)

  • Koo, Keun-Hwi;Choi, Sung-Hoo;Yun, Jong-Pil;Choi, Jong-Hyun;Kim, Sang-Woo
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.58 no.4
    • /
    • pp.806-816
    • /
    • 2009
  • Character recognition system consists of four step; text localization, text segmentation, character segmentation, and recognition. The character segmentation is very important and difficult because of noise, illumination, and so on. For high recognition rates of the system, it is necessary to take good performance of character segmentation algorithm. Many algorithms for character segmentation have been developed up to now, and many people have been recently making researches in segmentation of touching or overlapping character. Most of algorithms cannot apply to the text regions of management number marked on the slab in steel image, because the text regions are irregular such as touching character by strong illumination and by trouble of nozzle in marking machine, and loss of character. It is difficult to gain high success rate in various cases. This paper describes a new algorithm of character segmentation to recognize slab management number marked on the slab in the steel image. It is very important that pre-processing step is to convert gray image to binary image without loss of character and touching character. In this binary image, non-touching characters are simply separated by using vertical projection profile. For separating touching characters, after we use combined profile to find candidate points of boundary, decide real character boundary by using method based on recognition. In recognition step, we remove noise of character images, then recognize respective character images. In this paper, the proposed algorithm is effective for character segmentation and recognition of various text regions on the slab in steel image.

Character Segmentation using Side Profile Pattern (측면윤곽 패턴을 이용한 접합 문자 분할 연구)

  • Jung Minchul
    • Journal of Intelligence and Information Systems
    • /
    • v.10 no.3
    • /
    • pp.1-10
    • /
    • 2004
  • In this paper, a new character segmentation algorithm of machine printed character recognition is proposed. The new approach of the proposed character segmentation algorithm overcomes the weak points of both feature-based approaches and recognition-based approaches in character segmentation. This paper defines side profiles of touching characters. The character segmentation algorithm gives a candidate single character in touching characters by side profiles, without any help of character recognizer. It segments touching characters and decides the candidate single character by side profiles. This paper also defines cutting cost, which makes the proposed character segmentation find an optimal segmenting path. The performance of the proposed character segmentation algorithm in this paper has been obtained using a real envelope reader system, which can recognize addresses in U.S. mail pieces and sort the mail pieces. 3359 mail pieces were tested. The improvement was from $68.92\%\;to\;80.08\%$ by the proposed character segmentation.

  • PDF

Character Segmentation and Recognition Algorithm for Steel Manufacturing Process Automation (슬라브 제품 정보 인식을 위한 문자 분리 및 문자 인식 알고리즘 개발)

  • Choi, Sung-Hoo;Yun, Jong-Pil;Park, Young-Su;Park, Jee-Hoon;Koo, Keun-Hwi;Kim, Sang-Woo
    • Proceedings of the KIEE Conference
    • /
    • /
    • pp.389-391
    • /
    • 2007
  • This paper describes about the printed character segmentation and recognition system for slabs in steel manufacturing process. To increase the recognition rate, it is important to improve success rate of character segmentation. Since Slabs front area surface are not uniform and surface temperature is very high, marked characters not only undergo damages but also have much noise. On the other hand, since almost marked characters are very thick and the space between characters is only about 10 $^{\sim}$ 15 mm, there are many touching characters. Therefore appropriate character image preprocessing and segmentation algorithm is needed. In this paper we propose a multi-local thresholding method for damaged character restoration, a modified touching character segmentation, algorithm for marked characters. Finally a effective Multi-Class SVM is used to recognize segmented characters.

  • PDF

Character Segmentation Using Side Profile Pattern (측면 윤곽 패턴을 이용한 접합 문자 분할법)

  • 정민철
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.4 no.3
    • /
    • pp.248-251
    • /
    • 2003
  • In this paper, a new segmentation method of machine printed character string with arbitrary length is proposed. Character recognition requires character segmentation as a previous step. However character segmentation itself requires a character recognition capability for less error segmentation. It is necessary to attack both these problem simultaneously. It is proposed that a new recognition-based segmentation method, which recognizes a character in touching characters with help of defined side-profiles. The match of ‘side-profiles of touching characters' with ‘side-profiles of prototypes' gives single character candidates in touching characters. It segments touching characters according to cutting costs.

  • PDF

Character Segmentation in Chinese Handwritten Text Based on Gap and Character Construction Estimation

  • Zhang, Cheng Dong;Lee, Guee-Sang
    • International Journal of Contents
    • /
    • v.8 no.1
    • /
    • pp.39-46
    • /
    • 2012
  • Character segmentation is a preprocessing step in many offline handwriting recognition systems. In this paper, Chinese characters are categorized into seven different structures. In each structure, the character size with the range of variations is estimated considering typical handwritten samples. The component removal and merge criteria are presented to remove punctuation symbols or to merge small components which are part of a character. Finally, the criteria for segmenting the adjacent characters concerning each other or overlapped are proposed.

Development of an Algorithm for Korean Letter Recognition using Letter Component Analysis (조합형 문자구성을 이용한 문서 인식 알고리즘)

  • 김영재;이호재;김희식
    • Proceedings of the Korean Society of Precision Engineering Conference
    • /
    • /
    • pp.427-430
    • /
    • 1995
  • This paper proposes a new image processing algorithm to recognize korean documents. It take out the region of syllable area from input character image, then it makes recognition of a consonant and a vowel in the character. A precision segmentation is very important to recognize the input character. The input image has 8-bit gray scaled resolution. Not only the shape but also vertical and horizontal lines dispersion graph are used for segmentation. Theresult shows a higher accuracy of character segmentation.

  • PDF

An Internal Segmentation Method for the On-line Recognition of Run-on Characters (온라인 연속 필기 한글의 인식을 위한 내부 문자 분할에 관한 연구)

  • 정진영;전병환;김우성;김재희
    • Journal of the Korean Institute of Telematics and Electronics B
    • /
    • v.32B no.9
    • /
    • pp.1231-1238
    • /
    • 1995
  • In on-line character recognition, to segment input character is important. This paper proposes an internal character segmentation algorithm. The internal segmentation algorithm produces candidate words by considering possible combinations of Korean alphabets. In this process, we make use of projections of strokes onto the horizontal axis to remove ambiguities among candidate words. As a result of experiments, the internal segmentation algorithm shows better performance than external segmentation algorithm as the gap between sample characters becomes smaller.

  • PDF

A Study on the Preprocessing Method Using Construction of Watershed for Character Image segmentation

  • Nam Sang Yep;Choi Young Kyoo;Kwon Yun Jung;Lee Sung Chang
    • Proceedings of the IEEK Conference
    • /
    • /
    • pp.814-818
    • /
    • 2004
  • Off-line handwritten character recognition is in difficulty of incomplete preprocessing because it has not dynamic and timing information besides has various handwriting, extreme overlap of the consonant and vowel and many error image of stroke. Consequently off-line handwritten character recognition needs to study about preprocessing of various methods such as binarization and thinning. This paper considers running time of watershed algorithm and the quality of resulting image as preprocessing For off-line handwritten Korean character recognition. So it proposes application of effective watershed algorithm for segmentation of character region and background region in gray level character image and segmentation function for binarization image and segmentation function for binarization by extracted watershed image. Besides it proposes thinning methods which effectively extracts skeleton through conditional test mask considering running time and quality. of skeleton, estimates efficiency of existing methods and this paper's methods as running time and quality. Watershed image conversion uses prewitt operator for gradient image conversion, extracts local minima considering 8-neighborhood pixel. And methods by using difference of mean value is used in region merging step, Converted watershed image by means of this methods separates effectively character region and background region applying to segmentation function. Average execution time on the previous method was 2.16 second and on this paper method was 1.72 second. We prove that this paper's method removed noise effectively with overlap stroke as compared with the previous method.

  • PDF

Bilingual document analysis and character segmentation using connected components (연결요소를 이용한 한.영 혼용문서의 구조분석 및 낱자분리)

  • 김민기;권영빈;한상용
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.22 no.3
    • /
    • pp.410-422
    • /
    • 1997
  • In this paper, we descried a bottom-up document structure analysis method in bilingual Korean-English document. We proposed a character segmentation method based on the layout information of connected component of each character. In many researches, a document has been analyzed into text blocks and graphics. We analyzed a document into four parts: text, table, graphic, and separator. A text is recursively subdivided into text blocks, text lines, words, and characters. To extract the character in bilingual text, we proposed a new method of word of word separation of Korean or English. Futhermore, we used a character merging and segmentation method in accordance with the properties of Hangul on the Korean word blocks. Experimental results on the various documents show that the proposed method is very effectively operated on the document structure analysis and the character segmentation.

  • PDF

A Novel Character Segmentation Method for Text Images Captured by Cameras

  • Lue, Hsin-Te;Wen, Ming-Gang;Cheng, Hsu-Yung;Fan, Kuo-Chin;Lin, Chih-Wei;Yu, Chih-Chang
    • ETRI Journal
    • /
    • v.32 no.5
    • /
    • pp.729-739
    • /
    • 2010
  • Due to the rapid development of mobile devices equipped with cameras, instant translation of any text seen in any context is possible. Mobile devices can serve as a translation tool by recognizing the texts presented in the captured scenes. Images captured by cameras will embed more external or unwanted effects which need not to be considered in traditional optical character recognition (OCR). In this paper, we segment a text image captured by mobile devices into individual single characters to facilitate OCR kernel processing. Before proceeding with character segmentation, text detection and text line construction need to be performed in advance. A novel character segmentation method which integrates touched character filters is employed on text images captured by cameras. In addition, periphery features are extracted from the segmented images of touched characters and fed as inputs to support vector machines to calculate the confident values. In our experiment, the accuracy rate of the proposed character segmentation system is 94.90%, which demonstrates the effectiveness of the proposed method.