• Title/Summary/Keyword: Character Extraction

Search Result 303, Processing Time 0.023 seconds

Major Character Extraction using Character-Net (Character-Net을 이용한 주요배역 추출)

  • Park, Seung-Bo;Kim, Yoo-Won;Jo, Geun-Sik
    • Journal of Internet Computing and Services
    • /
    • v.11 no.1
    • /
    • pp.85-102
    • /
    • 2010
  • In this paper, we propose a novel method of analyzing video and representing the relationship among characters based on their contexts in the video sequences, namely Character-Net. As a huge amount of video contents is generated even in a single day, the searching and summarizing technologies of the contents have also been issued. Thereby, a number of researches have been proposed related to extracting semantic information of video or scenes. Generally stories of video, such as TV serial or commercial movies, are made progress with characters. Accordingly, the relationship between the characters and their contexts should be identified to summarize video. To deal with these issues, we propose Character-Net supporting the extraction of major characters in video. We first identify characters appeared in a group of video shots and subsequently extract the speaker and listeners in the shots. Finally, the characters are represented by a form of a network with graphs presenting the relationship among them. We present empirical experiments to demonstrate Character-Net and evaluate performance of extracting major characters.

A Study on Stroke Extraction for Handwritten Korean Character Recognition (필기체 한글 문자 인식을 위한 획 추출에 관한 연구)

  • Choi, Young-Kyoo;Rhee, Sang-Burm
    • The KIPS Transactions:PartB
    • /
    • v.9B no.3
    • /
    • pp.375-382
    • /
    • 2002
  • Handwritten character recognition is classified into on-line handwritten character recognition and off-line handwritten character recognition. On-line handwritten character recognition has made a remarkable outcome compared to off-line hacdwritten character recognition. This method can acquire the dynamic written information such as the writing order and the position of a stroke by means of pen-based electronic input device such as a tablet board. On the contrary, Any dynamic information can not be acquired in off-line handwritten character recognition since there are extreme overlapping between consonants and vowels, and heavily noisy images between strokes, which change the recognition performance with the result of the preprocessing. This paper proposes a method that effectively extracts the stroke including dynamic information of characters for off-line Korean handwritten character recognition. First of all, this method makes improvement and binarization of input handwritten character image as preprocessing procedure using watershed algorithm. The next procedure is extraction of skeleton by using the transformed Lu and Wang's thinning: algorithm, and segment pixel array is extracted by abstracting the feature point of the characters. Then, the vectorization is executed with a maximum permission error method. In the case that a few strokes are bound in a segment, a segment pixel array is divided with two or more segment vectors. In order to reconstruct the extracted segment vector with a complete stroke, the directional component of the vector is mortified by using right-hand writing coordinate system. With combination of segment vectors which are adjacent and can be combined, the reconstruction of complete stroke is made out which is suitable for character recognition. As experimentation, it is verified that the proposed method is suitable for handwritten Korean character recognition.

Text Region Extraction Using Pattern Histogram of Character-Edge Map in Natural Images (문자-에지 맵의 패턴 히스토그램을 이용한 자연이미지에세 텍스트 영역 추출)

  • Park, Jong-Cheon;Hwang, Dong-Guk;Lee, Woo-Ram;Jun, Byoung-Min
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.7 no.6
    • /
    • pp.1167-1174
    • /
    • 2006
  • Text region detection from a natural scene is useful in many applications such as vehicle license plate recognition. Therefore, in this paper, we propose a text region extraction method using pattern histogram of character-edge maps. We create 16 kinds of edge maps from the extracted edges and then, we create the 8 kinds of edge maps which compound 16 kinds of edge maps, and have a character feature. We extract a candidate of text regions using the 8 kinds of character-edge maps. The verification about candidate of text region used pattern histogram of character-edge maps and structural features of text region. Experimental results show that the proposed method extracts a text regions composed of complex background, various font sizes and font colors effectively.

  • PDF

Optical Character Recognition for Hindi Language Using a Neural-network Approach

  • Yadav, Divakar;Sanchez-Cuadrado, Sonia;Morato, Jorge
    • Journal of Information Processing Systems
    • /
    • v.9 no.1
    • /
    • pp.117-140
    • /
    • 2013
  • Hindi is the most widely spoken language in India, with more than 300 million speakers. As there is no separation between the characters of texts written in Hindi as there is in English, the Optical Character Recognition (OCR) systems developed for the Hindi language carry a very poor recognition rate. In this paper we propose an OCR for printed Hindi text in Devanagari script, using Artificial Neural Network (ANN), which improves its efficiency. One of the major reasons for the poor recognition rate is error in character segmentation. The presence of touching characters in the scanned documents further complicates the segmentation process, creating a major problem when designing an effective character segmentation technique. Preprocessing, character segmentation, feature extraction, and finally, classification and recognition are the major steps which are followed by a general OCR. The preprocessing tasks considered in the paper are conversion of gray scaled images to binary images, image rectification, and segmentation of the document's textual contents into paragraphs, lines, words, and then at the level of basic symbols. The basic symbols, obtained as the fundamental unit from the segmentation process, are recognized by the neural classifier. In this work, three feature extraction techniques-: histogram of projection based on mean distance, histogram of projection based on pixel value, and vertical zero crossing, have been used to improve the rate of recognition. These feature extraction techniques are powerful enough to extract features of even distorted characters/symbols. For development of the neural classifier, a back-propagation neural network with two hidden layers is used. The classifier is trained and tested for printed Hindi texts. A performance of approximately 90% correct recognition rate is achieved.

Character Recognition using Regional Structure

  • Yoo, Suk Won
    • International Journal of Advanced Culture Technology
    • /
    • v.7 no.1
    • /
    • pp.64-69
    • /
    • 2019
  • With the advent of the fourth industry, the need for office automation with automatic character recognition capabilities is increasing day by day. Therefore, in this paper, we study a character recognition algorithm that effectively recognizes a new experimental data character by using learning data characters. The proposed algorithm computes the degree of similarity that the structural regions of learning data characters match the corresponding regions of the experimental data character. It has been confirmed that satisfactory results can be obtained by selecting the learning data character with the highest degree of similarity in the matching process as the final recognition result for a given experimental data character.

Robust Stroke Extraction Method for Handwritten Korean Characters

  • Park, Young-Kyoo;Rhee, Sang-Burm
    • Proceedings of the IEEK Conference
    • /
    • 2000.07b
    • /
    • pp.819-822
    • /
    • 2000
  • The merit of the stroke extraction algorithm is the ease of the feature abstraction from the skeleton of a character, But, extracting strokes from Korean characters has two major problems that must be dealt with. One is extracting primitive strokes and the other is merging or splitting the strokes using dynamic information of the strokes. In this paper, a method is proposed to extract strokes from an off-line handwritten Korean character. We have developed some stroke segmentation rules based on splitting, merging and directional analysis. Using these techniques, we can extract and trace the strokes in an off-line handwritten Korean character accurately and efficiently.

  • PDF

On Character Region Extraction by Cost Minimization Method (코스트 최소화법에 의한 문자영역의 추출)

  • Kim, Seok-Tae
    • The Transactions of the Korea Information Processing Society
    • /
    • v.3 no.2
    • /
    • pp.348-358
    • /
    • 1996
  • If a method of character region extraction will have general purposes, it could not but make use of common features which all target images have. This paper suggests these common features should be considered as the coalitions for the region to be extracted within a framework of the cost minimization. The method suggested above could be effective by minimizing a cost function estmating the extent that character regions satify quantitatively the features, through Simulated Annealing Method. This method has an uniqueness in that it defines the cost function. Experimental result verify the usefulness of this cost minimization approach to characer region extraction.

  • PDF

A method for Character Segmentation using Frequence Characteristics and Back Propagation Neural Network (주파수 특성과 역전파 신경망 알고리즘을 이용한 문자 영역 분할 방법)

  • Chun Byung-Tae;Song Chee-Yang
    • Journal of the Korea Society of Computer and Information
    • /
    • v.11 no.4 s.42
    • /
    • pp.55-60
    • /
    • 2006
  • The proposed method uses FFT(Fast Fourier Transform) and neural networks in order to extract texts in real time. In general, text areas are found in the higher frequency domain, thus, can be characterized using FFT. The neural network are learned by character region(high frequency) and non character region(low frequency). The candidate text areas can be thus found by applying the higher frequency characteristics to neural network. Therefore, the final text area is extracted by verifying the candidate areas. Experimental results show a perfect candidate extraction rate and about 95% text extraction rate. The strength of the proposed algorithm is its simplicity, real-time processing by not processing the entire image.

  • PDF

The WALSH - HADAMARD Transfore and Characteristic Extraction for HANGEUL Character Recognition (한글문자인식을 위한 WALSH-HADAMARD 변환과 그 특징추출)

  • 박기웅;신승호;진용옥
    • Proceedings of the Korean Institute of Communication Sciences Conference
    • /
    • 1984.10a
    • /
    • pp.1-4
    • /
    • 1984
  • This paper is discussed to prepard reference data as a bassic study for Hangeul Character recognition and to extract 2 - Dtransform Korean Charater Image, The 1959 Hangeul Characters is established to form the total 170patterns of 17 formats classified by the initial soun, middle sound and terminal sound and prossessed the 2-D Korean Character Image. Using Superpostion theormm, we are applied to recognition Algorithm. For 50's Hangeul, the recognition efficiency is calculated by computer simulation.

  • PDF

Combining Different Distance Measurements Methods with Dempster-Shafer-Theory for Recognition of Urdu Character Script

  • Khan, Yunus;Nagar, Chetan;Kaushal, Devendra S.
    • International Journal of Ocean System Engineering
    • /
    • v.2 no.1
    • /
    • pp.16-23
    • /
    • 2012
  • In this paper we discussed a new methodology for Urdu Character Recognition system using Dempster-Shafer theory which can powerfully estimate the similarity ratings between a recognized character and sampling characters in the character database. Recognition of character is done by five probability calculation methods such as (similarity, hamming, linear correlation, cross-correlation, nearest neighbor) with Dempster-Shafer theory of belief functions. The main objective of this paper is to Recognition of Urdu letters and numerals through five similarity and dissimilarity algorithms to find the similarity between the given image and the standard template in the character recognition system. In this paper we develop a method to combine the results of the different distance measurement methods using the Dempster-Shafer theory. This idea enables us to obtain a single precision result. It was observed that the combination of these results ultimately enhanced the success rate.