• Title/Summary/Keyword: recognition-rate

Search Result 2,809, Processing Time 0.037 seconds

Comparison of ICA Methods for the Recognition of Corrupted Korean Speech (잡음 섞인 한국어 인식을 위한 ICA 비교 연구)

  • Kim, Seon-Il
    • 전자공학회논문지 IE
    • /
    • v.45 no.3
    • /
    • pp.20-26
    • /
    • 2008
  • Two independent component analysis(ICA) algorithms were applied for the recognition of speech signals corrupted by a car engine noise. Speech recognition was performed by hidden markov model(HMM) for the estimated signals and recognition rates were compared with those of orginal speech signals which are not corrupted. Two different ICA methods were applied for the estimation of speech signals, one of which is FastICA algorithm that maximizes negentropy, the other is information-maximization approach that maximizes the mutual information between inputs and outputs to give maximum independence among outputs. Word recognition rate for the Korean news sentences spoken by a male anchor is 87.85%, while there is 1.65% drop of performance on the average for the estimated speech signals by FastICA and 2.02% by information-maximization for the various signal to noise ratio(SNR). There is little difference between the methods.

A STUDY ON THE IMPLEMENTATION OF ARTIFICIAL NEURAL NET MODELS WITH FEATURE SET INPUT FOR RECOGNITION OF KOREAN PLOSIVE CONSONANTS (한국어 파열음 인식을 위한 피쳐 셉 입력 인공 신경망 모델에 관한 연구)

  • Kim, Ki-Seok;Kim, In-Bum;Hwang, Hee-Yeung
    • Proceedings of the KIEE Conference
    • /
    • 1990.07a
    • /
    • pp.535-538
    • /
    • 1990
  • The main problem in speech recognition is the enormous variability in acoustic signals due to complex but predictable contextual effects. Especially in plosive consonants it is very difficult to find invariant cue due to various contextual effects, but humans use these contextual effects as helpful information in plosive consonant recognition. In this paper we experimented on three artificial neural net models for the recognition of plosive consonants. Neural Net Model I used "Multi-layer Perceptron ". Model II used a variation of the "Self-organizing Feature Map Model". And Model III used "Interactive and Competitive Model" to experiment contextual effects. The recognition experiment was performed on 9 Korean plosive consonants. We used VCV speech chains for the experiment on contextual effects. The speech chain consists of Korean plosive consonants /g, d, b, K, T, P, k, t, p/ (/ㄱ, ㄷ, ㅂ, ㄲ, ㄸ, ㅃ, ㅋ, ㅌ, ㅍ/) and eight Korean monothongs. The inputs to Neural Net Models were several temporal cues - duration of the silence, transition and vot -, and the extent of the VC formant transitions to the presence of voicing energy during closure, burst intensity, presence of asperation, amount of low frequency energy present at voicing onset, and CV formant transition extent from the acoustic signals. Model I showed about 55 - 67 %, Model II showed about 60%, and Model III showed about 67% recognition rate.

  • PDF

Music Recognition by Partial Template Matching (부분적 템플릿 매칭을 활용한 악보인식)

  • Yoo, Jae-Myeong;Kim, Gi-Hong;Lee, Guee-Sang
    • The Journal of the Korea Contents Association
    • /
    • v.8 no.11
    • /
    • pp.85-93
    • /
    • 2008
  • For music score recognition, several approaches have been proposed including shape matching, statistical methods, neural network based methods and structural methods. In this paper, we deal with recognition for low resolution images which are captured by the digital camera of a mobile phone. Considerable distortions are included in these low resolution images, so when existing technology is used, many problems appear. First, captured images are not stable in the sense that they contain lots of distortions or non-uniform illumination changes. Therefore, notes or symbols in the music score are damaged and recognition process gets difficult. This paper presents recognition technology to overcome these problems. First, musical note to head, stick, tail part are separated. Then template matching on head part of musical note, and remainder part is applied. Experimental results show nearly 100% recognition rate for music scores with single musical notes.

Recognition of Handwritten Numerals using SVM Classifiers (SVM 분류기를 이용한 필기체 숫자인식)

  • Park, Joong-Jo;Kim, Kyoung-Min
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.8 no.3
    • /
    • pp.136-142
    • /
    • 2007
  • Recent researches in the recognition system have shown that SVM (Support Vector Machine) classifiers often have superior recognition rates in comparison to other classifiers. In this paper, we present the handwritten numeral recognition algorithm using SVM classifiers. The numeral features used in our algorithm are mesh features, directional features by Kirsch operators and concavity features, where first two features represent the foreground information of numerals and the last feature represents the background information of numerals. These features are complements each of the other. Since SVM is basically a binary classifier, it is required to construct and combine several binary SVMs to get the multi-class classifiers. We use two strategies for implementing multi-class SVM classifiers: "one against one" and "one against the rest", and examine their performances on the features used. The efficiency of our method is tested by the CENPARMI handwritten numeral database, and the recognition rate of 98.45% is achieved.

  • PDF

Iris Image Enhancement for the Recognition of Non-ideal Iris Images

  • Sajjad, Mazhar;Ahn, Chang-Won;Jung, Jin-Woo
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.10 no.4
    • /
    • pp.1904-1926
    • /
    • 2016
  • Iris recognition for biometric personnel identification has gained much interest owing to the increasing concern with security today. The image quality plays a major role in the performance of iris recognition systems. When capturing an iris image under uncontrolled conditions and dealing with non-cooperative people, the chance of getting non-ideal images is very high owing to poor focus, off-angle, noise, motion blur, occlusion of eyelashes and eyelids, and wearing glasses. In order to improve the accuracy of iris recognition while dealing with non-ideal iris images, we propose a novel algorithm that improves the quality of degraded iris images. First, the iris image is localized properly to obtain accurate iris boundary detection, and then the iris image is normalized to obtain a fixed size. Second, the valid region (iris region) is extracted from the segmented iris image to obtain only the iris region. Third, to get a well-distributed texture image, bilinear interpolation is used on the segmented valid iris gray image. Using contrast-limited adaptive histogram equalization (CLAHE) enhances the low contrast of the resulting interpolated image. The results of CLAHE are further improved by stretching the maximum and minimum values to 0-255 by using histogram-stretching technique. The gray texture information is extracted by 1D Gabor filters while the Hamming distance technique is chosen as a metric for recognition. The NICE-II training dataset taken from UBRIS.v2 was used for the experiment. Results of the proposed method outperformed other methods in terms of equal error rate (EER).

Emotion Recognition and Expression System of User using Multi-Modal Sensor Fusion Algorithm (다중 센서 융합 알고리즘을 이용한 사용자의 감정 인식 및 표현 시스템)

  • Yeom, Hong-Gi;Joo, Jong-Tae;Sim, Kwee-Bo
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.18 no.1
    • /
    • pp.20-26
    • /
    • 2008
  • As they have more and more intelligence robots or computers these days, so the interaction between intelligence robot(computer) - human is getting more and more important also the emotion recognition and expression are indispensable for interaction between intelligence robot(computer) - human. In this paper, firstly we extract emotional features at speech signal and facial image. Secondly we apply both BL(Bayesian Learning) and PCA(Principal Component Analysis), lastly we classify five emotions patterns(normal, happy, anger, surprise and sad) also, we experiment with decision fusion and feature fusion to enhance emotion recognition rate. The decision fusion method experiment on emotion recognition that result values of each recognition system apply Fuzzy membership function and the feature fusion method selects superior features through SFS(Sequential Forward Selection) method and superior features are applied to Neural Networks based on MLP(Multi Layer Perceptron) for classifying five emotions patterns. and recognized result apply to 2D facial shape for express emotion.

Real-time Recognition System of Facial Expressions Using Principal Component of Gabor-wavelet Features (표정별 가버 웨이블릿 주성분특징을 이용한 실시간 표정 인식 시스템)

  • Yoon, Hyun-Sup;Han, Young-Joon;Hahn, Hern-Soo
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.19 no.6
    • /
    • pp.821-827
    • /
    • 2009
  • Human emotion can be reflected by their facial expressions. So, it is one of good ways to understand people's emotions by recognizing their facial expressions. General recognition system of facial expressions had selected interesting points, and then only extracted features without analyzing physical meanings. They takes a long time to find interesting points, and it is hard to estimate accurate positions of these feature points. And in order to implement a recognition system of facial expressions on real-time embedded system, it is needed to simplify the algorithm and reduce the using resources. In this paper, we propose a real-time recognition algorithm of facial expressions that project the grid points on an expression space based on Gabor wavelet feature. Facial expression is simply described by feature vectors on the expression space, and is classified by an neural network with its resources dramatically reduced. The proposed system deals 5 expressions: anger, happiness, neutral, sadness, and surprise. In experiment, average execution time is 10.251 ms and recognition rate is measured as 87~93%.

Dynamic Gesture Recognition for the Remote Camera Robot Control (원격 카메라 로봇 제어를 위한 동적 제스처 인식)

  • Lee Ju-Won;Lee Byung-Ro
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.8 no.7
    • /
    • pp.1480-1487
    • /
    • 2004
  • This study is proposed the novel gesture recognition method for the remote camera robot control. To recognize the dynamics gesture, the preprocessing step is the image segmentation. The conventional methods for the effectively object segmentation has need a lot of the cole. information about the object(hand) image. And these methods in the recognition step have need a lot of the features with the each object. To improve the problems of the conventional methods, this study proposed the novel method to recognize the dynamic hand gesture such as the MMS(Max-Min Search) method to segment the object image, MSM(Mean Space Mapping) method and COG(Conte. Of Gravity) method to extract the features of image, and the structure of recognition MLPNN(Multi Layer Perceptron Neural Network) to recognize the dynamic gestures. In the results of experiment, the recognition rate of the proposed method appeared more than 90[%], and this result is shown that is available by HCI(Human Computer Interface) device for .emote robot control.

Handwritten Numeral Recognition using Composite Features and SVM classifier (복합특징과 SVM 분류기를 이용한 필기체 숫자인식)

  • Park, Joong-Jo;Kim, Tae-Woong;Kim, Kyoung-Min
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.14 no.12
    • /
    • pp.2761-2768
    • /
    • 2010
  • In this paper, we studied the use of the foreground and background features and SVM classifier to improve the accuracy of offline handwritten numeral recognition. The foreground features are two directional features: directional gradient feature by Kirsch operators and directional stroke feature by projection runlength, and the background feature is concavity feature which is extracted from the convex hull of the numeral, where concavity feature functions as complement to the directional features. During classification of the numeral, these three features are combined to obtain good discrimination power. The efficiency of our feature sets was tested by recognition experiments on the handwritten numeral database CENPARMI, where we used SVM with RBF kernel as a classifier. The experimental results showed that each combination of two or three features gave a better performance than a single feature. This means that each single feature works with a different discriminating power and cooperates with other features to enhance the recognition accuracy. By using the composite feature of the three features, we achieved a recognition rate of 98.90%.

A study on the new hybrid recurrent TDNN-HMM architecture for speech recognition (음성인식을 위한 새로운 혼성 recurrent TDNN-HMM 구조에 관한 연구)

  • Jang, Chun-Seo
    • The KIPS Transactions:PartB
    • /
    • v.8B no.6
    • /
    • pp.699-704
    • /
    • 2001
  • ABSTRACT In this paper, a new hybrid modular recurrent TDNN (time-delay neural network)-HMM (hidden Markov model) architecture for speech recognition has been studied. In TDNN, the recognition rate could be increased if the signal window is extended. To obtain this effect in the neural network, a high-level memory generated through a feedback within the first hidden layer of the neural network unit has been used. To increase the ability to deal with the temporal structure of phonemic features, the input layer of the network has been divided into multiple states in time sequence and has feature detector for each states. To expand the network from small recognition task to the full speech recognition system, modular construction method has been also used. Furthermore, the neural network and HMM are integrated by feeding output vectors from the neural network to HMM, and a new parameter smoothing method which can be applied to this hybrid system has been suggested.

  • PDF