• Title/Summary/Keyword: recognition-rate

Search Result 2,809, Processing Time 0.032 seconds

Performance Evaluation of Nonkeyword Modeling and Postprocessing for Vocabulary-independent Keyword Spotting (가변어휘 핵심어 검출을 위한 비핵심어 모델링 및 후처리 성능평가)

  • Kim, Hyung-Soon;Kim, Young-Kuk;Shin, Young-Wook
    • Speech Sciences
    • /
    • v.10 no.3
    • /
    • pp.225-239
    • /
    • 2003
  • In this paper, we develop a keyword spotting system using vocabulary-independent speech recognition technique, and investigate several non-keyword modeling and post-processing methods to improve its performance. In order to model non-keyword speech segments, monophone clustering and Gaussian Mixture Model (GMM) are considered. We employ likelihood ratio scoring method for the post-processing schemes to verify the recognition results, and filler models, anti-subword models and N-best decoding results are considered as an alternative hypothesis for likelihood ratio scoring. We also examine different methods to construct anti-subword models. We evaluate the performance of our system on the automatic telephone exchange service task. The results show that GMM-based non-keyword modeling yields better performance than that using monophone clustering. According to the post-processing experiment, the method using anti-keyword model based on Kullback-Leibler distance and N-best decoding method show better performance than other methods, and we could reduce more than 50% of keyword recognition errors with keyword rejection rate of 5%.

  • PDF

Improved $(2D)^2$ DLDA for Face Recognition (얼굴 인식을 위한 개선된 $(2D)^2$ DLDA 알고리즘)

  • Cho, Dong-Uk;Chang, Un-Dong;Kim, Young-Gil;Kim, Kwan-Dong;Ahn, Jae-Hyeong;Kim, Bong-Hyun;Lee, Se-Hwan
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.31 no.10C
    • /
    • pp.942-947
    • /
    • 2006
  • In this paper, a new feature representation technique called Improved 2-directional 2-dimensional direct linear discriminant analysis (Improved $(2D)^2$ DLDA) is proposed. In the case of face recognition, thesmall sample size problem and need for many coefficients are often encountered. In order to solve these problems, the proposed method uses the direct LDA and 2-directional image scatter matrix. Moreover the selection method of feature vector and the method of similarity measure are proposed. The ORL face database is used to evaluate the performance of the proposed method. The experimental results show that the proposed method obtains better recognition rate and requires lesser memory than the direct LDA.

A car number retrieving system using speech recognition for PDA (PDA상에서 음성인식을 이용한 차량번호 조회시스템)

  • 김우성;김동환;윤재선;홍광석
    • Proceedings of the Korea Institute of Convergence Signal Processing
    • /
    • 2001.06a
    • /
    • pp.281-284
    • /
    • 2001
  • In this paper, we present a car number retrieving system using speech recogntion and speech synthesis for PDA. This system consist of 4-digit numbers and command speech recognition as well its speech synthesis. Experiment results showed 4-digit numbers recognition rate 97% and commands recognition 99% through speaker-independent method.

  • PDF

Isolated-Word Recognition Using Adaptively Partitioned Multisection Codebooks (음성적응(音聲適應) 구간분할(區間分割) 멀티섹션 코드북을 이용(利用)한 고립단어인식(孤立單語認識))

  • Ha, Kyeong-Min;Jo, Jeong-Ho;Hong, Jae-Kuen;Kim, Soo-Joong
    • Proceedings of the KIEE Conference
    • /
    • 1988.07a
    • /
    • pp.10-13
    • /
    • 1988
  • An isolated-word recognition method using adaptively partitioned multisection codebooks is proposed. Each training utterance was divided into several sections according to its pattern extracted by labeling technique. For each pattern, reference codebooks were generated by clustering the training vectors of the same section. In recognition procedure, input speech was divided into the sections by the same method used in codebook generation procedure, and recognized to the reference word whose codebook represented the smallest average distortion. The proposed method was tested for 100 Korean words and attained recognition rate about 96 percent.

  • PDF

A Study on Korean isolated word recognition using LPC cepstrum and clustering (LPC Cepstrum과 집단화를 이용한 한국어 고립단어 인식에 관한 연구)

  • Kim, Jin-Yeong
    • The Journal of the Acoustical Society of Korea
    • /
    • v.6 no.4
    • /
    • pp.44-54
    • /
    • 1987
  • In this paper, the problem of LP-model and it's solution by liftering in cepstrum domain are investigated in speaker independent isolated-word recognition. And, clustering technique is discussed for obtaining the reference template. KMA (K-means iteration with average) method, which is transformed from UWA method and K-iteration method, has been suggested and compared with each other for clustering, the result of recognition experiments shows max. $95\%$ recognition rate when rasied-sign lifter and KMA clustering method is applied.

  • PDF

A Korean Speech Recognition Using Fuzzy Rule Base (Fuzzy Rule Base를 이용한 한국어 연속 음성인식)

  • Song, Jeong-Young
    • The Journal of Engineering Research
    • /
    • v.2 no.1
    • /
    • pp.13-21
    • /
    • 1997
  • This paper describes how to represent varations of feature parameters to improve recognition of continuous speech. For speech recognition, feature parameters, which are formant frequencies, pitches, logarithmic energies and zero crossing retes are used in general. But, their values and variations depend on speakers, for example disparities between man and woman, and on their age. It is difficult to decide a priority the value of the variation width. Hence, we try to represent this variation by introducing fuzziness and recognize a continuous speech by fuzzy inference using fuzzy production rules.

  • PDF

A Study on the Voice-Controlled Wheelchair using Spatio-Temporal Pattern Recognition Neural Network (Spatio-Temporal Pattern Recognition Neural Network를 이용한 전동 휠체어의 음성 제어에 관한 연구)

  • Baek, S.W.;Kim, S.B.;Kwon, J.W.;Lee, E.H.;Hong, S.H.
    • Proceedings of the KOSOMBE Conference
    • /
    • v.1993 no.05
    • /
    • pp.90-93
    • /
    • 1993
  • In this study, Korean speech was recognized by using spatio-temporal recognition neural network. The subjects of speech are numeric speech from zero to nine and basic command which might be used for motorized wheelchair developed it own Lab. Rabiner and Sambur's method of speech detection was used in determining end-point of speech, speech parameter was extracted by using LPC 16 order. The recognition rate was over 90%.

  • PDF

Features Detection in Face eased on The Model (모델 기반 얼굴에서 특징점 추출)

  • 석경휴;김용수;김동국;배철수;나상동
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2002.05a
    • /
    • pp.134-138
    • /
    • 2002
  • The human faces do not have distinct features unlike other general objects. In general the features of eyes, nose and mouth which are first recognized when human being see the face are defined. These features have different characteristics depending on different human face. In this paper, We propose a face recognition algorithm using the hidden Markov model(HMM). In the preprocessing stage, we find edges of a face using the locally adaptive threshold scheme and extract features based on generic knowledge of a face, then construct a database with extracted features. In training stage, we generate HMM parameters for each person by using the forward-backward algorithm. In the recognition stage, we apply probability values calculated by the HMM to input data. Then the input face is recognized by the euclidean distance of face feature vector and the cross-correlation between the input image and the database image. Computer simulation shows that the proposed HMM algorithm gives higher recognition rate compared with conventional face recognition algorithms.

  • PDF

A Development of Unicode-based Multi-lingual Namecard Recognizer (Unicode 기반 다국어 명함인식기 개발)

  • Jang, Dong-Hyeub;Lee, Jae-Hong
    • The KIPS Transactions:PartB
    • /
    • v.16B no.2
    • /
    • pp.117-122
    • /
    • 2009
  • We developed a multi-lingual namecard recognizer for building up a global client management systems. At first, we created the Unicode-based character image database for character recognition and learning of multi languages, and applied many color image processing techniques to get more correct data for namecard images which were acquired by various input devices. And by applying multi-layer perceptron neural network, individual character recognition applied for language types, and post-processing utilizing keyword databases made for individual languages, we increased a recognition rate for multi-lingual namecards.

A study on EMG pattern recognition based on parallel radial basis function network (병렬 Radial Basis Function 회로망을 이용한 근전도 신호의 패턴 인식에 관한 연구)

  • Kim, Se-Hoon;Lee, Seung-Chul;Kim, Ji-Un;Park, Sang-Hui
    • Proceedings of the KIEE Conference
    • /
    • 1998.07g
    • /
    • pp.2448-2450
    • /
    • 1998
  • For the exact classification of the arm motion this paper proposes EMG pattern recognition method with neural network. For this autoregressive coefficient, linear cepstrum coefficient, and adaptive cepstrum coefficient are selected for the feature parameter of EMG signal, and they are extracted from time series EMG signal. For the function recognition of the feature parameter a radial basis function network, a field of neural network is designed. For the improvement of recognition rate, a number of radial basis function network are combined in parallel, comparing with a backpropagation neural network an existing method.

  • PDF