• Title/Summary/Keyword: Speaker Adaptation

Search Result 122, Processing Time 0.022 seconds

A Study on Phoneme Recognition using Neural Networks and Fuzzy logic (신경망과 퍼지논리를 이용한 음소인식에 관한 연구)

  • Han, Jung-Hyun;Choi, Doo-Il
    • Proceedings of the KIEE Conference
    • /
    • 1998.07g
    • /
    • pp.2265-2267
    • /
    • 1998
  • This paper deals with study of Fast Speaker Adaptation Type Speech Recognition, and to analyze speech signal efficiently in time domain and time-frequency domain, utilizes SCONN[1] with Speech Signal Process suffices for Fast Speaker Adaptation Type Speech Recognition, and examined Speech Recognition to investigate adaptation of system, which has speech data input after speaker dependent recognition test.

  • PDF

Speaker Adaptation Using Neural Network in Continuous Speech Recognition (연속 음성에서의 신경회로망을 이용한 화자 적응)

  • 김선일
    • The Journal of the Acoustical Society of Korea
    • /
    • v.19 no.1
    • /
    • pp.11-15
    • /
    • 2000
  • Speaker adaptive continuous speech recognition for the RM speech corpus is described in this paper. Learning of hidden markov models for the reference speaker is performed for the training data of RM corpus. For the evaluation, evaluation data of RM corpus are used. Parts of another training data of RM corpus are used for the speaker adaptation. After dynamic time warping of another speaker's data for the reference data is accomplished, error back propagation neural network is used to transform the spectrum between speakers to be recognized and reference speaker. Experimental results to get the best adaptation by tuning the neural network are described. The recognition ratio after adaptation is substantially increased 2.1 times for the word recognition and 4.7 times for the word accuracy for the best.

  • PDF

Speaker Adaptation Performance Evaluation in Keyword Spotting System (500단어급 핵심어 검출기에서 화자적응 성능 평가)

  • Seo Hyun-Chul;Lee Kyong-Rok;Kim Jin-Young;Choi Seung-Ho
    • MALSORI
    • /
    • no.43
    • /
    • pp.151-161
    • /
    • 2002
  • This study presents performance analysis results of speaker adaptation for keyword spotting system. In this paper, we implemented MLLR (Maximum Likelihood Linear Regression) method on our middle size vocabulary keyword spotting system. This system was developed for directory services of universities and colleges. The experimental results show that speaker adaptation reduces the false alarm rate to 1/3 with the preservation of the mis-detection ratio. This improvement is achieved when speaker adaptation is applied to not only keyword models but also non-keyword models.

  • PDF

A Study on Realization of Continuous Speech Recognition System of Speaker Adaptation (화자적응화 연속음성 인식 시스템의 구현에 관한 연구)

  • 김상범;김수훈;허강인;고시영
    • The Journal of the Acoustical Society of Korea
    • /
    • v.18 no.3
    • /
    • pp.10-16
    • /
    • 1999
  • In this paper, we have studied Continuous Speech Recognition System of Speaker Adaptation using MAPE (Maximum A Posteriori Probability Estimation) which can adapt any small amount of adaptation speech data. Speaker adaptation is performed by the method of MAPB after Concatenation training which is making sentence unit HMM linked by syllable unit HMM and Viterbi segmentation classifies speech data to be adaptation into segmentation of syllable unit data automatically without hand labelling. For car control speech the recognition rates of adaptation of HMM was 77.18% which is approximately 6% improvement over that of unadapted HMM.(in case of O(n)DP)

  • PDF

ImprovementofMLLRAlgorithmforRapidSpeakerAdaptationandReductionofComputation (빠른 화자 적응과 연산량 감소를 위한 MLLR알고리즘 개선)

  • Kim, Ji-Un;Chung, Jae-Ho
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.29 no.1C
    • /
    • pp.65-71
    • /
    • 2004
  • We improved the MLLR speaker adaptation algorithm with reduction of the order of HMM parameters using PCA(Principle Component Analysis) or ICA(Independent Component Analysis). To find a smaller set of variables with less redundancy, we adapt PCA(principal component analysis) and ICA(independent component analysis) that would give as good a representation as possible, minimize the correlations between data elements, and remove the axis with less covariance or higher-order statistical independencies. Ordinary MLLR algorithm needs more than 30 seconds adaptation data to represent higher word recognition rate of SD(Speaker Dependent) models than of SI(Speaker Independent) models, whereas proposed algorithm needs just more than 10 seconds adaptation data. 10 components for ICA and PCA represent similar performance with 36 components for ordinary MLLR framework. So, compared with ordinary MLLR algorithm, the amount of total computation requested in speaker adaptation is reduced by about 1/167 in proposed MLLR algorithm.

Fast Speaker Adaptation in Noisy Environment using Environment Clustering (잡음 환경하에서 환경 군집화를 이용한 고속화자 적응)

  • Kim, Young-Kuk;Song, Hwa-Jeon;Kim, Hyung-Soon
    • Proceedings of the KSPS conference
    • /
    • 2007.05a
    • /
    • pp.33-36
    • /
    • 2007
  • In this paper, we investigate a fast speaker adaptation method based on eigenvoice in several noisy environments. In order to overcome its weakness against noise, we propose a noisy environment clustering method which divides the noisy adaptation utterances into utterance groups with similar environments by the vector quantization based clustering using a cepstral mean as a feature vector. Then each utterance group is used for adaptation to make an environment dependent model. According to our experiment, we obtained 19-37 % relative improvement in error rate compared with the simultaneous speaker adaptation and environmental compensation method

  • PDF

Selective Adaptation of Speaker Characteristics within a Subcluster Neural Network

  • Haskey, S.J.;Datta, S.
    • Proceedings of the KSPS conference
    • /
    • 1996.10a
    • /
    • pp.464-467
    • /
    • 1996
  • This paper aims to exploit inter/intra-speaker phoneme sub-class variations as criteria for adaptation in a phoneme recognition system based on a novel neural network architecture. Using a subcluster neural network design based on the One-Class-in-One-Network (OCON) feed forward subnets, similar to those proposed by Kung (2) and Jou (1), joined by a common front-end layer. the idea is to adapt only the neurons within the common front-end layer of the network. Consequently resulting in an adaptation which can be concentrated primarily on the speakers vocal characteristics. Since the adaptation occurs in an area common to all classes, convergence on a single class will improve the recognition of the remaining classes in the network. Results show that adaptation towards a phoneme, in the vowel sub-class, for speakers MDABO and MWBTO Improve the recognition of remaining vowel sub-class phonemes from the same speaker

  • PDF

A Study on the Speaker Adaptation of a Continuous Speech Recognition using HMM (HMM을 이용한 연속 음성 인식의 화자적응화에 관한 연구)

  • Kim, Sang-Bum;Lee, Young-Jae;Koh, Si-Young;Hur, Kang-In
    • The Journal of the Acoustical Society of Korea
    • /
    • v.15 no.4
    • /
    • pp.5-11
    • /
    • 1996
  • In this study, the method of speaker adaptation for uttered sentence using syllable unit hmm is proposed. Segmentation of syllable unit for sentence is performed automatically by concatenation of syllable unit hmm and viterbi segmentation. Speaker adaptation is performed using MAPE(Maximum A Posteriori Probabillity Estimation) which can adapt any small amount of adaptation speech data and add one sequentially. For newspaper editorial continuous speech, the recognition rates of adaptation of HMM was 71.8% which is approximately 37% improvement over that of unadapted HMM

  • PDF

A Comparative Study of Speaker Adaptation Methods for HMM-Based Speech Recognition (HMM 음성인식 시스템을 위한 화자적응 방법들의 성능비교)

  • Koo, Myoung-Wan;Un, Chong-Kwan;Lee, Hwang-Soo
    • The Journal of the Acoustical Society of Korea
    • /
    • v.10 no.3
    • /
    • pp.37-43
    • /
    • 1991
  • In this paper, we compare the performances of speaker adaptation which consist of two stages of processing for an HMM-based speech recognition system. We compare three kinds of VQ adaptation methods which may be used in the first stage to reduce the distortion error for a new speaker : label prototype adaptation, adaptation with a codebook from adaptation speech itself, and adaptation with a mapped codebook. We then compare the performance of four kinds of HMM parameter adaptation methods which may be used in the second stage to transform HMM parameters for a new speaker : adaptation by the Viterbi algorithm, that by the DTW algorithm, that by the iterative alignment algorithm. The results show that adaptation based on the fuzzy histogram algorithm yields the highest accuracy in an HMM-based speech recognition system.

  • PDF

Development of Voice Activated Universal Remote Control System using the Speaker Adaptation (화자적응을 이용한 음성인식 제어시스템 개발)

  • Kim Yong-Pyo;Yoon Dong-Han;Choi Un-Ha
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.10 no.4
    • /
    • pp.739-743
    • /
    • 2006
  • In this paper, development of voice activated Universal Remote Control using the Neural Networks. A speaker dependent system is developed to operate for a single speaker. These systems are usually easier to develop, cheaper to buy and more accurate, but not as flexible as speaker adaptive or speaker independent systems. A speaker independent system is developed to operate for any speaker of a particular type (e.g. American English). These systems are the most difficult to develop, most expensive and accuracy is lower than speaker dependent systems. However, they are more flexible. A speaker adaptive system is developed to adapt its operation to the characteristics of new speakers. It's difficulty lies somewhere between speaker independent and speaker dependent systems. This paper is developed Speaker Adaptation using the Neural Networks.