빠른 화자 적응과 연산량 감소를 위한 MLLR알고리즘 개선

ImprovementofMLLRAlgorithmforRapidSpeakerAdaptationandReductionofComputation

  • 김지운 (인하대학교 전자공학과 DSP Lab.) ;
  • 정재호 (인하대학교 전자공학과 DSP Lab.)
  • 발행 : 2004.01.01

초록

본 논문은 주성분분석(PCA, Principle Component Analysis) 혹은 독립성분분석(ICA, Independent Principle Component Analysis)를 이용하여 HMM(Hidden Markov Model) 파라메타의 차수를 감소시킴으로써 MLLR(Maximum Likelihood Linear Regression) 화자 적응 알고리즘을 개선하였다. 데이터의 특징을 잘 나타내는 PCA와 ICA를 통해 모델 mixture component의 상관관계를 줄이고 상대적으로 데이터의 분포가 적은 축을 삭제함으로써 추정해야 하는 적응 파라메타의 수를 줄였다. 기존의 MLLR 알고리즘은 SI(Speaker Independent)모델 보다 좋은 인식성능을 나타내기 위해 30초 이상의 적응 데이터가 요구되었고, 반면 제안한 알고리즘은 적응 파라메타의 수를 감소시킴으로써 10초 이상의 적응데이터가 요구되었다. 또한, 36차의 HMM 파라메타는 기존의 MLLR 알고리즘과 비슷한 인식성능을 나다내는 10차의 주성분이나 독릭성분을 사용함으로써 MLLR 알고리즘에서 적응파라메타를 추정할 때 요구되는 연산량을 1/167로 감소시켰다.

We improved the MLLR speaker adaptation algorithm with reduction of the order of HMM parameters using PCA(Principle Component Analysis) or ICA(Independent Component Analysis). To find a smaller set of variables with less redundancy, we adapt PCA(principal component analysis) and ICA(independent component analysis) that would give as good a representation as possible, minimize the correlations between data elements, and remove the axis with less covariance or higher-order statistical independencies. Ordinary MLLR algorithm needs more than 30 seconds adaptation data to represent higher word recognition rate of SD(Speaker Dependent) models than of SI(Speaker Independent) models, whereas proposed algorithm needs just more than 10 seconds adaptation data. 10 components for ICA and PCA represent similar performance with 36 components for ordinary MLLR framework. So, compared with ordinary MLLR algorithm, the amount of total computation requested in speaker adaptation is reduced by about 1/167 in proposed MLLR algorithm.

키워드

참고문헌

  1. C. H. Lee, C. H. Lin, and B. H. Juang,'A study on speaker adaptation of theparameters of continuous densityhidden Markov models,' IEEE Transon Signal Processing, vol. 39, No. 4April 1991. pp 806-814 https://doi.org/10.1109/78.80902
  2. C. J. Leggetter. Improved acousticmodelling for HMMs using lineartransforms, PhD Thesis, Univ. ofCambridge Feg. 1995
  3. O. Siohan, C. Chesta. and C. H. Lee'Joint maximum a posteriori adaptationof transformation and HMMparameters,' IEEE Trans. on Speechand Audio Processing, vol. 9, No. 4May 2001, pp 417-428 https://doi.org/10.1109/89.917687
  4. J. T. Chien, 'Online hierarchicaltransformation of hidden Markovmodels for speech recognition,' IEEETrans. on Speech and Audio Processing.vo1.7 No. 6, Nov. 1999, pp 656-667 https://doi.org/10.1109/89.799691
  5. M.J.F. Gales, 'The Generation and Useof Regression Class Trees For MLLRAdaptation,'TR263, Cambridge Univ.,August, 1996
  6. Aapo Hyvarinen, Juha Karhunen andErkki Oja, Independent ComponentAnalysis. Wi11y-Interscience,2001
  7. M. E. Tipping. and C. M. Bishop,'Probabilistic Principal Component Analysis.' Journal of the RoyalStatistical Society. Serios B. 61, Part3, pp 611-612,1999 https://doi.org/10.1111/1467-9868.00196
  8. D. Ridder, J. Kittler, and R. P. W.Duin, 'Probabilistic PCS and ICAsubspace mixture models for imagesegmentation.' The Eleventh BritishMachine Vision Conference, ,pp.112-121,September, 2000
  9. A. Sankar and C. H. Lee. 'Amaximum-likelihood approach tostochastic matching for robust speechrecognition,' IEEE Trans. on SpeechAudio Processing. vol. 4. pp. 190-202 1996 https://doi.org/10.1109/89.496215
  10. V. Digalakis. 'On-line adaptaion ofhidden Markov models using incrementalestimation algorithms.' Proc. 5th Eur.Conf. Speech Communication andTechnology. Sept. 1997, vol. 4. pp.1859-1862
  11. Qiang Huo, and Bin Ma, 'OnlineAdaptive Learning of Continuous-Density Hidden Markov Models Basedon Multiple-Stream Prior Evolutionand Posterior Pooling,'IEEE Trans. OnSpeech and Audio Processing, Vol. 9,No. 4, May 2001, pp388-398 https://doi.org/10.1109/89.917684