DOI QR코드

DOI QR Code

Robust Speech Recognition with Car Noise based on the Wavelet Filter Banks

웨이블렛 필터뱅크를 이용한 자동차 소음에 강인한 고립단어 음성인식

  • 이대종 (충북대학교 전기전자 및 컴퓨터공학부 컴퓨터정보통신연구소) ;
  • 곽근창 (충북대학교 전기전자 및 컴퓨터공학부 컴퓨터정보통신연구소) ;
  • 유정웅 (충북대학교 전기전자 및 컴퓨터공학부 컴퓨터정보통신연구소) ;
  • 전명근 (충북대학교 전기전자 및 컴퓨터공학부 컴퓨터정보통신연구소)
  • Published : 2002.04.01

Abstract

This paper proposes a robust speech recognition algorithm based on the wavelet filter banks. Since the proposed algorithm adopts a multiple band decision-making scheme, it performs robustness for noise as the presence of noisy severely degrades the performance of speech recognition system. For evaluating the performance of the proposed scheme, we compared it with the conventional speech recognizer based on the VQ for the 10-isolated korean digits with car noise. Here, the proposed method showed more 9~27% improvement of the recognition rate than the conventional VQ algorithm for the various car noisy environments.

본 논문에서는 웨이블렛 서브밴드 필터링기법을 이용하여 다중의사 결정기법에 기반을 둔 외부 잡음에 강인한 고립단어 음성인식 알고리즘을 제안하고자 한다. 음성인식에 있어서 외부잡음은 음성인식 알고리듬의 인식률을 저하시키는 주요 원인으로 지적되므로 음성인식기의 성능을 향상시키기 위해서 무엇보다도 잡음에 강인한 음성인식 알고리즘의 개발이 절실히 요구되고 있다. 제안된 알고리즘의 타당성을 검증하기 위하여 다양한 자동차 소음하에서 한국어 단독 숫자음 10단어의 인식률 변동을 알아 보았다. 그 결과 현재 음성인식 기법으로 널리 쓰이고 있는 벡터양자화 알고리즘만을 적용한 경우에 비해 9~25%의 향상된 인식률을 보였다.

Keywords

References

  1. Fundamentals of speech recognition L. R. Rabiner;B. H. Juang
  2. Computer Speech and Language v.5 Speech recognition in adverse environments B. H. Juang https://doi.org/10.1016/0885-2308(91)90011-E
  3. Speech Commun v.10 no.1 Speech Recognition in the Noisy Car Environment H. W. Ruehl;S. Dobler;J. Weith;P. Meyer https://doi.org/10.1016/0167-6393(91)90024-N
  4. IEEE Trans. ASSP. v.37 no.6 The Short time modified coherence representation and its application for noisy speech recognition D. Mansour;B. J. Juang https://doi.org/10.1109/ASSP.1989.28053
  5. Proc. ICASSP Recognition of speech in additive and convolutional noise based on RASTA spectral processing H. Hermansky;N. Morgan;H. G. Hirsh https://doi.org/10.1109/ICASSP.1993.319236
  6. Proc. ICASSP Regression feature for recognition of speech on quiet and in noise T. H. Applebaun;B. A. Hanson https://doi.org/10.1109/ICASSP.1991.150506
  7. IEEE Trans. ASSP v.37 no.11 A family of distortion measures based upon projection operation for robust speech recognition D. Mansour;B. J. Juang https://doi.org/10.1109/29.46548
  8. IEEE Trans. ASSP v.37 no.2 Suppression of acoustic noise in speech using spectral subtraction S. F. Boll https://doi.org/10.1109/TASSP.1979.1163209
  9. Speech Commn. v.11 Experiments with a nonlinear spectral subtracter (nss), hidden markov models and the projection, for robust speech recognition in cars P. Lockwood;J. Boudy https://doi.org/10.1016/0167-6393(92)90016-Z
  10. Computer, Speech and Language v.1 Auditory nerve representation as a front end for speech recognition in a noisy environment O. Ghitza https://doi.org/10.1016/S0885-2308(86)80018-3
  11. A wavelet tour of signal processing Stephane Mallat
  12. Proc. ICASSP v.3 Mel-scaled discrete wavelet coefficients for speech recognition Gowdy, J. N.;Tufekcim, Z. https://doi.org/10.1109/ICASSP.2000.861829
  13. Proc. ICASSP v.3 A new feature in speech recognition based on wavelet transform Yu Hao;Xiaoyan Zhu https://doi.org/10.1109/ICOSP.2000.893389
  14. Systems, Mans, and Cybernetics v.4 Evaluation of wavelet filters for speech recognition Kidae Kim;Dae Hee Youn;Chul hee Lee https://doi.org/10.1109/ICSMC.2000.884438