A Merging Algorithm with the Discrete Wavelet Transform to Extract Valid Speech-Sounds

이산 웨이브렛 변환을 이용한 유효 음성 추출을 위한 머징 알고리즘

  • Kim, Jin-Ok (Dept.of Electric Electronics Computer Engineering, Sungkyunkwan University) ;
  • Hwang, Dae-Jun (Dept.of Electric Electronics Computer Engineering, Sungkyunkwan University) ;
  • Paek, Han-Wook ;
  • Chung, Chin-Hyun (Dept.of Information Control Instrumentation Engineering, Kwangwoon University)
  • 김진옥 (성균관대학교 전기전자및컴퓨터공학부) ;
  • 황대준 (성균관대학교 전기전자및컴퓨터공학부) ;
  • 백한욱 (American-Panel Corporation 연구원) ;
  • 정진현 (광운대학교 정보제어공학과)
  • Published : 2002.06.01

Abstract

A valid speech-sound block can be classified to provide important information for speech recognition. The classification of the speech-sound block comes from the MRA(multi-resolution analysis) property of the DWT(discrete wavelet transform), which is used to reduce the computational time for the pre-processing of speech recognition. The merging algorithm is proposed to extract valid speech-sounds in terms of position and frequency range. It needs some numerical methods for an adaptive DWT implementation and performs unvoiced/voiced classification and denoising. Since the merging algorithm can decide the processing parameters relating to voices only and is independent of system noises, it is useful for extracting valid speech-sounds. The merging algorithm has an adaptive feature for arbitrary system noises and an excellent denoising SNR(signal-to-nolle ratio).

데이타로부터 유효한 음성 데이타를 추출하는 것은 음성 인식분야에서 중요하다. 본 논문의 음성 추출 기술은 빠른 연산이 가능하며 음성의 전처리 과정에 적합한 이산 웨이브렛 변환을 사용하고 있으며, 이산 웨이브렛 변환의 복수 해상도 해석 특징을 이용한 머징 알고리즘으로 유효한 음성을 추출하고 노이즈 제거를 동시에 구현한다. 머징 알고리즘은 음성만으로도 처리 매개변수를 결정할 수 있고 또한 시스템 잡음에 대하여서도 독립적이기 때문에, 유효 음성을 추출하는데 매우 효과적이다. 그리고 머징 알고리즘은 시스템 잡음에 대한 적응 특성을 갖고 탁월한 노이즈 분리 특성을 갖는다.

Keywords

References

  1. Raghuveer M. Rao and Ajit S. Bopardikar, Wavelet Transforms: Introduction to Theory and Applications, Addision Wesley, Readming, MA., 1998
  2. James S. Walker, A primer on Wavelets for Their Scientific Applications, CRC Press, Boca Ration, FL.,2000
  3. Randy Goldberg and Lance Riek, A practical Handbook of Speech Coders, CRC Press, Boca Ration, FL.,2000
  4. Jin Ok Kim, Dae Jun Hwang, Han Wook Paek, and Chin Hyun Chung, 'An application of the merging algorithm with the discrete transform to extract valid speech-sound', in IEEE VIMS 2001, Budapest, Hungary, May 2001, pp. 67-70, IEEE
  5. Jaideva C. Goswami and Andrew K. Chan, Fundamental of Wavelets: Theory, Algorithms and Applications, John Wiley & Sons, New York, 1999
  6. Anthony Teolis, Computational Signal Processing with Wavelets, Springer Verlag, New York, 1998
  7. C. Sidnry Burrus, Ramesh A. Gopinath, and Hitao Guo, Introduction to Wavelets and Wavelet Transforms: A primer, Prentice Hall, New Jersey, 1997
  8. T. L. Marzetta, 'A new interpretation for capon's maximum likelihood method of frequency wavenumber spectral estimation,' IEEE Trans. Acoustics, Speech and Signal Processing, vol. 31, 1983
  9. John R. Deller, John H. L. Hansen, and John G. Proakis, Discrete-Time Processing of Speech Signals(IEEE press Classic Reissue), IEEE Press, New York, 2000
  10. D. L. Donoho, 'Denosing by soft-thresholding,' IEEE Trans. Information Theory, vol. 41, 1995
  11. Agostino Abbate, Casimer M. Decusatis, and Pankaj K. Das, Wavelets and Subband: Fundamentals and Applications, Brikhauser, Stuttgart, Germany, 2001
  12. R. Todd Ogden, Essential Wavelets for Statistical Applications and Data Analysis, Spinger Verlag, New York, 1996
  13. S. K. Mitra, Digital Signal Processing: A Computer-Based Approach, McGraw-Hill, New York, 2nd edition, 2000
  14. Thomas W. Parsons, Voice and Speech Processing, McGraw-Hill, New York, 1986
  15. Sadaoki Furui, Digital Speech Processing;Synthesis and Recognition, Marcel Dekker, New York, 2nd edition, 2001
  16. Dan Jurasfky, James H. Martin, Keith Vander Linden, and Daniel Jurafsky, Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics and Speech Recognition, Prentice Hall, New Jersey, 2000
  17. Nelson Morgan and Ben Gold, Speech and Audio Signal Processing; Processing and Perception of Speech and Music, John Wiley & Sons, New York, 1999
  18. Lawrence R. Rabinder and Ronald W. Schafer, Digital Processing of Speech Signals, Prentice Hall, New Jersey, 1978
  19. Lawrence Rabinder and Biing-Hwang Juang, Fundamentals of Speech Recognition, Prentice Hall, New Jersey, 1993