Robust Entropy Based Voice Activity Detection Using Parameter Reconstruction in Noisy Environment

  • Han, Hag-Yong (Department of Electronic Engineering, Dong-A University) ;
  • Lee, Kwang-Seok (School of Electronic Information and Communication Engineering, Kyungil University) ;
  • Koh, Si-Young (School of Electronic Information and Communication Engineering, Kyungil University) ;
  • Hur, Kang-In (Department of Electronic Engineering, Dong-A University)
  • 발행 : 2003.12.01

초록

Voice activity detection is a important problem in the speech recognition and speech communication. This paper introduces new feature parameter which are reconstructed by spectral entropy of information theory for robust voice activity detection in the noise environment, then analyzes and compares it with energy method of voice activity detection and performance. In experiments, we confirmed that spectral entropy and its reconstructed parameter are superior than the energy method for robust voice activity detection in the various noise environment.

키워드

참고문헌

  1. Sadaoki Furui : 'Digital Speech Processing Synthesis, and Recognition', MAECEL DEKKER, INC. 2001 pp. 248-249
  2. Xuedong Alex Acero, Hsiao-Wuen Hon: 'SPOKEN LANGUAGE PROCESSING', Prentice Hall 2001. pp. 120-130
  3. Nikos Doukas, Patrick Naylor and Tania Stathaki : 'Voice Activity Detection Using Source Separation Techniques', Signal Processing Section, Proc. Eurospeech '97
  4. L.R.Rabiner, R.W.Schafer : 'Digital Processing of Speech Signals', PRENTICE HALL
  5. J.L. Shen, J.Hung, L.S.Lee : 'Robust Entropy-based Endpoint Detection for Speech Recognition in Noisy Environments', Proceeding of ICLP-98, 1998
  6. S. McClellan and J.D. Gibson: 'Variable-rate celp based on subband flatness', in IEEE Transactions on Speech and Audio-Processing, vol. 5, pp. 120-130, 1997 https://doi.org/10.1109/89.554774
  7. J.Sohn and W.Sung : 'A Voice Activity Detector Employing Soft Decision Based Noise Spectrum Adaptation', in Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing, pp. 356-368, 1998
  8. J.D. Hoyt and H. Wechsler : 'Detection of human speech in structured noise' in Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing, pp. 237-240, 1994
  9. http://spib.rice.edu/spib/select_noise.html
  10. Jianping Zhang, Wayne Ward, Bryan Pellom, 'Phone Based Voice Activity Detection Using Online Bayesian Adaptation with Conjugate Normal Distributions', lCASSP'2002, Orlando Florida, May 2002
  11. E.Neimer, R.Goubran and S.Mahmould, 'Robust Voice Activity Detection Using Higher-Order Statistics in the LPC Residual Domain', IEEE Trans. Speech and Audio Proc., 9(3), pp. 217-231, 2001 https://doi.org/10.1109/89.905996
  12. R. Sarikaya and J. Hansen, Robust Speech Activity Detection in the Presence of Noise, ICSLP, Sydney, 1998
  13. J.Sohn, N. Kim and W. Sung, 'A Statistical Model-Based Voice Activity Detection', IEEE Signal Proc. Lett., 6(1), pp. 1-3, 1999 https://doi.org/10.1109/97.736233
  14. S. Tanyer and H. Ozer, 'Voice Activity Detection in Nonstationary Noise', IEEE Trans. Speech and Audio Proc.,8(4), pp. 478-482, 2000 https://doi.org/10.1109/89.848229
  15. Mak, B., Jungua, J.-C., and Reaves, B., A robust speech/non speech detection algorithm using time and frequency-based features', IEEE ICASSP, vol. 1, pp. 269-272, 1992