Voice Activity Detection Using Global Speech Absence Probability Based on Teager Energy in Noisy Environments

Park, Yun-Sik;Lee, Sang-Min;

대한전자공학회논문지SP (Journal of the Institute of Electronics Engineers of Korea SP)

제49권1호
/
Pages.97-103
/
2012
/
1229-6384(pISSN)

대한전자공학회 (The Institute of Electronics and Information Engineers)

잡음환경에서 Teager Energy 기반의 전역 음성부재확률을 이용하는 음성검출

Voice Activity Detection Using Global Speech Absence Probability Based on Teager Energy in Noisy Environments

박윤식 (인하대학교 전자공학부) ;
이상민 (인하대학교 전자공학부)

Park, Yun-Sik (Department of Electronic Engineering, Inha University) ;
Lee, Sang-Min (Department of Electronic Engineering, Inha University)

투고 : 2011.07.05
심사 : 2011.09.14
발행 : 2012.01.25

PDF KSCI

PDF 다운로드

⟨ 이전 논문 다음 논문 ⟩

초록

본 논문에서는 잡음환경에서 효과적인 음성을 검출하기 위한 새로운 음성 검출 (VAD, voice activity detection) 알고리즘을 제안한다. 통계적 모델에 기반의 Likelihood ratio (LR)를 통하여 도출되는 전역 음성부재확률 (GSAP, global speech absence probability)은 음성검출을 위한 피쳐 (feature) 파라미터로 널리 적용되고 있다. 하지만 신호 대 잡음 비 (SNR, signal-to-noise ratio)가 낮은 잡음환경에서는 정확한 GSAP 추정이 어려운 문제점을 가지고 있다. 따라서 제안된 방법에서는 잡음환경에서 강인한 VAD 알고리즘을 위하여 Teager energy (TE) 기반의 GSAP를 피쳐 파라미터로 적용한다. 제안된 알고리즘은 기존의 방법과 객관적인 실험을 통해 비교 평가한 결과 다양한 배경잡음 환경에서 향상된 성능을 보였다.

In this paper, we propose a novel voice activity detection (VAD) algorithm to effectively distinguish speech from nonspeech in various noisy environments. Global speech absence probability (GSAP) derived from likelihood ratio (LR) based on the statistical model is widely used as the feature parameter for VAD. However, the feature parameter based on conventional GSAP is not sufficient to distinguish speech from noise at low SNRs (signal-to-noise ratios). The presented VAD algorithm utilizes GSAP based on Teager energy (TE) as the feature parameter to provide the improved performance of decision for speech segments in noisy environment. Performances of the proposed VAD algorithm are evaluated by objective test under various environments and better results compared with the conventional methods are obtained.

키워드

참고문헌

L. Karray, C. Mokbel and J. Monne, "Solutions for robust. speech/non-speech detection in wireless environment," presented at the IVTTA, Sep. 1988.
L. R. Rabiner and M. R. Sambur, "Voicedunvoiced- silence detection using the Itakura LPC distance measure," in Proc. IEEE Int. Conf. Acoust. Speech Signal Process., pp. 323-326, May 1977.
J. Sohn, N. S. Kim and W. Sung, "A statistical model-based voice activity detection," IEEE Signal Processing Letters, vol. 6, no. 1, pp. 1-3, Jan. 1999.
F. Jabloun, A. E. Cetin and E. Erzin, "Teager energy based feature parameters for speech recognition in car noise," IEEE Signal Processing Letters, vol. 6, pp. 259-261, 1999. https://doi.org/10.1109/97.789604
K. C. Wang and Y. H. Tsai, "Voice activity detection algorithm with low signal-to-noise ratios based on spectrum entropy," Second International Symposium on Universal Communication 2008, pp. 423-428, Dec. 2008.
R. J. McAualy and M. L. Malpass, "Speech enhancement using a soft-decision noise suppression filter," IEEE Trans. Acoust., Speech, Signal Process., vol. ASSP-28, pp. 137-145, Apr. 1980.
J. Sohn, W. Sung, "A voice activity detector employing soft decision based noise spectrum adaptation," in Proc. IEEE Int. Conf. Acoustics, Speech, and Signal Processing, pp. 365-368, 1998.
N. S. Kim and J.-H. Chang, "Spectral enhancement based on global soft decision," IEEE Signal Processing Letters, vol. 7, no. 5, pp. 108-110, May 2000. https://doi.org/10.1109/97.841154
Rix, A. W., Beerends, J. G., Hollier, M. P. and Hekstra, A. P. "Perceptual evaluation of speech quality (PESQ) - a new method for speech quality assessment of telephone networks and codecs," in Proc. IEEE Int. Conf. Acoust. Speech Signal Process., 2, pp.749-752, May 2001.
Yi Hu and P. C. Loizou, "Evaluation of objective quality measures for speech enhancement," IEEE Trans. ASLP, vol. 16, pp. 229-238, Jan. 2008.
Y. Ephraim and D. Malah, "Speech enhancement using a minimum mean-square error short-time spectral amplitude estimator," IEEE Trans. Acoust., Speech, Signal Process., vol. ASSP-32, no. 6, pp. 1109-1121, Dec. 1984.

대한전자공학회논문지SP (Journal of the Institute of Electronics Engineers of Korea SP)

잡음환경에서 Teager Energy 기반의 전역 음성부재확률을 이용하는 음성검출

Voice Activity Detection Using Global Speech Absence Probability Based on Teager Energy in Noisy Environments

초록

키워드

참고문헌

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)