An Improved VAD Algorithm Employing Speech Enhancement Preprocessing and Threshold Updating

;;

The Journal of Korean Institute of Communications and Information Sciences (한국통신학회논문지)

Volume 28 Issue 11C
/
Pages.1161-1168
/
2003
/
1226-4717(pISSN)
/
2287-3880(eISSN)

The Korean Institute of Commucations and Information Sciences (한국통신학회)

An Improved VAD Algorithm Employing Speech Enhancement Preprocessing and Threshold Updating

음성 향상 전처리와 문턱값 갱신을 적용한 향상된 음성검출 방법

이윤창 (고려대학교 전자정보공학과 신호처리연구실) ;
안상식 (고려대학교 전자 및 정보공학부)

Published : 2003.11.01

PDF KSCI

Download PDF

⟨ Previous Next ⟩

Abstract

In this paper, we propose an improved statistical model-based voice activity detection algorithm and threshold update method. We first improve signal-to-noise ratio by using speech enhancement preprocessing algorithm combined power subtraction method and matched filter, then apply it to LLR test optimum decision rule for improving the performance even in low SNR conditions. And we propose an adaptive threshold update method that was not concerned in any papers. We also perform extensive computer simulations to demonstrate the performance improvement of the proposed VAD algorithm employing the proposed speech enhancement preprocessing algorithm and adaptive threshold update method under various background noise environments. Finally we verify our results by comparing ITU-T G.729 Annex B.

본 논문에서는 음성검출의 성능을 향상시킬 목적으로 정합 필터를 이용한 음성향상 전처리 과정을 통하여 SNR을 개선한 후, 이를 LLR(Log Likelihood Ratio) 검사에 의한 최적 결정방법을 적용하여 확률적인 모델을 기준으로 하는 향상된 음성검출 방법을 제안한다. 또한 기존의 음성검출 방법들에서는 제시되지 않았던 문턱값 갱신 알고리즘을 제안하며, 이 방법을 통해서 기존의 방법들에서 성능이 좋지 않았던 낮은 SNR 환경에서도 음성검출을 할 수 있게 되었다. 마지막으로 컴퓨터 시뮬레이션을 통하여 이미 상용화되어 널리 이용중인 G.729B(ITU-TG.729 Annex B)의 음성검출 결과와 비교를 통해서 제안한 음성검출 방법의 성능의 우수성을 검증하며, 실제적인 환경에도 적용이 가능함을 보인다.

Keywords

References

Jongseo Sohn, Wonyong Sung, 'A statistical model-based voice activity detection,' IEEE Signalprocessing Letters, Vo1.6, No.l, 1999
Yong Duk Cho, Ahmet M. Kondoz, 'Analysis and improvement of a statisdcal model-based voiceacdvity detector,' IEEE Signal Processing Letters ,Vo1.8, Issue 10, pp.276-278, Oct. 2001
ITU-T Recommendation, 'G.729 Annex B: Asilence compression scheme for G.729 optimized for terminals conforming to RecommendationV.70,' Nov. 1996
Jin Yang, 'Frequency domain noise suppression approaches in mobile telephone system,'ICASSP-93, Vo1.2, pp. 363-366. 1993
H. G. Hirsch, C. Ehrlicher, 'Noise estimation techniques for robust speech recognition, 'ICASSP-95, pp. 153-156. May. 1995
Yoon-Chang Lee, Sang-Sik Ahn, 'An improved voice activity detection algorithm employing speech enhancement preprocessing,' IEICE Transaction on Fundamentals, Vol. E84-A, No.6,Jun. 2001
이윤창, 안상식, '정합필터를 이용한 음성검출 방법,' 한국통신학회 하계종합학술발표회 논문 초록집, Vol. 25, pp. 6, Aug. 2002
Ahmet M. Kondoz, 'Digital speech coding for lowbit rate cornmunications systems,' John Wley &Sons, pp. 337-341
http://www.sipro.com , Sipro Lab., Telecom Inc
http://spib.rice.edu , Rice Univ., DSP group.

The Journal of Korean Institute of Communications and Information Sciences (한국통신학회논문지)

An Improved VAD Algorithm Employing Speech Enhancement Preprocessing and Threshold Updating

음성 향상 전처리와 문턱값 갱신을 적용한 향상된 음성검출 방법

Abstract

Keywords

References

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)