Reference Channel Input-Based Speech Enhancement for Noise-Robust Recognition in Intelligent TV Applications

Jeong, Sangbae;

doi:10.6109/jkiice.2013.17.2.280

Journal of the Korea Institute of Information and Communication Engineering (한국정보통신학회논문지)

Volume 17 Issue 2
/
Pages.280-286
/
2013
/
2234-4772(pISSN)
/
2288-4165(eISSN)

The Korea Institute of Information and Commucation Engineering (한국정보통신학회)

DOI QR Code

Reference Channel Input-Based Speech Enhancement for Noise-Robust Recognition in Intelligent TV Applications

지능형 TV의 음성인식을 위한 참조 잡음 기반 음성개선

Jeong, Sangbae

정상배 (경상대학교)

Received : 2012.07.30
Accepted : 2012.09.05
Published : 2013.02.28

https://doi.org/10.6109/jkiice.2013.17.2.280 Citation PDF KSCI

Download PDF

⟨ Previous Next ⟩

Abstract

In this paper, a noise reduction system is proposed for the speech interface in intelligent TV applications. To reduce TV speaker sound which are very serious noises degrading recognition performance, a noise reduction algorithm utilizing the direct TV sound as the reference noise input is implemented. In the proposed algorithm, transfer functions are estimated to compensate for the difference between the direct TV sound and that recorded with the microphone installed on the TV frame. Then, the noise power spectrum in the received signal is calculated to perform Wiener filter-based noise cancellation. Additionally, a postprocessing step is applied to reduce remaining noises. Experimental results show that the proposed algorithm shows 88% recognition rate for isolated Korean words at 5 dB input SNR.

본 논문에서는 지능형 TV의 음성인터페이스를 위한 잡음제거 시스템에 대해서 제안한다. 음성인식 성능 저하에 매우 나쁜 영향을 주는 TV 소리를 제거하기 위해서 TV 소리 자체를 참조 잡음으로 하는 잡음제거 알고리즘이 구현된다. 제안된 알고리즘에서 TV 스피커와 다채널 장비간의 전달함수를 추정한다. 그 후, 위너 필터를 동작시키기 위해서 잡음의 전력 스펙트럼이 추정된다. 추가적으로 후처리 과정이 적용되어 잔존 잡음을 제거한다. 실험의 의해서 제안된 알고리즘이 5 dB 입력 SNR에서 88 %의 음성인식률을 나타내었다.

Keywords

References

S. Jeong and M. Hahn, "Speech quality and recognition rate improvement in car noise environments," Electronics Letters, vol. 37, no. 12, pp. 801-802, 2001.
Y. Ephraim, and D. Malah, "Speech enhancement using a minimum mean-square error short-time spectral amplitude estimator," IEEE Trans. Acoust., Speech, and Signal Process., vol. 32, no. 6, pp. 1109-1211, 1984. https://doi.org/10.1109/TASSP.1984.1164453
ES 202 212 V1.1.2., "Speech processing, transmission and quality aspects (STQ); distributed speech recognition; extended advanced front-end feature extraction algorithm; compression algorithm; back-end speech reconstruction algorithm," ETSI Standard, 2005.
B. Widrow, and S. D. Stearns, Adaptive Signal Processing, Prentice Hall, 1985.
B. D. Van Veen, and K. M. Buckley, "Beamforming: A versatile approach to spatial filtering", IEEE ASSP Magazine, vol. 5, no. 2, pp. 4-24, 1998.
M. Brandstein, and D. Ward, Microphone Arrays: Signal Processing Techniques and Applications, Springer, 2001.
J. Benesty, J. Chen, and Y. Huang, Microphone Array Signal Processing (Springer Topics in Signal Processing), Springer, 2008.
O. Hoshuyama, et al., "A robust adaptive beamformer for microphone arrays with a blocking matrix using constrained adaptive filters," IEEE Trans. Signal Proc., vol. 47, no. 10, pp. 2677-2688, 1999. https://doi.org/10.1109/78.790650
S. Alexander, Adaptive signal Processing: Theory and Applications, Springer-Verlag, 1986.
A. Oppenheim, R. Schafer, Discrete-time signal processing, Pearson, 2007.
L. Rabiner, and B. H. Juang, Fundamentals of speech recognition, Prentice Hall, 1996.

Journal of the Korea Institute of Information and Communication Engineering (한국정보통신학회논문지)

Reference Channel Input-Based Speech Enhancement for Noise-Robust Recognition in Intelligent TV Applications

지능형 TV의 음성인식을 위한 참조 잡음 기반 음성개선

Abstract

Keywords

References

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)