Speech Enhancement System Using a Model of Auditory Mechanism

청각기강의 모델을 이용한 음성강조 시스템

  • 최재승 (일본 오사카시립대학교 정보통신공학과)
  • Published : 2004.11.01

Abstract

On the field of speech processing the treatment of noise is still important problems for speech research. Especially, it has been noticed that the background noise causes remarkable reduction of speech recognition ratio. As the examples of the background noise, there are such various non-stationary noises existing in the real environment as driving noise of automobiles on the road or typing noise of printer. The treatment for these kinds of noises is not so simple as could be eliminated by the former Wiener filter, but needs more skillful techniques. In this paper as one of these trials, we show an algorithm which is a speech enhancement method using a model of mutual inhibition for noise reduction in speech which is contaminated by white noise or background noise mentioned above. It is confirmed that the proposed algorithm is effective for the speech degraded not only by white noise but also by colored noise, judging from the spectral distortion measurement.

음성 신호처리의 분야에서 잡음처리의 문제는 지금도 중요한 연구 과제이다. 특히 배경잡음이 음성의 인식율을 현저히 저하시키는 것은 오래 전부터 주목 받고 있다. 배경잡음으로는 실제 환경에 존재하는 비정상적인 다양한 잡음, 예를 들면 도로에서의 자동차의 주행잡음, 프린터의 구동잡음 등이 있다. 이런 종류에 대한 잡음 대책은 단순하지 않고, 종래의 위너 필터(Wiener filter) 등에 의한 선형적인 잡음제거 법보다도, 보다 고도한 잡음억제 기술이 필요하다. 본 논문에서는, 이러한 방법의 한 가지 시도로써 백색잡음 및 위에 기술한 비정상적인 배경잡음에 의해 열화된 음성을 상호억제로 불리는 인간의 청각기관에서의 잡음억제 기능 모델을 사용하여 음성강화 법의 알고리즘을 소개한다. 제안된 알고리즘은 스펙트럴 왜곡(SD)의 평가방법을 통하여 백색잡음 및 유색잡음에 대해서 효과적인 것을 보여준다.

Keywords

References

  1. Yadong Wu, Yan Li, 'Rubust speech/non-speech detection in adverse conditions using the fuzzy polarity correlation method', Systems, Man, and Cybernetics, 2000 IEEE International Conference on, Vol. 2, 8-11 Oct. pp. 2935-2939, 2000 https://doi.org/10.1109/ICSMC.2000.884446
  2. Sreenivas and P. Kirnapure, 'Codebook constrained wiener filtering for speech enhancement,' IEEE Trans. Speech and Audio Processing, Vol.4, No.5, pp. 383-389, 1996 https://doi.org/10.1109/89.536932
  3. B. Widrow et al., 'Adaptive noise cancelling: Principles and applications,' Proc. IEEE, Vol. 63, No. 12, pp. 1692-1716, 1975 https://doi.org/10.1109/PROC.1975.10036
  4. W. G. Knecht, M. E. Schenkel, and G. S. Moschytz, 'Neural network filters for speech enhancement,' IEEE Trans. Speech and Audio Processing, Vol.3, No.6, pp. 433-438, 1995 https://doi.org/10.1109/89.482210
  5. Y. M. Cheng and D. O'Shaughnessy, 'Speech enhancement based conceptually on auditory evidence,' IEEE Trans. Signal Processing, Vol. 39, No. 9, pp. 1943-1953, 1991 https://doi.org/10.1109/78.134427
  6. V. C. Dang, R. Carre, D. Tuffelli, 'Speech signal preprocessing taking into account lateral inhibition', Signal Processing III: Theories and Applications., Vol. 1, pp. 533-536, 1986
  7. Shihab A. Shamma, 'Speech processing in the auditory system II: Lateral inhibition and the central processing of speech evoked activity in the auditory nerve', J. Acoust. Soc Amer., pp. 1622-1632, 1985 https://doi.org/10.1121/1.392800
  8. Steven. Boll, 'Suppression of acoustic noise in speech using spectral subtraction,'IEEE Trans. Acoust., Speech Signal Processing, Vol. ASSP-27, No.2, pp. 113-120, 1979 https://doi.org/10.1109/TASSP.1979.1163209
  9. J. H. L. Hansen and S. Nandkumar, 'Robust estimation of speech in noisy backgrounds based on aspects of the auditory process,' J. Acoust. Soc. Am. 97(6), pp. 3833-3849, June, 1995 https://doi.org/10.1121/1.413108
  10. K. Itoh, et al., 'A Study of Objective Quality Measures for Digital Speech Waveform Coding Systems', The Institute of Electronics, Information and Communication Engineers(IEICE), Vol. J 66-A, No.3, pp. 274-281, 1983