음성향상을 위한 2차 조건 사후 최대 확률기법 기반 Global Soft Decision

Improved Global-Soft Decision Incorporating Second-Order Conditional MAP for Speech Enhancement

  • 금종모 (인하대학교 전자공학과 DSP연구실) ;
  • 장준혁 (인하대학교 전자공학과)
  • 발행 : 2009.06.30


본 논문에서는 기존의 global soft decision 방법에서 음성부재확률의 고정 파라미터에 2차 조건 사후 최대 확률기법을 적용한 음성 향상 기법을 제안한다. 기존의 global soft decision 방법은 음성부재확률을 구하기 위해 가정한 가설에 따라 파라미터값을 고정하여 다양한 음성 환경 변화에 민감한 점을 고려하여 본 논문에서 제안한 알고리즘은 기존의 고정 파라미터 값에 직전 2 프레임에서의 음성 존재와 부재에 대한 조건을 부여해주어 음성과 음성사이의 상호 연관성을 고려해주고, 보다 유동적으로 현재 프레임의 음성부재확률을 추정하는 음성향상 기법이다. 제안된 방법의 성능평가를 위해 ITU-T P.862 perceptual evaluation of speech quality (PESQ)를 이용하여 평가하였고, 그 결과 제안된 2차 조건 사후 최대 확률기법을 적용한 global soft decision 방법은 기존의 Global soft decision 방법보다 향상된 결과를 나타내었다.

In this paper, we propose a novel method to improve the performance of the global soft decision which is based on the second-order conditional maximum a posteriori (CMAP). Conventional global soft decision scheme has an disadvantage in that the speech absence probability adjusted by a fixed-parameter was sensitive to the various noise environments. In proposed approach using the second-order CMAP, speech absence probability value is more flexible which exploit not only the current observation but also the speech activity decisions in the previous two frames. Experimental results show that the proposed improved global soft decision method based on second-order conditional MAP yields better results compared to the conventional global soft decision technique with the performance criteria of the ITU-T P. 862 perceptual evaluation of speech quality (PESQ).



  1. S. F. Boll, 'Suppression of acoustic noise in speech using spectral subtraction,' IEEE Transactions on Acoustics, Speech and Signal Processing, vol. ASSP-27, no. 2, pp.113-120, Apr. 1979
  2. J. S. Lim, A. V. Oppenheim, 'Enhancement and bandwidth compression of noisy speech,' IEEE Transactions on Acoustics, Speech and Signal Processing, vol. 67, pp. 1583-1604, Dec. 1979
  3. R. J. McAulary and M. L. Malpass, 'Speech enhancement using a soft-decision noise suppression filter,' IEEE Transactions on Acoustics, Speech and Signal Processing, vol.28, pp. 137-145, Apr. 1980 https://doi.org/10.1109/TASSP.1980.1163394
  4. P. Scalart and J. Wieira Filho, 'Speech Enhancement based on a priori signal to noise estimation,' in Proc. ICASSP, Atlanta,U.S.A., pp. 629-632, May 1996
  5. Y. Ephraim and D. Malah, 'Speech enhancement using a minimum mean-squareerror short-time spectral amplitude estimator,' IEEE Transactions on Acoustics, Speech and Signal Processing, vol. ASSP-32, no. 6, pp.1109-1121, Dec. 1984 https://doi.org/10.1109/TASSP.1984.1164453
  6. N. S. Kim and J.-H. Chang, 'Spectral enhancement based on global soft decision,' IEEE Signal Processing Letters, vol. 7, no. 5, pp. 108-110, May 2000 https://doi.org/10.1109/97.841154
  7. J. W. Shin, H. J. Kwon, S. H. Jin and N. S.Kim, 'Voice activity detection based on conditional MAP criterion,' IEEE SignalProcessing Letters, vol. 15, pp. 257-260, Feb.2008 https://doi.org/10.1109/LSP.2008.917027
  8. J.-M. Kum, J.-H. Chang, 'Speech Enhancement Based on Minima Controlled Recursive Averaging Incorporating Second-Order Conditional MAP Criterion,' IEEE Signal Processing Letters,vol. 16, pp. 624-627, July, 2009 https://doi.org/10.1109/LSP.2009.2019351
  9. ITU-T P.862, Perceptual evaluation of speechquality (PESQ), an objective method forend-to-end speech quality assessment of narrow band telephone networks and speech codecs, 2001
  10. J.-H. Chang, Q.-H. Jo, D. K. Kim and N. S.Kim, 'Global soft decision employing support vector machine for speech enhancement,' IEEE Signal Processing Letters, vol. 16, pp.57-60, Jan. 2009 https://doi.org/10.1109/LSP.2008.2008574
  11. I. Cohen, 'Speech enhancement using a noncausal a priori SNR estimator,' IEEE Signal Processing Letters, vol. 11, no. 9, pp725-728, Sep. 2004 https://doi.org/10.1109/LSP.2004.833478