Speech Enhancement based on Smoothed Global Soft Decision

Smoothed Global Soft Decision에 근거한 음성 향상 기법

  • Jo, Q-Haing (School of Electronic and Electrical Engineering, Inha University) ;
  • Park, Yun-Sik (School of Electronic and Electrical Engineering, Inha University) ;
  • Chang, Joon-Hyuk (School of Electronic and Electrical Engineering, Inha University)
  • 조규행 (인하대학교 전자전기공학부) ;
  • 박윤식 (인하대학교 전자전기공학부) ;
  • 장준혁 (인하대학교 전자전기공학부)
  • Published : 2007.11.25

Abstract

In this paper, we propose an improved global soft decision for speech enhancement in noise environments. From an examination of statistical model-based speech enhancement, it is shown that the global soft decision has a fundamental drawback at the offset region of speech signals. To overcome the drawback, we apply a new speech enhancement method based on a smoothed Global likelihood ratio to the global soft decision. Performances of the proposed method are evaluated by subjective tests under various environments and yield better results compared with the reported speech enhancement method.

본 논문에서는 잡음 환경에서의 음성 향상을 위해 향상된 Global Soft Decision (GSD) 기법을 제안한다. 통계적 모델을 바탕으로 한 음성 향상과 관련한 연구에서 GSD는 음성의 꼬리 부분에서 취약하다고 알려져 있으며, 이를 개선하기 위해 Smoothed Global Likelihood Ratio (SGLR)를 바탕으로 한 새로운 음성 향상 기법을 GSD에 적용한다. 제안된 방법은 다양한 잡음 환경에서 MOS 실험을 바탕으로 기존의 연구와 비교하였으며 우수한 성능을 보여주었다.

Keywords

References

  1. J.-H. Chang and N. S. Kim, 'Voice activity detection based on complex Laplacian model,' Electronics Letters, vol. 39, no. 7, pp. 632-634, Apr. 2003 https://doi.org/10.1049/el:20030392
  2. J.-H. Chang, N. S. Kim and S. K. Mitra, 'Voice activity detection based on multiple statistical models,' IEEE Trans. Signal Processing, vol. 54, no. 6, pp. 1965-1976, June 2006 https://doi.org/10.1109/TSP.2006.874403
  3. J.-H. Chang and N. S. Kim, 'A new structural approach in system identification with generalized analysis-by-synthesis for Robust Speech Coding,' IEEE Trans. Speech and Audio Processing, vol. 14, no. 3, pp. 747-751, May 2006 https://doi.org/10.1109/TSA.2005.858069
  4. J.-H. Chang, 'Perceptual weighting filter for robust speech modification,' Signal Processing, vol. 86, Issue 5, pp. 1089-1093, May 2006 https://doi.org/10.1016/j.sigpro.2005.07.025
  5. J.-H. Chang, N. S. Kim and S. K. Mitra, 'A statistical model-based V/UV decision under background noise environments,' IEICE Trans. on Info. and Systs., vol. E87-D, no. 12, pp. 2885-2887, Dec. 2004
  6. J.-H. Chang, J. W. Shin and N. S. Kim, 'Voice activity detection employing generalized Gaussian distribution,' Electronics Letters, vol. 40, no. 24, pp.1561-1562, Nov. 2004 https://doi.org/10.1049/el:20047090
  7. J.-H. Chang and N. S. Kim, 'Distorted speech rejection for automatic speech recognition in wireless communication,' IEICE Trans. Info. and Systs., vol. E87-D, no. 7, pp. 1978-1981, July 2005
  8. J. Sohn, N. S. Kim and W. Sung, 'A statistical model-based voice activity detection,' IEEE Signal Processing Letters, vol. 6, no. 1, pp. 1-3, Jan. 1999
  9. Y. D. Cho and A. Kondoz, 'Analysis and improvement of a statistical model-based voice activity detector,' IEEE Signal Processing Letters, vol. 8, no. 10, pp. 276-278, Oct. 2001 https://doi.org/10.1109/97.957270
  10. E. Nemer, R. Goubran, and S. Mahmoud, 'Robust voice activity detection using higherorder statistics in the LPC Residual domain,' IEEE Trans. Speech and Audio Processing, vol. 9, no. 3, pp. 217-231, Mar. 2001 https://doi.org/10.1109/89.905996
  11. TIA/EIA/IS-127, 'Enhanced variable rate codec, speech service option 3 for wideband spectrum digital systems,' 1996
  12. N. S. Kim and J.-H. Chang, 'Spectral enhancement based on global soft decision', IEEE Signal Processing Letters, vol. 7, no. 5, pp. 108-110, May 2000 https://doi.org/10.1109/97.841154
  13. J.-H. Chang and N. S. Kim, 'Speech enhancement : new approaches to soft decision,' IEICE Trans. Inf. and Syst., vol. 27, E84-D, pp. 1231-1240, Sep. 2001
  14. J.-H. Chang, 'Warped discrete cosine transformbased noisy speech enhancement,' IEEE Trans. Circuit and Systems II, vol. 52, issue 9, pp. 535-539, Sept. 2005
  15. F. Beritelli, S. Casale, and A. Cavallaro, 'A robust voice activity detector for wireless communications using soft computing,' IEEE Journal on Selectied Areas in Communications, vol. 16, no. 9, pp. 1818-1829, Dec. 1998 https://doi.org/10.1109/49.737650
  16. Y. Ephraim and D. Malah, 'Speech enhancement using a minimum mean-square error short-time spectral amplitude estimator' IEEE Trans. Acoust., Speech, Signal Processing, vol. 32, no. 6, pp. 1109-1121, Dec. 1984 https://doi.org/10.1109/TASSP.1984.1164453
  17. J. Sohn and W. Sung, 'A voice activity detector employing soft decision based noise spectrum adaptation,' Proc. IEEE Int. Conf. Acoustics, Speech, and Signal Processing, pp. 365-368, 1998
  18. I. Cohen and B. Berdugo, 'Speech enhancement for non-stationary noise environments,' Signal Processing, vol 81, pp. 2403-2418, Nov. 2001 https://doi.org/10.1016/S0165-1684(01)00128-1
  19. I. Cohen and B. Berdugo, 'Noise estimation by minima controlled recursive averaging for robust speech enhancement,' IEEE Signal Processing Letters, vol. 9, no. 1, pp. 12-15, Jan. 2002 https://doi.org/10.1109/97.988717
  20. I. Cohen, 'Optimal speech enhancement under signal presence uncertainty using log-spectral amplitude estimator,' IEEE Signal Processing Letters, vol. 9, no. 4, pp. 113-116, Apr. 2002 https://doi.org/10.1109/97.1001645
  21. R. J. McAulary and M. L. Malpass, 'Speech enhancement using a soft-decision noise suppression filter,' IEEE Trans. Acoust., Speech, Signal Processing, vol.28, pp. 137-145, Apr. 1980 https://doi.org/10.1109/TASSP.1980.1163394
  22. O. Cappe, 'Elimination of musical noise phenomenon with the Ephraim and Malah noise suppressor,' IEEE Trans. Speech and Audio Processing, vol. 2, no. 2, pp. 345-349, Apr. 1994 https://doi.org/10.1109/89.279283