Improvement of Speech Reconstructed from MFCC Using GMM

GMM을 이용한 MFCC로부터 복원된 음성의 개선

  • 최원영 (부산대학교 전자공학과 음성통신연구실) ;
  • 최무열 (부산대학교 전자공학과 음성통신연구실) ;
  • 김형순 (부산대학교 전자공학과 음성통신연구실)
  • Published : 2005.03.01

Abstract

The goal of this research is to improve the quality of reconstructed speech in the Distributed Speech Recognition (DSR) system. For the extended DSR, we estimate the variable Maximum Voiced Frequency (MVF) from Mel-Frequency Cepstral Coefficient (MFCC) based on Gaussian Mixture Model (GMM), to implement realistic harmonic plus noise model for the excitation signal. For the standard DSR, we also make the voiced/unvoiced decision from MFCC based on GMM because the pitch information is not available in that case. The perceptual test reveals that speech reconstructed by the proposed method is preferred to the one by the conventional methods.

Keywords