Emotion recognition from speech using Gammatone auditory filterbank

  • Le, Ba-Vui (Department of Computer Engineering, Kyung Hee University) ;
  • Lee, Young-Koo (Department of Computer Engineering, Kyung Hee University) ;
  • Lee, Sung-Young (Department of Computer Engineering, Kyung Hee University)
  • Published : 2011.06.29

Abstract

An application of Gammatone auditory filterbank for emotion recognition from speech is described in this paper. Gammatone filterbank is a bank of Gammatone filters which are used as a preprocessing stage before applying feature extraction methods to get the most relevant features for emotion recognition from speech. In the feature extraction step, the energy value of output signal of each filter is computed and combined with other of all filters to produce a feature vector for the learning step. A feature vector is estimated in a short time period of input speech signal to take the advantage of dependence on time domain. Finally, in the learning step, Hidden Markov Model (HMM) is used to create a model for each emotion class and recognize a particular input emotional speech. In the experiment, feature extraction based on Gammatone filterbank (GTF) shows the better outcomes in comparison with features based on Mel-Frequency Cepstral Coefficient (MFCC) which is a well-known feature extraction for speech recognition as well as emotion recognition from speech.

Keywords

Acknowledgement

Supported by : 지식경제부, 정보통신산업진흥원