Emotion recognition in speech using hidden Markov model

은닉 마르코프 모델을 이용한 음성에서의 감정인식

  • 김성일 (중국 청화대학 음성기술센타) ;
  • 정현열 (영남대학교 공과대학 정보통신공학과)
  • Published : 2002.07.01

Abstract

This paper presents the new approach of identifying human emotional states such as anger, happiness, normal, sadness, or surprise. This is accomplished by using discrete duration continuous hidden Markov models(DDCHMM). For this, the emotional feature parameters are first defined from input speech signals. In this study, we used prosodic parameters such as pitch signals, energy, and their each derivative, which were then trained by HMM for recognition. Speaker adapted emotional models based on maximum a posteriori(MAP) estimation were also considered for speaker adaptation. As results, the simulation performance showed that the recognition rates of vocal emotion gradually increased with an increase of adaptation sample number.

본 논문은 분노, 행복, 평정, 슬픔, 놀람 등과 같은 인간의 감정상태를 인식하는 새로운 접근에 대해 설명한다. 이러한 시도는 이산길이를 포함하는 연속 은닉 마르코프 모델(HMM)을 사용함으로써 이루어진다. 이를 위해, 우선 입력음성신호로부터 감정의 특징 파라메타를 정의한다. 본 연구에서는 피치 신호, 에너지, 그리고 각각의 미분계수 등의 운율 파라메타를 사용하고, HMM으로 훈련과정을 거친다. 또한, 화자적응을 위해서 최대 사후확률(MAP) 추정에 기초한 감정 모델이 이용된다. 실험 결과로서, 음성에서의 감정 인식률은 적응 샘플수의 증가에 따라 점차적으로 증가함을 보여준다.

Keywords

References

  1. Proc. of the ICSLP'96 Recognizing Emotion in Speech F. Dellaert;T. Polzin;A. Waibel
  2. Proc. of International Conferenceon Multimedia Computing and Systems(ICMCS'99) Emotion Recognition and Synthesis System on Speech T. Moriyama;S. Ozawa
  3. Proc. of the 2nd International Conference on Automatic Face and Gesture Recognition Spoken Affect Classification and Analysis D. Roy;A. Pentland
  4. The 2002 Intel International Science and Engineering Fair Computer Recognition of Emotion in Speech Y. Yu;E. Chang;C. Li
  5. Book Digital Processing of Speech Signal L.R. Rabiner;R.W. Schafer
  6. Book Speech Recognition: Theory and C++ Implementation C. Becchetti;L.P. Riotti
  7. Proc. Int. Symposium on Spoken Dialogue Speech recognition and understanding of spoken dialogue S. Nakagawa;A. Kai;T. Itoh;S. Kogure
  8. MIT EECS Thesis for M.Sc. degree in Electrical Engineerign and Computer Science Stochastic Modeling of Physiological Signals with Hidden Markov Models: A Step Toward Frustration detection in Human- Computer Interfaces R. Fernandez
  9. Doctoral Thesis Prosody and Speech Recognition Waibel A
  10. Master's Thesis A Text-to-Speech System based on (NET)talk C. Turek
  11. Speech Coding and Synthesis A robust algorithm for pitch tracking (RAPT) David Talkin
  12. Journal of the Acoustical Society of America v.99 no.6 The processing of duration and intensity cues to prominence Alice E. Turk;James R. Sawusch
  13. Developmental Psychology v.64 Approval and disapproval: Infant responsiveness to vocal affect in familiar and unfamiliar languages A. Fernald
  14. Affective Computing Rosalind W. Picard
  15. proc. of International Conference on Multimedia Computing and Systems(ICMCS'99) Emotion Recognition and Synthesis System on Speech T. Moriyama;S. Ozawa
  16. Mechanical Engineer's Degree Thesis Recognition of Emotional and Cognitive States Using Physiological Data E. Vyzas
  17. Automatic Speech Recognition: The Development of SPHINX System K.F. Lee
  18. Prentice Hall Signal Processing Series Fundamentals of Speech Recognition L. Rabiner;B.H. Juang
  19. Proc. of ICSLP'94 An Unsupervised Speaker Adaptation Method for Continuous Parameter HMM by Maximum a Posteriori Probability Estimation Y. Tsurumi;S. Nakagawa