Emotion Recognition Algorithm Based on Minimum Classification Error incorporating Multi-modal System

최소 분류 오차 기법과 멀티 모달 시스템을 이용한 감정 인식 알고리즘

  • Lee, Kye-Hwan (Department of Electronics Engineering, Inha University) ;
  • Chang, Joon-Hyuk (Department of Electronics Engineering, Inha University)
  • 이계환 (인하대학교 전자공학부) ;
  • 장준혁 (인하대학교 전자공학부)
  • Published : 2009.07.25

Abstract

We propose an effective emotion recognition algorithm based on the minimum classification error (MCE) incorporating multi-modal system The emotion recognition is performed based on a Gaussian mixture model (GMM) based on MCE method employing on log-likelihood. In particular, the reposed technique is based on the fusion of feature vectors based on voice signal and galvanic skin response (GSR) from the body sensor. The experimental results indicate that performance of the proposal approach based on MCE incorporating the multi-modal system outperforms the conventional approach.

본 논문에서는 최소 분류 오차 기법 (Minimum Classification Error, MCE)에 기반한 감정 인식을 위한 알고리즘 멀티 모달(Multi-modal) 시스템을 기반으로 제안한다. 사람의 음성 신호로부터 추출한 특징벡터와 장착한 바디센서로부터 구한 피부의 전기반응도 (Galvanic Skin Response, GSR)를 기반으로 특징벡터를 구성하여 이를 Gaussian Mixture Model (GMM)으로 구성하고 이를 기반으로 구해지는 로그 기반의 우도 (Likelihood)를 사용한다. 특히, 변별적 가중치 학습을 사용하여 최적화된 가중치를 특징벡터에 인가하여 주요 감정을 식별하는 데 이용하여 성능향상을 도모한다. 실험결과 제안된 감정 인식이 기존의 방법보다 우수한 성능을 보인 것을 알 수 있었다.

Keywords

References

  1. Q. Ji, P. Lan and C. Looney, "A Probabilistic Framework for Modeling and Real-Time Monitoring Human Fatigue," IEEE Transaction on systems, man, and cybernetics Part A : Systems and humans, vol. 36, no. 5, pp. 862-875, Sep. 2006. https://doi.org/10.1109/TSMCA.2005.855922
  2. S. Casale, A. Russo and S. Serrano, "Multi-Style Classification of Speech Under Stress Using Feature Subset Selection Based on Genetic Algorithms," Speech Communication, vol. 49, no. 10-11, pp. 801-810, Oct. 2007 https://doi.org/10.1016/j.specom.2007.04.012
  3. R. Faltlhauser, T. Pfau, G. Ruske, "On-line Speaking Rate Estimation Using Gaussian Mixture Models," IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 3, pp. 1355-1358, June 2000
  4. O. Kwon, K. Chan, J. Hao and T. Lee, "Emotion Recognition by Speech Signals," Eurospeech, pp. 125-128, Sep. 2003
  5. S. Ramamohan and S. Dandapat, "Sinusoidal Model-Based Analysis and Classification of Stressed Speech," IEEE Transactions on audio, speech, and language processing, vol. 14, no. 3, pp. 737-746 , May 2006 https://doi.org/10.1109/TSA.2005.858071
  6. J. -H. Song, K. -H. Lee, J. -H. Chang, J. K. Kim and N. S. Kim, "Analysis and Improvement of speech/music classification for 3GPP2 SMV based on GMM," IEEE Signal Processing Letters, vol. 15, pp. 103-106, Jan. 2008 https://doi.org/10.1109/LSP.2007.911184
  7. BodyMedia Armband http://www.bodymedia.com
  8. C. M. Bishop, Neural networks for pattern recognition, Oxford University Press, UK, 1995
  9. R. O. Duda, P. E. Hart and D. G. Stork, Pattern classification, John Wiley & Sons, INC., 2001
  10. Kang S. -I., Jo Q. -H., Chang J. -H., "Discriminative Weight Training for A Statistical Model-Based Voice Activity Detection," IEEE Signal Processing Letters, vol. 15, pp. 170-173, Feb. 2008 https://doi.org/10.1109/LSP.2007.913595