헬스케어 로봇으로의 응용을 위한 음색기반의 감정인식 알고리즘 구현

Implementation of the Timbre-based Emotion Recognition Algorithm for a Healthcare Robot Application

  • 공정식 (인덕대학 기계설계과) ;
  • 권오상 (경기공업대학 자동화로봇과) ;
  • 이응혁 (한국산업기술대학교 전자공학과)
  • Kong, Jung-Shik (Department of Mechanical Design, Induk University) ;
  • Kwon, Oh-Sang (Department of Automation and Robot Kyonggi Institute of Technology) ;
  • Lee, Eung-Hyuk (Department of Electronical Engineering, Korea Polynomic University)
  • 투고 : 2009.12.12
  • 발행 : 2009.12.30

초록

음성신호는 화자에 대한 고유한 정보와 주변의 음향환경에 대한 정보는 물론 감정과 피로도 등 다양한 정보가 포함되어 있다. 이에 음성신호를 이용한 연구분야에서 감정 상태를 파악하기 위한 연구가 지속되어 왔다. 이에 본 논문에서는 화자의 감정을 인식하기 위해 ETSI의 3GPP2 표준코덱인 Selectable Mode Vocoder(SMV)를 분석한다. 이를 기반으로 감정 인식에 효과적인 특징들을 제안한다. 이후 선정된 특징 벡터를 이용하여 Gaussian Mixture Model(GMM) 기반의 감정 인식 알고리즘을 개발하고 Mixture component 개수를 변화시키면서 성능을 검증한다.

This paper deals with feeling recognition from people's voice to fine feature vectors. Voice signals include the people's own information and but also people's feelings and fatigues. So, many researches are being progressed to fine the feelings from people's voice. In this paper, We analysis Selectable Mode Vocoder(SMV) that is one of the standard 3GPP2 codecs of ETSI. From the analyzed result, we propose voices features for recognizing feelings. And then, feeling recognition algorithm based on gaussian mixture model(GMM) is proposed. It uses feature vectors is suggested. We verify the performance of this algorithm from changing the mixture component.

키워드

참고문헌

  1. Q. Ji, P. Lan, C. Looney, "A Probablistic Framework for Modeling and Real-Time Monitoring Human Fatigue," IEEE Transaction on systems, man, and cybernetics Part A : Systems and humans, vol. 36, no. 5, Sep. 2006.
  2. S. Casale, A. Russo, S. Serrano, "Multi-Style Classification of Speech Under Stress Using Feature Subset Selection Based on Genetic Algorithms," Speech Communication, 2007.
  3. R. Faltlhauser, T. Pfau, G. Ruske, "On-line Speaking Rate Estimation Using Gaussian Mixture Models," Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing, 2000.
  4. O. W. Kwon, K. Chan, J. Hao, T. W. Lee, "Emotion Recognition by Speech Signals," Proc. Eurospeech, 125-128, 2003.
  5. S. Ramamohan, S. Dandapat, "Sinusoidal Model-Based Analysis and Classification of Stressed Speech," IEEE Transactions on audio, speech, and language processing, vol. 14, no. 3, May 2006.
  6. R. O. Duda, P. E. Hart and D. G. Stork, Pattern classification, John Wiley & Sons, INC., 2001.
  7. S. Craig Greer, and A. Dejaco, "Standardization of the selectable mode vocoder," Proc. IEEE International Conference on Acoustics, Speech and Signal Processing, vol. 2, pp. 953-956, May 2001.
  8. G. Yang, E. Shlomot, A. Benyassine, J. Thyssen, S. Huan-yu and C. Murgia, "The SMV algorithm selected by TIA and 3GPP2 for CDMA applications," Proc. IEEE International Conference on Acoustics, Speech and Signal Processing, vol. 2, pp. 709-712, May 2001.
  9. 3GPP2 Spec., "Selectable Mode Vocoder (SMV) Service Option for Wideband Spread Spectrum Communications Systems," 3GPP2-C.S0030-0, v3.0, Jan. 2004.