Emotion Recognition using Pitch Parameters of Speech

음성의 피치 파라메터를 사용한 감정 인식

  • Received : 2013.11.19
  • Accepted : 2015.04.03
  • Published : 2015.06.25


This paper studied various parameter extraction methods using pitch information of speech for the development of the emotion recognition system. For this purpose, pitch parameters were extracted from korean speech database containing various emotions using stochastical information and numerical analysis techniques. GMM based emotion recognition system were used to compare the performance of pitch parameters. Sequential feature selection method were used to select the parameters showing the best emotion recognition performance. Experimental results of recognizing four emotions showed 63.5% recognition rate using the combination of 15 parameters out of 56 pitch parameters. Experimental results of detecting the presence of emotion showed 80.3% recognition rate using the combination of 14 parameters.


Emotion Recognition;Speech Parameter;Pitch


  1. Janet E. Cahn, "The Generation of Affect in Synthesized Speech", Journal of the American Voice I/0 Society, Vol. 8, pp.1-19 July 1990.
  2. K. R. Scherer, D. R. Ladd, and K. E. A. Silverman, "Vocal Cues to Speaker Affect: Testing Two Models", Journal Acoustical Society of America, Vol. 76, No. 5, pp. 1346-1355, Nov 1984.
  3. Iain R. Murray and John L. Arnott, "Toward the Simulation of Emotion in Synthetic Speech: A Review of the Literature on Human Vocal Emotion", Journal Acoustical Society of America, pp.1097-1108, Feb. 1993.
  4. Rosalind W. Picard, "Affective Computing", The MIT Press, 1997.
  5. V. Kostv and S. Fukuda, "Emotion in User Interface, Voice Interaction System," IEEE International Conference on Systems, Cybernetics Representation, No.2, pp,798-803, 2000
  6. T. Moriyama and S. Oazwa, "Emotion Recognition and Synthesis System on Speech," IEEE Intl. Conference on Multimedia Computing and System, , pp.840-844. 1999
  7. L. C. Siva and P. C. Ng, "Bimodal Emotion Recognition," in Proceeding of the 4th Intl. Conference on Automatic Face and Gesture Recognition, pp.332-335. 2000
  8. Y. G. Kim, Y. C. Bae, "Design of Emotion Recognition Model Using fuzzy Logic" Proceedings of KFIS Spring Conference, 2000.
  9. K. B. Sim, C. H. Park, "Analyzing the element of emotion recognition from speech", Journal of Korean Institute of Intelligent Systems, Vol. 11, no. 6, pp.510-515, 2001.
  10. P. A. Devijver and J. Kitteler, "Pattern Recognition : A Statistical Approach", London: Prentice-Hall International, 1982
  11. P. Boersma and D. Weeninck, "PRAAT, a system for doing phonetics by computer," Inst. Phon. Sci. Univ. of Amsterdam, Amsterdam, Negherlands, Tech. Rep. 132, 1996 [Online]. Available:
  12. B. S. Kang, "text-independent emotion recognition algorithm using speech signal," Master thesis, Yonsei University, 2000
  13. Dimitrios Ververidis, Constantine Kotropoulos, Loannis Pitas, "Automatic Emotional Speech Classification", in Proceedings of ICASSP'04, 2004.
  14. Carlos Busso, Sungbok Lee, Shrikanth Narayanan, "Analysis of Emotionally Salient Aspects of Fundamental Frequency for Emotion Detection,", IEEE Trans. Speech and Audio Processing, Vol. 17, No 4, pp. 582-596, May 2009

Cited by

  1. A Low Bit Rate Speech Coder Based on the Inflection Point Detection vol.15, pp.4, 2015,
  2. A Fixed Rate Speech Coder Based on the Filter Bank Method and the Inflection Point Detection vol.16, pp.4, 2016,


Supported by : 군산대학교