DOI QR코드

DOI QR Code

Emotion Recognition of Low Resource (Sindhi) Language Using Machine Learning

  • 투고 : 2021.08.05
  • 발행 : 2021.08.30

초록

One of the most active areas of research in the field of affective computing and signal processing is emotion recognition. This paper proposes emotion recognition of low-resource (Sindhi) language. This work's uniqueness is that it examines the emotions of languages for which there is currently no publicly accessible dataset. The proposed effort has provided a dataset named MAVDESS (Mehran Audio-Visual Dataset Mehran Audio-Visual Database of Emotional Speech in Sindhi) for the academic community of a significant Sindhi language that is mainly spoken in Pakistan; however, no generic data for such languages is accessible in machine learning except few. Furthermore, the analysis of various emotions of Sindhi language in MAVDESS has been carried out to annotate the emotions using line features such as pitch, volume, and base, as well as toolkits such as OpenSmile, Scikit-Learn, and some important classification schemes such as LR, SVC, DT, and KNN, which will be further classified and computed to the machine via Python language for training a machine. Meanwhile, the dataset can be accessed in future via https://doi.org/10.5281/zenodo.5213073.

키워드

참고문헌

  1. de Gelder B, Vroomen J. The perception of emotions by ear and by eye. Cognition & Emotion. 2000; 14(3):289-311. https://doi.org/10.1080/026999300378824
  2. Dolan RJ, Morris JS, de Gelder B. Crossmodal binding of fear in voice and face. Proceedings of the National Academy of Sciences. 2001; 98(17):10006-10 https://doi.org/10.1073/pnas.171288598
  3. Kreifelts B, Ethofer T, Grodd W, Erb M, Wildgruber D. Audiovisual integration of emotional signals in voice and face: an event-related fMRI study. NeuroImage. 2007; 37(4):1445-56. https://doi.org/10.1016/j.neuroimage.2007.06.020
  4. Collignon O, Girard S, Gosselin F, Roy S, Saint-Amour D, Lassonde M, et al. Audio-visual integration of emotion expression. Brain Research. 2008; 1242:126-35. https://doi.org/10.1016/j.brainres.2008.04.023
  5. Sato W, Yoshikawa S. Spontaneous facial mimicry in response to dynamic facial expressions. Cogni- tion. 2007; 104(1):1-18. https://doi.org/10.1016/j.cognition.2006.05.001
  6. Weyers P, Muhlberger A, Hefele C, Pauli P. Electromyographic responses to static and dynamic ava- tar emotional facial expressions. Psychophysiology. 2006; 43(5):450-3. https://doi.org/10.1111/j.1469-8986.2006.00451.x
  7. S. G. Koolagudi, R. Reddy, J. Yadav, and K. S. Rao, "IITKGP-SEHSC: Hindi speech corpus for emotion analysis," in International Conference on Devices and Communications, 2011, pp. 1-5.
  8. S. A. Ali, S. Zehra, M. Khan, and F. Wahab, "Development and Analysis of Speech Emotion Corpus Using Prosodic Features for Cross Linguistics," International Journal of Scientific & Engineering Research, vol. 4, no. 1, 2013.
  9. Ekman P. An argument for basic emotions. Cognition and Emotion. 1992; 6(3-4):169-200. https://doi.org/10.1080/02699939208411068
  10. Belin P, Fillion-Bilodeau S, Gosselin F. The Montreal Affective Voices: a validated set of nonverbal affect bursts for research on auditory affective processing. Behavior Research Methods. 2008; 40 (2):531-9 https://doi.org/10.3758/BRM.40.2.531
  11. Banziger T, Grandjean D, Scherer KR. Emotion recognition from expressions in face, voice, and body: the Multimodal Emotion Recognition Test (MERT). Emotion. 2009; 9(5):691. https://doi.org/10.1037/a0017088
  12. Breiter HC, Etcoff NL, Whalen PJ, Kennedy WA, Rauch SL, Buckner RL, et al. Response and habitua- tion of the human amygdala during visual processing of facial expression. Neuron. 1996; 17(5):875-87. https://doi.org/10.1016/S0896-6273(00)80219-6
  13. Sonnemans J, Frijda NH. The structure of subjective emotional intensity. Cognition & Emotion. 1994; 8(4):329-50 https://doi.org/10.1080/02699939408408945
  14. Burkhardt F, Paeschke A, Rolfes M, Sendlmeier WF, Weiss B, editors. A database of German emo- tional speech. Ninth European Conference on Speech Communication and Technology (INTERSPEECH 2005); 2005; Lisbon, Portugal.
  15. Descartes R. The passions of the soul. In: Cottingham J, Stoothoff R, Murdoch D, editors. The philo- sophical works of Descartes. Cambridge: Cambridge University Press (Original work published 1649); 1984.
  16. Brainard DH. The psychophysics toolbox. Spatial Vision. 1997; 10:433-6. https://doi.org/10.1163/156856897X00357
  17. Florian Eyben, Martin Wollmer, and Bjorn Schuller, "Opensmile: the munich versatile and fast open-source audio feature extractor," in Proceedings of the 18th ACM international conference on Multimedia. ACM, 2010, pp. 1459-1462.
  18. Mengyi Liu, Ruiping Wang, Shaoxin Li, Shiguang Shan, Zhiwu Huang, and Xilin Chen, "Combining multiple kernel methods on riemannian manifold for emotion recognition in the wild," in Proceedings of the 16th International Conference on Multimodal Interaction. ACM, 2014, pp. 494-501.
  19. B. Schuller, S. Steidl, A. Batliner, J. Hirschberg, J. K. Burgoon, A. Baird, A. Elkins, Y. Zhang, E. Coutinho, and K. Evanini, "The INTERSPEECH 2016 Computational Paralinguistics Challenge: De- ception, Sincerity & Native language," in INTERSPEECH, 2016, pp. 2001-2005.
  20. Eyben, K. R. Scherer, B. W. Schuller, J. Sundberg, E. Andre, C. Busso, L. Y. Devillers, J. Epps, P. Laukka, S. S. Narayanan, and K. P. Truong, "The Geneva Minimalistic Acoustic Parameter Set (GeMAPS) for Voice Research and Affective Computing," IEEE Transactions on Affective Computing, vol. 7, no. 2, pp. 190-202, 2016. https://doi.org/10.1109/TAFFC.2015.2457417
  21. B. Schuller, S. Steidl, and A. Batliner, "The INTERSPEECH 2009 Emotion Challenge," in INTERSPEECH, 2009, pp. 312-315.
  22. F S. Steidl, A. Batliner, F. Burkhardt, L. Devillers, M. Christian, S. Lan- guage, P. Group, D. Telekom, and A. G. Laboratories, "The INTERSPEECH 2010 Paralinguistic Challenge," in INTERSPEECH, 2010, pp. 2794-2797.
  23. F. Eyben, F. Weninger, F. Gross, and B. Schuller, "Recent developments in openSMILE, the munich open-source multimedia feature extractor," in ACM international conference on Multimedia, 2013, pp. 835-838.
  24. Stavros Petridis, "Imperial College London, Machine Learning Course 395, 2018" https://ibug.doc.ic.ac.uk/media/uploads/documents/ml-lecture3-2018.pdf