DOI QR코드

DOI QR Code

Attention-based CNN-BiGRU for Bengali Music Emotion Classification

  • Subhasish Ghosh (Department of computer science and engineering, BGC Trust University Bangladesh) ;
  • Omar Faruk Riad (Department of computer science and engineering, BGC Trust University Bangladesh)
  • 투고 : 2023.09.05
  • 발행 : 2023.09.30

초록

For Bengali music emotion classification, deep learning models, particularly CNN and RNN are frequently used. But previous researches had the flaws of low accuracy and overfitting problem. In this research, attention-based Conv1D and BiGRU model is designed for music emotion classification and comparative experimentation shows that the proposed model is classifying emotions more accurate. We have proposed a Conv1D and Bi-GRU with the attention-based model for emotion classification of our Bengali music dataset. The model integrates attention-based. Wav preprocessing makes use of MFCCs. To reduce the dimensionality of the feature space, contextual features were extracted from two Conv1D layers. In order to solve the overfitting problems, dropouts are utilized. Two bidirectional GRUs networks are used to update previous and future emotion representation of the output from the Conv1D layers. Two BiGRU layers are conntected to an attention mechanism to give various MFCC feature vectors more attention. Moreover, the attention mechanism has increased the accuracy of the proposed classification model. The vector is finally classified into four emotion classes: Angry, Happy, Relax, Sad; using a dense, fully connected layer with softmax activation. The proposed Conv1D+BiGRU+Attention model is efficient at classifying emotions in the Bengali music dataset than baseline methods. For our Bengali music dataset, the performance of our proposed model is 95%.

키워드

참고문헌

  1. S. R. Gulhane, S. D. Shirbahadurkar, and S. Badhe Sanjay, 2019. Self organizing feature map network for musical instrument sounds. International journal of innovative technology and exploring Engineering, vol. 8, no. 9S3, pp. 143-146, 2019. https://doi.org/10.35940/ijitee.I3029.0789S319
  2. P. Y. Raj, B. Bhuwan, and L. Joonwhoan, 2021. Deep-learning-based multimodal emotion classification for music videos. Sensors (Basel, Switzerland), vol. 21, no. 14, pp. 4927-4931, 2021. https://doi.org/10.3390/s21144927
  3. Rana, D. and Jain, A., 2014. Effect of windowing on the calculation of MFCC statistical parameter for different gender in Hindi speech. International Journal of Computer Applications, 98(8).
  4. Jain, A., Prakash, N. and Agrawal, S.S., 2011, May. Evaluation of MFCC for emotion identification in Hindi speech. In 2011 IEEE 3rd International Conference on Communication Software and Networks (pp. 189-193). IEEE.
  5. Lee, D., 2019. Hornbostel-Sachs classification of musical instruments. Knowledge Organization, 47(1), pp.72-91. https://doi.org/10.5771/0943-7444-2020-1-72
  6. Heideman, Michael T.; Johnson, Don H.; Burrus, Charles Sidney (1984). "Gauss and the history of the fast Fourier transform".
  7. Ying, M., Kaiyong, L., Jiayu, H. and Zangjia, G., 2019. Analysis of Tibetan folk music style based on audio signal processing. Journal of Electrical and Electronic Engineering, 7(6), pp.151-154. https://doi.org/10.11648/j.jeee.20190706.13
  8. Prabavathy, S., Rathikarani, V. and Dhanalakshmi, P., 2020. Classification of Musical Instruments using SVM and KNN. International Journal of Innovative Technology and Exploring Engineering, 9(7), pp.1186-1190. https://doi.org/10.35940/ijitee.G5836.059720
  9. Li, J., Luo, J., Ding, J., Zhao, X. and Yang, X., 2019. Regional classification of Chinese folk songs based on CRF model. Multimedia tools and applications, 78(9), pp.11563-11584. https://doi.org/10.1007/s11042-018-6637-6
  10. Cheah, K.H., Nisar, H., Yap, V.V. and Lee, C.Y., 2020. Convolutional neural networks for classification of music-listening EEG: comparing 1D convolutional kernels with 2D kernels and cerebral laterality of musical influence. Neural Computing and Applications, 32(13), pp.8867-8891. https://doi.org/10.1007/s00521-019-04367-7
  11. Tamboli, A.I. and Kokate, R.D., 2019. An effective optimizationbased neural network for musical note recognition. Journal of Intelligent Systems, 28(1), pp.173-183. https://doi.org/10.1515/jisys-2017-0038
  12. Kamyab, M., Liu, G., Rasool, A. and Adjeisah, M., 2022. ACR-SA: attention-based deep model through two-channel CNN and Bi-RNN for sentiment analysis. PeerJ Computer Science, 8, p.e877.
  13. Dey, R. and Salem, F.M., 2017, August. Gate-variants of gated recurrent unit (GRU) neural networks. In 2017 IEEE 60th international midwest symposium on circuits and systems (MWSCAS) (pp. 1597-1600). IEEE.
  14. Liu, J., Yang, Y., Lv, S., Wang, J. and Chen, H., 2019. Attentionbased BiGRU-CNN for Chinese question classification. Journal of Ambient Intelligence and Humanized Computing, pp.1-12.
  15. Hunckler, M., [Updated 2017 Feb 20]. Emotional Intelligence: Your Secret Weapon For Success In Business And Life. Available from: https://www.forbes.com/sites/matthunckler/2017/02/20/emotionalintelligence-in-business-and-life/?sh=3516c1687f6c
  16. DSilva, A., [Updated 2019 Nov 08]. Did you know that 90% of top performers have a high EQ? Available from: https://www.capacityhr.co.uk/did-you-know-that-90-of-topperformers-have-a-high-eq#:~:
  17. Ackerman, E.C [Updated 2018 March 12]. Positive Emotions: A List of 26 Examples & Definition in Psychology. Available from: https://positivepsychology.com/positive-emotions-list-examplesdefinition-psychology/
  18. Yang, S., He, D. and Zhang, M., 2022, January. A Speaker System Based On CLDNN Music Emotion Recognition Algorithm. In ICETIS 2022; 7th International Conference on Electronic Technology and Information Science (pp. 1-7). VDE.
  19. Xie, L. and Gao, Y., 2022. A database for aesthetic classification of Chinese traditional music. Cognitive Computation and Systems.
  20. Tiple, B. and Patwardhan, M., 2022. Multi-label emotion recognition from Indian classical music using gradient descent SNN model. Multimedia Tools and Applications, 81(6), pp.8853-8870. https://doi.org/10.1007/s11042-022-11975-4
  21. He, J., 2022. Algorithm Composition and Emotion Recognition Based on Machine Learning. Computational Intelligence and Neuroscience,
  22. Satayarak, N. and Benjangkaprasert, C., 2022, June. On the Study of Thai Music Emotion Recognition Based on Western Music Model. In Journal of Physics: Conference Series (Vol. 2261, No. 1, p. 012018). IOP Publishing.
  23. Li, J., Han, L., Li, X., Zhu, J., Yuan, B. and Gou, Z., 2022. An evaluation of deep neural network models for music classification using spectrograms. Multimedia Tools and Applications, 81(4), pp.4621-4647. https://doi.org/10.1007/s11042-020-10465-9
  24. Wu, Z., 2022. Research on automatic classification method of ethnic music emotion based on machine learning. Journal of Mathematics, 2022.
  25. Niu, N., 2022. Music Emotion Recognition Model Using Gated Recurrent Unit Networks and Multi-Feature Extraction. Mobile Information Systems, 2022.
  26. Wang, C. and Ko, Y.C., 2022. Emotional representation of music in multi-source data by the Internet of Things and deep learning. The Journal of Supercomputing, pp.1-18.
  27. Tong, G., 2022. Music Emotion Classification Method Using Improved Deep Belief Network. Mobile Information Systems, 2022.
  28. Liao, Y.J., Wang, W.C., Ruan, S.J., Lee, Y.H. and Chen, S.C., 2022. A Music Playback Algorithm Based on Residual-Inception Blocks for Music Emotion Classification and Physiological Information. Sensors, 22(3), p.777.
  29. Jia, X., 2022. Music Emotion Classification Method Based on Deep Learning and Improved Attention Mechanism. Computational Intelligence and Neuroscience, 2022.
  30. Abdullah, S.M.S.A., Ameen, S.Y.A., Sadeeq, M.A. and Zeebaree, S., 2021. Multimodal emotion recognition using deep learning. Journal of Applied Science and Technology Trends, 2(02), pp.52-58. https://doi.org/10.38094/jastt20291
  31. Zhao, W., Zhou, Y., Tie, Y. and Zhao, Y., 2018, October. Recurrent neural network for MIDI music emotion classification. In 2018 IEEE 3rd Advanced Information Technology, Electronic and Automation Control Conference (IAEAC) (pp. 2596-2600). IEEE.
  32. Cunningham, S., Ridley, H., Weinel, J. and Picking, R., 2021. Supervised machine learning for audio emotion recognition. Personal and Ubiquitous Computing, 25(4), pp.637-650. https://doi.org/10.1007/s00779-020-01389-0
  33. Liu, H., Fang, Y. and Huang, Q., 2019, January. Music emotion recognition using a variant of recurrent neural network. In 2018 International Conference on Mathematics, Modeling, Simulation and Statistics Application (MMSSA 2018). Atlantis Press.
  34. Medina, Y.O., Beltran, J.R. and Baldassarri, S., 2020. Emotional classification of music using neural networks with the MediaEval dataset. Personal and Ubiquitous Computing, pp.1-13.
  35. Chen, C. and Li, Q., 2020. A multimodal music emotion classification method based on multifeature combined network classifier. Mathematical Problems in Engineering, 2020.
  36. Rajesh, S. and Nalini, N.J., 2020. Musical instrument emotion recognition using deep recurrent neural network. Procedia Computer Science, 167, pp.16-25. https://doi.org/10.1016/j.procs.2020.03.178
  37. Jia, X., 2022. Music Emotion Classification Method Based on Deep Learning and Explicit Sparse Attention Network. Computational Intelligence and Neuroscience, 2022.
  38. Chaudhary, D., Singh, N.P. and Singh, S., 2021. Development of music emotion classification system using convolution neural network. International Journal of Speech Technology, 24(3), pp.571-580. https://doi.org/10.1007/s10772-020-09781-0
  39. Na, W. and Yong, F., 2022. Music Recognition and Classification Algorithm considering Audio Emotion. Scientific Programming, 2022.
  40. Chorowski, J.K., Bahdanau, D., Serdyuk, D., Cho, K. and Bengio, Y., 2015. Attention-based models for speech recognition. Advances in neural information processing systems, 28.
  41. Bahdanau D, Cho K, Bengio Y. Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473. 2014 Sep 1.