DOI QR코드

DOI QR Code

Machine Learning Algorithm Accuracy for Code-Switching Analytics in Detecting Mood

  • Latib, Latifah Abd (Faculty of Communication, Visual Art and Computing, Universiti Selangor) ;
  • Subramaniam, Hema (Department of Software Engineering, Faculty of Computer Science and Information Technology, Universiti Malaya) ;
  • Ramli, Siti Khadijah (Faculty of Communication, Visual Art and Computing, Universiti Selangor) ;
  • Ali, Affezah (School of Liberal Arts & Sciences, Taylor's University) ;
  • Yulia, Astri (Department of Language Education, Faculty of Education and Social Sciences, University Selangor) ;
  • Shahdan, Tengku Shahrom Tengku (School Of Education & Human Sciences, Albukhary International University) ;
  • Zulkefly, Nor Sheereen (Faculty of Medicine and Health Sciences, Universiti Putra Malaysia)
  • 투고 : 2022.09.05
  • 발행 : 2022.09.30

초록

Nowadays, as we can notice on social media, most users choose to use more than one language in their online postings. Thus, social media analytics needs reviewing as code-switching analytics instead of traditional analytics. This paper aims to present evidence comparable to the accuracy of code-switching analytics techniques in analysing the mood state of social media users. We conducted a systematic literature review (SLR) to study the social media analytics that examined the effectiveness of code-switching analytics techniques. One primary question and three sub-questions have been raised for this purpose. The study investigates the computational models used to detect and measures emotional well-being. The study primarily focuses on online postings text, including the extended text analysis, analysing and predicting using past experiences, and classifying the mood upon analysis. We used thirty-two (32) papers for our evidence synthesis and identified four main task classifications that can be used potentially in code-switching analytics. The tasks include determining analytics algorithms, classification techniques, mood classes, and analytics flow. Results showed that CNN-BiLSTM was the machine learning algorithm that affected code-switching analytics accuracy the most with 83.21%. In addition, the analytics accuracy when using the code-mixing emotion corpus could enhance by about 20% compared to when performing with one language. Our meta-analyses showed that code-mixing emotion corpus was effective in improving the mood analytics accuracy level. This SLR result has pointed to two apparent gaps in the research field: i) lack of studies that focus on Malay-English code-mixing analytics and ii) lack of studies investigating various mood classes via the code-mixing approach.

키워드

참고문헌

  1. Abdalqader, M. A., & Joseph, S. A. (2020). The Impact of Social Media on Body Comparison Tendency, Body-Esteem and Sleep Quality Among Female Students in a Private University in Shah Alam/ Malaysia. Global Journal of Public Health Medicine, 2(2), 229-234. https://doi.org/10.37557/gjphm.v2i2.66
  2. Abdullah, M., Hadzikadicy, M., & Shaikhz, S. (2019). SEDAT: Sentiment and Emotion Detection in Arabic Text Using CNN-LSTM Deep Learning. Proceedings - 17th IEEE International Conference on Machine Learning and Applications, ICMLA 2018, January 2019, 835-840. https://doi.org/10.1109/ICMLA.2018.00134
  3. Ahmad, G. I., Singla, J., Ali, A., Reshi, A. A., & Salameh, A. A. (2022). Machine Learning Techniques for Sentiment Analysis of Code-Mixed and Switched Indian Social Media Text Corpus: A Comprehensive Review. International Journal of Advanced Computer Science and Applications, 13(2), 455-467. https://doi.org/10.14569/IJACSA.2022.0130254
  4. Ahmad, H., Asghar, M. Z., Khan, A. S., & Habib, A. (2020). A systematic literature review of personality trait classification from the textual content. Open Computer Science, 10(1), 175-193. https://doi.org/10.1515/comp-2020-0188
  5. Ahmad, M., Aftab, S., & Ali, I. (2017). Sentiment Analysis of Tweets using SVM. International Journal of Computer Applications, 177(5), 25-29. https://doi.org/10.5120/ijca2017915758
  6. Ali, M. Z., Javed, K., Haq, E. ul, & Tariq, A. (2021). Sentiment and Emotion Classification of Epidemic Related Bilingual data from Social Media. http://arxiv.org/abs/2105.01468
  7. Azmin, S., & Dhar, K. (2019). Emotion Detection from Bangla Text Corpus Using Naive Bayes Classifier. 2019 4th International Conference on Electrical Information and Communication Technology (EICT), 1-5. https://doi.org/10.1109/EICT48899.2019.9068797
  8. Berryman, C., Ferguson, C. J., & Negy, C. (2018). Social Media Use and Mental Health among Young Adults. Psychiatric Quarterly, 89(2), 307-314. https://doi.org/10.1007/s11126-017-9535-6
  9. Braithwaite, S. R., Giraud-Carrier, C., West, J., Barnes, M. D., & Hanson, C. L. (2016). Validating Machine Learning Algorithms for Twitter Data Against Established Measures of Suicidality. JMIR Mental Health, 3(2), e21. https://doi.org/10.2196/mental.4822
  10. Cain, J. (2018). It is time to confront student mental health issues associated with smartphones and social media. American Journal of Pharmaceutical Education, 82(7), 738-741. https://doi.org/10.5688/AJPE6862
  11. Cao, B., Zheng, L., Zhang, C., Yu, P. S., Piscitello, A., Zulueta, J., Ajilore, O., Ryan, K., & Leow, A. D. (2017). DeepMood: Modeling mobile phone typing dynamics for mood detection. Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Part F1296, 747-755. https://doi.org/10.1145/3097983.3098086
  12. Collobert;, R., Weston, J., Bottou;, L., Karlen, M., Kavukcuoglu;, K., & Kuksa, P. (2011). Natural Language Processing ( Almost ) from Scratch. 12, 2493-2537.
  13. Erfani, S. S. B. A. (2018). Impacts of the use of social network sites on users' psychological well-being: A systematic review. https://doi.org/https://doi.org/10.1002/asi.24015
  14. Fedric, K., & Saumya, S. (2017). Engaging customers through online participation in social networking sites.
  15. Gao, W., Li, S., Lee, S. Y. M., Zhou, G., & Huang, C. R. (2013). Joint learning on sentiment and emotion classification. International Conference on Information and Knowledge Management, Proceedings, August 2015, 1505-1508. https://doi.org/10.1145/2505515.2507830
  16. Ghosh, S., Ghosh, S., & Das, D. (2017). Sentiment Identification in Code-Mixed Social Media Text. http://arxiv.org/abs/1707.01184
  17. Guntuku, S. C., Yaden, D. B., Kern, M. L., Ungar, L. H., & Eichstaedt, J. C. (2017). Detecting depression and mental illness on social media: an integrative review. Current Opinion in Behavioral Sciences, 18, 43-49. https://doi.org/10.1016/j.cobeha.2017.07.005
  18. Gupta, D., Lamba, A., Ekbal, A., & Bhattacharyya, P. (2016). Opinion Mining in a Code-Mixed Environment: A Case Study with Government Portals. Proc. of the 13th Intl. Conference on Natural Language Processing, 249-258. http://ltrc.iiit.ac.in/icon2016/proceedings/icon2016/pdf/W16-6331.pdf
  19. Harvey, R., Muncey, A., & Vaughan, N. (2018). Associating colours with emotions detected in social media tweets. Proceedings of AISB Annual Convention 2018, 5-8.
  20. Hasan, M., Rundensteiner, E., & Agu, E. (2019). Automatic emotion detection in text streams by analysing Twitter data. International Journal of Data Science and Analytics, 7(1), 35-51. https://doi.org/10.1007/s41060-018-0096-z
  21. Hattingh, M., Matthee, M., Smuts, H., Pappas, I., Yogesh, K., Mantymaki, M., Hattingh, M., Matthee, M., Smuts, H., Pappas, I., & Dwivedi, Y. K. (2021). Responsible Design, Implementation and Use of Information and Communication Technology. https://doi.org/10.1007/978-3-030-44999-5
  22. Ho, V. A., Nguyen, D. H. C., Nguyen, D. H., Pham, L. T. Van, Nguyen, D. V., Nguyen, K. Van, & Nguyen, N. L. T. (2020). Emotion Recognition for Vietnamese Social Media Text. Communications in Computer and Information Science, 1215 CCIS, 319-333. https://doi.org/10.1007/978-981-15-6168-9_27
  23. Jose, N., Chakravarthi, B. R., Suryawanshi, S., Sherly, E., & McCrae, J. P. (2020). A Survey of Current Datasets for Code-Switching Research. 2020 6th International Conference on Advanced Computing and Communication Systems (ICACCS), 136-141. https://doi.org/10.1109/ICACCS48705.2020.9074205
  24. Joseph, F. J. J. (2019). Twitter Based Outcome Predictions of 2019 Indian General Elections Using Decision Tree. 2019 4th International Conference on Information Technology (InCIT), 50-53. https://doi.org/10.1109/INCIT.2019.8911975
  25. Joshi, A., Prabhu, A., Shrivastava, M., & Varma, V. (2016). Towards sub-word level compositions for sentiment analysis of Hindi-English code mixed text. COLING 2016 - 26th International Conference on Computational Linguistics, Proceedings of COLING 2016: Technical Papers, 2012, 2482-2491.
  26. Karim, F., Oyewande, A., Abdalla, L. F., Chaudhry Ehsanullah, R., & Khan, S. (2020). Social Media Use and Its Connection to Mental Health: A Systematic Review. Cureus, 12(6). https://doi.org/10.7759/cureus.8627
  27. Kasmuri, E., & Basiron, H. (2019). Building a Malay-English code-switching subjectivity corpus for sentiment analysis. International Journal of Advances in Soft Computing and Its Applications, 11(1), 112-130
  28. Keles, B., McCrae, N., & Grealish, A. (2020). A systematic review: the influence of social media on depression, anxiety and psychological distress in adolescents. International Journal of Adolescence and Youth, 25(1), 79-93. https://doi.org/10.1080/02673843.2019.1590851
  29. Khan, A., Shahid Husain, M., & Khan, A. (2018). Analysis of Mental State of Users Using Social Media to Predict Depression! A Survey. International Journal of Advanced Research in Computer Science, 9(2).
  30. Kitchenham, B. A., & Charters, S. (2007). Guidelines for performing Systematic Literature Reviews in Software Engineering. January 1-57.
  31. Larsen, M. E., Boonstra, T. W., Batterham, P. J., O'Dea, B., Paris, C., & Christensen, H. (2015). We Feel: Mapping Emotion on Twitter. IEEE Journal of Biomedical and Health Informatics, 19(4), 1246-1252. https://doi.org/10.1109/JBHI.2015.2403839
  32. Lee, S. Y. M., & Wang, Z. (2015). Emotion in code-switching texts: Corpus construction and analysis. Proceedings of the 8th SIGHAN Workshop on Chinese Language Processing, SIGHAN 2015 - Co-Located with 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, ACL IJCNLP 2015, 91-99. https://doi.org/10.18653/v1/w15-3116
  33. Lo, S. L., Cambria, E., Cornforth, D., Ling, S., & David, A. (2016). Institutional Knowledge at Singapore Management University A multilingual semi-supervised approach in deriving Singlish sentic patterns for polarity detection Knowledge-Based Systems A multilingual semi-supervised approach in deriving Singlish sentic pa. 236-247.
  34. Malaysian Communication and Multimedia Commission (MCMC). (2020). Internet users survey 2020: Infographic. Statistics and Data Intelligence Department, Malaysian Communications and Multimedia Commission, 1-6.
  35. Malaysian Communications and Multimedia Commission. (2020). Internet Users Survey 2020. The Internet Users Survey, 76. https://doi.org/ISSN 1823-2523
  36. Nalinde, P. B., & Shinde, A. (2019). Machine learning framework for detection of psychological disorders at OSN. Int J Innov Technol Explor Eng (IJITEE), 8(11).
  37. Napitu, F., Bijaksana, M. A., Trisetyarso, A., & Heryadi, Y. (2017). Twitter opinion mining predicts broadband internet's customer churn rate. 2017 IEEE International Conference on Cybernetics and Computational Intelligence (CyberneticsCom), 141-146. https://doi.org/10.1109/CYBERNETICSCOM.2017.8311699
  38. Nassif, A. B., Elnagar, A., Shahin, I., & Henno, S. (2021). Deep learning for Arabic subjective sentiment analysis: Challenges and research opportunities. Applied Soft Computing, 98(November). https://doi.org/10.1016/j.asoc.2020.106836
  39. Nimeshika, S., & Ahangama, S. (2019). A Method to Identify the Current Mood of Social Media Users. 2019 14th Conference on Industrial and Information Systems (ICIIS), 356-359. https://doi.org/10.1109/ICIIS47346.2019.9063291
  40. Plaza-del-Arco, F. M., Martin-Valdivia, M. T., Urena-Lopez, L. A., & Mitkov, R. (2020). Improved emotion recognition in Spanish social media through incorporation of lexical knowledge. Future Generation Computer Systems, 110, 1000-1008. https://doi.org/https://doi.org/10.1016/j.future.2019.09.034
  41. Rabie, O., & Sturm, C. (2014). Feel the heat: Emotion detection in Arabic social media content. The International Conference on Data Mining, Internet Computing, and Big Data (BigData2014), 37-49.
  42. Rus, H. M., & Tiemensma, J. (2017). "It's complicated." A systematic review of associations between social network site use and romantic relationships. Computers in Human Behavior, 75, 684-703. https://doi.org/https://doi.org/10.1016/j.chb.2017.06.004
  43. Sasidhar, T. T., B, P., & P, S. K. (2020). Emotion Detection in Hinglish (Hindi+English) Code-Mixed Social Media Text. Procedia Computer Science, 171, 1346-1352. https://doi.org/https://doi.org/10.1016/j.procs.2020.04.144
  44. Sunarya, P. O. A., Refianti, R., Mutiara, A. B., & Octaviani, W. (2019). Comparison of Accuracy between Convolutional Neural Networks and Naive Bayes Classifiers in Sentiment Analysis on Twitter. 10(5), 77-86.
  45. Vijay, D., Bohra, A., Singh, V., Akhtar, S. S., & Shrivastava, M. (2018). Corpus Creation and Emotion Prediction for Hindi-English Code-Mixed Social Media Text. 128-135.
  46. Vilares, D., Alonso, M. A., & Gomez-Rodriguez, C. (2017). Supervised sentiment analysis in multilingual environments. Information Processing and Management, 53(3), 595-607. https://doi.org/10.1016/j.ipm.2017.01.004
  47. Wang, Z., Lee, S. Y. M., Li, S., & Zhou, G. (2017). Emotion Analysis in Code-Switching Text With Joint Factor Graph Model. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 25(3), 469-480. https://doi.org/10.1109/TASLP.2016.2637280
  48. Wallace, B. C. (2014). A Sensitivity Analysis of (and Practitioners' Guide to) Convolutional Neural Networks for Sentence Classification.
  49. Wang, Z., Lee, S. Y. M., Li, S., & Zhou, G. (2017). Emotion Analysis in Code-Switching Text With Joint Factor Graph Model. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 25(3), 469-480. https://doi.org/10.1109/TASLP.2016.2637280
  50. Wanniarachchi, V. U., Mathrani, A., Susnjak, T., & Scogings, C. (2020). A systematic literature review: What is the current stance towards weight stigmatisation in social media platforms? International Journal of Human-Computer Studies, 135, 102371. https://doi.org/https://doi.org/10.1016/j.ijhcs.2019.102371
  51. Yusop, F. D., & Sumari, M. (2013). The Use of Social Media Technologies among Malaysian Youth. Procedia - Social and Behavioral Sciences, 103, 1204-1209. https://doi.org/10.1016/j.sbspro.2013.10.448
  52. Zhou, C., Sun, C., Liu, Z., & Lau, F. C. M. (2015). A C-LSTM Neural Network for Text Classification. http://arxiv.org/abs/1511.08630
  53. Zucco, C., Calabrese, B., Agapito, G., Guzzi, P. H., & Cannataro, M. (2020). Sentiment analysis for mining texts and social networks data: Methods and tools. WIREs Data Mining and Knowledge Discovery, 10(1), e1333. https://doi.org/https://doi.org/10.1002/widm.1333