DOI QR코드

DOI QR Code

Arabic Stock News Sentiments Using the Bidirectional Encoder Representations from Transformers Model

  • Eman Alasmari (The Faculty of Computing and Information Technology, King Abdulaziz University) ;
  • Mohamed Hamdy (The Faculty of Computing and Information Technology, King Abdulaziz University) ;
  • Khaled H. Alyoubi (The Faculty of Computing and Information Technology, King Abdulaziz University) ;
  • Fahd Saleh Alotaibi (The Faculty of Computing and Information Technology, King Abdulaziz University)
  • Received : 2024.02.05
  • Published : 2024.02.29

Abstract

Stock market news sentiment analysis (SA) aims to identify the attitudes of the news of the stock on the official platforms toward companies' stocks. It supports making the right decision in investing or analysts' evaluation. However, the research on Arabic SA is limited compared to that on English SA due to the complexity and limited corpora of the Arabic language. This paper develops a model of sentiment classification to predict the polarity of Arabic stock news in microblogs. Also, it aims to extract the reasons which lead to polarity categorization as the main economic causes or aspects based on semantic unity. Therefore, this paper presents an Arabic SA approach based on the logistic regression model and the Bidirectional Encoder Representations from Transformers (BERT) model. The proposed model is used to classify articles as positive, negative, or neutral. It was trained on the basis of data collected from an official Saudi stock market article platform that was later preprocessed and labeled. Moreover, the economic reasons for the articles based on semantic unit, divided into seven economic aspects to highlight the polarity of the articles, were investigated. The supervised BERT model obtained 88% article classification accuracy based on SA, and the unsupervised mean Word2Vec encoder obtained 80% economic-aspect clustering accuracy. Predicting polarity classification on the Arabic stock market news and their economic reasons would provide valuable benefits to the stock SA field.

Keywords

References

  1. F. Jin, W. Wang, P. Chakraborty et al., "Tracking Multiple Social Media for Stock Market Event Prediction," in Advances in Data Mining. Applications and Theoretical Aspects, Cham, pp. 16-30, 2017.
  2. Y. W. Wanjari, V. D. Mohod, D. B. Gaikwad et al., "Automatic news extraction system for Indian online news papers," in Proc. - 2014 3rd Int. Conf. Reliab. Infocom Technol. Optim. Trends Futur. Dir. ICRITO 2014, pp. 1-6, 2015.
  3. A. Mukwazvure and K. P. Supreethi, "A hybrid approach to sentiment analysis of news comments," in 2015 4th Int. Conf. Reliab. Infocom Technol. Optim. Trends Futur. Dir. ICRITO 2015, pp. 1-6, 2015.
  4. V. S. Pagolu, K. N. Reddy, G. Panda et al., "Sentiment analysis of Twitter data for predicting stock market movements," in Int. Conf. Signal Process. Commun. Power Embed. Syst. SCOPES 2016 - Proc., pp. 1345- 1350, 2017.
  5. F. Hemmatian and M. K. Sohrabi, "A survey on classification techniques for opinion mining and sentiment analysis," Artif. Intell. Rev., vol. 52, no. 3, pp. 1495-1545, 2019. https://doi.org/10.1007/s10462-017-9599-6
  6. E. W. Zhang, W., Li et al., "Dynamic Business Network Analysis for Correlated Stock Price Movement Prediction," IEEE Intelligent Systems, vol. 30, no. 2, pp. 26-33, 2015.
  7. D. D. Wu, L. Zheng and D. L. Olson, "A Decision Support Approach for Online Stock Forum Sentiment Analysis," IEEE transactions on systems, man, and cybernetics: systems, vol. 44, no. 8, pp. 1077-1087, 2014. https://doi.org/10.1109/TSMC.2013.2295353
  8. S. Krishnamoorthy, "Sentiment analysis of financial news articles using performance indicators," Knowl. Inf. Syst., vol. 56, no. 2, pp. 373-394, 2018. https://doi.org/10.1007/s10115-017-1134-1
  9. P. Choudhari, "Sentiment Analysis and Machine Learning Based Sentiment Classification : A Review," International Journal of Advanced Research in Computer Science, vol. 8, no. 3, 2017.
  10. K. Min and H. Moon, "Deep Learning Approach for Short-Term Stock Trends Prediction Based on TwoStream Gated Recurrent Unit Network," IEEE Access, vol. 6, pp. 55392-55404, 2018. https://doi.org/10.1109/ACCESS.2018.2868970
  11. M. Al-Ayyoub, A. Nuseir, K. Alsmearat et al., "Deep learning for Arabic NLP: A survey," J. Comput. Sci., vol. 26, pp. 522-531, 2018. https://doi.org/10.1016/j.jocs.2017.11.011
  12. Z. Obied, A. Solyman, A. Ullah et al., "BERT Multilingual and Capsule Network for Arabic Sentiment Analysis," in Proc. 2020 Int. Conf. Comput. Control. Electr. Electron. Eng. ICCCEEE 2020, pp. 1-6, 2021.
  13. A. Abuzayed and H. Al-Khalifa, "Sarcasm and Sentiment Detection In Arabic Tweets Using BERT-based Models and Data Augmentation," in Proc. Sixth Arab. Nat. Lang. Process. Work., pp. 312-317, 2021.
  14. M. El-Masri, N. Altrabsheh and H. Mansour, "Successes and challenges of Arabic sentiment analysis research: a literature review," Soc. Netw. Anal. Min., vol. 7, no. 1, pp. 1-22, 2017. https://doi.org/10.1007/s13278-016-0419-9
  15. M. A. Han, Hao.Hmeidi, I. et al., "A lexicon based approach for classifying Arabic multi-labeled text," Int. J. Web Inf. Syst., vol. 1011, no. 17, pp. 324-342, 2016.
  16. J. Kordonis, S. Symeonidis and A. Arampatzis, "Stock Price Forecasting via Sentiment Analysis on Twitter," in Proc. 20th Pan-Hellenic Conf. Informatics - PCI '16, pp. 1-6, 2016.
  17. D. de Franca Costa and N. F. F. da Silva, "INF-UFG at FiQA 2018 Task 1: predicting sentiments and aspects on financial tweets and news headlines," In Companion Proceedings of the The Web Conference 2018, pp. 1967-1971, 2018.
  18. L. Qiu, Q. Lei and Z. Zhang, "Advanced Sentiment Classification of Tibetan Microblogs on Smart Campuses Based on Multi-Feature Fusion," IEEE Access, vol. 6, pp. 17896-17904, 2018. https://doi.org/10.1109/ACCESS.2018.2820163
  19. L. Troiano, S. Member, E. M. Villa et al., "Replicating a Trading Strategy by Means of LSTM for Financial Industry Applications," IEEE Trans. Ind. Informatics, vol. 14, no. 7, pp. 3226-3234, 2018. https://doi.org/10.1109/TII.2018.2811377
  20. Y. Guo, S. Han, C. Shen et al., "An Adaptive SVR for High-Frequency Stock Price Forecasting," IEEE Access, vol. 6, pp. 11397-11404, 2018. https://doi.org/10.1109/ACCESS.2018.2806180
  21. P. Pai, S. Member and C. Liu, "Predicting Vehicle Sales by Sentiment Analysis of Twitter Data and Stock Market Values," IEEE Access, vol. 6, pp. 57655-57662, 2018. https://doi.org/10.1109/ACCESS.2018.2873730
  22. F. Z. Xing, E. Cambria and R. E. Welsch, "Intelligent asset allocation via market sentiment views," IEEE Comput. Intell. Mag., vol. 13, no. 4, pp. 25-34, 2018.
  23. Y. Touzani, K. Douzi and F. Khoukhi, "Stock Price Forecasting: New Model for Uptrend Detecting and Downtrend Anticipating Based on Long Short-Term Memory," In Proceedings of the 2018 2nd International Conference on Cloud and Big Data Computing, pp. 61-65, 2018.
  24. V. K. Piryani, R., Madhavi, D. et al., "Analytical mapping of opinion mining and sentiment analysis research during 2000 - 2015," Information Processing & Management, vol. 53, no. 1. pp. 122-150, 2017. https://doi.org/10.1016/j.ipm.2016.07.001
  25. F. A. Y. Q. Ni, M.ASCE, H. F. Zhou et al., "Generalization Capability of Neural Network Models for Temperature-Frequency Correlation Using Monitoring Data," J. Struct. Eng., vol. 135, no. 10, pp. 1290-1300, 2009. https://doi.org/10.1061/(ASCE)ST.1943-541X.0000050
  26. M. Anthony, and P. L. Bartlett, "Neural network l earning: Theoretical foundations," in Cambridge: cambridge university press, vol. 9. 1999.
  27. A. Yadav and D. K. Vishwakarma, "Sentiment analysis using deep learning architectures: a review," Artif. Intell. Rev., vol. 53, no. 6, pp. 4335-4385, 2020. https://doi.org/10.1007/s10462-019-09794-5
  28. S. Sachin, A. Tripathi, N. Mahajan et al., "Sentiment Analysis Using Gated Recurrent Neural Networks," SN Comput. Sci., vol. 1, no. 2, pp. 1-13, 2020.
  29. J. V. Tembhurne and T. Diwan, "Sentiment analysis in textual, visual and multimodal inputs using recurrent neural networks," Multimed. Tools Appl., vol. 80, no. 5, pp. 6871-6910, 2021. https://doi.org/10.1007/s11042-020-10037-x
  30. M. Nabipour, P. Nayyeri, H. Jabani et al., "Predicting Stock Market Trends Using Machine Learning and Deep Learning Algorithms Via Continuous and Binary Data; A Comparative Analysis," IEEE Access, vol. 8, pp. 150199-150212, 2020. https://doi.org/10.1109/ACCESS.2020.3015966
  31. R. Cai, B. Qin, Y. Chen et al., "Sentiment analysis about investors and consumers in energy market based on BERT-BILSTM," IEEE Access, vol. 8, pp. 171408-171415, 2020. https://doi.org/10.1109/ACCESS.2020.3024750
  32. A. Vaswani, N. Shazeer, N. Parmar et al., "Attention is all you need," Adv. Neural Inf. Process. Syst., pp. 5999-6009, 2017.
  33. J. Devlin, M. W. Chang, K. Lee et al., "BERT: Pretraining of deep bidirectional transformers for language understanding," in NAACL HLT 2019 - 2019 Conf. North Am. Chapter Assoc. Comput. Linguist. Hum. Lang. Technol. - Proc. Conf., vol. 1, pp. 4171-4186, 2019.
  34. W. M. Szu, Y. C. Wang and W. R. Yang, "How does investor sentiment affect implied risk-neutral distributions of call and put options?," In HANDBOOK OF FINANCIAL ECONOMETRICS, MATHEMATICS, STATISTICS, AND MACHINE LEARNING, vol. 18, no. 2, pp. 1599-1636, 2015.
  35. M. Koppel and J. Schler, "The importance of neutral examples for learning sentiment," Comput. Intell., vol. 22, no. 2, pp. 100-109, 2006. https://doi.org/10.1111/j.1467-8640.2006.00276.x
  36. T. S. Ng, "Machine learning," Stud. Syst. Decis. Control, vol. 65, pp. 121-151, 2016. https://doi.org/10.1007/978-981-10-1509-0_9
  37. G. Hackeling, "Mastering Machine Learning with scikit-learn," in Packt Publishing Ltd, 2017.
  38. P. Liang and M. I. Jordan, "An asymptotic analysis of generative, discriminative, and pseudolikelihood estimators," in Proc. 25th Int. Conf. Mach. Learn., pp. 584-591, 2008.
  39. F. A. Gers, J. Schmidhuber and F. Cummins, "Learning to forget: Continual prediction with LSTM," Neural Comput., vol. 12, no. 10, pp. 2451-2471, 2000.
  40. I. Goodfellow, Y. Bengio and A. Courville, "deep learning English version," MIT press, p. 800, 2017.