DOI QR코드

DOI QR Code

A novel classification approach based on Naïve Bayes for Twitter sentiment analysis

  • Received : 2016.11.25
  • Accepted : 2017.03.13
  • Published : 2017.06.30

Abstract

With rapid growth of web technology and dissemination of smart devices, social networking service(SNS) is widely used. As a result, huge amount of data are generated from SNS such as Twitter, and sentiment analysis of SNS data is very important for various applications and services. In the existing sentiment analysis based on the $Na{\ddot{i}}ve$ Bayes algorithm, a same number of attributes is usually employed to estimate the weight of each class. Moreover, uncountable and meaningless attributes are included. This results in decreased accuracy of sentiment analysis. In this paper two methods are proposed to resolve these issues, which reflect the difference of the number of positive words and negative words in calculating the weights, and eliminate insignificant words in the feature selection step using Multinomial $Na{\ddot{i}}ve$ Bayes(MNB) algorithm. Performance comparison demonstrates that the proposed scheme significantly increases the accuracy compared to the existing Multivariate Bernoulli $Na{\ddot{i}}ve$ Bayes(BNB) algorithm and MNB scheme.

Keywords

References

  1. Sitaram Asur and Bernardo A. Huberman, "Predicting the Future with Social Media," in Proc. of the 2010 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology, pp.492-499, 2010.
  2. Jeffrey Nichols, Jalal Mahmud and Clemens Drews, "Summarizing Sporting Events Using Twitter," in Proc. of the 2012 ACM international conference on Intelligent User Interfaces, pp.189-198, 2012.
  3. Anurag P. Jain and Vijay D. Katkar, "Sentiments analysis of Twitter data using data mining," in Proc. of International Conference on Information Processing,pp.807-810, 2015.
  4. Vishal A. Kharde and S.S. Sonawane, "Sentiment Analysis of Twitter Data: A Survey of Techniques," International Journal of Computer Applications, vol. 139, no. 11, pp.5-15, April 2016. https://doi.org/10.5120/ijca2016908625
  5. Ang Yang, Jun Zhang, Lei Pan and Yang Xiang, "Enhanced Twitter Sentiment Analysis by Using Feature Selection and Combination," in Proc. of International Symposium on Security and Privacy in Social Networks and Big Data, pp.52-57, 2015.
  6. Alec Go, Richa Bhayani andLei Huang, "Twitter Sentiment Classification using Distant Supervision,"CS224N Project Report, Stanford. 1, 2009.
  7. Fabrizio Sebastiani, "Machine Learning in Automated Text Categorization," ACM Computing Survey, vol. 34, no. 1, pp.1-47, March, 2002. https://doi.org/10.1145/505282.505283
  8. S. B. Kotsiantis, "Supervised Machine Learning: A Review of Classification Techniques," Informatica, vol. 31, no. 3, pp.249-268, 2007.
  9. Jingnian Chen, Houkuan Huang, Shengfeng Tian and Youli Qu, "Feature selection for text classification with Naive Bayes," Expert Systems with Applications, vol. 36, no. 3, pp.5432-5435, April, 2009. https://doi.org/10.1016/j.eswa.2008.06.054
  10. Saif M. Mohammad, Svetlana Kiritchenko and Xiaodan Zhu, "NRC-Canada: Building the State-of-the-Art in Sentiment Analysis of Tweets," in Proc. of the seventh international workshop on Semantic Evaluation Exercises, 2013.
  11. Apoorv Agarwal, Boyi Xie, Ilia Vovsha, Owen Rambow and Rebecca Passonneau, "Sentiment analysis of Twitter data," in Proc. of the Workshop on Languages in Social Media, pp.30-38, 2011.
  12. Bac Le and Huy Nguyen, "Twitter Sentiment Analysis Using Machine Learning Techniques," Advanced Computational Methods for Knowledge Engineering, pp.279-289, 2015.
  13. Jia Wu, Shirui Pan, Xingquan Zhu, Zhihua Cai, Peng Zhang and Chengqi Zhang, "Self-adaptive attribute weighting for Naive Bayes classification," Expert Systems with Applications, vol. 42, no. 3, pp.1487-1502, February, 2015. https://doi.org/10.1016/j.eswa.2014.09.019
  14. Nir Friedman, Dan Geiger and Moises Goldszmidt, "Bayesian Network Classifiers," Machine Learning, vol. 29, no. 2, pp.131-163, November, 1997. https://doi.org/10.1023/A:1007465528199
  15. Andrew McCallum and Kamal Nigam, "A Comparison of Event Models for Naive Bayes Text Classification," in Proc. of AAAI-98 workshop on learning for text categorization, pp. 41-49, 1998.
  16. Lungan Zhang, Liangxiao Jiang, Chaoqun Li and Ganggang Kong, "Two feature weighting approaches for naive Bayes text classifiers," Knowledge-Based Systems, vol. 100, no. 15, pp.137-144, May, 2016. https://doi.org/10.1016/j.knosys.2016.02.017
  17. Liangxiao Jiang, Harry Zhang andZhihua Cai, "A Novel Bayes Model: Hidden Naive Bayes," IEEE Transactions on Knowledge and Data Engineering, vol. 21, no. 10, pp.1361-1371, October, 2009. https://doi.org/10.1109/TKDE.2008.234
  18. Liangxiao Jiang, Chaoqun Li, Shasha Wang and Lungan Zhang, "Deep feature weighting for naive Bayes and its application to text classification," Engineering Applications of Artificial Intelligence, vol. 52, pp.26-39, June, 2016. https://doi.org/10.1016/j.engappai.2016.02.002
  19. Xuemeng Song, Zhao-Yan Ming, Liqiang Nie, Yi-Liang Zhao and Tat-Seng Chua, "Volunteerism Tendency Prediction via Harvesting Multiple Social Networks," ACM Transactions on Information Systems, vol. 34, no. 2, pp.1-27, April, 2016.
  20. Aliaksei Severyn and Alessandro Moschitti, "Twitter Sentiment Analysis with Deep Convolutional Neural Networks," in Proc. of International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 959-962, August, 2015.

Cited by

  1. Fuzzy Ontology and LSTM-Based Text Mining: A Transportation Network Monitoring System for Assisting Travel vol.19, pp.2, 2017, https://doi.org/10.3390/s19020234
  2. SAEP: A Surrounding-Aware Individual Emotion Prediction Model Combined with T-LSTM and Memory Attention Mechanism vol.11, pp.23, 2017, https://doi.org/10.3390/app112311111