A novel classification approach based on Naïve Bayes for Twitter sentiment analysis

Song, Junseok;Kim, Kyung Tae;Lee, Byungjun;Kim, Sangyoung;Youn, Hee Yong;

doi:10.3837/tiis.2017.06.011

KSII Transactions on Internet and Information Systems (TIIS)

Volume 11 Issue 6
/
Pages.2996-3011
/
2017
/
1976-7277(pISSN)
/
1976-7277(eISSN)

Korean Society for Internet Information (한국인터넷정보학회)

DOI QR Code

A novel classification approach based on Naïve Bayes for Twitter sentiment analysis

Song, Junseok (College of Software, Sungkyunkwan University) ;
Kim, Kyung Tae (College of Software, Sungkyunkwan University) ;
Lee, Byungjun (College of Software, Sungkyunkwan University) ;
Kim, Sangyoung (College of Software, Sungkyunkwan University) ;
Youn, Hee Yong (College of Software, Sungkyunkwan University)

Received : 2016.11.25
Accepted : 2017.03.13
Published : 2017.06.30

https://doi.org/10.3837/tiis.2017.06.011 Citation PDF KSCI

Download PDF

⟨ Previous Next ⟩

Abstract

With rapid growth of web technology and dissemination of smart devices, social networking service(SNS) is widely used. As a result, huge amount of data are generated from SNS such as Twitter, and sentiment analysis of SNS data is very important for various applications and services. In the existing sentiment analysis based on the $Na{\ddot{i}}ve$ Bayes algorithm, a same number of attributes is usually employed to estimate the weight of each class. Moreover, uncountable and meaningless attributes are included. This results in decreased accuracy of sentiment analysis. In this paper two methods are proposed to resolve these issues, which reflect the difference of the number of positive words and negative words in calculating the weights, and eliminate insignificant words in the feature selection step using Multinomial $Na{\ddot{i}}ve$ Bayes(MNB) algorithm. Performance comparison demonstrates that the proposed scheme significantly increases the accuracy compared to the existing Multivariate Bernoulli $Na{\ddot{i}}ve$ Bayes(BNB) algorithm and MNB scheme.

Keywords

References

Sitaram Asur and Bernardo A. Huberman, "Predicting the Future with Social Media," in Proc. of the 2010 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology, pp.492-499, 2010.
Jeffrey Nichols, Jalal Mahmud and Clemens Drews, "Summarizing Sporting Events Using Twitter," in Proc. of the 2012 ACM international conference on Intelligent User Interfaces, pp.189-198, 2012.
Anurag P. Jain and Vijay D. Katkar, "Sentiments analysis of Twitter data using data mining," in Proc. of International Conference on Information Processing,pp.807-810, 2015.
Vishal A. Kharde and S.S. Sonawane, "Sentiment Analysis of Twitter Data: A Survey of Techniques," International Journal of Computer Applications, vol. 139, no. 11, pp.5-15, April 2016. https://doi.org/10.5120/ijca2016908625
Ang Yang, Jun Zhang, Lei Pan and Yang Xiang, "Enhanced Twitter Sentiment Analysis by Using Feature Selection and Combination," in Proc. of International Symposium on Security and Privacy in Social Networks and Big Data, pp.52-57, 2015.
Alec Go, Richa Bhayani andLei Huang, "Twitter Sentiment Classification using Distant Supervision,"CS224N Project Report, Stanford. 1, 2009.
Fabrizio Sebastiani, "Machine Learning in Automated Text Categorization," ACM Computing Survey, vol. 34, no. 1, pp.1-47, March, 2002. https://doi.org/10.1145/505282.505283
S. B. Kotsiantis, "Supervised Machine Learning: A Review of Classification Techniques," Informatica, vol. 31, no. 3, pp.249-268, 2007.
Jingnian Chen, Houkuan Huang, Shengfeng Tian and Youli Qu, "Feature selection for text classification with Naive Bayes," Expert Systems with Applications, vol. 36, no. 3, pp.5432-5435, April, 2009. https://doi.org/10.1016/j.eswa.2008.06.054
Saif M. Mohammad, Svetlana Kiritchenko and Xiaodan Zhu, "NRC-Canada: Building the State-of-the-Art in Sentiment Analysis of Tweets," in Proc. of the seventh international workshop on Semantic Evaluation Exercises, 2013.
Apoorv Agarwal, Boyi Xie, Ilia Vovsha, Owen Rambow and Rebecca Passonneau, "Sentiment analysis of Twitter data," in Proc. of the Workshop on Languages in Social Media, pp.30-38, 2011.
Bac Le and Huy Nguyen, "Twitter Sentiment Analysis Using Machine Learning Techniques," Advanced Computational Methods for Knowledge Engineering, pp.279-289, 2015.
Jia Wu, Shirui Pan, Xingquan Zhu, Zhihua Cai, Peng Zhang and Chengqi Zhang, "Self-adaptive attribute weighting for Naive Bayes classification," Expert Systems with Applications, vol. 42, no. 3, pp.1487-1502, February, 2015. https://doi.org/10.1016/j.eswa.2014.09.019
Nir Friedman, Dan Geiger and Moises Goldszmidt, "Bayesian Network Classifiers," Machine Learning, vol. 29, no. 2, pp.131-163, November, 1997. https://doi.org/10.1023/A:1007465528199
Andrew McCallum and Kamal Nigam, "A Comparison of Event Models for Naive Bayes Text Classification," in Proc. of AAAI-98 workshop on learning for text categorization, pp. 41-49, 1998.
Lungan Zhang, Liangxiao Jiang, Chaoqun Li and Ganggang Kong, "Two feature weighting approaches for naive Bayes text classifiers," Knowledge-Based Systems, vol. 100, no. 15, pp.137-144, May, 2016. https://doi.org/10.1016/j.knosys.2016.02.017
Liangxiao Jiang, Harry Zhang andZhihua Cai, "A Novel Bayes Model: Hidden Naive Bayes," IEEE Transactions on Knowledge and Data Engineering, vol. 21, no. 10, pp.1361-1371, October, 2009. https://doi.org/10.1109/TKDE.2008.234
Liangxiao Jiang, Chaoqun Li, Shasha Wang and Lungan Zhang, "Deep feature weighting for naive Bayes and its application to text classification," Engineering Applications of Artificial Intelligence, vol. 52, pp.26-39, June, 2016. https://doi.org/10.1016/j.engappai.2016.02.002
Xuemeng Song, Zhao-Yan Ming, Liqiang Nie, Yi-Liang Zhao and Tat-Seng Chua, "Volunteerism Tendency Prediction via Harvesting Multiple Social Networks," ACM Transactions on Information Systems, vol. 34, no. 2, pp.1-27, April, 2016.
Aliaksei Severyn and Alessandro Moschitti, "Twitter Sentiment Analysis with Deep Convolutional Neural Networks," in Proc. of International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 959-962, August, 2015.

Cited by

Fuzzy Ontology and LSTM-Based Text Mining: A Transportation Network Monitoring System for Assisting Travel vol.19, pp.2, 2017, https://doi.org/10.3390/s19020234
SAEP: A Surrounding-Aware Individual Emotion Prediction Model Combined with T-LSTM and Memory Attention Mechanism vol.11, pp.23, 2017, https://doi.org/10.3390/app112311111

KSII Transactions on Internet and Information Systems (TIIS)

A novel classification approach based on Naïve Bayes for Twitter sentiment analysis

Abstract

Keywords

References

Cited by

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)