DOI QR코드

DOI QR Code

Exploring the Sentiment Analysis of Electric Vehicles Social Media Data by Using Feature Selection Methods

속성선택방법을 이용한 전기자동차 소셜미디어 데이터의 감성분석 연구

  • Costello, Francis Joseph (SKK Business School, Sungkyunkwan University) ;
  • Lee, Kun Chang (Global Business Administration/Dept of Health Sciences & Technology, SAIHST (Samsung Advanced Institute for Health Sciences & Technology) Sungkyunkwan University)
  • Received : 2019.01.02
  • Accepted : 2020.02.20
  • Published : 2020.02.28

Abstract

This study presents a recently obtained social media data set based upon the case study of Electric Vehicles (EV) and looks to implement a sentiment analysis (SA) in order to gain insights. This study uses two methods in order to fully analyze the public's sentiment on EVs. First, we implement a SA tool in which we used to extract the sentiment of comments. Next we labeled the data with these sentiments obtained and classified them. While performing classification we found the problem of dimensionality and also explored the use of feature selection (FS) models in order to reduce the data set's dimensionality. We found that the use of three FS models (Chi Squared, Information Gain and ReliefF) showed the most promising results when used alongside a logistic and support vector machines classification algorithm. the contributions of this paper are in providing an real-world example of social media text analytics which can be adopted in many other areas of research and business. Moving forward researchers can use the methodological approach in this paper to further refine and improve their own case uses in text analytics.

References

  1. X. Tian, Y. Geng, S. Zhong, J. Wilson, C. Gao, W. Chen & H. Hao. (2018). A bibliometric analysis on trends and characters of carbon emissions from transport sector. Transportation Research Part D: Transport and Environment, 59(December 2017) 1-10. https://doi.org/10.1016/j.trd.2017.12.009 https://doi.org/10.1016/j.trd.2017.12.009
  2. W. He, X. Tian, R. Tao, W. Zhang, G. Yan & V. Akula. (2017). Application of social media analytics: A case of analyzing online hotel reviews. Online Information Review, 41(7), 921-935. https://doi.org/10.1108/OIR-07-2016-0201 https://doi.org/10.1108/OIR-07-2016-0201
  3. T. Carpenter (2015). Measuring and Mitigating Electric Vehicle Adoption Barriers. PhD thesis, Waterloo, Ontario.
  4. J. Kim, M. Han, Y. Lee & Y. Park. (2016). Futuristic data-driven scenario building: Incorporating text mining and fuzzy association rule mining into fuzzy cognitive map. Expert Systems with Applications, 57, 311-323. https://doi.org/10.1016/j.eswa.2016.03.043 https://doi.org/10.1016/j.eswa.2016.03.043
  5. J. Li & H. Liu. (2017). Challenges of Feature Selection for Big Data Analytics. IEEE Computer Society, (March), 9-15. https://doi.org/10.1109/MIS.2017.38
  6. M. N. Injadat, F. Salo & A. B. Nassif. (2016). Data mining techniques in social media: A survey. Neurocomputing, 214, 654-670. https://doi.org/10.1016/j.neucom.2016.06.045 https://doi.org/10.1016/j.neucom.2016.06.045
  7. B. Li, K. C. C. Chan, C. Ou & S. Ruifeng. (2017). Discovering public sentiment in social media for predicting stock movement of publicly listed companies. Information Systems, 69, 81-92. https://doi.org/10.1016/j.is.2016.10.001 https://doi.org/10.1016/j.is.2016.10.001
  8. N. F. F. da Silva, E. R. Hruschka & E. R. Hruschka. (2014). Tweet sentiment analysis with classifier ensembles. Decision Support Systems, 66, 170-179. https://doi.org/10.1016/j.dss.2014.07.003 https://doi.org/10.1016/j.dss.2014.07.003
  9. H. Yuan, R. Y. K. Lau & W. Xu. (2016). The determinants of crowdfunding success: A semantic text analytics approach. Decision Support Systems, 91. https://doi.org/10.1016/j.dss.2016.08.001 https://doi.org/10.1016/j.dss.2016.08.001
  10. A. Ortigosa, J. M. Martín & R. M. Carrol. (2014). Sentiment analysis in Facebook and its application to e-learning. Computers in Human Behavior, 31(1), 527-541. https://doi.org/10.1016/j.chb.2013.05.024 https://doi.org/10.1016/j.chb.2013.05.024
  11. T. W. Rinker. (2018). sentimentr: Calculate Text Polarity Sentiment version 2.6.1. Retrieved from. http://github.com/trinker/sentimentr
  12. C. T. Tran, M. Zhang, P. Andreae, B. Xue & L. T. Bui. (2018). Improving performance of classification on incomplete data using feature selection and clustering. Applied Soft Computing Journal, 73, 848-861. https://doi.org/10.1016/j.asoc.2018.09.026 https://doi.org/10.1016/j.asoc.2018.09.026
  13. M. Tutkan, M. C. Ganiz & S. Akyokus. (2016). Helmholtz principle based supervised and unsupervised feature selection methods for text mining. Information Processing and Management, 52(5), 885-910. https://doi.org/10.1016/j.ipm.2016.03.007 https://doi.org/10.1016/j.ipm.2016.03.007
  14. K. Seddig, P. Jochem & W. Fichtner. (2017). Integrating renewable energy sources by electric vehicle fleets under uncertainty. Energy, 141, 2145-2153. https://doi.org/10.1016/j.energy.2017.11.140 https://doi.org/10.1016/j.energy.2017.11.140
  15. M. Neaimeh, S. D. Salisbury, G. A. Hill, P. T. Blythe, D. R. Scoffield & J. E. Francfort. (2017). Analysing the usage and evidencing the importance of fast chargers for the adoption of battery electric vehicles. Energy Policy, 108, 474-486. https://doi.org/10.1016/j.enpol.2017.06.033 https://doi.org/10.1016/j.enpol.2017.06.033
  16. D. Connolly. (2017). Economic viability of electric roads compared to oil and batteries for all forms of road transport. EnergyStrategy Reviews. https://doi.org/10.1016/j.esr.2017.09.005
  17. L. H. Bjornsson & S. Karlsson. (2017). Electrification of the two-car household: PHEV or BEV? Transportation Research Part C: Emerging Technologies, 85(October), 363-376. https://doi.org/10.1016/j.trc.2017.09.021 https://doi.org/10.1016/j.trc.2017.09.021
  18. I. H. Witten, E. Frank & M. A. Hall. (2011). Data Mining: Practical Machine Learning Tools and Techniques (3rd ed.). Burlington, MA: Morgan Kaufmann Publishers Inc. https://doi.org/10.1016/B978-0-12-374856-0.00001-8
  19. M. Robnik-Sikonja & I. Kononenko. (2003). Theoretical and empirical analysis of ReliefF and RReliefF. Machine Learning, 53(1), 23-69. https://doi.org/10.1023/A:1025667309714 https://doi.org/10.1023/A:1025667309714
  20. M. A. Hall. (1999). Correlation-based feature selection for machine learning.
  21. R. J. Quinlan. (1986). Induction of decision trees. Machine Learning, 1(1), 81-106. https://doi.org/10.1007/BF00116251 https://doi.org/10.1007/BF00116251
  22. G. Wang, J. Sun, J. Ma, K. Xu & J. Gu (2014). Sentiment classification: The contribution of ensemble learning. DecisionSupport Systems, 57, 77-93. https://doi.org/10.1016/j.dss.2013.08.002
  23. R. Togo, K. Magota, T. Shiga, K. Hirata, I. Tsujino, M. Haseyama & T. Ogawa (2018). Cardiac sarcoidosis classification with deep convolutional neural network-based features using polar maps. Computers in Biology and Medicine, 104(August 2018), 81-86. https://doi.org/10.1016/j.compbiomed.2018.11.008
  24. A. Onan & S. Korukoglu (2017). A feature selection model based on genetic rank aggregation for text sentiment classification. Journal of Information Science, 43(1), 25-38. https://doi.org/10.1177/0165551515613226 https://doi.org/10.1177/0165551515613226
  25. F. Wang, T. Xu, T. Tang, M. Zhou & H. Wang (2017). Bilevel Feature Extraction-Based Text Mining for Fault Diagnosis of Railway Systems. IEEE Transactions on Intelligent Transportation Systems, 18(1), 49-58. https://doi.org/10.1109/TITS.2016.2521866 https://doi.org/10.1109/TITS.2016.2521866
  26. L. M. Abualigah, A. T.Khader, M. A. Al-Betar, & O. A. Alomari. (2017). Text feature selection with a robust weight schemeand dynamic dimension reduction to text document clustering. Expert Systemswith Applications, 84, 24-36. https://doi.org/10.1016/j.eswa.2017.05.002 https://doi.org/10.1016/j.eswa.2017.05.002
  27. F. J. Costello & K. C. Lee. (2019). Exploring the Performance of Synthetic Minority Over-sampling Technique (SMOTE) to Predict Good Borrowers in P2P Lending. Journal of Digital Convergence, 17(9), 71-78. https://doi.org/10.14400/JDC.2019.17.9.071 https://doi.org/10.14400/jdc.2019.17.9.071
  28. C. Dhaoui, C. M. Webster & L. P. Tan. (2017). Social media sentiment analysis: lexicon versus machine learning. Journal of Consumer Marketing, 34(6), 480-488. https://doi.org/10.1108/JCM-03-2017-2141 https://doi.org/10.1108/JCM-03-2017-2141