DOI QR코드

DOI QR Code

효과적인 애스팩트 마이닝을 위한 다중 레이블 분류접근법

Multi-Label Classification Approach to Effective Aspect-Mining

  • 원종윤 (성균관대학교 경영대학) ;
  • 이건창 (성균관대학교 경영대학 글로벌경영학과/삼성융합의과학원 융합의과학과)
  • Jong Yoon Won (SKK Business School, Sungkyunkwan University) ;
  • Kun Chang Lee (Global Business Administration/Department of Health Sciences & & Technology, SAIHST(Samsung Advanced Institute for Health Sciences & Technology), Sungkyunkwan University)
  • 투고 : 2020.01.24
  • 심사 : 2020.04.10
  • 발행 : 2020.08.31

초록

최근의 감성분류 연구는 출력변수가 하나인 단일레이블 분류방법을 사용한 연구가 많다. 특히, 이러한 연구는 하나의 극성 값(긍정, 부정)만을 찾는 연구가 많다. 그러나 한 문장 안에는 다중적인 의미가 내포되어 있다. 그 중에서도 감정과 오피니언이 이러한 특징을 갖는다. 본 논문은 두 가지 연구목적을 제시한다. 첫째, 한 문장 안에 다양한 토픽(주제 또는 애스팩트)이 있다는 사실을 기반으로, 해당 문장을 각 애스팩트 별로 감성을 분류하는 애스팩트 마이닝을 수행한다. 둘째, 두개 이상의 종속변수(출력 값)를 한 번에 분석하는 다중레이블 분류방법을 적용한다. 이에 본 연구는 감성분류의 연구가 단일분류기에 의해서만 이루어진 연구를 개선하고자 다중레이블 분류방법에 의한 애스팩트 마이닝을 수행하고자 한다. 이와 같은 연구목적을 달성하기 위해 국내 뮤지컬 데이터를 수집하였다. 분석결과 문장 안에 있는 다양한 애스팩트별 감성을 추출하였고, 유의한 결과를 얻었다.

Recent trends in sentiment analysis have been focused on applying single label classification approaches. However, when considering the fact that a review comment by one person is usually composed of several topics or aspects, it would be better to classify sentiments for those aspects respectively. This paper has two purposes. First, based on the fact that there are various aspects in one sentence, aspect mining is performed to classify the emotions by each aspect. Second, we apply the multiple label classification method to analyze two or more dependent variables (output values) at once. To prove our proposed approach's validity, online review comments about musical performances were garnered from domestic online platform, and the multi-label classification approach was applied to the dataset. Results were promising, and potentials of our proposed approach were discussed.

키워드

참고문헌

  1. 김다예, 이영인, "Word2Vec을 활용한 뉴스 기반 주가지수 방향성 예측용 감성 사전 구축", 한국빅데이터학회지, 제3권, 제1호, 2018, pp. 13-20.
  2. 남길임, 조은경, 한국어 텍스트 감성분석, 커뮤니케이션북스, 2017.
  3. 박재수, 이재수, "아파트 매매가격과 부동산 온라인 뉴스의 교차상관관계와 인과관계 분석: 온라인 뉴스 기사의 비정형 빅데이터를 활용한 감성분석 기법의 적용", Journal of Korea Planning Association, 제54권, 제1호, 2019, pp. 131-147, Available at DOI: 10.17208/jkpa.2019.02.54.1.131.
  4. 박현정, 송민채, 신경식, "CNN을 적용한 한국어 상품평 감성분석: 형태소 임베딩을 중심으로", 지능정보연구, 제24권, 제2호, 2018, pp. 59-83, Available at DOI: 10.13088/jiis.2018.24.2.059.
  5. 이영준, 윤보현, "토픽모델링과 감성분석에 기반한 금통위 의사록 분석", Journal of The Korean Data Analysis Society, 제21권, 제2호, 2019, pp. 889-900. https://doi.org/10.37727/jkdas.2019.21.2.889
  6. 정민지, 이유림, 유채민, 김지원, 정재은, "텍스트 마이닝 기법을 이용한 모바일 간편결제 서비스에 대한 소비자 반응 분석: 삼성페이를 중심으로", 디지털융복합연구, 제17권, 제1호, 2019, pp. 9-27, Available at DOI: 10.14400/JDC.2019.17.1.009.
  7. 정풀잎, 안현철, 곽기영, "텍스트 마이닝과 소셜 네트워크 분석을 이용한 스마트폰 디자인의 핵심속성 및 가치 식별", 대한경영학회지, 제32권, 제1호, 2019, pp. 27-47, Available at DOI: 10.18032/kaaba.2019.32.1.27.
  8. Abberley, L., N. Gould, K. Crockett, and J. Cheng, "Modelling road congestion using ontologies for big data analytics in smart cities", International Smart Cities Conference, 2017, pp. 14-17, Available at DOI: 10.1109/ISC2.2017.8090795.
  9. Ali, F., D. Kwak, P. Khan, S. El-Sappagh, A. Ali, S. Ullah, K. H. Kim, and K. S. Kwak, "Transportation sentiment analysis using word embedding and ontology-based topic modeling", Knowledge-Based Systems, Vol.174, 2019, pp. 27-42, Available at DOI: 10.1016/j.knosys.2019.02.033.
  10. Almars, A., X. Li, and X. Zhao, "Modelling user attitudes using hierarchical sentiment-topic model", Data & Knowledge Engineering, Vol.119, 2019, pp. 139-149, Available at DOI: 10.1016/j.datak.2019.01.005.
  11. Alqaryouti, O., N. Siyam, A. A. Monem, and K. Shaalan, "Aspect-based sentiment analysis using smart government review data", Applied Computing and Informatics, 2019 Online, Available at DOI: 10.1016/j.aci.2019.11.003.
  12. Celardo, L. and M. G. Everett, "Network text analysis: A two-way classification approach", International Journal of Information Management, 2019 pp. 1-8, Available at DOI: 10.1016/j.ijinfomgt.2019.09.005.
  13. Chen, F. and Y. Huang, "Knowledge-enhanced neural networks for sentiment analysis of Chinese reviews", Neurocomputing, Vol.368, 2019, pp. 51-58, Available at DOI: 10.1016/j.neucom.2019.08.054.
  14. Das, S., X. Sun, and A. Dutta, "Text mining and topic modeling of compendiums of papers from transportation research board annual meetings", Transportation Research Record, Vol.2552, No.1, 2016, pp. 48-56, Available at DOI: 10.3141/2552-07.
  15. Furnkranz, J., E. Hullermeier, E. L. Mencia, and K. Brinker, "Multilabel classification via calibrated label ranking", Machine Learning, Vol.73, No.2, 2008, pp. 133-153. https://doi.org/10.1007/s10994-008-5064-8
  16. Garcia-Pablos, A., M. Cuadros, and G. Rigau, "W2VLDA: almost unsupervised system for aspect based sentiment analysis", Expert Systems with Applications, Vol.91, 2018, pp. 127-137, Available at DOI: 10.1016/j.eswa.2017.08.049.
  17. Giatsoglou, M., M. G. Vozalis, K. Diamantaras, A. Vakali, G. Sarigiannidis, and K. C. Chatzisavvas, "Sentiment analysis leveraging emotions and word embeddings", Expert Systems with Applications, Vol.69, 2017, pp. 214-224, Available at DOI: 10.1016/j.eswa.2016.10.043.
  18. Huang, J., G. Li, S. Wang, Z. Xue, and Q. Huang, "Multi-label classification by exploiting local positive and negative pairwise label correlation", Neurocomputing, Vol.257, 2017, pp. 164-174, Available at DOI: 10.1016/j.neucom.2016.12.073.
  19. Hullermeier, E., J. Furnkranz, W. Cheng, and K. Brinker, "Label ranking by learning pairwise preferences", Artificial Intelligence, Vol.172, No.16, 2008, pp. 1897-1916, Available at DOI: 10.1016/j.artint.2008.08.002.
  20. Jena, R., "An empirical case study on Indian consumers' sentiment towards electric vehicles: A big data analytics approach", Industrial Marketing Management, 2020, pp. 1-12, Available at DOI: 10.1016/j.indmarman.2019.12.012.
  21. Jha, V., R. Savitha, P. D. Shenoy, K. R. Venugopal, and A. K. Sangaiah, "A novel sentiment aware dictionary for multi-domain sentiment classification", Computers & Electrical Engineering, Vol.69, 2018, pp. 585-597, Available at DOI: 10.1016/j.compeleceng.2017.10.015.
  22. Khan, A. U. R., M. Khan, and M. B. Khan, "Naive multi-label classification of YouTube comments using comparative opinion mining", Procedia Computer Science, Vol.82, 2016, pp. 57-64, Available at DOI: 10.1016/j.procs.2016.04.009.
  23. Krouska, A., C. Troussas, and M. Virvou, "Comparative evaluation of algorithms for sentiment analysis over social networking services", Journal of Universal Computer Science, Vol.23, No.8, 2017, pp. 755-768.
  24. Lawani, A., M. R. Reed, T. Mark, and Y. Zheng, "Reviews and price on online platforms: Evidence from sentiment analysis of Airbnb reviews in Boston", Regional Science and Urban Economics, Vol.75, 2019, pp. 22-34, Available at DOI: 10.1016/j.regsciurbeco.2018.11.003.
  25. Le, H., J. Lee, and H. K. Lee, "Purchase process aspect-based opinion mining: An application for online shopping mall", The Journal of Internet Electronic Commerce Research, Vol.15, No.2, 2015, pp. 15-28.
  26. Lee, J. and D. W. Kim, "SCLS: Multi-label feature selection based on scalable criterion for large label set", Pattern Recognition, Vol.66, 2017, pp. 342-352, Available at DOI: 10.1016/j.patcog.2017.01.014.
  27. Lee, S., J. H. Lee, S. H. Jung, and J. Park, "The role of entropy of review text sentiments on online WOM and movie box office sales", Electronic Commerce Research and Applications, Vol.22, 2017, pp. 42-52, Available at DOI: 10.1016/j.elerap.2017.03.001.
  28. Lee, S. H., J. Cui, and J. W. Kim, "Sentiment analysis on movie review through building modified sentiment dictionary by movie genre", Journal of Intelligence and Information Systems, Vol.22, No.2, 2016, pp. 97-113, Available at DOI: 10.13088/jiis.2016.22.2.097.
  29. Li, Y., C. Shi, H. Zhao, F. Zhuang, and B. Wu, "Aspect mining with rating bias", European Conference on Machine Learning and Knowledge Discovery in Databases, 2016, pp. 458-474.
  30. Liu, H., H. Motoda, R. Setiono, and Z. Zhao, "Feature selection: An ever evolving frontier in data mining", In Feature Selection in Data Mining, 2010, pp. 4-13.
  31. Liu, X., "Analyzing the impact of user-generated content on B2B Firms' stock performance: Big data analysis with machine learning methods", Industrial Marketing Management, 2019, pp. 1-10, Available at DOI: 10.1016/j.indmarman.2019.02.021.
  32. Liuc, H. and H. Motoda, Feature Selection for Knowledge Discovery and Data Mining, Springer Science & Business Media, 2012.
  33. Marcheggiani, D., O. Tackstrom, A. Esuli, and F. Sebastiani, "Hierarchical multi-label conditional random fields for aspect-oriented opinion mining", European Conference on Information Retrieval, 2014, pp. 273-285.
  34. Mikolov, T., K. Chen, G. Corrado, and J. Dean, "Efficient estimation of word representations in vector space", arXiv preprint arXiv:1301.3781, 2013.
  35. Ngo-Ye, T. L. and A. P. Sinha, "The influence of reviewer engagement characteristics on online review helpfulness: A text regression model", Decision Support Systems, Vol.61, 2014, pp.47-58, Available at DOI: 10.1016/j.dss.2014.01.011.
  36. Pereira, R. B., A. Plastino, B. Zadrozny, and L. H. Merschmann, "Correlation analysis of performance measures for multi-label classification", Information Processing & Management, Vol.54, No.3, 2018, pp. 359-369, Available at DOI: 10.1016/j.ipm.2018.01.002.
  37. Qiu, G., B. Liu, J. Bu, and C. Chen, "Opinion word expansion and target extraction through double propagation", Computational Linguistics, Vol.37, No.1, 2011, pp. 9-27, Available at DOI: 10.1162/coli_a_00034.
  38. Rathan, M., V. R. Hulipalled, K. R. Venugopal, and L. M. Patnaik, "Consumer insight mining: Aspect based twitter opinion mining of mobile phone reviews", Applied Soft Computing, Vol.68, 2018, pp. 765-773, Available at DOI: 10.1016/j.asoc.2017.07.056.
  39. Read, J., B. Pfahringer, G. Holmes, and E. Frank, "Classifier chains for multi-label classification", Machine Learning and Knowledge Discovery in Databases, 2009, pp. 254-269.
  40. Read, J., B. Pfahringer, G. Holmes, and E. Frank, "Classifier chains for multi-label classification", Machine Learning, Vol.85, No.3, 2011, pp. 333-359. https://doi.org/10.1007/s10994-011-5256-5
  41. Santosh, D. T., K. S. Babu, S. D. V. Prasad, and A. Vivekananda, "Opinion mining of online product reviews from traditional LDA topic clusters using feature ontology tree and sentiwordnet", International Journal Education and Management Engineering, Vol.6, 2016, pp. 34-44, Available at DOI: 10.5815/ijeme.2016.06.04.
  42. Shibuya, Y. and H. Tanaka, "Public sentiment and demand for used cars after a large-scale disaster: Social media sentiment analysis with Facebook pages", Social Web in Emergency and Disaster Management, 2018.
  43. Spolaor, N., E. A. Cherman, M. C. Monard, and H. D. Lee, "ReliefF for multi-label feature selection", IEEE Brazilian Conference on Intelligent Systems, 2013, pp. 6-11.
  44. Tan, X., Y. Cai, J. Xu, H. F. Leung, W. Chen, and Q. Li, "Improving aspect-based sentiment analysis via aligning aspect embedding", Neurocomputing, 2019, Available at DOI: 10.1016/j.neucom.2019.12.035.
  45. Wang, G., J. Sun, J. Ma, K. Xu, and J. Gu, "Sentiment classification: The contribution of ensemble learning", Decision Support Systems, Vol.57, 2014, pp. 77-93, Available at DOI: 10.1016/j.dss.2013.08.002.
  46. Yadollahi, A., A. G. Shahraki, and O. R. Zaiane, "Current state of text sentiment analysis from opinion to emotion mining", Association for Computing Machinery Computing Surveys, Vol.50, No.2, 2017, pp. 25-58, Available at DOI: 10.1145/3057270.
  47. Yang, C., H. Zhang, B. Jiang, and K. Li, "Aspect-based sentiment analysis with alternating coattention networks", Information Processing & Management, Vol.56, No.3, 2019, pp. 463-478, Available at DOI: 10.1016/j.ipm.2018.12.004.
  48. Zhang, M. L. and Z. H. Zhou, "ML-KNN: A lazy learning approach to multi-label learning", Pattern Recognition, Vol.40, No.7, 2007, pp. 2038-2048, Available at DOI: 10.1016/j.patcog.2006.12.019.
  49. Zhang, M. L. and Z. H. Zhou, "A review on multi-label learning algorithms", IEEE Transactions on Knowledge and Data Engineering, Vol.26, No.8, 2013, pp. 1819-1837, Available at DOI: 10.1109/TKDE.2013.39.
  50. Zhou, Z. H., M. L. Zhang, S. J. Huang, and Y. F. Li, "Multi-instance multi-label learning", Artificial Intelligence, Vol.176, No.1, 2012, pp. 2291-2320. https://doi.org/10.1016/j.artint.2011.10.002