DOI QR코드

DOI QR Code

트윗 텍스트 마이닝 기법을 이용한 구제역의 감성분석

Sentiment Analysis of Foot-and-Mouth Disease Using Tweet Text-Mining Technique

  • 채희찬 (고려대학교 컴퓨터정보학과) ;
  • 이종욱 (고려대학교 컴퓨터정보학과) ;
  • 최윤아 (고려대학교 컴퓨터정보학과) ;
  • 박대희 (고려대학교 컴퓨터정보학과) ;
  • 정용화 (고려대학교 컴퓨터정보학과)
  • 투고 : 2018.07.06
  • 심사 : 2018.08.03
  • 발행 : 2018.11.30

초록

구제역으로 인하여 국내 축산업계 및 관련 산업분야는 매년 막대한 피해를 입고 있다. 구제역과 관련한 다양한 학술적 연구들이 현재 진행되고는 있으나, 구제역의 발병에 따른 사회적 파급효과에 관한 공학적 분석 연구는 매우 제한적이다. 본 연구에서는 구제역에 관한 일반 시민들의 감성적 반응을 텍스트 마이닝 방법론을 사용하여 분석하는 체계적인 방법론을 제안한다. 제안하는 시스템은 먼저, 트위터에 게시된 트윗 중 구제역과 관련된 데이터를 수집한 후, 딥러닝 기법을 사용하여 극성 분류 과정을 거친다. 둘째, 토픽 모델링의 대표적인 기법 중 하나인 LDA를 활용하여 트윗으로 부터 키워드들을 추출하고, 추출된 키워드들로부터 극성별 동시출현 키워드 네트워크를 구성한다. 셋째, 키워드 네트워크을 통해 구제역의 위기단계 구간별 사회적 파급효과를 분석한다. 사례 분석으로써, 2010년 7월부터 2011년 12월까지 국내에서 발생한 구제역에 관한 일반 시민들의 감성적 변화를 분석하였다.

Due to the FMD(foot-and-mouth disease), the domestic animal husbandry and related industries suffer enormous damage every year. Although various academic researches related to FMD are ongoing, engineering studies on the social effects of FMD are very limited. In this study, we propose a systematic methodology to analyze emotional responses of regular citizens on FMD using text mining techniques. The proposed system first collects data related to FMD from the tweets posted on Twitter, and then performs a polarity classification process using a deep-learning technique. Second, keywords are extracted from the tweet using LDA, which is one of the typical techniques of topic modeling, and a keyword network is constructed from the extracted keywords. Finally, we analyze the various social effects of regular citizens on FMD through keyword network. As a case study, we performed the emotional analysis experiment of regular citizens about FMD from July 2010 to December 2011 in Korea.

키워드

JBCRJM_2018_v7n11_419_f0001.png 이미지

Fig. 1. System Architecture

JBCRJM_2018_v7n11_419_f0002.png 이미지

Fig. 2. LSTM-CNN Model

JBCRJM_2018_v7n11_419_f0003.png 이미지

Fig. 3. Trend of FMD and FMD-Tweet

JBCRJM_2018_v7n11_419_f0004.png 이미지

Fig. 4. Trend of FMD-Tweet Polarity

JBCRJM_2018_v7n11_419_f0005.png 이미지

Fig. 5. Keyword Network in the Early Period of FMD

JBCRJM_2018_v7n11_419_f0006.png 이미지

Fig. 6. Keyword Network in the Serious Period of FMD

JBCRJM_2018_v7n11_419_f0007.png 이미지

Fig. 7. Keyword Network in the Termination Period of FMD

Table 1. Word Filtering and Converting Rules

JBCRJM_2018_v7n11_419_t0001.png 이미지

Table 2. Example of Synonyms Substitution

JBCRJM_2018_v7n11_419_t0002.png 이미지

Table 3. Training Parameters

JBCRJM_2018_v7n11_419_t0003.png 이미지

참고문헌

  1. S. I. Pak and S. H. Bae, "A Space-Time Cluster of Foot-and-Mouth Disease Outbreaks in South Korea, 2010- 2011," Journal of the Korean Association of Regional Geographers, Vol.18, No.4, pp.464-472, 2012.
  2. Z. Xu, J. Lee, D. Park, and Y. Chung, "Multidimensional Analysis Model for Highly Pathogenic Avian Influenza using Data Cube and Data Mining Techniques," Biosystems Engineering, Vol.157, pp.109-121, 2017. https://doi.org/10.1016/j.biosystemseng.2017.03.004
  3. H. Kim, S. Oh, S. Ahn, and B. Cho, "Real-time Monitoring Method of Cattle's Temperature for FMD Prevention and Its Case Studies," Journal of KIIT, Vol.15, No.5, pp.141-150, 2017.
  4. M. J. Kyung and J. H. Yom, "Implementation of Open Source SOLAP Decision-Making System for Livestock Epidemic Surveillance and Prevention," Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography, Vol.30, No.3, pp.287-294, 2012. https://doi.org/10.7848/ksgpc.2012.30.3.287
  5. J. M. Jo, J. S. Jo, W. Qasim, B. E. Moon, M. H. Lee, et al., "Analysis of Route Patterns of Vehicles Transporting Livestock Manures using GPS," Journal of Agriculture and Life Science, Vol.52, No.1, pp.83-91, 2018. https://doi.org/10.14397/jals.2018.52.1.83
  6. B. Noh, Z. Xu, J. Lee, D. Park, and Y. Chung, "Keyword Network based Repercussion Effect Analysis of Foot-and- Mouth Disease using Online News," Journal of KIIT, Vol.14, No.9, pp.143-152, 2016.
  7. H. J. Woo and Y. H. Kim, "Spatial Distribution Patterns of Twitter Data with Topic Modeling," Journal of the Korean Association of Regional Geographers, Vol.23, No.2, pp.376- 387, 2017.
  8. S. Son, D. Kim, S. Lee, M. Gil, and Y. Moon, "Storm-Based Dynamic Tag Cloud for Real-Time SNS Data," KIPS Transactions on Software and Data Engineering, Vol.6, No.6, pp.309-314, 2017. https://doi.org/10.3745/KTSDE.2017.6.6.309
  9. Y. Kim and H. Kang, "An Analysis of Relationship Between Word Frequency in Social Network Service Data and Crime Occurences," KIPS Transactions on Computer and Communication Systems, Vol.5, No.9, pp.229-236, 2016. https://doi.org/10.3745/KTCCS.2016.5.9.229
  10. S. Hochreiter and J. Schmidhuber, "Long Short-Term Memory," Neural Computation, Vol.9, No.8, pp.1735-1780, 1997. https://doi.org/10.1162/neco.1997.9.8.1735
  11. C. Bziotis, N. Pelekis, and C. Doulkeridis, "Datastories at SemEval-2017 task 4: Deep LSTM with Attention for Message-Level and Topic-Based Sentiment Analysis," Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017), pp.747-754, 2017.
  12. Y. Kim, "Convolutional Neural Networks for Sentence Classification," Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp.1746-1751, 2014.
  13. M. Cliche, "BB_twtr at SemEval-2017 Task 4: Twitter Sentiment Analysis with CNNs and LSTMs," Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017), pp.573-580, 2017.
  14. C. Wu, F. Wu, J. Liu, S. Wu, and Y. Huang, "THU_NGN at SemEval-2018 Task 1: Fine-grained Tweet Sentiment Intensity Analysis with Attention CNN-LSTM," Proceedings of The 12th International Workshop on Semantic Evaluation (SemEval-2018), pp.186-192, 2018.
  15. M. Ko, "Unstructured Data Processing Using Keyword- Based Topic-Oriented Analysis," KIPS Transactions on Software and Data Engineering, Vol.6, No.11, pp.521-526, 2017. https://doi.org/10.3745/KTSDE.2017.6.11.521
  16. S. Rosenthal, N. Farra, and P. Nakov, "SemEval-2017 task 4: Sentiment Analysis in Twitter," Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval- 2017), pp.502-518, 2017.
  17. Y. D. Yun, J. C. Jo, Y. A. Hur, and H. S. Lim, "A Comparative Analysis of Cognitive Change about Big Data using Social Media Data Analysis," KIPS Transactions on Software and Data Engineering, Vol.6, No.7, pp.371-378, 2017. https://doi.org/10.3745/KTSDE.2017.6.7.371
  18. M. S. Seo and H. H. Yoo, "Citizen Sentiment Analysis of the Social Disaster by Using Opinion Mining," Journal of the Korean Society for Geospatial Information Science, Vol.25, No.1, pp.37-46, 2017.
  19. P. Oh and B. Y. Hwang, "Real-Time Spatial Recommendation System Based on Sentiment Analysis of Twitter," Journal of Society for e-Business Studies, Vol.21, No.3, pp.15-28, 2017. https://doi.org/10.7838/JSEBS.2016.21.3.015
  20. J. Howard and S. Ruder, "Fine-tuned Language Models for Text Classification," arXiv preprint arXiv:1801.06146, 2018.
  21. G. Daval-Frerot, A. Bouchekif, and A. Moreau, "Epita at SemEval-2018 Task 1: Sentiment Analysis using Transfer Learning Approach," Proceedings of The 12th International Workshop on Semantic Evaluation (SemEval-2018), pp.151-155, 2018.
  22. S. Vosoughi, P. Vijayaraghavan, and D. Roy, "Tweet2vec: Learning Tweet Embeddings using Character-Level CNN-LSTM Encoder-Decoder," Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval. ACM, pp.1041-1044, 2016.
  23. J. Hong and Y. Jeong, "Establishing the Category of Emotion Verb and Classifying Emotion Verbs," Korean Linguistics, Vol.45, pp.387-420, 2009.