DOI QR코드

DOI QR Code

Tweets analysis using a Dynamic Topic Modeling : Focusing on the 2019 Koreas-US DMZ Summit

트윗의 타임 시퀀스를 활용한 DTM 분석 : 2019 남북미정상회동 이벤트를 중심으로

  • Ko, EunJi (Division of Digital Media, Ewha Womans University) ;
  • Choi, SunYoung (Graduate School of Communication & Arts, YonSei University)
  • Received : 2021.01.08
  • Accepted : 2021.01.16
  • Published : 2021.02.28

Abstract

In this study, tweets about the 2019 Koreas-US DMZ Summit were collected along with a time sequence and analyzed by a sequential topic modeling method, Dynamic Topic Modeling(DTM). In microblogging services such as Twitter, unstructured data that mixes news and an opinion about a single event occurs at the same time on a large scale, and information and reactions are produced in the same message format. Therefore, to grasp a topic trend, the contextual meaning can be found only by performing pattern analysis reflecting the characteristics of sequential data. As a result of calculating the DTM after obtaining the topic coherence score and evaluating the Latent Dirichlet Allocation(LDA), 30 topics related to news reports and opinions were derived, and the probability of occurrence of each topic and keywords were dynamically evolving. In conclusion, the study found that DTM is a suitable model for analyzing the trend of integrated topics in a specific event over time.

이 연구는 2019년 판문점 남북미 정상 회동 트윗을 타임 시퀀스와 함께 수집하여 시퀀셜 토픽모델링인 DTM으로 분석하였다. 트위터와 같은 마이크로 블로깅 서비스는 단일 이벤트에 뉴스와 오피니언이 혼재된 비정형 데이터가 대규모로 동시에 발생하고, 정보와 반응이 동일 메시지 형식으로 생산된다. 때문에 토픽 트렌드를 파악하려면 시퀀셜 데이터의 특성을 반영하여 패턴 분석을 해야 맥락적 의미를 알 수 있다. 토픽 일관성 점수를 구해 LDA를 평가한 후 DTM을 계산한 결과, 뉴스 보도와 오피니언 관련 토픽 30개가 도출되었고, 각 토픽과 키워드는 시간에 따라 발생 확률이 역동적으로 진화하고 있었다. 결론적으로 DTM은 특정 이벤트에 대한 사회 전반에 나타난 통합적 토픽 추이를 시간에 따라 분석하는데 적합한 모델임을 밝혔다.

Keywords

References

  1. S. A. A. Hridoy, M. T. Ekram, M. S. Islam, F. Ahmed, and R. M. Rahman, "Localized twitter opinion mining using sentiment analysis," Decision Analytics, vol. 2, no. 1, pp. 1-19, 2015. https://doi.org/10.1186/s40165-014-0010-2
  2. D. M. Blei, A. Y. Ng, and M. I. Jordan, "Latent dirichlet allocation," Advances in neural information processing systems, vol. 14, pp. 601-608, 2001.
  3. S. Y. Choi and E. J. Ko, "Analysis of 〈Korean Journal of Journalism & Communication Studies〉from 1960 to 2018 using Metadata with Dynamic Topic Modeling," Korean Journal of Journalism & Communication Studies, vol. 63, no. 4, pp. 7-42, Aug. 2019.
  4. S. Y. Choi and E. J. Ko, "Real-time Participative Democracy through Media Multitasking and Online Community Gamification - Analysis on the Online Posts Using a Dynamic Topic Model," Korean Journal of Broadcasting and Telecommunication Studies, vol. 31, no. 3, pp. 78-113, May. 2017.
  5. D. M. Blei and J. D. Lafferty, "Dynamic topic models," in Proceedings of the 23rd international conference on Machine learning, ACM, pp. 113-120. 2006.
  6. D. M. Blei, "Probabilistic Topic Models(review article)," Communications of the ACM, vol. 55, no. 4, pp. 77-84. 2012. https://doi.org/10.1145/2133806.2133826
  7. D. Newman, J. H. Lau, K. Grieser, and T. Baldwin, "Automatic evaluation of topic coherence," in Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics, pp. 100-108, 2010.
  8. D. O'callaghan, D. Greene, J. Carthy, and P. Cunningham, "An analysis of the coherence of descriptors in topic modeling," Expert Systems with Applications, vol. 42, no. 13, pp. 5645-5657, Aug. 2015. https://doi.org/10.1016/j.eswa.2015.02.055
  9. F. Morstatter and H. Liu, "In search of coherence and consensus: measuring the interpretability of statistical topics," The Journal of Machine Learning Research, vol. 18, no. 1, pp. 6177-6208, 2017.