DOI QR코드

DOI QR Code

토픽 모형 및 사회연결망 분석을 이용한 한국데이터정보과학회지 영문초록 분석

김규하;박철용
Kim, Gyuha;Park, Cheolyong

  • 투고 : 2014.12.16
  • 심사 : 2015.01.10
  • 발행 : 2015.01.31

초록

이 논문에서는 텍스트마이닝 (text mining) 기법을 이용하여 한국데이터정보과학회지에 게재된 논문의 영어초록을 분석하였다. 먼저 다양한 방법을 통해 단어-문서 행렬 (term-document matrix)을 생성하고 이를 사회연결망 분석 (social network analysis)을 통해 시각화하였다. 또한 토픽을 추출하기 위한 방법으로 LDA (latent Dirichlet allocation)와 CTM (correlated topic model)을 사용하였다. 토픽의 수, 단어-문서 행렬의 생성방법에 따라 엔트로피 (entropy)를 통해 토픽 추출 모형들의 성능을 비교하였다.

키워드

사회연결망 분석;텍스트마이닝;토픽 모형;한국데이터정보과학회지

참고문헌

  1. Blei, D. M. and Lafferty, J. D. (2006). Dynamic topic models. Proceedings of the 23rd International Conference on Machine Learning, 113-120.
  2. Blei, D. M. and Lafferty, J. D. (2007). A correlated topic model of science. The Annals of Applied Statistics, 1, 17-35. https://doi.org/10.1214/07-AOAS114
  3. Blei, D. M. and Lafferty, J. D. (2009). Topic models. In Text Mining: Classification, Clustering, and Applications, edited by A. N. Srivastava and M. Sahami, Champman and Hall/CRC, Boca Raton, 71-94.
  4. Blei, D. M., Ng, A. Y. and Jordan, M. I. (2003). Latent Dirichlet allocation. Journal of Machine Learning Research, 3, 993-1022.
  5. Chung, H. and Han, C. (2013). Conditional bootstrap confidence intervals for classification error rate when a block of observations is missing. Journal of the Korean Data & Information Science Society, 24, 189-200. https://doi.org/10.7465/jkdi.2013.24.1.189
  6. Hornik, K. and Grun, B. (2011). topicmodels: An R package for fitting topic models. Journal of Statistical Software, 40, 1-30.
  7. Huang, J. and Malisiewicz, T. (2006). Correlated topic model details, Technical Report, Carnegie Mellon University, Pittsburgh, PA.
  8. Shim, J., Kim, Y. and Hwang, C. (2013). Generalized kernel estimating equation for panel estimation of small area unemployment rates. Journal of the Korean Data & Information Science Society, 24, 1199-1210. https://doi.org/10.7465/jkdi.2013.24.6.1199

피인용 문헌

  1. Performance analysis of volleyball games using the social network and text mining techniques vol.26, pp.3, 2015, https://doi.org/10.7465/jkdi.2015.26.1.151
  2. Research of Topic Analysis for Extracting the Relationship between Science Data vol.21, pp.1, 2016, https://doi.org/10.7465/jkdi.2015.26.1.151
  3. A study on fractal dimensions of art works vol.27, pp.2, 2016, https://doi.org/10.7465/jkdi.2015.26.1.151
  4. Research Topics in Industrial Engineering 2001~2015 vol.42, pp.6, 2016, https://doi.org/10.7465/jkdi.2015.26.1.151