DOI QR코드

DOI QR Code

Trend Analysis of Data Mining Research Using Topic Network Analysis

  • Kim, Hyon Hee (Dept. of Statistics & Information Science, Dongduk Women's University) ;
  • Rhee, Hey Young (Dept. of Library & Information Science, Dongduk Women's University)
  • Received : 2016.04.11
  • Accepted : 2016.05.09
  • Published : 2016.05.31

Abstract

In this paper, we propose a topic network analysis approach which integrates topic modeling and social network analysis. We collected 2,039 scientific papers from five top journals in the field of data mining published from 1996 to 2015, and analyzed them with the proposed approach. To identify topic trends, time-series analysis of topic network is performed based on 4 intervals. Our experimental results show centralization of the topic network has the highest score from 1996 to 2000, and decreases for next 5 years and increases again. For last 5 years, centralization of the degree centrality increases, while centralization of the betweenness centrality and closeness centrality decreases again. Also, clustering is identified as the most interrelated topic among other topics. Topics with the highest degree centrality evolves clustering, web applications, clustering and dimensionality reduction according to time. Our approach extracts the interrelationships of topics, which cannot be detected with conventional topic modeling approaches, and provides topical trends of data mining research fields.

Keywords

References

  1. C. Kim and Y-S. Hong, "Classification Techniques for XML Document Using Text Mining", Journal of the Korea Society of Computer and Information, Vol. 11, No. 2, May, pp. 15-23, 2006.
  2. J-P. Moon, W-S Lee, and J-H Chang, "A proper folder recommendation technique using frequent itemsets for efficient e-mail classification", Journal of the Korea Society of Computer and Information, Vol. 16, No. 2, Feb. pp. 33-46, 2011. https://doi.org/10.9708/jksci.2011.16.2.033
  3. D. M. Blei, Y. N. Andrew, and M. I. Jordan, "Latent Dirichlet Allocation," Journal of Machine Learning Research Vol. 3, pp. 993-1022, 2003.
  4. J. Park and M. Song, "A Study on the Research Trends in Library & Information Science in Korea Ifor Information Management, Vol. 30, No. 1, pp. 7-32, March, 2013. https://doi.org/10.3743/KOSIM.2013.30.1.007
  5. D. M. Blei, "Probabilistic Topic Models," Communications of the ACM, Vol. 55, No. 4, pp. 77-84, April, 2012. https://doi.org/10.1145/2133806.2133826
  6. S. Wasserman and K. Faust, "Social Network Analysis: Methods and Applications," Cambridge University Press, 1994.
  7. A. Duvvuru, S. Kamarthi, and S. Sultornsanee, "Undercovering Research Trends: Network Analysis of Keywords in Scholarly Articles," Proceedings of the 9th International Joint Conference on Computer Science and Software Engineering, pp. 265-270, 2012.
  8. Web of Science, "http://isiknowledge.com,"
  9. T. L. Griffiths and M. Steyvers., "Finding scientific topics," Proceedings of the National Academy of Sciences of the USA, Vol. 101 No. 1, pp. 5228-5235, April, 2004. https://doi.org/10.1073/pnas.0307752101
  10. J. Bae, N. Han, and M. Song., "Twitter Issue Tracking System by Topic Modeling Techniques," Journal of Intelligent Information Systems, Vol. 20, No. 2, pp. 109-122, June, 2014.
  11. D. M. Blei and J. D. Lafferty., "Correlated Topic Models," Proceedings of Neural Information Processing Systems, pp. 147-154, 2005.
  12. Q. Mei et al., "Topic Modeling with Network Regularization," Proceedings of International Conference on World Wide Web, pp. 101-110, 2008.
  13. X-L. Mao et al., "SSHLDA: A Semi-Supervised Hierarchical Topic Model," Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, pp. 800-809, 2012.
  14. X. Wang and A. McCallum, "Topics over Time: A Non-Markov Continuous-Time Model of Topical Trends, " Proceedings of the 12th International Conference on Knowledge Discovery and Data Mining, pp. 424-433, 2006.
  15. R, The R Project for Statistical Computing, "https://www.r-project.org/,"
  16. B. Gruen and K. Hornik., "topicmodels: An R Package for Fitting Topic Models," Journal of Statistical Software, Vol. 40, No. 13, pp. 1-29, May, 2011.
  17. C. D. Manning, P. Raghavan, and H. Schuetze., "Introduction to Information Retrieval," Cambridge University Press, pp. 116-121, 2008.
  18. L. C. Freeman, "Centrality in Social Networks: Conceptual Clarification," Social Networks, Vol. 1, pp. 215-239, 1979.

Cited by

  1. A Research on Difference Between Consumer Perception of Slow Fashion and Consumption Behavior of Fast Fashion: Application of Topic Modelling with Big Data vol.9, pp.1, 2016, https://doi.org/10.20482/jemm.2021.9.1.1
  2. Trends of Nursing Research on Accidental Falls: A Topic Modeling Analysis vol.18, pp.8, 2016, https://doi.org/10.3390/ijerph18083963
  3. Trends in Nursing Research on Infections: Semantic Network Analysis and Topic Modeling vol.18, pp.13, 2016, https://doi.org/10.3390/ijerph18136915