DOI QR코드

DOI QR Code

An Ontology-Based Labeling of Influential Topics Using Topic Network Analysis

  • Kim, Hyon Hee (Dept. of Statistics and Information Science, Dongduk Women's University) ;
  • Rhee, Hey Young (Dept. of Library and Information Science, Dongduk Women's University)
  • Received : 2018.05.29
  • Accepted : 2018.07.06
  • Published : 2019.10.31

Abstract

In this paper, we present an ontology-based approach to labeling influential topics of scientific articles. First, to look for influential topics from scientific article, topic modeling is performed, and then social network analysis is applied to the selected topic models. Abstracts of research papers related to data mining published over the 20 years from 1995 to 2015 are collected and analyzed in this research. Second, to interpret and to explain selected influential topics, the UniDM ontology is constructed from Wikipedia and serves as concept hierarchies of topic models. Our experimental results show that the subjects of data management and queries are identified in the most interrelated topic among other topics, which is followed by that of recommender systems and text mining. Also, the subjects of recommender systems and context-aware systems belong to the most influential topic, and the subject of k-nearest neighbor classifier belongs to the closest topic to other topics. The proposed framework provides a general model for interpreting topics in topic models, which plays an important role in overcoming ambiguous and arbitrary interpretation of topics in topic modeling.

Keywords

Data Mining Ontology;Labeling of Topic Models;Ontology-based Interpretation of Topics;Topic Network Analysis

Acknowledgement

Supported by : Dongduk Women's University

References

  1. D. M. Blei, A. Y. Ng, and M. I. Jordan, "Latent Dirichlet allocation," Journal of Machine Learning Research, vol. 3, pp. 993-1022, 2003.
  2. D. M. Blei, "Probabilistic topic models," Communications of the ACM, vol. 55, no. 4, pp. 77-84, 2012. https://doi.org/10.1145/2133806.2133826
  3. J. H. Park and M. Song, "A study on the research trends in library & information science in Korea using topic modeling," Journal of the Korean Society for Information Management, vol. 30, no. 1, pp. 7-32, 2013.
  4. S. Wasserman and K. Faust, Social Network Analysis: Methods and Applications. New York, NY: Cambridge University Press, 1994.
  5. A. Duvvuru, S. Kamarthi, and S. Sultornsanee, "Undercovering research trends: network analysis of keywords in scholarly articles," in Proceedings of 2012 9th International Conference on Computer Science and Software Engineering (JCSSE), Bangkok, Thailand, 2012, pp. 265-270.
  6. H. H. Kim, D. Kim, and J. Jo, "Patent data analysis using clique analysis in a keyword network," Journal of the Korean Data and Information Science Society, vol. 27, no. 5, pp. 1273-1284, 2016. https://doi.org/10.7465/jkdi.2016.27.5.1273
  7. H. H. Kim and H. Y. Rhee, "Trend analysis of data mining research using topic network analysis," Journal of the Korea Society of Computer and Information, vol. 21, no. 5, pp. 141-148, 2016.
  8. Wikipedia [Online]. Available: http://www.wikipedia.org/.
  9. V. Vijayarajan, M. Dinakaran, P. Tejaswin, and M. Lohani, "A generic framework for ontology-based information retrieval and image retrieval in web data," Human-centric Computing and Information Sciences, vol. 6, article no. 18, 2016.
  10. M. Lee, Y. S. Park, and J. W. Lee, "Image-centric integrated data model of medical information by diseases: two case studies for AMI and ischemic stroke," Journal of Information Processing Systems, vol. 12, no. 4, pp. 741-753, 2016.
  11. M. N. Islam and A. N. Islam, "Ontology mapping and semantics of web interface signs," Human-centric Computing and Information Sciences, vol. 6, article no. 20, 2016.
  12. M. Allahyari, K. J. Kochut, and M. Janik, "Ontology-based text classification into dynamically defined topics," in Proceedings of 2014 IEEE International Conference on Semantic Computing, Newport Beach, CA, 2014, pp. 273-278.
  13. S. Fodeh, B. Punch, and P. N. Tan, "On ontology-driven document clustering using core semantic features," Knowledge and Information Systems, vol. 28, no. 2, pp. 395-421, 2011. https://doi.org/10.1007/s10115-010-0370-4
  14. F. Wu and D. S. Weld, "Automatically refining the Wikipedia infobox ontology," in Proceedings of the 17th international conference on World Wide Web, Beijing, China, 2008, pp. 635-644.
  15. M. Allahyari and K. Kochut, "Semantic tagging using topic models exploiting Wikipedia category network," in Proceedings of 2016 IEEE 10th International Conference on Semantic Computing (ICSC), Laguna Hills, CA, 2016, pp. 63-70.
  16. J. Chang, S. Gerrish, C. Wang, J. L. Boyd-Graber, and D. M. Blei, "Reading tea leaves: how humans interpret topic models," Advances in Neural Information Processing Systems, vol. 22, pp. 288-296, 2009.
  17. J. H. Lau, K. Grieser, D. Newman, and T. Baldwin, "Automatic labelling of topic models," in Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies-Volume 1, Portland, OR, 2011, pp. 1536-1545.
  18. D. Magatti, S. Calegari, D. Ciucci, and F. Stella, "Automatic labeling of topics," in Proceedings of 2009 9th International Conference on Intelligent Systems Design and Applications, Pisa, Italy, 2009, pp. 1227-1232.
  19. X. L. Mao, Z. Y. Ming, Z. J. Zha, T. S. Chua, H. Yan, and X. Li, "Automatic labeling hierarchical topics," in Proceedings of the 21st ACM International Conference on Information and Knowledge Management, Maui, HI, 2012, pp. 2383-2386.
  20. D. M. Blei and J. D. Lafferty, "A correlated topic model of science," The Annals of Applied Statistics, vol. 1, no. 1, pp. 17-35, 2007. https://doi.org/10.1214/07-AOAS114
  21. Y. Cha and J. Cho, "Social-network analysis using topic models," in Proceedings of the 35th International ACM SIGIR Conference on Research and Development in Information Retrieval, Portland, OR, 2012, pp. 565-574.
  22. G. L'huillier, H. Alvarez, S. A. Rios, and F. Aguilera, "Topic-based social network analysis for virtual communities of interests in the dark web," ACM SIGKDD Explorations Newsletter, vol. 12, no. 2, pp. 66-73, 2011. https://doi.org/10.1145/1964897.1964917
  23. Q. Mei, D. Cai, D. Zhang, and C. Zhai, "Topic modeling with network regularization," in Proceedings of the 17th International Conference on World Wide Web, Beijing, China, 2008, pp. 101-110.