동시인용정보를 이용한 동명이인 저자의 중의성 해소

Disambiguation of Author Names Using Co-citation

  • 강인수 (경성대학교 컴퓨터학부)
  • Kang, In-Su (Computer Science and Engineering, Kyungsung University)
  • 투고 : 2011.04.04
  • 심사 : 2011.07.14
  • 발행 : 2011.07.30


동시인용은 서로 다른 두 연구가 이후의 새로운 연구에서 동시 인용되는 것이다. 이 연구는 동시인용과 저자식별의 관계를 다룬다. 저자식별은 문헌에 출현한 동명의 저자명들을 실 세계 저자로 식별하는 것이다. 동시인용은, 한 사람의 관련된 연구들이 이후 또 다른 연구들에서 타인 혹은 자신에 의해 동시 인용되는 증거를 수집함으로써, 저자식별의 절차와 성능에 영향을 미칠 수 있다. 이 연구는 구글 스칼라로부터 동시인용을 자동 수집하는 절차를 제시하고 동시인용 정보를 저자식별의 기존 자질들과 효율적으로 결합하는 새로운 군집알고리즘을 제안한다. 실험을 통해 동시인용이 저자식별에 미치는 긍정적 효과를 확인하였다.

Co-citation means that two or more studies are cited together by a later study. This paper deals with the relationship between co-citation and author disambiguation. Author disambiguation is to cluster same-name author instances into real-world individuals. Co-citation may influence author disambiguation in terms that two or more related research works performed by the same person may be co-cited by some later studies. This article describes automated steps to gather co-citation information from Google scholar, and proposes a new clustering algorithm to effectively integrate co-citation information with other author disambiguation features. Experiments showed that co-citation helps to improve the performance of author disambiguation.



연구 과제 주관 기관 : 경성대학교


  1. 강인수, 이승우, 정한민, 김평. 2008a. 저자 식별을 위한 자질 비교. 한국콘텐츠학회논문지, 8(2): 41-47.
  2. 강인수. 2008b. 저자 식별을 위한 전자메일의 추출 및 활용. 한국콘텐츠학회논문지, 8(6): 261-268.
  3. 강인수, 김평, 이승우, 정한민. 2009. 저자 식별을 위한 대용량 평가셋 구축. 한국콘텐츠학회논문지, 9(11): 455-464.
  4. Aswani, N., K. Bontcheva, and H. Cunningham. 2006. "Mining Information for Instance Unification." Proceedings of the 5th International Semantic Web Conference (ISWC), 329-342.
  5. Elliott, S. 2010. "Survey of author name disambiguation: 2004 to 2010." Library Philosophy and Practice. [cited 2011. 6. 17]. .
  6. Han, H., C. Giles, and H. Zha. 2003. "A Model-based K-means Algorithm for name Disambiguation." Proceedings of Semantic Web Technologies for Searching and Retrieving Scientific Data, Oct. 20, Florida: USA.
  7. Han, H., C. Giles, H. Zha, and C. Li. 2004. "Two Supervised Learning Approaches for Name Disambiguation in Author Citations." Proceedings of the ACM/ IEEE Joint Conference on Digital Libraries(JCDL), 296-305.
  8. Han, H., H. Zha, and C. Giles. 2005. "Name Disambiguation in Author Citations Using a K-way Spectral Clustering method." Proceedings of the ACM/IEEE Joint Conference on Digital Libraries(JCDL), 334-343.
  9. Jaccard, P. 1901. "Etude Comparative de la Distribution Florale Dans une Portion Des Alpes et des Jura." Bulletin de la Societe Vaudoise des Sciences Naturelles, 37: 547-579.
  10. Kang, I., P. Kim, S. Lee, and H. Jung. 2010. "Construction of a Large-scale Test Set for Author Disambiguation." Information Processing & Management, 47(3): 452-465.
  11. McCallum, A., K. Nigam, and L. Ungar. 2000. "Efficient Clustering of High-dimensional Data Sets with Application to Reference Matching." Proceedings of the 6th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining(KDD), 169-178.
  12. McRae-Spencer, D. and N. Shadbolt. 2006. "Also by the Same Author: AKTiveAuthor, a Citation Graph Approach to Name Disambiguation." Proceedings of ACM/ IEEE Joint Conference on Digital Libraries( JCDL), 53-54.
  13. Pasula, H., B. Marthi, B. Milch, and S. Russell. 2002. "Identity Uncertainty and Citation Matching." NIPS, 1401-1408.
  14. Pereira, D., B. Ribeiro-Neto, N. Ziviani, and A. Laender. 2009. "Using Web Information for Author Name Disambiguation." Proceedings of the ACM/IEEE Joint Conference on Digital Libraries(JCDL), 49-58.
  15. Small, H. 1973. "Co-citation in the Scientific Literature: A New Measure of the Relationship Between Two Documents." Journal of the American Society for Information Science, 24(4): 265-269.
  16. Song, Y., J. Huang, I. Councill, and J. Li. 2007. "Efficient Topic-based Unsupervised name Disambiguation." Proceedings of the ACM/IEEE Joint Conference on Digital Libraries(JCDL), 342-351.
  17. Tan, Y., M. Kan, and D. Lee. 2006. "Search Engine Driven Author Disambiguation." Proceedings of ACM/IEEE Joint Conference on Digital Libraries(JCDL), 314-315.
  18. White, H. and B. Griffith. 1981. "Author Cocitation: A Literature Measure of Intellectual Structure." Journal of the American Society for Information Science, 32(3): 163-171.
  19. Zhao, D. 2006. "Towards All-author Co-citation Analysis." Information Processing & Management, 42: 1578-1591.