DOI QR코드

DOI QR Code

Disambiguation of Author Names Using Co-citation

동시인용정보를 이용한 동명이인 저자의 중의성 해소

  • Kang, In-Su (Computer Science and Engineering, Kyungsung University)
  • 강인수 (경성대학교 컴퓨터학부)
  • Received : 2011.04.04
  • Accepted : 2011.07.14
  • Published : 2011.07.30

Abstract

Co-citation means that two or more studies are cited together by a later study. This paper deals with the relationship between co-citation and author disambiguation. Author disambiguation is to cluster same-name author instances into real-world individuals. Co-citation may influence author disambiguation in terms that two or more related research works performed by the same person may be co-cited by some later studies. This article describes automated steps to gather co-citation information from Google scholar, and proposes a new clustering algorithm to effectively integrate co-citation information with other author disambiguation features. Experiments showed that co-citation helps to improve the performance of author disambiguation.

동시인용은 서로 다른 두 연구가 이후의 새로운 연구에서 동시 인용되는 것이다. 이 연구는 동시인용과 저자식별의 관계를 다룬다. 저자식별은 문헌에 출현한 동명의 저자명들을 실 세계 저자로 식별하는 것이다. 동시인용은, 한 사람의 관련된 연구들이 이후 또 다른 연구들에서 타인 혹은 자신에 의해 동시 인용되는 증거를 수집함으로써, 저자식별의 절차와 성능에 영향을 미칠 수 있다. 이 연구는 구글 스칼라로부터 동시인용을 자동 수집하는 절차를 제시하고 동시인용 정보를 저자식별의 기존 자질들과 효율적으로 결합하는 새로운 군집알고리즘을 제안한다. 실험을 통해 동시인용이 저자식별에 미치는 긍정적 효과를 확인하였다.

Keywords

References

  1. 강인수, 이승우, 정한민, 김평. 2008a. 저자 식별을 위한 자질 비교. 한국콘텐츠학회논문지, 8(2): 41-47. https://doi.org/10.5392/JKCA.2008.8.2.041
  2. 강인수. 2008b. 저자 식별을 위한 전자메일의 추출 및 활용. 한국콘텐츠학회논문지, 8(6): 261-268. https://doi.org/10.5392/JKCA.2008.8.6.261
  3. 강인수, 김평, 이승우, 정한민. 2009. 저자 식별을 위한 대용량 평가셋 구축. 한국콘텐츠학회논문지, 9(11): 455-464. https://doi.org/10.5392/JKCA.2009.9.11.455
  4. Aswani, N., K. Bontcheva, and H. Cunningham. 2006. "Mining Information for Instance Unification." Proceedings of the 5th International Semantic Web Conference (ISWC), 329-342.
  5. Elliott, S. 2010. "Survey of author name disambiguation: 2004 to 2010." Library Philosophy and Practice. [cited 2011. 6. 17]. .
  6. Han, H., C. Giles, and H. Zha. 2003. "A Model-based K-means Algorithm for name Disambiguation." Proceedings of Semantic Web Technologies for Searching and Retrieving Scientific Data, Oct. 20, Florida: USA.
  7. Han, H., C. Giles, H. Zha, and C. Li. 2004. "Two Supervised Learning Approaches for Name Disambiguation in Author Citations." Proceedings of the ACM/ IEEE Joint Conference on Digital Libraries(JCDL), 296-305.
  8. Han, H., H. Zha, and C. Giles. 2005. "Name Disambiguation in Author Citations Using a K-way Spectral Clustering method." Proceedings of the ACM/IEEE Joint Conference on Digital Libraries(JCDL), 334-343.
  9. Jaccard, P. 1901. "Etude Comparative de la Distribution Florale Dans une Portion Des Alpes et des Jura." Bulletin de la Societe Vaudoise des Sciences Naturelles, 37: 547-579.
  10. Kang, I., P. Kim, S. Lee, and H. Jung. 2010. "Construction of a Large-scale Test Set for Author Disambiguation." Information Processing & Management, 47(3): 452-465.
  11. McCallum, A., K. Nigam, and L. Ungar. 2000. "Efficient Clustering of High-dimensional Data Sets with Application to Reference Matching." Proceedings of the 6th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining(KDD), 169-178.
  12. McRae-Spencer, D. and N. Shadbolt. 2006. "Also by the Same Author: AKTiveAuthor, a Citation Graph Approach to Name Disambiguation." Proceedings of ACM/ IEEE Joint Conference on Digital Libraries( JCDL), 53-54.
  13. Pasula, H., B. Marthi, B. Milch, and S. Russell. 2002. "Identity Uncertainty and Citation Matching." NIPS, 1401-1408.
  14. Pereira, D., B. Ribeiro-Neto, N. Ziviani, and A. Laender. 2009. "Using Web Information for Author Name Disambiguation." Proceedings of the ACM/IEEE Joint Conference on Digital Libraries(JCDL), 49-58.
  15. Small, H. 1973. "Co-citation in the Scientific Literature: A New Measure of the Relationship Between Two Documents." Journal of the American Society for Information Science, 24(4): 265-269. https://doi.org/10.1002/asi.4630240406
  16. Song, Y., J. Huang, I. Councill, and J. Li. 2007. "Efficient Topic-based Unsupervised name Disambiguation." Proceedings of the ACM/IEEE Joint Conference on Digital Libraries(JCDL), 342-351.
  17. Tan, Y., M. Kan, and D. Lee. 2006. "Search Engine Driven Author Disambiguation." Proceedings of ACM/IEEE Joint Conference on Digital Libraries(JCDL), 314-315.
  18. White, H. and B. Griffith. 1981. "Author Cocitation: A Literature Measure of Intellectual Structure." Journal of the American Society for Information Science, 32(3): 163-171. https://doi.org/10.1002/asi.4630320302
  19. Zhao, D. 2006. "Towards All-author Co-citation Analysis." Information Processing & Management, 42: 1578-1591. https://doi.org/10.1016/j.ipm.2006.03.022