DOI QR코드

DOI QR Code

공공도서관 목록데이터의 중복검증에 관한 연구 - 부산 지역 G도서관 사례를 중심으로 -

A Study on Duplication Verification of Public Library Catalog Data: Focusing on the Case of G Library in Busan

  • 송민건 (부산대학교 문헌정보학과) ;
  • 이수상 (부산대학교 문헌정보학과)
  • 투고 : 2024.02.26
  • 심사 : 2024.03.20
  • 발행 : 2024.03.30

초록

본 논문은 아이템 기반으로 작성된 공공도서관의 목록데이터에 대해 중복검증 알고리즘을 적용하여 서지레코드의 통합방안을 도출하고자 하였다. 이를 위하여 부산 지역에서 비교적 최근에 개관한 G도서관을 선정하였다. G도서관의 OPAC 데이터를 웹 크롤링을 통해 수집한 다음, 한국문학(KDC 800) 다권본 도서를 선별하고 KERIS의 중복검증 알고리즘을 적용하였다. 검증 결과를 바탕으로 2차에 걸친 데이터 교정 작업을 진행한 이후, 중복검증률은 95.53%에서 98.27%로 총 2.74% 상승하였다. 데이터 교정 후에도 유사/불일치 판정을 받은 24권은 개정판, 양장본 등 별도의 ISBN을 부여받고 출판된 다른 판본의 자료로 확인되었다. 이를 통해 목록데이터 교정 작업을 통해 중복검증률의 개선이 가능함을 확인하였으며, 공공도서관의 중복된 아이템 레코드들을 구현형 레코드로 전환하기 위한 도구로서 KERIS 중복검증 알고리즘의 활용 가능성을 확인하였다.

The purpose of this study is to derive an integration plan for bibliographic records by applying a duplicate verification algorithm to the item-based catalog in public libraries. To this, G Library, which was opened recently in Busan, was selected. After collecting OPAC data from G Library through web crawling, multipart monographs of Korean Literature (KDC 800) were selected and KERIS duplicate verification algorithm was applied. After two rounds of data correction based on the verification results, the duplicate verification rate increased by a total of 2.74% from 95.53% to 98.27%. Even after data correction, 24 books that were judged to be similar or inconsistent were identified as data from other published editions after receiving separate ISBN such as revised versions or hard copies. Through this, it was confirmed that the duplicate verification rate could be improved through catalog data correction work, and the possibility of using the KERIS duplicate verification algorithm as a tool to convert duplicate item-based records from public libraries into manifestation-based records was confirmed.

키워드

참고문헌

  1. Cho, Sun-Yeong (2003). A study on duplicate detection algorithm in union catalog. Journal of the Korean Society for Library and Information Science, 37(4), 69-88. http://uci.or.kr/G704-000226.2003.37.4.001 
  2. Kim, Sun-Ae & Lee, Soo-Sang (2006). Quality evaluation of a shared cataloging DB: the case of KOLIS-NET. Journal of the Korean Society for Library and Information Science, 40(1), 95-117. http://uci.or.kr/G704-000226.2006.40.1.005 
  3. Korea Education and Research Information Service (2006). Duplicate Verification and Quality Assessment(Union Catalog). [Received via email from KERIS officer (2023, November 16)]. 
  4. National Library of Korea (n.d.). KOLIS-NET Duplication Verification Algorithm [Received via email from National Library of Korea officer (2024, March 11)]. 
  5. Rho, Jee-Hyun & Lee, Eun-Ju (2023). Improving the quality of bibliographic data in public libraries: focusing on public libraries in Busan Metropolitan City. Journal of Korean Library and Information Science Society, 54(3), 105-128. https://doi.org/10.16981/kliss.54.3.202309.105 
  6. Beall, J. (2010). Measuring duplicate metadata records in library databases. Library Hi Tech News, 27(9/10), 10-12. https://doi.org/10.1108/07419051011110595 
  7. Cousins, S. A. (1998). Duplicate detection and record consolidation in large bibliographic databases: the COPAC database experience. Journal of Information Science, 24(4), 231-240. https://doi.org/10.1177/016555159802400402 
  8. O'Neill, E. T., Rogers, S. A., & Oskins, M. W. (1993). Characteristics of duplicate records in OCLC's Online Union Catalog. Library Resources & Technical Services, 37(3), 59-71. 
  9. Goyal, P. (1987). Duplicate record identification in bibliographic databases. Information Systems, 12(3), 239-242. https://doi.org/10.1016/0306-4379(87)90002-0 
  10. Liu, W. & Zeng, J. (2016). Duplicate literature detection for cross-library search. Cybernetics and Information Technologies, 16(2), 160-178. https://doi.org/10.1515/cait-2016-0028 
  11. Sitas, A. & Kapidakis, S. (2008). Duplicate detection algorithms of bibliographic descriptions. Library Hi Tech, 26(2), 287-301. https://doi.org/10.1108/07378830810880379 
  12. Taniguchi, S. (2013). Duplicate bibliographic record detection with an OCR-converted source of information. Journal of Information Science, 39(2), 153-168. https://doi.org/10.1177/0165551512459923 
  13. Toney, S. R. (1992). Cleanup and deduplication of an international bibliographic database. Information Technologies and Libraries, 11(1), 19-28.