DOI QR코드

DOI QR Code

Building the Outlier Candidate Discrimination Training Data based on Inventory for Automatic Classification of Transferred Records

이관 기록물 분류 자동화를 위한 목록 기반 이상치 판별 학습데이터 구축

  • 정지혜 (전북대학교 일반대학원 기록관리학과) ;
  • 이젬마 (국가기록원 디지털혁신과) ;
  • 왕호성 (국가기록원 디지털혁신과) ;
  • 오효정 (전북대학교 문헌정보학과, 문화융복합아카이빙연구소)
  • Received : 2022.02.09
  • Accepted : 2022.02.07
  • Published : 2022.02.28

Abstract

Electronic public records are classified simultaneously as production, a preservation period is granted, and after a certain period, they are transferred to an archive and preserved. This study intends to find a way to improve the efficiency in classifying transferred records and maintain consistent standards. To this end, the current record classification work process carried out by the National Archives of Korea was analyzed, and problems were identified. As a way to minimize the manual work of record classification by converging the required improvement, the process of identifying outlier candidates based on a list consisting of classified information of the transferred records was proposed and systemized. Furthermore, the proposed outlier discrimination process was applied to the actual records transferred to the National Archives of Korea. The results were standardized and constructed as a training data format that can be used for machine learning in the future.

전자적으로 생산된 공공기록물은 생산과 동시에 편철되고 보존기간이 부여되며 일정기간이 지나면 영구기록물관리기관으로 이관되어 보존된다. 이관 시 기록물관리 담당자가 기록물 분류정보를 확인하고 품질을 일정 수준으로 유지토록 해야 하지만, 이관된 기록물의 분류는 기록물 정리/기술 업무로 편성되어 있고, 대부분의 정리/기술 업무는 수작업에 의존하고 있어 당해 연도에 처리해야 할 기록물 수량을 맞추기 어려운 실정이다. 이에 본 연구는 이관 기록물 분류 업무의 효율화와 일관된 기준을 유지하기 위한 방안을 제안하고자 한다. 이를 위해 먼저 국가기록원에서 수행하고 있는 현행의 기록분류 업무 프로세스를 분석하고 개선 요구사항을 수렴하여 분류 업무의 수작업을 최소화하기 위한 방안으로 이관된 기록물의 편철 정보, 즉 목록에 기반한 분류 이상치 후보를 판별하는 과정을 도출·체계화하였다. 나아가 제안한 이상치 판별 프로세스를 실제 국가기록원으로 이관된 기록물을 대상으로 적용하고, 그 결과를 규격화하여 추후 기계학습에 활용 가능한 학습데이터 형식으로 구축하였다. 본 연구의 궁극적인 목적은 지능형 전자기록 관리 환경 구축을 위한 사전 단계로, 기록관리 업무 내 기계학습 기법이 적용 가능한 문제 유형을 선별하고 자동화하는 방안을 모색하고자 한다.

Keywords

Acknowledgement

본 논문은 '2021년 국가기록관리 활용기술 연구개발(R&D) 사업'의 연구비를 지원 받아 수행되었음. 본 논문은 2019년 대한민국 교육부와 한국연구재단의 지원을 받아 수행된 연구임(과제번호: NRF-2019S1A5B8099507).

References

  1. Choi, Cheol-min (2018). A Study on the Logical Reorganization of Records Owned by the National Archives of Korea. Master's thesis, Department of Archival Studies, Graduate school of Hannam University, South Korea.
  2. Choi, Jung Yul (2018). A study on the standardization strategy for building of learning data set for machine learning applications. Journal of Digital Convergence, 16(10), 205-212. https://doi.org/10.14400/JDC.2018.16.10.205
  3. Jang, Hyun-Jong & Rho, Jee-Hyun (2021). Problems encountered by and improvement strategies of the records classification system for national universities. Journal of Korean Society of Archives and Records Management, 21(2), 115-134 http://dx.doi.org/10.14404/JKSARM.2021.21.2.115
  4. Jang, Ji-Sook & Rieh, Hae-Young (2009). Design of automatic records classification system using contextual information. Journal of Korean Society of Archives and Records Management, 9(1), 151-173. https://doi.org/10.14404/JKSARM.2009.9.1.151
  5. Kang, Yoona, Park, Tae-yeon, Kim, Hyunjin, & Oh, Hyo-Jung (2021). Automation and common utilization plans of job and organization analysis of producing institutions. Journal of Korean Society of Archives and Records Management, 21(4), 81-99. http://dx.doi.org/10.14404/JKSARM.2021.21.4.081
  6. Kim, Hae Chan Sol, An, Dae-Jin, Yim, Jin Hee, & Rieh, Hae-Young (2017). A study on automatic classitification of record text using machine learning. Korean Society for Information Society, 34(4), 321-344. http://dx.doi.org/10.3743/KOSIM.2017.34.4.321
  7. Kim, Intaek, An, Dae-Jin, & Rieh, Hae-Young (2017). Intelligent records and archives management that applies artificial intelligence. Journal of Korean Society of Archives and Records Management, 17(4), 225-250. http://dx.doi.org/10.14404/JKSARM.2017.17.4.225
  8. Korea Employment Information Service (2016, March 24). Artificial intelligence(AI), the era of collaboration between robots and humans. Available: http://www.keis.or.kr/user/bbs/main/137/3963/bbsDataView/32721.do?page=38&column=&search=&searchSDate=&searchEDate=&bbsDataCategory=
  9. National Archives of Korea (2021a). Study on Common Training Dataset Construction for appling AI technology for Records Managements.
  10. National Archives of Korea (2021b). Standard Model for Archives: Function and Procedure Version 2.2.
  11. Oh, Jin Kwan (2019). Automated and intelligent technology for archives management and services. Proceedings of Korean Society of Archives and Records Management, 69-74. http://dx.doi.org/10.14404/PKSARM.2019.S.069
  12. Seol, Moon-Won (2013). An analysis of the application framework of the business reference model to records classification schemes in Korean central government agencies. Journal of the Korean Biblia Society for Library and Information Science, 24(4), 23-51. https://doi.org/10.14699/kbiblia.2013.24.4.023
  13. Yun, Sang-Woo (2020). A Plan to Establish a Record Classification System for Public Institutions through Business Function Analysis: Focused on the Example of Establishment of A Organization Record Management Standard Table. Master's thesis, Department of Archival Studies, Gangnung-wonju National University Graduate school, South Korea.