• Title/Summary/Keyword: Data Repository Registry

Search Result 16, Processing Time 0.024 seconds

Global Data Repository Status and Analysis: Based on Korea, China and Japan Data in re3data.org

  • Kim, Suntae
    • International Journal of Knowledge Content Development & Technology
    • /
    • v.8 no.1
    • /
    • pp.79-89
    • /
    • 2018
  • We collected and analyzed data from e3data.org, which is a global registry of data repository services. We analyzed data profile for three leading Asian economies-Korea, China, and Japan-against the reference data for other participating countries. In particular, we examined how individual countries contribute to the repository, organizational type, versioning and product quality management, and subject tagging. We come to the conclusion that all three Asian countries still fall short in terms of involvement. As for participating institutions, there are 7 from Korea, 64 from China, and 120 from Japan. Among Chinese organizations, 3 are profit, 61 non-profit, and 37 organizations (which yields 1.8%) are involved in repository building. In Japan, there is 1 is commercial and 119 non-profit organizations, of which 57 (3.0%) are involved in repository building. All 7 organizations from Korea are non-profit, and 6 of them (0.3%) are involved in repository building. As regards versioning and product quality management, Korea, China, and Japan are up to par with other countries. Subject analysis reveals that Korea contributes more to geosciences, Japan to physics and geosciences, while China, unlike Korea and Japan, is more active in life sciences. It is hoped that this study will help planning domestic infrastructure for research data repositories with proper consideration for specific research domains and national characteristics.

Registry Metadata Quality Assessment by the Example of re3data.org Schema

  • Kim, Suntae;Choi, Myung-Seok
    • International Journal of Knowledge Content Development & Technology
    • /
    • v.7 no.2
    • /
    • pp.41-51
    • /
    • 2017
  • Nowadays, research data repositories (RDR) have become progressively widespread all over the world. To expand repository services and build up inbound linking strategy, organizations list their repositories with so called Global Registries. Accordingly, such registries should be carefully described by the related data. In this study, I explore the metadata schema of re3data.org. I collect and analyze descriptions from the listed repositories, and come up with some suggestions concerning possible improvements to the metadata schema. To accomplish this, I develop a crawler program, which collects necessary data from the re3data.org. Based on the analysis results, I have identified two issues that required elements is missing, one issue that required element value is missing when the corresponding property is applied, five inconsistency issues with re3data controlled vocabulary, six issues with undescribed optional elements, and two inconsistency issues between the elements and their attributes which do not pair with. I believe this discussion can facilitate improvements to the existing re3data.org schema and further help researchers who analyze data repository trends.

Analysis of the Current Status of Data Repositories in the Field of Ecological Research

  • Kim, Suntae
    • Proceedings of the National Institute of Ecology of the Republic of Korea
    • /
    • v.2 no.2
    • /
    • pp.139-143
    • /
    • 2021
  • In this study, data repository information registered in re3data (re3data.org), a research data registry, was collected. Based on collected data, the current status was analyzed for 354 repositories (approximately 14% of total repositories) in the field using keywords in the ecological field suggested by two experts. Major metadata formats used to describe data in ecological research data repositories include Federal Geographic Data Committee Content Standard for Digital Geospatial Metadata (FGDC/CSDGM), Dublin Core, ISO 19115, Ecological Metadata Language (EML), Directory Interchange Format (DIF), Darwin Core, Data Documentation Initiative (DDI), and DataCite Metadata Schema. The number of ecological repositories according to country is 102 in the US, 34 in Germany, 31 in Canada, and one in Korea. A total of 771 non-profit organizations and 12 for-profit organizations are involved in the construction of the ecological field research data repository. Data version control ratio of the ecological field research data repositories registered in re3data was analyzed to be somewhat higher (86.6%) than the total ratio (83.9%). Results of this study can be used to establish policies to build and operate a research data repository in the ecological field.

The Design of Data Grid Wrapper for Integrated Retrieve based on XMDR (XMDR 기반의 통합 검색을 위한 데이터 그리드 Wrapper 설계)

  • Hwang, Chi-Gon;Jung, Kye-Dong;Choi, Young-Keun
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.12 no.5
    • /
    • pp.921-929
    • /
    • 2008
  • Recently, many researches have been conducted to solve data heterogeneity as a way for data integration. The elements of the system that we suggest are an XMDR wrapper and XMDR Repository. XMDR wrapper solves the heterogeneity of the existing system by creating the interface based on the standard information of XMDR, and performing the inter-conversion between global XMDR query and local query using mapping data on standard information and local schema. XMDR Repository are composed of XMDR which manages the mapping data on standard information and local schema, and of Proxy DB which saves the accomplished results. With XMDR wrapper and XMDR Repository, users can use the same interface, and they need not conduct repeated queries since XMDR wrapper not only solves the heterogeneity of the schema using the meta-semantic ontology of XMDR, but also considers the heterogeneity accompanying the meaning of the value through instance semantic ontology. Therefore, in this paper we suggest the grid wrapper for the solution of data heterogeneity and efficient data integration.

Design and Implementation of A Distributed Information Integration System based on Metadata Registry (메타데이터 레지스트리 기반의 분산 정보 통합 시스템 설계 및 구현)

  • Kim, Jong-Hwan;Park, Hea-Sook;Moon, Chang-Joo;Baik, Doo-Kwon
    • The KIPS Transactions:PartD
    • /
    • v.10D no.2
    • /
    • pp.233-246
    • /
    • 2003
  • The mediator-based system integrates heterogeneous information systems with the flexible manner. But it does not give much attention on the query optimization issues, especially for the query reusing. The other thing is that it does not use standardized metadata for schema matching. To improve this two issues, we propose mediator-based Distributed Information Integration System (DIIS) which uses query caching regarding performance and uses ISO/IEC 11179 metadata registry in terms of standardization. The DIIS is designed to provide decision-making support, which logically integrates the distributed heterogeneous business information systems based on the Web environment. We designed the system in the aspect of three-layer expression formula architecture using the layered pattern to improve the system reusability and to facilitate the system maintenance. The functionality and flow of core components of three-layer architecture are expressed in terms of process line diagrams and assembly line diagrams of Eriksson Penker Extension Model (EPEM), a methodology of an extension of UML. For the implementation, Supply Chain Management (SCM) domain is used. And we used the Web-based environment for user interface. The DIIS supports functions of query caching and query reusability through Query Function Manager (QFM) and Query Function Repository (QFR) such that it enhances the query processing speed and query reusability by caching the frequently used queries and optimizing the query cost. The DIIS solves the diverse heterogeneity problems by mapping MetaData Registry (MDR) based on ISO/IEC 11179 and Schema Repository (SCR).

The Design of XMDR Data Hub for Efficient Business Process Operation (효율적인 비즈니스 프로세스 운용을 위한 XMDR 데이터 허브 설계)

  • Hwang, Chi-Gon;Jung, Gye-Dong;Choi, Young-Keun
    • The KIPS Transactions:PartD
    • /
    • v.18D no.3
    • /
    • pp.149-156
    • /
    • 2011
  • Recently, enterprise systems require the necessity of integration for data sharing and cooperation. As a methodology for integration, Service-Oriented Architecture for service integration and Master Data for integration of data, which is used for service, were appeared. This paper suggests a method that operates BP(Business Process) efficiently. We make XMDR(eXtended Meta Data Registry) as knowledge-repository to support the BP and construct data hubs to operate it. XMDR manages MDM(Master Data Management) to integrate the data, resolves heterogeneity between the data and provides relationship to the business efficiently. This is composed of MDR(Meta Data Registry), ontology and BR(Business Relations). MDR describes relationship between meta data to solve structured heterogeneity. Ontology describes semantic heterogeneity and relationship between data. BR describes relationship between tasks. XMDR data hub supports the management of master data and interaction of different process effectively.

Analysis of Ecological Data Repository Operation Status and EcoBank Service Proposal (생태 분야 데이터 리포지터리 운영 현황 분석 및 EcoBank 서비스 제안)

  • Juseop Kim;Hyosuk Kang;Suntae Kim
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.57 no.4
    • /
    • pp.289-310
    • /
    • 2023
  • Sharing and reusing data has become essential. Data repositories are a key tool for sharing and reusing this data. The purpose of this study is to propose the service of EcoBank, which is being built and operated by the National Institute of Ecology. To achieve the research purpose, 10 out of 123 foreign data repositories in the field of ecology registered on re3data.org were selected, investigated, and analyzed. As a result of the analysis, three services were derived in common. The three services consist of first, research data policy, second, research data quality review, and research data management training and workshops. Here, in order to share EcoBank's global data, it is necessary to register with a data repository registry such as re3data.org, and it is proposed that certification be promoted to ensure the reliability and quality of the repository.

A New Method of Registering the XML-based Clinical Document Architecture Supporting Pseudonymization in Clinical Document Registry Framework (익명화 방법을 적용한 임상진료문서 등록 기법 연구)

  • Kim, Il-Kwang;Lee, Jae-Young;Kim, Il-Kon;Kwak, Yun-Sik
    • Journal of KIISE:Software and Applications
    • /
    • v.34 no.10
    • /
    • pp.918-928
    • /
    • 2007
  • The goal of this paper is to propose a new way to register CDA documents in CDR (Clinical Document Repository) that is proposed by the author earlier. One of the methods is to use a manifest archiving for seamless references and visualization of CDA related files. Another method is to enhance the CDA security level for supporting pseudonymization of CDA. The former is a useful method to support the bundled registration of CDA related files as a set. And it also can provide a seamless presentation view to end-users, once downloaded, without each HTTP connection. The latter is a new method of CDA registration which can supports a do-identification of a patient. Usually, CDA header can be used for containing patient identification information, and CDA body can be used for diagnosis or treatment data. So, if we detach each other, we can get good advantages for privacy protection. Because even if someone succeeded to get separated CDA body, he/she never knows whose clinical data that is. The other way, even if someone succeeded to get separated CDA header; he/she doesn't know what kind of treatment has been done. This is the way to achieve protecting privacy by disconnecting association of relative information and reducing possibility of leaking private information. In order to achieve this goal, the method we propose is to separate CDA into two parts and to store them in different repositories.

Knowledge Based Search System In the ebXML Environment (ebXML 환경에서의 지식기반 검색 시스템)

  • 최형림;김현수;최현덕
    • The Journal of Society for e-Business Studies
    • /
    • v.7 no.3
    • /
    • pp.75-91
    • /
    • 2002
  • As B2B (Business to business) develops swiftly, at home as well as in other advanced countries, plans for activating Electronic business are made and proceeded in a national viewpoint. However, it is essential task for the construction, advancement and activation of B2B framework to make an efficient search for differently built -up data from B2C and thus to look for optimal business partner suitable for his/her own business. For this, in the last Aug. of 2001, government has also referred to ebXML, the exchange model for electronic business data based on XML, as a suggestion for B2B framework. The purpose of this study is to develop search system for efficient choice of business partner and this will play an important role for data processing and competitiveness strengthening of small and medium enterprises. Meanwhile, this system is built up by using systemic characteristics registered in ebXML Registry/Repository and ‘question-expanding’ searching ways based on the particulars of business profiles for both objectiveness and maximum efficiency of search result.

  • PDF

Service Discovery Using Broadcasting Data Channel

  • Hasan, Md. Kamrul;Rubaiyeat, Husne Ara;Lee, Sungyoung;Lee, Young-Koo
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2010.04a
    • /
    • pp.312-313
    • /
    • 2010
  • Traditional service discovery mechanisms so far necessitate centralized registry containing all the service descriptions. Though centralized service registration is intuitive, it does not facilitate users in their usual ways of doing things. Moreover, centralized repository is not scalable for high query rate. We propose that service description be broadcast through the advertising data channels so that computers can parse and queue the service descriptions interesting to the users. The current technologies such as Digital Media Broadcast (DMB), Car Navigation Systems and Wireless Broadband can bring our idea to reality.