• Title/Summary/Keyword: Schema.org

Search Result 8, Processing Time 0.021 seconds

Registry Metadata Quality Assessment by the Example of re3data.org Schema

  • Kim, Suntae;Choi, Myung-Seok
    • International Journal of Knowledge Content Development & Technology
    • /
    • v.7 no.2
    • /
    • pp.41-51
    • /
    • 2017
  • Nowadays, research data repositories (RDR) have become progressively widespread all over the world. To expand repository services and build up inbound linking strategy, organizations list their repositories with so called Global Registries. Accordingly, such registries should be carefully described by the related data. In this study, I explore the metadata schema of re3data.org. I collect and analyze descriptions from the listed repositories, and come up with some suggestions concerning possible improvements to the metadata schema. To accomplish this, I develop a crawler program, which collects necessary data from the re3data.org. Based on the analysis results, I have identified two issues that required elements is missing, one issue that required element value is missing when the corresponding property is applied, five inconsistency issues with re3data controlled vocabulary, six issues with undescribed optional elements, and two inconsistency issues between the elements and their attributes which do not pair with. I believe this discussion can facilitate improvements to the existing re3data.org schema and further help researchers who analyze data repository trends.

Biotea-2-Bioschemas, facilitating structured markup for semantically annotated scholarly publications

  • Garcia, Leyla;Giraldo, Olga;Garcia, Alexander;Rebholz-Schuhmann, Dietrich
    • Genomics & Informatics
    • /
    • v.17 no.2
    • /
    • pp.14.1-14.6
    • /
    • 2019
  • The total number of scholarly publications grows day by day, making it necessary to explore and use simple yet effective ways to expose their metadata. Schema.org supports adding structured metadata to web pages via markup, making it easier for data providers but also for search engines to provide the right search results. Bioschemas is based on the standards of schema.org, providing new types, properties and guidelines for metadata, i.e., providing metadata profiles tailored to the Life Sciences domain. Here we present our proposed contribution to Bioschemas (from the project "Biotea"), which supports metadata contributions for scholarly publications via profiles and web components. Biotea comprises a semantic model to represent publications together with annotated elements recognized from the scientific text; our Biotea model has been mapped to schema.org following Bioschemas standards.

Transformation Method for Publishing DCAT based Metadata in Data Repository on Web (DCAT 기반 메타데이터의 웹 출판을 위한 변환 기법)

  • Park, Jinhyo;Kim, Kihun;Kim, Sung-Hee;Youn, Joosang
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2021.11a
    • /
    • pp.491-493
    • /
    • 2021
  • 최근 데이터 산업 발전과 함께 데이터를 저장, 공유, 거래가 가능한 다양한 데이터 저장소와 거래소가 증가하고 있다. 대부분의 데이터 저장소 및 거래소는 데이터 검색과 공유를 위해 DCAT 기반 메타데이터를 구성하고 있다. 하지만 DCAT 기반 메타데이터는 웹 검색 엔진에서 검색이 잘되지 않는 문제점을 가지고 있다. 이는 웹에서 자원을 출판하기 위한 데이터 모델 기법이 Schema.org 방법을 사용하고 있기 때문이다. 본 논문에서는 이런 문제점을 해결하기 위해 DCAT 기반 메타데이터를 Schema.org 방법으로 변환할 수 있는 새로운 기법을 제안한다. 제안하는 변환 기법은 데이터 저장소와 거래소 내 데이터셋이 웹에서 잘 검색될 수 있는 웹 출판 기능을 지원한다.

A Study on Recent Trends in Building Linked Data for Overseas Libraries: Focusing on Published Datasets, Reused Vocabulary, and Interlinked External Datasets (해외 도서관 링크드 데이터 구축의 최근 동향 연구 - 발행 데이터세트, 재사용 어휘집, 인터링킹 외부 데이터세트를 중심으로 -)

  • Sung-Sook Lee
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.56 no.4
    • /
    • pp.5-28
    • /
    • 2022
  • In this study, LD construction cases of overseas libraries were analyzed with focus on published datasets, reused vocabulary, and interlinked external datasets, and based on the analysis results, basic data on LD construction plans of domestic libraries were obtained. As a result of the analysis of 21 library cases, overseas libraries have established a faithful authority LD and conducted new services using published LDs. To this end, overseas libraries collaborated with other libraries and cultural institutions within the region, within the country, and nationally under the leadership of the library, and based on this cooperation, a specialized dataset was published. Overseas libraries used Schema.org to increase the visibility of published LDs, and used BIBFRAME for subdivision of description to define various entities and build LDs based on the defined entities. Overseas libraries have utilized various defined entities to link related information, display results, browse, and download in bulk. Overseas libraries were interested in the continuous up-to-date of interlinked external datasets, and directly utilized external data to reinforce catalog information. In this study, based on the derived implications, points to be considered when issuing LDs by domestic libraries were proposed. The research results can be used as basic data when future domestic libraries plan LD services or upgrade existing services.

Standard-based Integration of Heterogeneous Large-scale DNA Microarray Data for Improving Reusability

  • Jung, Yong;Seo, Hwa-Jeong;Park, Yu-Rang;Kim, Ji-Hun;Bien, Sang Jay;Kim, Ju-Han
    • Genomics & Informatics
    • /
    • v.9 no.1
    • /
    • pp.19-27
    • /
    • 2011
  • Gene Expression Omnibus (GEO) has kept the largest amount of gene-expression microarray data that have grown exponentially. Microarray data in GEO have been generated in many different formats and often lack standardized annotation and documentation. It is hard to know if preprocessing has been applied to a dataset or not and in what way. Standard-based integration of heterogeneous data formats and metadata is necessary for comprehensive data query, analysis and mining. We attempted to integrate the heterogeneous microarray data in GEO based on Minimum Information About a Microarray Experiment (MIAME) standard. We unified the data fields of GEO Data table and mapped the attributes of GEO metadata into MIAME elements. We also discriminated non-preprocessed raw datasets from others and processed ones by using a two-step classification method. Most of the procedures were developed as semi-automated algorithms with some degree of text mining techniques. We localized 2,967 Platforms, 4,867 Series and 103,590 Samples with covering 279 organisms, integrated them into a standard-based relational schema and developed a comprehensive query interface to extract. Our tool, GEOQuest is available at http://www.snubi.org/software/GEOQuest/.

Analysis of the Current Status of Data Repositories in the Field of Ecological Research

  • Kim, Suntae
    • Proceedings of the National Institute of Ecology of the Republic of Korea
    • /
    • v.2 no.2
    • /
    • pp.139-143
    • /
    • 2021
  • In this study, data repository information registered in re3data (re3data.org), a research data registry, was collected. Based on collected data, the current status was analyzed for 354 repositories (approximately 14% of total repositories) in the field using keywords in the ecological field suggested by two experts. Major metadata formats used to describe data in ecological research data repositories include Federal Geographic Data Committee Content Standard for Digital Geospatial Metadata (FGDC/CSDGM), Dublin Core, ISO 19115, Ecological Metadata Language (EML), Directory Interchange Format (DIF), Darwin Core, Data Documentation Initiative (DDI), and DataCite Metadata Schema. The number of ecological repositories according to country is 102 in the US, 34 in Germany, 31 in Canada, and one in Korea. A total of 771 non-profit organizations and 12 for-profit organizations are involved in the construction of the ecological field research data repository. Data version control ratio of the ecological field research data repositories registered in re3data was analyzed to be somewhat higher (86.6%) than the total ratio (83.9%). Results of this study can be used to establish policies to build and operate a research data repository in the ecological field.

Comparison of Cognitive Loads between Koreans and Foreigners in the Reading Process

  • Im, Jung Nam;Min, Seung Nam;Cho, Sung Moon
    • Journal of the Ergonomics Society of Korea
    • /
    • v.35 no.4
    • /
    • pp.293-305
    • /
    • 2016
  • Objective: This study aims to measure cognitive load levels by analyzing the EEG of Koreans and foreigners, when they read a Korean text with care selected by level from the grammar and vocabulary aspects, and compare the cognitive load levels through quantitative values. The study results can be utilized as basic data for more scientific approach, when Korean texts or books are developed, and an evaluation method is built, when the foreigners encounter them for learning or an assignment. Background: Based on 2014, the number of the foreign students studying in Korea was 84,801, and they increase annually. Most of them are from Asian region, and they come to Korea to enter a university or a graduate school in Korea. Because those foreign students aim to learn within Universities in Korea, they receive Korean education from their preparation for study in Korea. To enter a university in Korea, they must acquire grade 4 or higher level in the Test of Proficiency in Korean (TOPIK), or they need to complete a certain educational program at each university's affiliated language institution. In such a program, the learners of the Korean language receive Korean education based on texts, except speaking domain, and the comprehension of texts can determine their academic achievements in studying after they enter their desired schools (Jeon, 2004). However, many foreigners, who finish a language course for the short-term, and need to start university study, cannot properly catch up with university classes requiring expertise with the vocabulary and grammar levels learned during the language course. Therefore, reading education, centered on a strategy to understand university textbooks regarded as top level reading texts to the foreigners, is necessary (Kim and Shin, 2015). This study carried out an experiment from a perspective that quantitative data on the readers of the main player of reading education and teaching materials need to be secured to back up the need for reading education for university study learners, and scientifically approach educational design. Namely, this study grasped the difficulty level of reading through the measurement of cognitive loads indicated in the reading activity of each text by dividing the difficulty of a teaching material (book) into eight levels, and the main player of reading into Koreans and foreigners. Method: To identify cognitive loads indicated upon reading Korean texts with care by Koreans and foreigners, this study recruited 16 participants (eight Koreans and eight foreigners). The foreigners were limited to the language course students studying the intermediate level Korean course at university-affiliated language institutions within Seoul Metropolitan Area. To identify cognitive load, as they read a text by level selected from the Korean books (difficulty: eight levels) published by King Sejong Institute (Sejonghakdang.org), the EEG sensor was attached to the frontal love (Fz) and occipital lobe (Oz). After the experiment, this study carried out a questionnaire survey to measure subjective evaluation, and identified the comprehension and difficulty on grammar and words. To find out the effects on schema that may affect text comprehension, this study controlled the Korean texts, and measured EEG and subjective satisfaction. Results: To identify brain's cognitive load, beta band was extracted. As a result, interactions (Fz: p =0.48; Oz: p =0.00) were revealed according to Koreans and foreigners, and difficulty of the text. The cognitive loads of Koreans, the readers whose mother tongue is Korean, were lower in reading Korean texts than those of the foreigners, and the foreigners' cognitive loads became higher gradually according to the difficulty of the texts. From the text four, which is intermediate level in difficulty, remarkable differences started to appear in comparison of the Koreans and foreigners in the beginner's level text. In the subjective evaluation, interactions were revealed according to the Koreans and foreigners and text difficulty (p =0.00), and satisfaction was lower, as the difficulty of the text became higher. Conclusion: When there was background knowledge in reading, namely schema was formed, the comprehension and satisfaction of the texts were higher, although higher levels of vocabulary and grammar were included in the texts than those of the readers. In the case of a text in which the difficulty of grammar was felt high in the subjective evaluation, foreigners' cognitive loads were also high, which shows the result of the loads' going up higher in proportion to the increase of difficulty. This means that the grammar factor functions as a stress factor to the foreigners' reading comprehension. Application: This study quantitatively evaluated the cognitive loads of Koreans and foreigners through EEG, based on readers and the text difficulty, when they read Korean texts. The results of this study can be used for making Korean teaching materials or Korean education content and topic selection for foreigners. If research scope is expanded to reading process using an eye-tracker, the reading education program and evaluation method for foreigners can be developed on the basis of quantitative values.

A Knowledge Graph on Japanese "Comfort Women": Interlinking Fragmented Digital Archival Resources (일본군 '위안부' 지식그래프: 파편화된 디지털 기록의 연결)

  • Park, Haram;Kim, Haklae
    • Journal of Korean Society of Archives and Records Management
    • /
    • v.21 no.3
    • /
    • pp.61-78
    • /
    • 2021
  • Records on Japanese "Comfort Women" have been individually managed by private sectors or institutions, and some are provided as digital archives on the Internet. However, records of digital archives differ in the composition and representation of metadata by individual institutions. Meanwhile, there is a lack of a consistent structure to describe the relationships between and among these records, leading to their fragmentation and disconnectedness. This paper proposes a knowledge model for interlinking the digital archival resources and builds a knowledge graph by integrating the records from distributed digital archives. It derives common elements by analyzing metadata from the diverse digital archives and expresses them in standard vocabularies to semantically describe multiple entities and relationships of the digital archival resources. In particular, the study includes the refinement of collected data to search and thread dispersed records and the enrichment of external data to provide significant contextual information of records. An evaluation of the knowledge graph is performed via a query measuring the (dis)connectivity between the distributed records. As a result, the knowledge graph is capable of interlinking and retrieving fragmented records, providing substantial contextual information on the records with external data enrichment, and searching accurately to match the user's intentions through semantic-based queries.