• Title/Summary/Keyword: RDF Data

Search Result 195, Processing Time 0.024 seconds

An Architecture for Efficient RDF Data Management Using Structure Index with Relation-Based Data Partitioning Approach

  • Nguyen, Duc;Oh, Sang-yoon
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.5 no.1
    • /
    • pp.14-17
    • /
    • 2013
  • RDF data is widely used for exchanging data nowadays to enable semantic web era. This leads to the need for storing and retrieving these data efficiently and effectively. Recently, the structure index in graph-based perspective is considered as a promising approach to deal with issues of complex query graphs. However, even though there are many researches based on structure indexing, there can be a better architectural approach instead of addressing the issue as a part. In this research, we propose architecture for storing, query processing and retrieving RDF data in efficient manner using structure indexing. Our research utilizes research results from iStore and 2 relation-based approaches and we focus on improving query processing to reduce the time of loading data and I/O cost.

Development of a Linked Data Creation System for Ordinary People and Application (일반인을 위한 링크드 데이터 생성 시스템 개발 및 활용)

  • Jung, Hyo-Sook;Kim, Hee-Jin;Park, Seong-Bin
    • The Journal of Korean Association of Computer Education
    • /
    • v.14 no.2
    • /
    • pp.47-59
    • /
    • 2011
  • Linked Data is about using the web to link related data that wasn't linked previously. To publish linked data, people should be able to represent, share, and link pieces of data, information, and knowledge by using URIs and RDF. However, building linked data is not easy for the common users who do not know the knowledge or skill about using URIs and RDF. In this paper, we present a system that the common users can create linked data by connecting data originated from different RDFs. They build linked data by adding new links to connect between RDF data saved in their computers or searched from Swoogle. We can apply the proposed system to creating educational contents. For example, teachers can develop various learning contents by building linked data that connects different data suited to the learning level of their students.

  • PDF

Comparison of Storage Structures for RDF Data in Semantic Web. (시맨틱 웹에서 RDF 데이터 저장구조들의 성능비교)

  • Kim, KyungHo;Back, WooHyoun;Son, JiEun;Kim, KyungChang
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2013.05a
    • /
    • pp.881-884
    • /
    • 2013
  • RDF(Resource Description Framework)는 시맨틱 웹의 기초로서 웹 사용자에게 정보를 보다 정확하고 효율적으로 접근하는 표준이다. RDF 데이터를 효율적으로 저장하고 접근하는 필요성이 날로 증가하고 있다. RDF 데이터를 저장하고 검색하는 기본 저장 구조는 관계형 데이터베이스를 이용하는 것이다. 최근에는 RDF 데이터가 엄청나게 증가하고 있는 시점에 대용량 database의 질의(단순 조회)에 최적화된 칼럼-지향(column-oriented) 데이터베이스가 대안으로 제안되었다. 본 논문에서는 RDF 데이터의 저장 구조로서 관계형 데이터베이스와 칼럼-기반 데이터베이스를 비교분석 하고자 한다. Berlin SPARQL Benchmark 를 이용한 성능분석 결과 RDF data 의 저장 구조로서 칼럼-기반 데이터베이스의 효율성을 입증하였다.

Analysis of Access Authorization Conflict for Partial Information Hiding of RDF Web Document (RDF 웹 문서의 부분적인 정보 은닉과 관련한 접근 권한 충돌 문제의 분석)

  • Kim, Jae-Hoon;Park, Seog
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.18 no.2
    • /
    • pp.49-63
    • /
    • 2008
  • RDF is the base ontology model which is used in Semantic Web defined by W3C. OWL expands the RDF base model by providing various vocabularies for defining much more ontology relationships. Recently Jain and Farkas have suggested an RDF access control model based on RDF triple. Their research point is to introduce an authorization conflict problem by RDF inference which must be considered in RDF ontology data. Due to the problem, we cannot adopt XML access control model for RDF, although RDF is represented by XML. However, Jain and Farkas did not define the authorization propagation over the RDF upper/lower ontology concepts when an RDF authorization is specified. The reason why the authorization specification should be defined clearly is that finally, the authorizatin conflict is the problem between the authorization propagation in specifying an authorization and the authorization propagation in inferencing authorizations. In this article, first we define an RDF access authorization specification based on RDF triple in detail. Next, based on the definition, we analyze the authoriztion conflict problem by RDF inference in detail. Next, we briefly introduce a method which can quickly find an authorization conflict by using graph labeling techniques. This method is especially related with the subsumption relationship based inference. Finally, we present a comparison analysis with Jain and Farkas' study, and some experimental results showing the efficiency of the suggested conflict detection method.

Automatic Construction of SHACL Schemas for RDF Knowledge Graphs Generated by R2RML Mappings

  • Choi, Ji-Woong
    • Journal of the Korea Society of Computer and Information
    • /
    • v.25 no.8
    • /
    • pp.9-21
    • /
    • 2020
  • With the proliferation of RDF knowledge graphs(KGs), there arose a need of a standardized schema representation of the graph model for effective data interchangeability and interoperability. The need resulted in the development of SHACL specification to describe and validate RDF graph's structure by W3C. Relational databases(RDBs) are one of major sources for acquiring structured knowledge. The standard for automatic generation of RDF KGs from RDBs is R2RML, which is also developed by W3C. Since R2RML is designed to generate only RDF data graphs from RDBs, additional manual tasks are required to create the schemas for the graphs. In this paper we propose an approach to automatically generate SHACL schemas for RDF KGs populated by R2RML mappings. The key of our approach is that the SHACL shemas are built only from R2RML documents. We describe an implementation of our appraoch. Then, we show the validity of our approach with R2RML test cases designed by W3C.

Conversion of Large RDF Data using Hash-based ID Mapping Tables with MapReduce Jobs (맵리듀스 잡을 사용한 해시 ID 매핑 테이블 기반 대량 RDF 데이터 변환 방법)

  • Kim, InA;Lee, Kyu-Chul
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2021.10a
    • /
    • pp.236-239
    • /
    • 2021
  • With the growth of AI technology, the scale of Knowledge Graphs continues to be expanded. Knowledge Graphs are mainly expressed as RDF representations that consist of connected triples. Many RDF storages compress and transform RDF triples into the condensed IDs. However, if we try to transform a large scale of RDF triples, it occurs the high processing time and memory overhead because it needs to search the large ID mapping table. In this paper, we propose the method of converting RDF triples using Hash-based ID mapping tables with MapReduce, which is the software framework with a parallel, distributed algorithm. Our proposed method not only transforms RDF triples into Integer-based IDs, but also improves the conversion speed and memory overhead. As a result of our experiment with the proposed method for LUBM, the size of the dataset is reduced by about 3.8 times and the conversion time was spent about 106 seconds.

  • PDF

TripleDiff: an Incremental Update Algorithm on RDF Documents in Triple Stores (TripleDiff: 트리플 저장소에서 RDF 문서에 대한 점진적 갱신 알고리즘)

  • Lee, Tae-Whi;Kim, Ki-Sung;Yoo, Sang-Won;Kim, Hyoung-Joo
    • Journal of KIISE:Databases
    • /
    • v.33 no.5
    • /
    • pp.476-485
    • /
    • 2006
  • The Resource Description Framework(RDF), which emerged with the semantic web, is settling down as a standard for representing information about the resources in the World Wide Web Hence, a lot of research on storing and query processing RDF documents has been done and several RDF storage systems, such as Sesame and Jena, have been developed. But the research on updating RDF documents is still insufficient. When a RDF document is changed, data in the RDF triple store also needs to be updated. However, current RDF triple stores don't support incremental update. So updating can be peformed only by deleting the old version and then storing the new document. This updating method is very inefficient because RDF documents are steadily updated. Furthermore, it makes worse when several RDF documents are stored in the same database. In this paper, we propose an incremental update algorithm on RDF, documents in triple stores. We use a text matching technique for two versions of a RDF document and compensate for the text matching result to find the right target triples to be updated. We show that our approach efficiently update RDF documents through experiments with real-life RDF datasets.

Provenance Compression Scheme Considering RDF Graph Patterns (RDF 그래프 패턴을 고려한 프로버넌스 압축 기법)

  • Bok, kyoungsoo;Han, Jieun;Noh, Yeonwoo;Yook, Misun;Lim, Jongtae;Lee, Seok-Hee;Yoo, Jaesoo
    • The Journal of the Korea Contents Association
    • /
    • v.16 no.2
    • /
    • pp.374-386
    • /
    • 2016
  • Provenance means the meta data that represents the history or lineage of a data in collaboration storage environments. Therefore, as provenance has been accruing over time, it takes several ten times as large as the original data. The schemes for effciently compressing huge amounts of provenance are required. In this paper, we propose a provenance compression scheme considering the RDF graph patterns. The proposed scheme represents provenance based on a standard PROV model and encodes provenance in numeric data through the text encoding. We compress provenance and RDF data using the graph patterns. Unlike conventional provenance compression techniques, we compress provenance by considering RDF documents on the semantic web. In order to show the superiority of the proposed scheme, we compare it with the existing scheme in terms of compression ratio and the processing time.

Integration of Heterogeneous Protein Databases Based on RDF(S) Models (RDF(S) 모델에 기반한 다양한 형태의 단백질 데이타베이스 통합)

  • Lee, Kang-Pyo;Yoo, Sang-Won;Kim, Hyoung-Joo
    • Journal of KIISE:Databases
    • /
    • v.35 no.2
    • /
    • pp.132-142
    • /
    • 2008
  • In biological domain, there exist a variety of protein analysis databases which have their own meaning toward the same target of protein. If we integrate these scattered heterogeneous data efficiently, we can obtain useful information which otherwise cannot be found from each original source. Reflecting the characteristics of biological data, each data source has its own syntax and semantics. If we describe these data through RDF(S) models, one of the Semantic Web standards, we can achieve not only syntactic but also semantic integration. In this paper, we propose a new concept of integration layer based on the RDF unified schema. As a conceptual model, we construct a unified schema focusing on the protein information; as a representational model, we propose a technique for the wrappers to aggregate necessary information from the relevant sources and dynamically generate RDF instances. Two example queries show that our integration layer succeeds in processing the integrated requests from users and displaying the appropriate results.

R2RML Based ShEx Schema

  • Choi, Ji-Woong
    • Journal of the Korea Society of Computer and Information
    • /
    • v.23 no.10
    • /
    • pp.45-55
    • /
    • 2018
  • R2RML is a W3C standard language that defines how to expose the relational data as RDF triples. The output from an R2RML mapping is only an RDF dataset. By definition, the dataset has no schema. The lack of schema makes the dataset in linked data portal impractical for integrating and analyzing data. To address this issue, we propose an approach for generating automatically schemas for RDF graphs populated by R2RML mappings. More precisely, we represent the schema using ShEx, which is a language for validating and describing RDF. Our approach allows to generate ShEx schemas as well as RDF datasets from R2RML mappings. Our ShEx schema can provide benefits for both data providers and ordinary users. Data providers can verify and guarantee the structural integrity of the dataset against the schema. Users can write SPARQL queries efficiently by referring to the schema. In this paper, we describe data structures and algorithms of the system to derive ShEx documents from R2RML documents and presents a brief demonstration regarding its proper use.