• Title/Summary/Keyword: XML Data Index

Search Result 58, Processing Time 0.029 seconds

Storage and Retrieval of XML Documents Without Redundant Path Information (경로정보의 중복을 제거한 XML 문서의 저장 및 질의처리 기법)

  • Lee Hiye-Ja;Jeong Byeong-Soo;Kim Dae-Ho;Lee Young-Koo
    • The KIPS Transactions:PartD
    • /
    • v.12D no.5 s.101
    • /
    • pp.663-672
    • /
    • 2005
  • This Paper Proposes an approach that removes the redundancy of Path information and uses an inverted index, as an efficient way to store a large volume of XML documents and to retrieve wanted information from there. An XML document is decomposed into nodes based on its tree structure, and stored in relational tables according to the node type, with path information from the root to each node. The existing methods using path information store data for all element paths, which cause retrieval performance to be decreased with increased data volume. Our approach stores only data for leaf element path excluding internal element paths. As the inverted index is made by the leaf element path only, the number of posting lists by key words become smaller than those of the existing methods. For the storage and retrieval of U data, our approach doesn't require the XML schema information of XML documents and any extension of relational database. We demonstrate the better performance of on approach than the existing approaches within the scope of our experiment.

RDB Storage Model of XML Instance based on the Edge-Lageled Graph (Edge-Labeled Graph에 기반 한 XML 인스턴스의 RDB 저장 모델)

  • 김정희;김정필;곽호영
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2003.04a
    • /
    • pp.545-547
    • /
    • 2003
  • 본 논문에서는 Edge-Labeled Graph에 기반하여 XML 인스턴스들을 관계형 데이터베이스(RDB)로 저장하는 모델을 제안하고 구현한다. 저장되는 XML 인스턴스들은 Edge-Libeled Graph에 기반 한 Data Graph로 표현되고 이를 이용하여 데이터 경로(Data Path), 요소(Element), 속성(Attribute), 테이블 인덱스(Table Index) 테이블에 정의된 값들이 추출된 후 Napper를 이용하여 데이터베이스 스키마를 정의하고 추출된 값들을 저장한다. 그리고, RDB 저장 모델은 질의를 지원하기 위해, XPATH를 따르는 질의 언어로 사용되는 XQL을 SQL로 변환하는 변환기를 제공하며, 또한 저장된 XML 인스턴스를 복원하는 DBtoXML 처리기를 갖도록 하였다. 구현 결과, XML 인스턴스들과 RDB 구조로의 저장 관계가 그래프(Graph) 기반의 경로(Path)를 이용한 표현으로 가능했으며, 동시에, 특정 요소 (Element) 또는 속성(Attribute)들의 정보들을 쉽게 검색할 수 있는 가능성을 보였다.

  • PDF

XML View Indexing Using an RDBMS based XML Storage System (관계 DBMS 기반 XML 저장시스템 상에서의 XML 뷰 인덱싱)

  • Park Dae-Sung;Kim Young-Sung;Kang Hyunchul
    • Journal of Internet Computing and Services
    • /
    • v.6 no.4
    • /
    • pp.59-73
    • /
    • 2005
  • Caching query results and reusing them in processing of subsequent queries is an important query optimization technique. Materialized view and view indexing are the representative examples of such a technique. The two schemes had received much attention for relational databases, and have been investigated for XML data since XML emerged as the standard for data exchange on the Web. In XML view indexing, XML view xv which is the result of an XML query is represented as an XML view index(XVI), a structure containing the identifiers of xv's underlying XML elements as well as the information on xv. Since XVI for xv stores just the identifiers of the XML elements not the elements themselves, when xv is requested, its XVI should be materialized against xv's underlying XML documents. In this paper, we address the problem of integrating an XML view index management system with an RDBMS based XML storage system. The proposed system was implemented in Java on Windows 2000 Server with each of two different commercial RDBMSs, and used in evaluating performance improvement through XML view indexing as well as its overheads. The experimental results revealed that XML view indexing was very effective with an RDBMS based XML storage system while its overhead was negligible.

  • PDF

A XML Instance Repository Model based on the Edge-Labeled Graph (Edge-Labeled 그래프 기반의 XML 인스턴스 저장 모델)

  • Kim Jeong-Hee;Kwak Ho-Young
    • Journal of Internet Computing and Services
    • /
    • v.4 no.6
    • /
    • pp.33-42
    • /
    • 2003
  • A XML Instance repository model based on the Edge-Labeled Graph is suggested for storing the XML instance in Relational Databases, This repository model represents the XML instance as a data graph based on the Edge-Labeled Graph, extracts the defined value based on the structure of data path, element, attribute, and table index table presented as database schema, and stores these values using the Mapper module, In order to support querry, XML repository model offers the module translating XQL which is a query language under XPATH to SQL, and has DBtoXML generator module restoring the stored XML instance. As a result, it is possible to represent the storage relationship between the XML instances and the proposed repository model in terms of Graph-based Path, and it shows the possibility of easy search of specific element and attribute information.

  • PDF

효율적인 물류정보 서비스를 위한 XML 중심의 물류데이터 색인 및 검색

  • Baek, Dae-Won;Jo, Lee-Hyeon;Baek, Eok-Jong;Gwon, Hyeok-Cheol
    • Proceedings of the Korea Inteligent Information System Society Conference
    • /
    • 2005.11a
    • /
    • pp.264-270
    • /
    • 2005
  • 다양한 XML 데이터를 통합 관리하고, 여러 애플리케이션에 정보를 제공하는 웹 서비스 기반의 정보시 스템에서는 체계적이고 효과적인 XML 데이터의 저장 및 검색을 요구한다. 특히, 물류 분야의 정보시스 템에서는 다양한 물류 객체의 정보를 저장하고 관리 하여야 하며, 여러 애플리케이션의 물류 정보 요청에 지능적인 XML 데이터 검색으로 대처할 수 있어야 한다. XML은 데이터를 구조적으로 표현하고, 체계적인 정보 전달을 위해 많은 분야에서 이용하고 있다. XML 데이터는 데이터 구조적 형식을 정의하는 태그와 해당 값으로 구성되어 있다. 각각의 데이터 구조를 가지는 다양한 물류 데이터의 통합 관리 및 검색서비스를 위해서는 XML 데이터의 섹인이 매우 중요하다. 본 논문에서는 웹 서비스 기반의 물류정보 시스템에서 효율적인 정보 검색서비스 제공을 위한 XML 데이터 색인 기법을 제안한다. 또한, 다양한 물류데이터의 효율적인 통합 관리 및 검색을 위한 온톨로지의 적용을 제안한다.

  • PDF

An Assignment Method of Multidimensional Type Inheritance Indexes for XML Query Processing (XML 질의처리를 위한 다차원 타입상속 색인구조의 할당기법)

  • Lee, Jong-Hak
    • Journal of Korea Multimedia Society
    • /
    • v.12 no.1
    • /
    • pp.1-15
    • /
    • 2009
  • This paper presents an assignment method of the multidimensional type inheritance indexes (MD-TIXs) to support the processing of XML queries in XML databases. MD-TIX uses a multidimensional index structure for efficiently supporting nested predicates that involve both nested element and type inheritance hierarchies. In this paper, we have analyzed the strategy of the query processing by using the MD-TIXs, and presented an assignment method of the MD-TIXs in the framework of complex queries, containing conjunctions of nested predicates, each one involving an Xpath having target types or domain types substitution. We first consider MD-TIX operations caused by updating of XML data-bases, and the use of the MD-TIXs in the case of a query containing a single nested predicate. And then, we consider the assignments of the MD-TIXs in the framework of more general queries containing nested predicates over overlapping paths that have common subpaths.

  • PDF

k-Bitmap Clustering Method for XML Data based on Relational DBMS (관계형 DBMS 기반의 XML 데이터를 위한 k-비트맵 클러스터링 기법)

  • Lee, Bum-Suk;Hwang, Byung-Yeon
    • The KIPS Transactions:PartD
    • /
    • v.16D no.6
    • /
    • pp.845-850
    • /
    • 2009
  • Use of XML data has been increased with growth of Web 2.0 environment. XML is recognized its advantages by using based technology of RSS or ATOM for transferring information from blogs and news feed. Bitmap clustering is a method to keep index in main memory based on Relational DBMS, and which performed better than the other XML indexing methods during the evaluation. Existing method generates too many clusters, and it causes deterioration of result of searching quality. This paper proposes k-Bitmap clustering method that can generate user defined k clusters to solve above-mentioned problem. The proposed method also keeps additional inverted index for searching excluded terms from representative bits of k-Bitmap. We performed evaluation and the result shows that the users can control the number of clusters. Also our method has high recall value in single term search, and it guarantees the searching result includes all related documents for its query with keeping two indices.

Techniques of XML Fragment Stream Organization for Efficient XML Query Processing in Mobile Clients (이동 클라이언트에서 효율적인 XML 질의 처리를 위한 XML 조각 스트림 구성 기법)

  • Ryu, Jeong-Hoon;Kang, Hyun-Chul
    • The Journal of Society for e-Business Studies
    • /
    • v.14 no.4
    • /
    • pp.75-94
    • /
    • 2009
  • Since XML emerged as a standard for data exchange on the web, it has been established as a core component in e-Commerce and efficient query processing over XML data in ubiquitous computing environment has been also receiving much attention. Recently, the techniques were proposed whereby an XML document is fragmented into XML fragments to be streamed and the mobile clients receive the stream while processing queries over it. In processing queries over an XML fragment stream, the average access time significantly depends on the order of fragments in the stream. As such, for query performance, an efficient organization of XML fragment stream is required as well as the indexing for energy-efficient query processing due to the reduction of tuning time. In this paper, a technique of XML fragment stream organization based on query frequencies, fragment size, fragment access frequencies, and an active XML-based indexing scheme are proposed. Through implementation and performance experiments, our techniques were shown to be efficient compared with the conventional XML fragment stream organizations.

  • PDF

A Minimum Sequence Matching Scheme for Efficient XPath Processing

  • Seo, Dong-Min;Yeo, Myung-Ho;Kim, Myoung-Ho;Yoo, Jae-Soo
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.3 no.5
    • /
    • pp.492-506
    • /
    • 2009
  • Index structures that are based on sequence matching for XPath processing such as ViST, PRIX and LCS-TRIM have recently been proposed to reduce the search time of XML documents. However, ViST can cause a lot of unnecessary computation and I/O when processing structural joint queries because its numbering scheme is not optimized. PRIX and LCS-TRIM require much processing time for matching XML data trees and queries. In this paper, we propose a novel index structure that solves the problems of ViST and improves the performance of PRIX and LCS-TRIM. Our index structure provides the minimum sequence matching scheme to efficiently process structural queries. Finally, to verify the superiority of the proposed index structure with the minimum sequence matching scheme, we compare our index structure with ViST, PRIX and LCS-TRIM in terms of query processing of a single path or of a branching path including wild-cards ('*' and '//' ).

Development of a Korea SCI System for Efficient Citation Analysis (효율적인 인용분석을 위한 한국 SCI 시스템의 개발)

  • 이계준;조현양;최재황;윤희준
    • Journal of KIISE:Databases
    • /
    • v.31 no.2
    • /
    • pp.174-182
    • /
    • 2004
  • In order to produce information the author usually reference other authors' work. A citation index leads users to papers by citations. Citations lead the user to desired information. In this paper, KSCI(Korea Science Citation Index) which defines the relationships between citing documents and cited documents has been constructed. KSCI System is to solve problems for recursive retrieval in ISI's SCI(Science Citation Index) Path Encoding Indexing technique was used to solve the problems. From the analysis of data, this system has efficiency about 8.98% in the aspect of data storage. In the aspect of retrieval, there was efficiency between citing documents and cited documents, especially there was over 40% of efficiency in the retrieval of cited documents. It is concluded that suggested KSCI system will provide efficient storage and retrieval system.