• Title/Summary/Keyword: XML 인덱스

Search Result 117, Processing Time 0.048 seconds

A Minimization Technique of XML Path Comparison Based on Signature (시그니쳐를 이용한 XML 경로 비교의 최소화 기법)

  • Jang, Kyung-Hoon;Hwang, Byung-Yeon
    • The Journal of Society for e-Business Studies
    • /
    • v.17 no.3
    • /
    • pp.61-72
    • /
    • 2012
  • Since XML allows users to define any tags, XML documents with various structures have been created. Accordingly, many studies on clustering and searching the XML documents based on the similarity of paths have been done in order to manage the documents efficiently. To retrieve XML documents having similar structures, the three-dimensional bitmap indexing technique uses a path as a unit when it creates an index. If a path structure is changed, the technique recognizes it as a new path. Thus, another technique to measure the similarity of paths was proposed. To compute the similarity between two paths, the technique compares every node of the paths. It causes unnecessary comparison of the nodes, which do not exist in common between the two paths. In this paper, we propose a new technique that minimizes the comparison using signatures and show the performance evaluation results of the technique. The comparison speed of proposed technique was 20 percent faster than the existing technique.

GML 응용스키마를 이용한 공간데이터베이스 스키마 모델링

  • 정호영;이민우;전우제;박수홍
    • Proceedings of the Korean Association of Geographic Inforamtion Studies Conference
    • /
    • 2003.11a
    • /
    • pp.30-39
    • /
    • 2003
  • GML 데이터는 공간 및 비공간 정보를 동시에 갖는 GIS 데이터의 특징과 구조적(structured)인 XML 데이터의 성격을 함께 가지고 있어 일반적인 DBMS에 저장되기 힘들다. XML 저장이 가능한 데이터베이스는 공간데이터 처리 능력이 부족하고, 공간데이터베이스는 XML 데이터를 저장하기 어렵다. 본 연구는 GML 데이터가 공간데이터베이스에 저장될 수 있도록 GML 응용스키마로부터 공간데이터베이스 스키마를 모델링하는 방법을 제안한다. 이를 위하여 객체관계형 데이터베이스의 특징인 복합 애트리뷰트(composite attribute)와 추상데이터타입(ADT)을 이용한 GML 스키마의 맵핑 규칙을 정하였다. 맵핑 규칙은 OGC SQL 스키마에 적합하도록 GML 데이터의 공간 정보와 비공간 정보를 분리하여 저장시킨다. 따라서 저장된 데이터는 공간데이터베이스가 제공하는 공간 연산자/함수 및 인덱스를 통하여 다양한 공간/비공간 질의가 빠르게 수행될 수 있다.

  • PDF

Multimedia XML Database System supporting Content-based Retrieval (내용 기반 검색을 지원하는 멀티미디어 XML 데이터베이스 시스템)

  • 김연희;신판섭;김병곤;이재호;임해철
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2001.04b
    • /
    • pp.76-78
    • /
    • 2001
  • 현재 웹 서비스 기반검색 시스템의 일반화에 힘입어 단순한 덱스트 정보뿐 만 아니라 이미지 데이터와 같은 멀티미디어 정보가 보편화되고 그 교류의 양이 크게 증가하였다. 따라서 덱스트 정보에 대한 검색과 함께 멀티미디어 정보에 대한 효과적 검색을 지원하는 시스템 개발이 중요시되고 있다. 그러나 기존에 개발된 시스템들은 멀티미디어 데이터를 검색 결과의 부가적 정보로서 사용하는 것이 일반적이며 그 자체를 질의 검색의 주요 대상으로 처리하지 못하였다. 따라서 본 논문에서는 웹 상에서 대용량 이미지 데이터베이스를 구축하고 이를 기반으로 효과적 검색을 지원하는 멀티미디어 검색 시스템을 설계한다. 제안 시스템은 크게 두 가지 검색 구조를 제공하는데, 먼저 기존의 덱스트 기반 검색을 위하여 이미지의 의미 정보를 XML로 표현하여 이를 DTD 독립적인 스키마에 따라 관계형 데이터베이스에 저장, 관리하여 체계적이고 구조적인 서비스를 지원한다. 또한 이미지에 대한 내용 기반 검색을 위하여 이미지 데이터베이스를 구축하고 이미지 데이터로부터 색상 히스토그램 특성을 자동으로 추출하여 구축한 인덱스를 유지, 관리하며, 이를 통한 내용 기반 검색 구조와 사용자 질의 인터페이스를 설계한다.

  • PDF

A Way to Speed up Evaluation of Path-oriented Queries using An Abbreviation-paths and An Extendible Hashing Technique (단축-경로와 확장성 해싱 기법을 이용한 경로-지향 질의의 평가속도 개선 방법)

  • Park Hee-Sook;Cho Woo-Hyun
    • The KIPS Transactions:PartD
    • /
    • v.11D no.7 s.96
    • /
    • pp.1409-1416
    • /
    • 2004
  • Recently, due to the popularity and explosive growth of the Internet, information exchange is increasing dramatically over the Internet. Also the XML is becoming a standard as well as a major tool of data exchange on the Internet. so that in retrieving the XML document. the problem for speeding up evaluation of path-oriented queries is a main issue. In this paper, we propose a new indexing technique to advance the searching performance of path-oriented queries in document databases. In the new indexing technique, an abbreviation-path file to perform path-oriented queries efficiently is generated which is able to use its hash-code value to index keys. Also this technique can be further enhanced by combining the Extendible Hashing technique with the abbreviation path file to expedite a speed up evaluation of retrieval.

Accelerating Keyword Search Processing over XML Documents using Document-level Ranking (문서 단위 순위화를 통한 XML 문서에 대한 키워드 검색 성능 향상)

  • Lee, Hyung-Dong;Kim, Hyoung-Joo
    • Journal of KIISE:Databases
    • /
    • v.33 no.5
    • /
    • pp.538-550
    • /
    • 2006
  • XML Keyword search enables us to get information easily without knowledge of structure of documents and returns specific and useful partial document results instead of whole documents. Element level query processing makes it possible, but computational complexity, as the number of documents grows, increases significantly overhead costs. In this paper, we present document-level ranking scheme over XML documents which predicts results of element-level processing to reduce processing cost. To do this, we propose the notion of 'keyword proximity' - the correlation of keywords in a document that affects the results of element-level query processing using path information of occurrence nodes and their resemblances - for document ranking process. In benefit of document-centric view, it is possible to reduce processing time using ranked document list or filtering of low scored documents. Our experimental evaluation shows that document-level processing technique using ranked document list is effective and improves performance by the early termination for top-k query.

An Efficient Path Expression Join Algorithm Using XML Structure Context (XML 구조 문맥을 사용한 효율적인 경로 표현식 조인 알고리즘)

  • Kim, Hak-Soo;Shin, Young-Jae;Hwang, Jin-Ho;Lee, Seung-Mi;Son, Jin-Hyun
    • The KIPS Transactions:PartD
    • /
    • v.14D no.6
    • /
    • pp.605-614
    • /
    • 2007
  • As a standard query language to search XML data, XQuery and XPath were proposed by W3C. By widely using XQuery and XPath languages, recent researches focus on the development of query processing algorithm and data structure for efficiently processing XML query with the enormous XML database system. Recently, when processing XML path expressions, the concept of the structural join which may determine the structural relationship between XML elements, e.g., ancestor-descendant or parent-child, has been one of the dominant XPath processing mechanisms. However, structural joins which frequently occur in XPath query processing require high cost. In this paper, we propose a new structural join algorithm, called SISJ, based on our structured index, called SI, in order to process XPath queries efficiently. Experimental results show that our algorithm performs marginally better than previous ones. However, in the case of high recursive documents, it performed more than 30% by the pruning feature of the proposed method.

XML View Indexing Using an RDBMS based XML Storage System (관계 DBMS 기반 XML 저장시스템 상에서의 XML 뷰 인덱싱)

  • Park Dae-Sung;Kim Young-Sung;Kang Hyunchul
    • Journal of Internet Computing and Services
    • /
    • v.6 no.4
    • /
    • pp.59-73
    • /
    • 2005
  • Caching query results and reusing them in processing of subsequent queries is an important query optimization technique. Materialized view and view indexing are the representative examples of such a technique. The two schemes had received much attention for relational databases, and have been investigated for XML data since XML emerged as the standard for data exchange on the Web. In XML view indexing, XML view xv which is the result of an XML query is represented as an XML view index(XVI), a structure containing the identifiers of xv's underlying XML elements as well as the information on xv. Since XVI for xv stores just the identifiers of the XML elements not the elements themselves, when xv is requested, its XVI should be materialized against xv's underlying XML documents. In this paper, we address the problem of integrating an XML view index management system with an RDBMS based XML storage system. The proposed system was implemented in Java on Windows 2000 Server with each of two different commercial RDBMSs, and used in evaluating performance improvement through XML view indexing as well as its overheads. The experimental results revealed that XML view indexing was very effective with an RDBMS based XML storage system while its overhead was negligible.

  • PDF

XML Document Filtering based on Segments (세그먼트 기반의 XML 문서 필터링)

  • Kwon, Joon-Ho;Rao, Praveen;Moon, Bong-Ki;Lee, Suk-Ho
    • Journal of KIISE:Databases
    • /
    • v.35 no.4
    • /
    • pp.368-378
    • /
    • 2008
  • In recent years, publish-subscribe (pub-sub) systems based on XML document filtering have received much attention. In a typical pub-sub system, subscribed users specify their interest in profiles expressed in the XPath language, and each new content is matched against the user profiles so that the content is delivered to only the interested subscribers. As the number of subscribed users and their profiles can grow very large, the scalability of the system is critical to the success of pub-sub services. In this paper, we propose a fast and scalable XML filtering system called SFiST which is an extension of the FiST system. Sharable segments are extracted from twig patterns and stored into the hash-based Segment Table in SFiST system. Segments are used to represent user profiles as Terse Sequences and stored in the Compact Segment Index during filtering. Our experimental study shows that SFiST system has better performance than FiST system in terms of filtering time and memory usage.

A Design of Index/XML Sequence Relation Information System for Product Abstraction and Classification (산출물 추출 및 분류를 위한 Index/XML순서관계 시스템 설계)

  • Sun Su-Kyun
    • The KIPS Transactions:PartD
    • /
    • v.12D no.1 s.97
    • /
    • pp.111-120
    • /
    • 2005
  • Software development creates many product that class components, Class Diagram, form, object, and design pattern. So this Paper suggests Index/XML Sequence Relation information system for product abstraction and classification, the system of design product Sequence Relation abstraction which can store, reuse design patterns in the meta modeling database with pattern Relation information. This is Index/XML Sequence Relation system which can easily change various relation information of product for product abstraction and classification. This system designed to extract and classify design pattern efficiently and then functional indexing, sequence base indexing for standard pattern, code indexing to change pattern into code and grouping by Index-ID code, and its role information can apply by structural extraction and design pattern indexing process. and it has managed various products, class item, diagram, forms, components and design pattern.

A Tree-Based Indexing Method for Mobile Data Broadcasting (모바일 데이터 브로드캐스팅을 위한 트리 기반의 인덱싱 방법)

  • Park, Mee-Hwa;Lee, Yong-Kyu
    • Journal of the Korea Society of Computer and Information
    • /
    • v.13 no.4
    • /
    • pp.141-150
    • /
    • 2008
  • In this mobile computing environment, data broadcasting is widely used to resolve the problem of limited power and bandwidth of mobile equipments. Most previous broadcast indexing methods concentrate on flat data. However. with the growing popularity of XML, an increasing amount of information is being stored and exchanged in the XML format. We propose a novel indexing method. called TOP tree(Tree Ordering based Path summary tree), for indexing XML document on mobile broadcast environments. TOP tree is a path summary tree which provides a concise structure summary at group level using global IDs and element information at local level using local IDs. Based on the TOP tree representation, we suggest a broadcast stream generation and query Processing method that efficiently handles not only simple Path queries but also multiple path queries. We have compared our indexing method with other indexing methods. Evaluation results show that our approaches can effectively improve the access time and tune-in time in a wireless broadcasting environment.

  • PDF