• Title/Summary/Keyword: Query optimization

Search Result 124, Processing Time 0.028 seconds

An XML Query Optimization Technique by Signature based Block Traversing (시그니처 기반 블록 탐색을 통한 XML 질의 최적화 기법)

  • Park, Sang-Won;Park, Dong-Ju;Jeong, Tae-Seon;Kim, Hyeong-Ju
    • Journal of KIISE:Databases
    • /
    • v.29 no.1
    • /
    • pp.79-88
    • /
    • 2002
  • Data on the Internet are usually represented and transfered as XML. the XML data is represented as a tree and therefore, object repositories are well-suited to store and query them due to their modeling power. XML queries are represented as regular path expressions and evaluated by traversing each object of the tree in object repositories. Several indexes are proposed to fast evaluate regular path expressions. However, in some cases they may not cover all possible paths because they require a great amount of disk space. In order to efficiently evaluate the queries in such cases, we propose an optimized traversing which combines the signature method and block traversing. The signature approach shrink the search space by using the signature information attached to each object, which hints the existence of a certain label in the sub-tree. The block traversing reduces disk I/O by early evaluating the reachable objects in a page. We conducted diverse experiments to show that the hybrid approach achieves a better performance than the other naive ones.

Entropy-based Dynamic Histogram for Spatio-temporal Databases (시공간 데이타베이스의 엔트로피 기반 동적 히스토그램)

  • 박현규;손진현;김명호
    • Journal of KIISE:Databases
    • /
    • v.30 no.2
    • /
    • pp.176-183
    • /
    • 2003
  • Various techniques including histograms, sampling and parametric techniques have been proposed to estimate query result sizes for the query optimization. Histogram-based techniques are the most widely used form for the selectivity estimation in relational database systems. However, in the spatio-temporal databases for the moving objects, the continual changes of the data distribution suffer the direct utilization of the state of the art histogram techniques. Specifically for the future queries, we need another methodology that considers the updated information and keeps the accuracy of the result. In this paper we propose a novel approach based upon the duality and the marginal distribution to construct a histogram with very little time since the spatio-temporal histogram requires the data distribution defined by query predicates. We use data synopsis method in the dual space to construct spatio-temporal histograms. Our method is robust to changing data distributions during a certain period of time while the objects keep the linear movements. An additional feature of our approach supports the dynamic update incrementally and maintains the accuracy of the estimated result.

Attribute-based Approach for Multiple Continuous Queries over Data Streams (데이터 스트림 상에서 다중 연속 질의 처리를 위한 속성기반 접근 기법)

  • Lee, Hyun-Ho;Lee, Won-Suk
    • The KIPS Transactions:PartD
    • /
    • v.14D no.5
    • /
    • pp.459-470
    • /
    • 2007
  • A data stream is a massive unbounded sequence of data elements continuously generated at a rapid rate. Query processing for such a data stream should also be continuous and rapid, which requires strict time and space constraints. In most DSMS(Data Stream Management System), the selection predicates of continuous queries are grouped or indexed to guarantee these constraints. This paper proposes a new scheme tailed an ASC(Attribute Selection Construct) that collectively evaluates selection predicates containing the same attribute in multiple continuous queries. An ASC contains valuable information, such as attribute usage status, partially pre calculated matching results and selectivity statistics for its multiple selection predicates. The processing order of those ASC's that are corresponding to the attributes of a base data stream can significantly influence the overall performance of multiple query evaluation. Consequently, a method of establishing an efficient evaluation order of multiple ASC's is also proposed. Finally, the performance of the proposed method is analyzed by a series of experiments to identify its various characteristics.

Optimization of Post-Processing for Subsequence Matching in Time-Series Databases (시계열 데이터베이스에서 서브시퀀스 매칭을 위한 후처리 과정의 최적화)

  • Kim, Sang-Uk
    • The KIPS Transactions:PartD
    • /
    • v.9D no.4
    • /
    • pp.555-560
    • /
    • 2002
  • Subsequence matching, which consists of index searching and post-processing steps, is an operation that finds those subsequences whose changing patterns are similar to that of a given query sequence from a time-series database. This paper discusses optimization of post-processing for subsequence matching. The common problem occurred in post-processing of previous methods is to compare the candidate subsequence with the query sequence for discarding false alarms whenever each candidate subsequence appears during index searching. This makes a sequence containing candidate subsequences to be accessed multiple times from disk, and also have a candidate subsequence to be compared with the query sequence multiple times. These redundancies cause the performance of subsequence matching to degrade seriously. In this paper, we propose a new optimal method for resolving the problem. The proposed method stores ail the candidate subsequences returned by index searching into a binary search tree, and performs post-processing in a batch fashion after finishing the index searching. By this method, we are able to completely eliminate the redundancies mentioned above. For verifying the performance improvement effect of the proposed method, we perform extensive experiments using a real-life stock data set. The results reveal that the proposed method achieves 55 times to 156 times speedup over the previous methods.

Enabling Efficient Verification of Dynamic Data Possession and Batch Updating in Cloud Storage

  • Qi, Yining;Tang, Xin;Huang, Yongfeng
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.12 no.6
    • /
    • pp.2429-2449
    • /
    • 2018
  • Dynamic data possession verification is a common requirement in cloud storage systems. After the client outsources its data to the cloud, it needs to not only check the integrity of its data but also verify whether the update is executed correctly. Previous researches have proposed various schemes based on Merkle Hash Tree (MHT) and implemented some initial improvements to prevent the tree imbalance. This paper tries to take one step further: Is there still any problems remained for optimization? In this paper, we study how to raise the efficiency of data dynamics by improving the parts of query and rebalancing, using a new data structure called Rank-Based Merkle AVL Tree (RB-MAT). Furthermore, we fill the gap of verifying multiple update operations at the same time, which is the novel batch updating scheme. The experimental results show that our efficient scheme has better efficiency than those of existing methods.

Design and Implementation of a Framework for Context-Aware Preference Queries

  • Roocks, Patrick;Endres, Markus;Huhn, Alfons;KieBling, Werner;Mandl, Stefan
    • Journal of Computing Science and Engineering
    • /
    • v.6 no.4
    • /
    • pp.243-256
    • /
    • 2012
  • In this paper we present a framework for a novel kind of context-aware preference query composition whereby queries for the Preference SQL system are created. We choose a commercial e-business platform for outdoor activities as a use case and develop a context model for this domain within our framework. The suggested model considers explicit user input, domain-specific knowledge, contextual knowledge and location-based sensor data in a comprehensive approach. Aside from the theoretical background of preferences, the optimization of preference queries and our novel generator based model we give special attention to the aspects of the implementation and the practical experiences. We provide a sketch of the implementation and summarize our user studies which have been done in a joint project with an industrial partner.

A Study on Boolean Query Optimization in Information Retrieval (불리언 질의 최적화에 관한 연구)

  • Joo, Won-Kyun;Lee, Min-Ho;Kang, Moo-Young
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2002.11c
    • /
    • pp.1879-1882
    • /
    • 2002
  • 본 논문에서는 불리언 모델을 지원하는 정보검색 시스템에서 사용자로부터 입력받은 불리언 질의를 효율적으로 연산하기 위한 3 가지 방법을 제안한다. 첫째, 불리언 대수를 사용하여 형태적으로 불필요한 노드를 제거한다. 둘째 색인어 출현 빈도 정보를 사용함으로써 빈도 0 을 가지는 노드와 이를 포함하는 노드의 연산 제외 여부를 결정하고, 연산 수행 시 시간이 적게 걸리는 순으로 피 연산자와 연산자의 순서를 재배열한다. 셋째, 불리언 질의 내에 복합 명사가 포함되어 있을 경우 구성 명사와 연산자의 조합을 이용한 질의 확장을 실시한다. 처음 두 가지 방법은 검색 속도의 향상을, 세 번째 방법은 정확도의 향상을 목표로 한다.

  • PDF

Estimation of Data Distribution Using Multidimensional Dynamic File Organization (다차원 동적 화일 구조를 이용한 데이타 분포의 추정)

  • Kim, Sang-Wook
    • Journal of Industrial Technology
    • /
    • v.15
    • /
    • pp.41-50
    • /
    • 1995
  • This paper presents a technique for estimating distribution of data stored in a database. This technique is very useful for accurate selectivity estimation, which is essential in query optimization and physical database design. To maintain data distribution, we employ the directory of the multilevel grid file, a multidimensional dynamic file organization. The major advantage of our technique is that data distribution information is maintained dynamically in the multilevel grid file. In contrast, other static methods such as the histogram method use static date structures, which requires periodic restructuring. Furthermore, we propose a method for keeping the abstract information of data distribution in main memory. This is advantageous in the situation where the size of main memory is not sufficient. Finally, We also suggest formulas for calculating selectivies of various queries based on our data distribution information.

  • PDF

User Feedback adapting Fuzzy Technique in Reuse Environment (재사용 환경에서 퍼지 기법을 적용한 사용자 피드백)

  • 김귀정
    • Proceedings of the Korea Contents Association Conference
    • /
    • 2004.05a
    • /
    • pp.401-405
    • /
    • 2004
  • The paper describes a technique for building a reuse environment obtained by polling user feedback about selected reuse components in order to enhance the system effectiveness. In order to do, we use fuzzification function adapting fuzzy technique. This is made by user profile. Function modification attained by result of continuous choice of components. This method is aimed to enhance system rather than optimization about single query

  • PDF

The Design of Spatial Query Optimization Technique using Horizontal Splitting of CNF (CNF의 수평적 분리를 이용한 공간 질의 최적화 기법의 제안)

  • 이환재;정보흥;조숙경;이순조;배해영
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2001.04b
    • /
    • pp.229-231
    • /
    • 2001
  • 공간 데이터베이스 시스템에서의 질의처리 과정 중 질의 재작성 과정에 의해 다중 블록 질의가 단일 블록으로 변환되면 공간 서술자와 비공간 서술자가 OR와 AND에 의해 연결되어있는 복잡한 CNF가 생성된다. CNF 내의 공간 서술자는 공간연산의 정제단계의 수행 비용이 비공간 연산에 비해 상당히 많이 들기 때문에 비공간 서술자와는 다른 최적화 기법이 필요하다. 본 논문에서는 공간 서술자가 포함된 복잡한 CNF를 수평적으로 분리하여 질의를 재작성하고 수행순서를 재조정하는 기법을 제안한다. 제안하는 기법은 원시 CNF를 수행 비용이 상대적으로 적은 전처리 단계의 CNF와 이에 비해 수행비용이 많이 드는 후처리 단계의 CNF로 분리하고 질의를 재작성 한 후 비용 모델에 의거해서 실행 트리를 최적화 한다. 본 논문에서 제시하는 기법은 질의 최적화 단계에서 공간연산의 단계별 실행특성을 감안한 효율적인 실행 계획 생성이 가능하다는 장점이 있다.

  • PDF