• Title/Summary/Keyword: multidimensional indexes

Search Result 25, Processing Time 0.025 seconds

An SVD-Based Approach for Generating High-Dimensional Data and Query Sets (SVD를 기반으로 한 고차원 데이터 및 질의 집합의 생성)

  • 김상욱
    • The Journal of Information Technology and Database
    • /
    • v.8 no.2
    • /
    • pp.91-101
    • /
    • 2001
  • Previous research efforts on performance evaluation of multidimensional indexes typically have used synthetic data sets distributed uniformly or normally over multidimensional space. However, recent research research result has shown that these hinds of data sets hardly reflect the characteristics of multimedia database applications. In this paper, we discuss issues on generating high dimensional data and query sets for resolving the problem. We first identify the features of the data and query sets that are appropriate for fairly evaluating performances of multidimensional indexes, and then propose HDDQ_Gen(High-Dimensional Data and Query Generator) that satisfies such features. HDDQ_Gen supports the following features : (1) clustered distributions, (2) various object distributions in each cluster, (3) various cluster distributions, (4) various correlations among different dimensions, (5) query distributions depending on data distributions. Using these features, users are able to control tile distribution characteristics of data and query sets. Our contribution is fairly important in that HDDQ_Gen provides the benchmark environment evaluating multidimensional indexes correctly.

  • PDF

A Bitmap Index for Chunk-Based MOLAP Cubes (청크 기반 MOLAP 큐브를 위한 비트맵 인덱스)

  • Lim, Yoon-Sun;Kim, Myung
    • Journal of KIISE:Databases
    • /
    • v.30 no.3
    • /
    • pp.225-236
    • /
    • 2003
  • MOLAP systems store data in a multidimensional away called a 'cube' and access them using way indexes. When a cube is placed into disk, it can be Partitioned into a set of chunks of the same side length. Such a cube storage scheme is called the chunk-based MOLAP cube storage scheme. It gives data clustering effect so that all the dimensions are guaranteed to get a fair chance in terms of the query processing speed. In order to achieve high space utilization, sparse chunks are further compressed. Due to data compression, the relative position of chunks cannot be obtained in constant time without using indexes. In this paper, we propose a bitmap index for chunk-based MOLAP cubes. The index can be constructed along with the corresponding cube generation. The relative position of chunks is retained in the index so that chunk retrieval can be done in constant time. We placed in an index block as many chunks as possible so that the number of index searches is minimized for OLAP operations such as range queries. We showed the proposed index is efficient by comparing it with multidimensional indexes such as UB-tree and grid file in terms of time and space.

Optimal Configurations of Multidimensional Path Indexes for the Efficient Execution of Object-Oriented Queries (객체지향 질의의 효율적 처리를 위한 다차원 경로 색인구조의 최적 구성방법)

  • Lee, Jong-Hak
    • Journal of Korea Multimedia Society
    • /
    • v.7 no.7
    • /
    • pp.859-876
    • /
    • 2004
  • This paper presents optimal configurations of multidimensional path indexes (MPIs) for the efficient execution of object-oriented queries in object databases. MPI uses a multidimensional index structure for efficiently supporting nested predicates that involve both nested attribute and class hierarchies, which are not supported by the nested attribute index using one-dimensional index structure such as $B^+$-tree. In this paper, we have analyzed the MPIs in the framework of complex queries, containing conjunctions of nested predicates, each one involving a path expression having target classes and domain classes substitution. First of all, we have considered MPI operations caused by updating of object databases, and the use of the MPI in the case of a query containing a single nested predicate. And then, we have considered the use of the MPIs in the framework of more general queries containing nested predicates over both overlapping and non-overlapping paths. The former are paths having common subpaths, while the latter have no common subpaths.

  • PDF

Efficient Indoor Location Estimation using Multidimensional Indexes in Wireless Networks

  • Jun, Bong-Gi
    • International Journal of Contents
    • /
    • v.5 no.2
    • /
    • pp.59-63
    • /
    • 2009
  • Since it is hard to use GPS for tracking mobile user in indoor environments, much research has focused on techniques using existing wireless local area network infrastructure. Signal strength received at a fixed location is not constant, so fingerprinting approach which use pattern matching is popular. But this approach has to pay additional costs to determine user location. This paper proposes a new approach to find user's location efficiently using an index scheme. After analyzing characteristics of RF signals, the paper suggests the data processing method how the signal strength values for each of the access points are recorded in a radio map. To reduce computational cost during the location determination phase, multidimensional indexes for radio map with the important information which is the order of the strongest access points are used.

A High-Dimensional Index Structure Based on Singular Value Decomposition (Singular Value Decomposition 기반 고차원 인덱스 구조)

  • Kim, Sang-Wook;Aggarwal, Charu;Yu, Philip S.
    • Journal of Industrial Technology
    • /
    • v.20 no.B
    • /
    • pp.213-218
    • /
    • 2000
  • The nearest neighbor query is an important operation widely used in multimedia databases for finding the object that is most similar to a given query object. Most of techniques for processing nearest neighbor queries employ multidimensional indexes for effective indexing of objects. However, the performance of previous multidimensional indexes, which use N-dimensional rectangles or spheres for representing the capsule of the object cluster, deteriorates seriously as the number of dimensions gets higher. This paper proposes a new index structure based singular value decomposition resolving this problem and the query processing method using it. We also verify the superiority of our approach through performance evaluation by performing extensive experiments.

  • PDF

Top-down Hierarchical Clustering using Multidimensional Indexes (다차원 색인을 이용한 하향식 계층 클러스터링)

  • Hwang, Jae-Jun;Mun, Yang-Se;Hwang, Gyu-Yeong
    • Journal of KIISE:Databases
    • /
    • v.29 no.5
    • /
    • pp.367-380
    • /
    • 2002
  • Due to recent increase in applications requiring huge amount of data such as spatial data analysis and image analysis, clustering on large databases has been actively studied. In a hierarchical clustering method, a tree representing hierarchical decomposition of the database is first created, and then, used for efficient clustering. Existing hierarchical clustering methods mainly adopted the bottom-up approach, which creates a tree from the bottom to the topmost level of the hierarchy. These bottom-up methods require at least one scan over the entire database in order to build the tree and need to search most nodes of the tree since the clustering algorithm starts from the leaf level. In this paper, we propose a novel top-down hierarchical clustering method that uses multidimensional indexes that are already maintained in most database applications. Generally, multidimensional indexes have the clustering property storing similar objects in the same (or adjacent) data pares. Using this property we can find adjacent objects without calculating distances among them. We first formally define the cluster based on the density of objects. For the definition, we propose the concept of the region contrast partition based on the density of the region. To speed up the clustering algorithm, we use the branch-and-bound algorithm. We propose the bounds and formally prove their correctness. Experimental results show that the proposed method is at least as effective in quality of clustering as BIRCH, a bottom-up hierarchical clustering method, while reducing the number of page accesses by up to 26~187 times depending on the size of the database. As a result, we believe that the proposed method significantly improves the clustering performance in large databases and is practically usable in various database applications.

An Assignment Method of Multidimensional Type Inheritance Indexes for XML Query Processing (XML 질의처리를 위한 다차원 타입상속 색인구조의 할당기법)

  • Lee, Jong-Hak
    • Journal of Korea Multimedia Society
    • /
    • v.12 no.1
    • /
    • pp.1-15
    • /
    • 2009
  • This paper presents an assignment method of the multidimensional type inheritance indexes (MD-TIXs) to support the processing of XML queries in XML databases. MD-TIX uses a multidimensional index structure for efficiently supporting nested predicates that involve both nested element and type inheritance hierarchies. In this paper, we have analyzed the strategy of the query processing by using the MD-TIXs, and presented an assignment method of the MD-TIXs in the framework of complex queries, containing conjunctions of nested predicates, each one involving an Xpath having target types or domain types substitution. We first consider MD-TIX operations caused by updating of XML data-bases, and the use of the MD-TIXs in the case of a query containing a single nested predicate. And then, we consider the assignments of the MD-TIXs in the framework of more general queries containing nested predicates over overlapping paths that have common subpaths.

  • PDF

최근접 질의를 위한 고차원 인덱싱 방법

  • Kim, Sang-Uk;Aggarwal, Charu;Yu, Philip
    • Journal of KIISE:Databases
    • /
    • v.28 no.4
    • /
    • pp.632-642
    • /
    • 2001
  • The nearest neighbor query is an important operation widely used in multimedia databases for finding the object that is most similar to a given object Most of techniques for processing nearest neighbor queries employ multidimensional indexes for effective indexing of objects. However, the performance of previous multidimensional indexes, which use N-dimensional rectangles or spheres for representing the capsule of the object cluster, deteriorates seriously as th number of dimensions gets higher, In this paper we first point out the fact that the simple representation of capsuler incurs performance degradation in processing nearest neighbor queries. For alleviating this problem,. we propose(1) adopting new axis systems appropriate to a given cluster (2) representing various shapes of capsules by combining rectangles and spheres, and (3) maintaining outliers separately, We also verify the superiority of our approach through performance evaluation by performing extensive experiments.

  • PDF

Optimal Design Method of Multidimensional Nested Attribute Indexes for Object-Oriented Query Processing (객체지향 질의처리를 위한 다차원 중포 속성 색인구조의 최적 설계기법)

  • Yoon, Dong-Ha;Lee, Jong-Hak
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2002.11c
    • /
    • pp.1863-1866
    • /
    • 2002
  • 본 논문에서는 객체지향 데이터베이스 시스템에서 중포 속성에 대한 색인구조로 다차원 색인구조를 이용하는 다차원 중포 색인구조(Multidimensional Nested Attribute Index: MD-NAI)의 최적 설계기법을 제시한다. MD-NAI는 일차원 색인구조를 이용한 중포 속성 색인구조에서 지원할 수 없는 클래스 계층상의 클래스 대치가 있는 중포 술어의 질의처리를 잘 지원할 수 있다. 그러나, MD-NAI는 사용자 질의 형태에 따라 색인검색의 성능이 매우 나빠질 수 있다. 본 논문에서는 질의 형태에 따른 MD-NAI의 성능 개선을 위하여, 먼저 중포 술어에 대한 질의 정보로서 MD-NAI의 색인 페이지 영역의 최적 모양을 결정하고, 이 최적 모양을 갖는 색인 페이지 영역의 모양이 되도록 하는 영역분할 전략을 적용한다. 성능평가의 결과에 의하면, 주어진 질의 패턴에 따라 최적의 MD-NAI를 구성할 수 있었으며, 삼차원 MD-NAI의 경우에 질의 형태에 따라 5.5배까지 성능이 향상되었다.

  • PDF

Physical Database Design for DFT-Based Multidimensional Indexes in Time-Series Databases (시계열 데이터베이스에서 DFT-기반 다차원 인덱스를 위한 물리적 데이터베이스 설계)

  • Kim, Sang-Wook;Kim, Jin-Ho;Han, Byung-ll
    • Journal of Korea Multimedia Society
    • /
    • v.7 no.11
    • /
    • pp.1505-1514
    • /
    • 2004
  • Sequence matching in time-series databases is an operation that finds the data sequences whose changing patterns are similar to that of a query sequence. Typically, sequence matching hires a multi-dimensional index for its efficient processing. In order to alleviate the dimensionality curse problem of the multi-dimensional index in high-dimensional cases, the previous methods for sequence matching apply the Discrete Fourier Transform(DFT) to data sequences, and take only the first two or three DFT coefficients as organizing attributes of the multi-dimensional index. This paper first points out the problems in such simple methods taking the firs two or three coefficients, and proposes a novel solution to construct the optimal multi -dimensional index. The proposed method analyzes the characteristics of a target database, and identifies the organizing attributes having the best discrimination power based on the analysis. It also determines the optimal number of organizing attributes for efficient sequence matching by using a cost model. To show the effectiveness of the proposed method, we perform a series of experiments. The results show that the Proposed method outperforms the previous ones significantly.

  • PDF