Search | Korea Science

Performance Study of the Index-based Parallel Join

Jeong, Byeong-Soo;Edward Omiecinski
- The Journal of Information Technology and Database
- /
- v.2 no.2
- /
- pp.87-109
- /
- 1995
The index file has been used a access database records effectively. The join operation in a relational database system requires a large execution time, especially in the case of handling large size tables. If the indexes are available on the joining attributes for both relations involved in the join and the join selectivity is relatively small, we can improve the execution time of the join operation. In this paper. we investigate the performance trade-offs of parallel index-based join algorithms where different indexing schemes are used. We also present a comparison of our index-based parallel join algorithms with the hash-based parallel join algorithm.
PDF

An Efficient Spatial Join Method Using DOT Index (DOT 색인을 이용한 효율적인 공간 조인 기법)

Back, Hyun;Yoon, Jee-Hee;Won, Jung-Im;Park, Sang-Hyun
- Journal of KIISE:Databases
- /
- v.34 no.5
- /
- pp.420-436
- /
- 2007
The choice of an effective indexing method is crucial to guarantee the performance of the spatial join operator which is heavily used in geographical information systems. The $R^*$-tree based method is renowned as one of the most representative indexing methods. In this paper, we propose an efficient spatial join technique based on the DOT(Double Transformation) index, and compare it with the spatial Join technique based on the $R^*$-tree index. The DOT index transforms the MBR of an spatial object into a single numeric value using a space filling curve, and builds the $B^+$-tree from a set of numeric values transformed as such. The DOT index is possible to be employed as a primary index for spatial objects. The proposed spatial join technique exploits the regularities in the moving patterns of space filling curves to divide a query region into a set of maximal sub-regions within which space filling curves traverse without interruption. Such division reduces the number of spatial transformations required to perform the spatial join and thus improves the performance of join processing. The experiments with the data sets of various distributions and sizes revealed that the proposed join technique is up to three times faster than the spatial join method based on the $R^*$-tree index.
PDF KSCI

An Efficient Block Index Scheme with Segmentation for Spatio-Textual Similarity Join

Xiang, Yiming;Zhuang, Yi;Jiang, Nan
- KSII Transactions on Internet and Information Systems (TIIS)
- /
- v.11 no.7
- /
- pp.3578-3593
- /
- 2017
Given two collections of objects that carry both spatial and textual information in the form of tags, a $\text\underline{S}patio$-$\text\underline{T}extual$-based object $\text\underline{S}imilarity$ $\text\underline{JOIN}$ (ST-SJOIN) retrieves the pairs of objects that are textually similar and spatially close. In this paper, we have proposed a block index-based approach called BIST-JOIN to facilitate the efficient ST-SJOIN processing. In this approach, a dual-feature distance plane (DFDP) is first partitioned into some blocks based on four segmentation schemes, and the ST-SJOIN is then transformed into searching the object pairs falling in some affected blocks in the DFDP. Extensive experiments on real and synthetic datasets demonstrate that our proposed join method outperforms the state-of-the-art solutions.
https://doi.org/10.3837/tiis.2017.07.015 인용 PDF KSCI

A Data Mining Approach for Selecting Bitmap Join Indices

Bellatreche, Ladjel;Missaoui, Rokia;Necir, Hamid;Drias, Habiba
- Journal of Computing Science and Engineering
- /
- v.1 no.2
- /
- pp.177-194
- /
- 2007
Index selection is one of the most important decisions to take in the physical design of relational data warehouses. Indices reduce significantly the cost of processing complex OLAP queries, but require storage cost and induce maintenance overhead. Two main types of indices are available: mono-attribute indices (e.g., B-tree, bitmap, hash, etc.) and multi-attribute indices (join indices, bitmap join indices). To optimize star join queries characterized by joins between a large fact table and multiple dimension tables and selections on dimension tables, bitmap join indices are well adapted. They require less storage cost due to their binary representation. However, selecting these indices is a difficult task due to the exponential number of candidate attributes to be indexed. Most of approaches for index selection follow two main steps: (1) pruning the search space (i.e., reducing the number of candidate attributes) and (2) selecting indices using the pruned search space. In this paper, we first propose a data mining driven approach to prune the search space of bitmap join index selection problem. As opposed to an existing our technique that only uses frequency of attributes in queries as a pruning metric, our technique uses not only frequencies, but also other parameters such as the size of dimension tables involved in the indexing process, size of each dimension tuple, and page size on disk. We then define a greedy algorithm to select bitmap join indices that minimize processing cost and verify storage constraint. Finally, in order to evaluate the efficiency of our approach, we compare it with some existing techniques.
https://doi.org/10.5626/JCSE.2007.1.2.177 인용 PDF

Semijoin-Based Spatial Join Processing in Multiple Sensor Networks

Kim, Min-Soo;Kim, Ju-Wan;Kim, Myoung-Ho
- ETRI Journal
- /
- v.30 no.6
- /
- pp.853-855
- /
- 2008
This paper presents an energy-efficient spatial join algorithm for multiple sensor networks employing a spatial semijoin strategy. For optimization of the algorithm, we propose a GR-tree index and a grid-ID-based spatial approximation method, which are unique to sensor networks. The GR-tree is a distributed spatial index over the sensor nodes, which efficiently prunes away the nodes that will not participate in a spatial join result. The grid-ID-based approximation provides great reduction in communication cost by approximating many spatial objects in simpler forms. Our experiments demonstrate that the algorithm outperforms existing methods in reducing energy consumption at the nodes.
PDF

A Study on Selecting Bitmap Join Index to Speed up Complex Queries in Relational Data Warehouses (관계형 데이터 웨어하우스의 복잡한 질의의 처리 효율 향상을 위한 비트맵 조인 인덱스 선택에 관한 연구)

An, Hyoung-Geun;Koh, Jae-Jin
- The KIPS Transactions:PartD
- /
- v.19D no.1
- /
- pp.1-14
- /
- 2012
As the size of the data warehouse is large, the selection of indices on the data warehouse affects the efficiency of the query processing of the data warehouse. Indices induce the lower query processing cost, but they occupy the large storage areas and induce the index maintenance cost which are accompanied by database updates. The bitmap join indices are well applied when we optimize the star join queries which join a fact table and many dimension tables and the selection on dimension tables in data warehouses. Though the bitmap join indices with the binary representations induce the lower storage cost, the task to select the indexing attributes among the huge candidate attributes which are generated is difficult. The processes of index selection are to reduce the number of candidate attributes to be indexed and then select the indexing attributes. In this paper on bitmap join index selection problem we reduce the number of candidate attributes by the data mining techniques. Compared to the existing techniques which reduce the number of candidate attributes by the frequencies of attributes we consider the frequencies of attributes and the size of dimension tables and the size of the tuples of the dimension tables and the page size of disk. We use the mining of the frequent itemsets as mining techniques and reduce the great number of candidate attributes. We make the bitmap join indices which have the least costs and the least storage area adapted to storage constraints by using the cost functions applied to the bitmap join indices of the candidate attributes. We compare the existing techniques and ours and analyze them in order to evaluate the efficiencies of ours.
https://doi.org/10.3745/KIPSTD.2012.19D.1.001 인용 PDF KSCI

Spatial Join based on the Transform-Space View (변환공간 뷰를 기반으로한 공간 조인)

이민재;한욱신;황규영
- Journal of KIISE:Databases
- /
- v.30 no.5
- /
- pp.438-450
- /
- 2003
Spatial joins find pairs of objects that overlap with each other. In spatial joins using indexes, original-space indexes such as the R-tree are widely used. An original-space index is the one that indexes objects as represented in the original space. Since original-space indexes deal with sizes of objects, it is difficult to develop a formal algorithm without relying on heuristics. On the other hand, transform-space indexes, which transform objects in the original space into points in the transform space and index them, deal only with points but no sites. Thus, spatial join algorithms using these indexes are relatively simple and can be formally developed. However, the disadvantage of transform-space join algorithms is that they cannot be applied to original-space indexes such as the R-tree containing original-space objects. In this paper, we present a novel mechanism for achieving the best of these two types of algorithms. Specifically, we propose a new notion of the transform-space view and present the transform-space view join algorithm(TSVJ). A transform-space view is a virtual transform-space index based on an original-space index. It allows us to interpret on-the-fly a pre-built original-space index as a transform-space index without incurring any overhead and without actually modifying the structure of the original-space index or changing object representation. The experimental result shows that, compared to existing spatial join algorithms that use R-trees in the original space, the TSVJ improves the number of disk accesses by up to 43.1% The most important contribution of this paper is to show that we can use original-space indexes, such as the R-tree, in the transform space by interpreting them through the notion of the transform-space view. We believe that this new notion provides a framework for developing various new spatial query processing algorithms in the transform space.
PDF KSCI

Evaluating Join Performance on Relational Database Systems

Ordonez, Carlos;Garcia-Garcia, Javier
- Journal of Computing Science and Engineering
- /
- v.4 no.4
- /
- pp.276-290
- /
- 2010
The join operator is fundamental in relational database systems. Evaluating join queries on large tables is challenging because records need to be efficiently matched based on a given key. In this work, we analyze join queries in SQL with large tables in which a foreign key may be null, invalid or valid, given a referential integrity constraint. We conduct an extensive join performance evaluation on three DBMSs. Specifically, we study join queries varying table sizes, row size and key probabilistic distribution, inserting null, invalid or valid foreign key values. We also benchmark three well-known query optimizations: view materialization, secondary index and join reordering. Our experiments show certain optimizations perform well across DBMSs, whereas other optimizations depend on the DBMS architecture.
https://doi.org/10.5626/JCSE.2010.4.4.276 인용 PDF

A Multi-level Inverted Index Technique for Structural Document Search (구조화 문서 검색을 위한 다단계 역색인 기법)

Kim, Jong-Ik
- The KIPS Transactions:PartB
- /
- v.15B no.4
- /
- pp.355-364
- /
- 2008
In general, we can use an inverted index for retrieving element lists from structured documents. An inverted index can retrieve a list of elements that have the same tag name. In this approach, however, the cost of query processing is linear to the length of a path query because all the structural relationships (parent-child and ancestor-descendant) should be resolved by structural join operations. In this paper, we propose an inverted index technique and a novel structural join technique for accelerating XML path query evaluation. Our inverted index can retrieve element lists for path segments in a parent-child relationship. Our structural join technique can handle lists of element pairs while the existing techniques handle lists of elements. We show through experiments that these two proposed techniques are integrated to accelerate evaluation of XML path queries.
https://doi.org/10.3745/KIPSTB.2008.15-B.4.355 인용 PDF KSCI

An Optimal Way to Index Searching of Duality-Based Time-Series Subsequence Matching (이원성 기반 시계열 서브시퀀스 매칭의 인덱스 검색을 위한 최적의 기법)

Kim, Sang-Wook;Park, Dae-Hyun;Lee, Heon-Gil
- The KIPS Transactions:PartD
- /
- v.11D no.5
- /
- pp.1003-1010
- /
- 2004
In this paper, we address efficient processing of subsequence matching in time-series databases. We first point out the performance problems occurring in the index searching of a prior method for subsequence matching. Then, we propose a new method that resolves these problems. Our method starts with viewing the index searching of subsequence matching from a new angle, thereby regarding it as a kind of a spatial-join called a window-join. For speeding up the window-join, our method builds an R＊-tree in main memory for f query sequence at starting of sub-sequence matching. Our method also includes a novel algorithm for joining effectively one R＊-tree in disk, which is for data sequences, and another R＊-tree in main memory, which is for a query sequence. This algorithm accesses each R＊-tree page built on data sequences exactly cure without incurring any index-level false alarms. Therefore, in terms of the number of disk accesses, the proposed algorithm proves to be optimal. Also, performance evaluation through extensive experiments shows the superiority of our method quantitatively.
https://doi.org/10.3745/KIPSTD.2004.11D.5.1003 인용 PDF KSCI

Search Result 70, Processing Time 0.019 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)