Search | Korea Science

MLPPI Wizard: An Automated Multi-level Partitioning Tool on Analytical Workloads

Suh, Young-Kyoon;Crolotte, Alain;Kostamaa, Pekka
- KSII Transactions on Internet and Information Systems (TIIS)
- /
- v.12 no.4
- /
- pp.1693-1713
- /
- 2018
An important technique used by database administrators (DBAs) is to improve performance in decision-support workloads associated with a Star schema is multi-level partitioning. Queries will then benefit from performance improvements via partition elimination, due to constraints on queries expressed on the dimension tables. As the task of multi-level partitioning can be overwhelming for a DBA we are proposing a wizard that facilitates the task by calculating a partitioning scheme for a particular workload. The system resides completely on a client and interacts with the costing estimation subsystem of the query optimizer via an API over the network, thereby eliminating any need to make changes to the optimizer. In addition, since only cost estimates are needed the wizard overhead is very low. By using a greedy algorithm for search space enumeration over the query predicates in the workload the wizard is efficient with worst-case polynomial complexity. The technology proposed can be applied to any clustering or partitioning scheme in any database management system that provides an interface to the query optimizer. Applied to the Teradata database the technology provides recommendations that outperform a human expert's solution as measured by the total execution time of the workload. We also demonstrate the scalability of our approach when the fact table (and workload) size increases.
https://doi.org/10.3837/tiis.2018.04.016 인용 PDF KSCI

A Selectivity Estimation Scheme for Spatial Topological Predicate Using Multi-Dimensional Histogram (다차원 히스토그램을 이용한 공간 위상 술어의 선택도 추정 기법)

Kim, Hong-Yeon;Bae, Hae-Yeong
- The Transactions of the Korea Information Processing Society
- /
- v.6 no.4
- /
- pp.841-850
- /
- 1999
Many commercial database systems maintain histograms to summarize the contents of relations, permit efficient estimation of query result sizes, and access plan costs. In spatial database systems, most query predicates consist of topological relationship between spatial objects, and ti is ver important to estimate the selectivity of those predicates for spatial query optimizer. In this paper, we propose a selectivity estimation scheme for spatial topological predicates based on the multi-dimensional histogram and the transformation scheme. Proposed scheme applies two partition strategies on transformed object space to generate spatial histogram, and estimates the selectivity of topological predicates based on the topological characteristic of transformed space. Proposed scheme provides a way for estimating the selectivity without too much memory space usage and additional I/Os in spatial query optimizer.
PDF

Cost Models of Energy-based Query Optimization for Flash-aware Embedded DBMS (플래시 기반 임베디드 DBMS의 전력기반 질의 최적화를 위한 비용 모델)

Kim, Do-Yun;Park, Sang-Won
- Journal of the Institute of Electronics Engineers of Korea CI
- /
- v.45 no.3
- /
- pp.75-85
- /
- 2008
The DBMS are widely used in embedded systems. The flash memory is used as a storage device of a embedded system. The optimizer of existing database system assumes that the storage device is disk. There is overhead to overwrite on flash memory unlike disk. The block of flash memory should be erased before write. Due to this reason, query optimization model based on disk does not adequate for flash-aware database. Especially embedded system should minimize the consumption of energy, but consumes more energy because of excessive erase operations. This paper proposes new energy based cost model of embedded database and shows the comparison between disk based cost model and energy based cost model.
PDF KSCI

Block Histogram Compression Method for Selectivity Estimation in High-dimensions (고차원에서 선택율 추정을 위한 블록 히스토그램 압축방법)

Lee, Ju-Hong;Jeon, Seok-Ju;Park, Seon
- The KIPS Transactions:PartD
- /
- v.10D no.6
- /
- pp.927-934
- /
- 2003
Database query optimates the selectivety of a query to find the most efficient access plan. Multi-dimensional selectivity estimation technique is required for a query with multiple attributes because the attributes are not independent each other. Histogram is practically used in most commercial database products because it approximates data distributions with small overhead and small error rates. However, histogram is inadequate for a query with multiple attributes because it incurs high storage overhead and high error rates. In this paper, we propose a novel method for multi-dimentional selectivity estimation. Compressed information from a large number of small-sized histogram buckets is maintained using the discrete cosine transform. This enables low error rates and low storage overheads even in high dimensions. Extensive experimental results show adventages of the proposed approach.
https://doi.org/10.3745/KIPSTD.2003.10D.6.927 인용 PDF KSCI

Cost Model for Parallel Spatial Joins using Fixed Grids (고정 그리드를 이용한 병렬 공간 조인을 위한 비용 모델)

Kim, Jin-Deog;Hong, Bong-Hee
- Journal of KIISE:Databases
- /
- v.28 no.4
- /
- pp.665-676
- /
- 2001
The most expensive spatial operation in patial database in a spatial join which computes a combined table of which tuple consists of two tuples of the two tables satisgying a spatial predicate. Although the execution time of sequential processing of a spatial join has been so far considerably improved the response time is not tolerable because of not meeting the requiremetns of interactive users. It is usually appropriate to use parallel processing to improve the performance of spatial join processing. in spatial database the fixed grids which consist of the regularly partitioned cells can be employed the previous works on the spatial joins have not studied the parallel processing of spatial joins using fixed grids. This paper has presented an analytical cost model that estimates the comparative performance of a parallel spatial join algorithm based on the fixed grids in terms of the number of MBR comparisons. disk accesses, and message passing, Several experiments on the synthetic and real datasets show that the proposed analytical model is very accurate. This most model is also expected to used for implementing a very important DBMS component, Called the query processing optimizer.
PDF

Search Result 5, Processing Time 0.02 seconds

MLPPI Wizard: An Automated Multi-level Partitioning Tool on Analytical Workloads

A Selectivity Estimation Scheme for Spatial Topological Predicate Using Multi-Dimensional Histogram (다차원 히스토그램을 이용한 공간 위상 술어의 선택도 추정 기법)

Cost Models of Energy-based Query Optimization for Flash-aware Embedded DBMS (플래시 기반 임베디드 DBMS의 전력기반 질의 최적화를 위한 비용 모델)

Block Histogram Compression Method for Selectivity Estimation in High-dimensions (고차원에서 선택율 추정을 위한 블록 히스토그램 압축방법)

Cost Model for Parallel Spatial Joins using Fixed Grids (고정 그리드를 이용한 병렬 공간 조인을 위한 비용 모델)

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)