• Title/Summary/Keyword: theta join

Search Result 4, Processing Time 0.017 seconds

Using a Greedy Algorithm for the Improvement of a MapReduce, Theta join, M-Bucket-I Heuristic (그리디 알고리즘을 이용한 맵리듀스 세타조인 M-Bucket-I 휴리스틱의 개선)

  • Kim, Wooyeol;Shim, Kyuseok
    • Journal of KIISE
    • /
    • v.43 no.2
    • /
    • pp.229-236
    • /
    • 2016
  • Theta join is one of the essential and important types of queries in database systems. As the amount of data needs to be processed increases, processing theta joins with a single machine becomes impractical. Therefore, theta join algorithms using distributed computing frameworks have been studied widely. Although one of the state-of-the-art theta-join algorithms uses M-Bucket-I heuristic, it is hard to use since running time of M-Bucket-I heuristic, which computes a mapping from a record to a reducer (i.e., reducer mapping), is O(n) where n is the size of input data. In this paper, we propose MBI-I algorithm which reduces the running time of M-Bucket-I heuristic to $O(r_{max}log\;n)$ and gives the same result as M-Bucket-I heuristic does. We also conducted several experiments to show algorithm and confirmed that our algorithm can improve the performance of a theta join by 10%.

A Flexible Query Processing System for XML Regular Path Expressions (XML 정규 경로식을 위한 유연한 질의 처리 시스템)

  • 김대일;김기창;김유성
    • Journal of KIISE:Databases
    • /
    • v.30 no.6
    • /
    • pp.641-650
    • /
    • 2003
  • The eXtensible Markup Language(XML) is emerging as a standard format of data representation and exchange on the Internet. There have been researches about storing and retrieving XML documents using the relational database which has techniques in full growth about large data processing, recovery, concurrency control and so on. Since in previous systems same structure information and fundamental operation are used for processing of various kinds of XML queries, only some specific query can be efficiently processed not all types of query. In this paper, we propose a flexible query processing system. To process query efficiently, the proposed system analyzes regular path expression queries, and uses $\theta$-join operation using region numbering values to check ancestor-descendent relationship and equi-join operation using parent's region start value to check parent-child relationship. Thus, the proposed system processes efficiently XML regular path expressions. From the experimental results, we show that proposed XML query processing system is more efficient than previous systems.

System Size and Service Size Distributions of a Batch Service Queue

  • Lee, Soon-Seok;Lee, Ho-Woo;Yoon, Seung-Hyun;Nadrajan, R.
    • Journal of the Korean Operations Research and Management Science Society
    • /
    • v.18 no.3
    • /
    • pp.179-186
    • /
    • 1993
  • We derive the arbitrary time point system size distribution of M/ $G^{B}$1 queue in which late arrivals are not allowed to join the on-going service. The distribution is given by P(z) = $P_{4}$(z) $S^{*}$ (.lambda.-.lambda.z) where $P_{4}$ (z) is the probability generating function of the queue size and $S^{*}$(.theta.) is the Laplace-Stieltjes transform of the service time distribution function. We also derive the distribution of the service siez at arbitrary point of time. time.

  • PDF

Query Reorganization Scheme supporting Parallel Query Processing of Theta Join and Nested SQL on Distributed CUBRID (분산 CUBIRD 상에서 세타 조인 및 중첩 SQL 병렬 질의처리를 지원하는 질의 재구성 기법)

  • Yang, Hyeon-Sik;Kim, Hyeong-Jin;Chang, Jae-Woo
    • Proceedings of the Korea Contents Association Conference
    • /
    • 2014.11a
    • /
    • pp.37-38
    • /
    • 2014
  • 최근 SNS의 발전으로 인해 데이터의 양이 급격히 증가하였으며, 이에 따라 빅데이터 처리를 위한 분산 DBMS 기반 질의 처리 연구가 활발히 진행되고 있다. 이를 위해 CUBRID는 CUBRID Shard 서비스를 통해 데이터베이스를 shard 단위로 수평 분할하여 각기 다른 물리 노드에 데이터를 분산 저장하도록 지원한다. 그러나 CUBRID Shard는 shard간 데이터가 독립적으로 관리되기 때문에 세타 조인 및 중첩 질의와 같이 다수 서버에서의 테이블 참조가 필요한 질의는 처리가 불가능하다. 따라서 본 논문에서는 분산 CUBRID 상에서 세타 조인 및 중첩 SQL를 지원하는 질의 재구성 기법을 제안한다.

  • PDF