• Title/Summary/Keyword: aggregation query

Search Result 59, Processing Time 0.024 seconds

Transformation of Continuous Aggregation Join Queries over Data Streams

  • Tran, Tri Minh;Lee, Byung-Suk
    • Journal of Computing Science and Engineering
    • /
    • v.3 no.1
    • /
    • pp.27-58
    • /
    • 2009
  • Aggregation join queries are an important class of queries over data streams. These queries involve both join and aggregation operations, with window-based joins followed by an aggregation on the join output. All existing research address join query optimization and aggregation query optimization as separate problems. We observe that, by putting them within the same scope of query optimization, more efficient query execution plans are possible through more versatile query transformations. The enabling idea is to perform aggregation before join so that the join execution time may be reduced. There has been some research done on such query transformations in relational databases, but none has been done in data streams. Doing it in data streams brings new challenges due to the incremental and continuous arrival of tuples. These challenges are addressed in this paper. Specifically, we first present a query processing model geared to facilitate query transformations and propose a query transformation rule specialized to work with streams. The rule is simple and yet covers all possible cases of transformation. Then we present a generic query processing algorithm that works with all alternative query execution plans possible with the transformation, and develop the cost formulas of the query execution plans. Based on the processing algorithm, we validate the rule theoretically by proving the equivalence of query execution plans. Finally, through extensive experiments, we validate the cost formulas and study the performances of alternative query execution plans.

An Efficient Indexing Structure for Multidimensional Categorical Range Aggregation Query

  • Yang, Jian;Zhao, Chongchong;Li, Chao;Xing, Chunxiao
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.13 no.2
    • /
    • pp.597-618
    • /
    • 2019
  • Categorical range aggregation, which is conceptually equivalent to running a range aggregation query separately on multiple datasets, returns the query result on each dataset. The challenge is when the number of dataset is as large as hundreds or thousands, it takes a lot of computation time and I/O. In previous work, only a single dimension of the range restriction has been solved, and in practice, more applications are being used to calculate multiple range restriction statistics. We proposed MCRI-Tree, an index structure designed to solve multi-dimensional categorical range aggregation queries, which can utilize main memory to maximize the efficiency of CRA queries. Specifically, the MCRI-Tree answers any query in $O(nk^{n-1})$ I/Os (where n is the number of dimensions, and k denotes the maximum number of pages covered in one dimension among all the n dimensions during a query). The practical efficiency of our technique is demonstrated with extensive experiments.

Efficient Authentication of Aggregation Queries for Outsourced Databases (아웃소싱 데이터베이스에서 집계 질의를 위한 효율적인 인증 기법)

  • Shin, Jongmin;Shim, Kyuseok
    • Journal of KIISE
    • /
    • v.44 no.7
    • /
    • pp.703-709
    • /
    • 2017
  • Outsourcing databases is to offload storage and computationally intensive tasks to the third party server. Therefore, data owners can manage big data, and handle queries from clients, without building a costly infrastructure. However, because of the insecurity of network systems, the third-party server may be untrusted, thus the query results from the server may be tampered with. This problem has motivated significant research efforts on authenticating various queries such as range query, kNN query, function query, etc. Although aggregation queries play a key role in analyzing big data, authenticating aggregation queries has not been extensively studied, and the previous works are not efficient for data with high dimension or a large number of distinct values. In this paper, we propose the AMR-tree that is a data structure, applied to authenticate aggregation queries. We also propose an efficient proof construction method and a verification method with the AMR-tree. Furthermore, we validate the performance of the proposed algorithm by conducting various experiments through changing parameters such as the number of distinct values, the number of records, and the dimension of data.

Routing Techniques for Data Aggregation in Sensor Networks

  • Kim, Jeong-Joon
    • Journal of Information Processing Systems
    • /
    • v.14 no.2
    • /
    • pp.396-417
    • /
    • 2018
  • GR-tree and query aggregation techniques have been proposed for spatial query processing in conventional spatial query processing for wireless sensor networks. Although these spatial query processing techniques consider spatial query optimization, time query optimization is not taken into consideration. The index reorganization cost and communication cost for the parent sensor nodes increase the energy consumption that is required to ensure the most efficient operation in the wireless sensor node. This paper proposes itinerary-based R-tree (IR-tree) for more efficient spatial-temporal query processing in wireless sensor networks. This paper analyzes the performance of previous studies and IR-tree, which are the conventional spatial query processing techniques, with regard to the accuracy, energy consumption, and query processing time of the query results using the wireless sensor data with Uniform, Gauss, and Skew distributions. This paper proves the superiority of the proposed IR-tree-based space-time indexing.

A FRAMEWORK FOR QUERY PROCESSING OVER HETEROGENEOUS LARGE SCALE SENSOR NETWORKS

  • Lee, Chung-Ho;Kim, Min-Soo;Lee, Yong-Joon
    • Proceedings of the KSRS Conference
    • /
    • 2007.10a
    • /
    • pp.101-104
    • /
    • 2007
  • Efficient Query processing and optimization are critical for reducing network traffic and decreasing latency of query when accessing and manipulating sensor data of large-scale sensor networks. Currently it has been studied in sensor database projects. These works have mainly focused on in-network query processing for sensor networks and assumes homogeneous sensor networks, where each sensor network has same hardware and software configuration. In this paper, we present a framework for efficient query processing over heterogeneous sensor networks. Our proposed framework introduces query processing paradigm considering two heterogeneous characteristics of sensor networks: (1) data dissemination approach such as push, pull, and hybrid; (2) query processing capability of sensor networks if they may support in-network aggregation, spatial, periodic and conditional operators. Additionally, we propose multi-query optimization strategies supporting cross-translation between data acquisition query and data stream query to minimize total cost of multiple queries. It has been implemented in WSN middleware, COSMOS, developed by ETRI.

  • PDF

Efficient Processing of MAX-of-SUM Queries in OLAP (OLAP에서 MAX-of-SUM 질의의 효율적인 처리 기법)

  • Cheong, Hee-Jeong;Kim, Dong-Wook;Kim, Jong-Soo;Lee, Yoon-Joon;Kim, Myoung-Ho
    • Journal of KIISE:Databases
    • /
    • v.27 no.2
    • /
    • pp.165-174
    • /
    • 2000
  • Recent researches about range queries in OLAP are only concerned with applying an aggregation operator over a certain region. However, data analysts in real world need not only the simple range query pattern but also an extended range query pattern that finds ranges which satisfy a special condition specified by using several aggregation operators. In this work, we define the general form of the extended range query and propose an efficient processing method for the 'MAX -of-SUM' query, which is the representative form of the extended range query pattern. The MAX-of-SUM query finds the range which has the maximum range sum value in data cube where the size of the range is given. The proposed query processing method is based on the prediction of the scope of the range sum values. That is, the search space on the query processing can be reduced by using the result of the prediction, and hence, the query response time is also reduced.

  • PDF

Extension of Aggregate Functions for Spatiotemporal Data Analysis (데이타 분석을 위한 시공간 집계 함수의 확장)

  • Chi Jeong Hee;Shin Hyun Ho;Kim Sang Ho;Ryu Keun Ho
    • Journal of KIISE:Databases
    • /
    • v.32 no.1
    • /
    • pp.43-55
    • /
    • 2005
  • Spatiotemporal databases support methods of recording and querying for spatiotemporal data to user by offering both spatial management and historical information on various types of objects in the real world. We can answer to the following query in real world: 'What is the average of volume of pesticide sprayed for cach farm land from April to August on 2001, within some query window' Such aggregation queries have both temporal and spatial constraint. However, previous works for aggregation are attached only to temporal aggregation or spatial aggregation. So they have problems that are difficult to apply for spatiotemporal data directly which have both spatial and temporal constraint. Therefore, in this paper, we propose spatiotemporal aggregate functions for analysis of spatiotemporal data which have spatiotemporal characteristic, such as stCOUNT, stSUM, stAVG, stMAX, stMIN. We also show that our proposal resulted in the convenience and improvement of query in application systems, and facility of analysis on spatiotemporal data which the previous temporal or spatial aggregate functions are not able to analyze, by applying to the estate management system. Then, we show the validity of our algorithm performance through the evaluation of spatiotemporal aggregate functions.

Development of an Event Stream Processing System for the Vehicle Telematics Environment

  • Kim, Jong-Ik;Kwon, Oh-Cheon;Kim, Hyun-Suk
    • ETRI Journal
    • /
    • v.31 no.4
    • /
    • pp.463-465
    • /
    • 2009
  • In this letter, we present an event stream processing system that can evaluate a pattern query for a data sequence with predicates. We propose a pattern query language and develop a pattern query processing system. In our system, we propose novel techniques for run-time aggregation and negation processing and apply our system to stream data generated from vehicles to monitor unusual driving patterns.

An Algorithm for Computing Range-Groupby Queries (영역-그룹화 질의 계산 알고리즘)

  • Lee, Yeong-Gu;Mun, Yang-Se;Hwang, Gyu-Yeong
    • Journal of KIISE:Databases
    • /
    • v.29 no.4
    • /
    • pp.247-261
    • /
    • 2002
  • Aggregation is an important operation that affects the performance of OLAP systems. In this paper we define a new class of aggregation queries, called range-groupby queries, and present a method for processing them. A range-groupby query is defined as a query that, for an arbitrarily specified region of an n-dimensional cube, computes aggregations for each combination of values of the grouping attributes. Range-groupby queries are used very frequently in analyzing information in MOLAP since they allow us to summarize various trends in an arbitrarily specified subregion of the domain space. In MOLAP applications, in order to improve the performance of query processing, a method of maintaining precomputed aggregation results, called the prefix-sum array, is widely used. For the case of range-groupby queries, however, maintaining precomputed aggregation results for each combination of the grouping attributes incurs enormous storage overhead. Here, we propose a fast algorithm that can compute range-groupby queries with minimal storage overhead. Our algorithm maintains only one prefix-sum away and still effectively processes range-groupby queries for all possible combinations of the grouping attributes. Compared with the method that maintains a prefix-sum array for each combination of the grouping attributes in an n-dimensional cube, our algorithm reduces the space overhead by (equation omitted), while accessing a similar number of cells.

Adaptive Range Aggregation Index Method for Efficient Spatial Range Query in Ubiquitous Sensor Networks (USN환경에서 효율적인 공간영역질의를 위한 적응형 영역 집계 인덱스 기법)

  • Li, Yan;Eo, Sang-Hun;Cho, Sook-Kyoung;Lee, Soon-Jo;Bae, Hae-Yeong
    • Journal of Korea Spatial Information System Society
    • /
    • v.9 no.2
    • /
    • pp.93-107
    • /
    • 2007
  • In this paper, an adaptive range aggregation spatial index method is proposed for spatial range query in ubiquitous sensor networks. As the ubiquitous sensor networks are the new information-oriented paradigm, many energy efficient spatial range query methods in ubiquitous sensor networks environment are studied vigorously. In sensor networks, users can monitor environment scalar data such as temperature and humidity during user defined time and spatial ranges. In order to execute spatial range query efficiently, rectangle based index methods are proposed, such as SPIX. But they define the return path as the opposite of its query transmit path. However, the sensor nodes in queried ranges are closed to each other, they can't aggregate the sensed value in a queried range because their query transmission paths are different. As a result, the previous methods waste energy unnecessarily to aggregate sensing data out of the queried range. In this paper, an adaptive aggregation index method is proposed that can aggregate values in a user defined range adaptively by using its neighbor information. It is shown that sensor power is saved efficiently by using the proposed method over the performance evaluation.

  • PDF