• Title/Summary/Keyword: Query optimization

Search Result 124, Processing Time 0.021 seconds

Developing a Dynamic Materialized View Index for Efficiently Discovering Usable Views for Progressive Queries

  • Zhu, Chao;Zhu, Qiang;Zuzarte, Calisto;Ma, Wenbin
    • Journal of Information Processing Systems
    • /
    • v.9 no.4
    • /
    • pp.511-537
    • /
    • 2013
  • Numerous data intensive applications demand the efficient processing of a new type of query, which is called a progressive query (PQ). A PQ consists of a set of unpredictable but inter-related step-queries (SQ) that are specified by its user in a sequence of steps. A conventional DBMS was not designed to efficiently process such PQs. In our earlier work, we introduced a materialized view based approach for efficiently processing PQs, where the focus was on selecting promising views for materialization. The problem of how to efficiently find usable views from the materialized set in order to answer the SQs for a PQ remains open. In this paper, we present a new index technique, called the Dynamic Materialized View Index (DMVI), to rapidly discover usable views for answering a given SQ. The structure of the proposed index is a special ordered tree where the SQ domain tables are used as search keys and some bitmaps are kept at the leaf nodes for refined filtering. A two-level priority rule is adopted to order domain tables in the tree, which facilitates the efficient maintenance of the tree by taking into account the dynamic characteristics of various types of materialized views for PQs. The bitmap encoding methods and the strategies/algorithms to construct, search, and maintain the DMVI are suggested. The extensive experimental results demonstrate that our index technique is quite promising in improving the performance of the materialized view based query processing approach for PQs.

Efficient Processing of Multiple Group-by Queries in MapReduce for Big Data Analysis (맵리듀스에서 빅데이터 분석을 위한 다중 Group-by 질의의 효율적인 처리 기법)

  • Park, Eunju;Park, Sojeong;Oh, Sohyun;Choi, Hyejin;Lee, Ki Yong;Shim, Junho
    • KIISE Transactions on Computing Practices
    • /
    • v.21 no.5
    • /
    • pp.387-392
    • /
    • 2015
  • MapReduce is a framework used to process large data sets in parallel on a large cluster. A group-by query is a query that partitions the input data into groups based on the values of the specified attributes, and then evaluates the value of the specified aggregate function for each group. In this paper, we propose an efficient method for processing multiple group-by queries using MapReduce. Instead of computing each group-by query independently, the proposed method computes multiple group-by queries in stages with one or more MapReduce jobs in order to reduce the total execution cost. We compared the performance of this method with the performance of a less sophisticated method that computes each group-by query independently. This comparison showed that the proposed method offers better performance in terms of execution time.

Design of Solving Similarity Recognition for Cloth Products Based on Fuzzy Logic and Particle Swarm Optimization Algorithm

  • Chang, Bae-Muu
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.11 no.10
    • /
    • pp.4987-5005
    • /
    • 2017
  • This paper introduces a new method to solve Similarity Recognition for Cloth Products, which is based on Fuzzy logic and Particle swarm optimization algorithm. For convenience, it is called the SRCPFP method hereafter. In this paper, the SRCPFP method combines Fuzzy Logic (FL) and Particle Swarm Optimization (PSO) algorithm to solve similarity recognition for cloth products. First, it establishes three features, length, thickness, and temperature resistance, respectively, for each cloth product. Subsequently, these three features are engaged to construct a Fuzzy Inference System (FIS) which can find out the similarity between a query cloth and each sampling cloth in the cloth database D. At the same time, the FIS integrated with the PSO algorithm can effectively search for near optimal parameters of membership functions in eight fuzzy rules of the FIS for the above similarities. Finally, experimental results represent that the SRCPFP method can realize a satisfying recognition performance and outperform other well-known methods for similarity recognition under considerations here.

Spatial Partitioning for Query Result Size Estimation in Spatial Databases (공간 데이터베이스에서 질의 결과 크기 추정을 위한 공간 분할)

  • 황환규
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.41 no.2
    • /
    • pp.23-32
    • /
    • 2004
  • The query optimizer's important task while a query is invoked is to estimate the fraction of records in the databases that satisfy the given query condition. The query result size estimation in spatial databases, like relational databases, proceeds to partition the whole input into a small number of subsets called “buckets” and then estimate the fraction of the input in the buckets. The accuracy of estimation is determined by the difference between the real data counts and approximations in the buckets, and is dependent on how to partition the buckets. Existing techniques for spatial databases are equi-area and equi-count techniques, which are respectively analogous in relation databases to equi-height histogram that divides the input value range into buckets of equal size and equi-depth histogram that is equal to the number of records within each bucket. In this paper we propose a new partitioning technique that determines buckets according to the maximal difference of area which is defined as the product of data ranges End frequencies of input. In this new technique we consider both data values and frequencies of input data simultaneously, and thus achieve substantial improvements in accuracy over existing approaches. We present a detailed experimental study of the accuracy of query result size estimation comparing the proposed technique and the existing techniques using synthetic as well as real-life datasets. Experiments confirm that our proposed techniques offer better accuracy in query result size estimation than the existing techniques for space query size, bucket number, data number and data size.

RDF Query Optimization Technique based on Program Analysis (프로그램 분석을 통한 RDF 질의 최적화 기법)

  • Choi, Nak-Min;Cho, Eun-Sun
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.47 no.4
    • /
    • pp.54-62
    • /
    • 2010
  • Semantic Web programming is such an immature area that it is yet based on API calls, and does not provide high productivity in compiler time and sufficient efficiency in runtime. To get over this limitation, some efforts have been devoted on dedicated programming languages for Semantic Web. In this paper, we introduce a sophisticated cashing technique to enhance the runtime efficiency of RDF (Resource Description Framework) processing programs with SPARQL queries. We use static program analysis on those programs to determine what to be cashed, so as to decrease the cash miss ratio. Our method is implemented on programs in 'Jey' language, which is one of the programming languages devised for RDF data processing.

The Query Optimization Techniques for XML Data using DTDs (DTD를 이용한 XML 데이타에 대한 질의 최적화 기법)

  • Chung, Tae-Sun;Kim, Hyoung-Joo
    • Journal of KIISE:Databases
    • /
    • v.28 no.4
    • /
    • pp.723-731
    • /
    • 2001
  • As XML has become and emerging standard for information exchange on the World Wide Web it has gained attention in database communities of extract information from XML seen as a database model. Data in XML can be mapped to semistructured dta model based on edge-labeled graph and queries can be processed against it Here we propose new query optimization techniques using DTDs(Document Type Definitions) which have the schema information about XML data. Our techniques reduce traditional index techniques Also, as they preserve source database structure, they can process many kinds of complex queries. we implemented our techniques and provided preliminary performance results.

  • PDF

Query Optimization with Metadata Routing Tables on Nano-Q+ Sensor Network with Multiple Heterogeneous Sensors (다중 이기종 센서를 보유한 Nano-Q+ 기반 센서네트워크에서 메타데이타 라우팅 테이블을 이용한 질의 최적화)

  • Nam, Young-Kwang;Choe, Gui-Ja;Lee, Byoung-Dai;Kwak, Kwang-Woong;Lee, Kwang-Yong;Mah, Pyoung-Soo
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.14 no.1
    • /
    • pp.13-21
    • /
    • 2008
  • In general, data communication among sensor nodes requires more energy than internal processing or sensing activities. In this paper, we propose a noble technique to reduce the number of packet transmissions necessary for sending/receiving queries/results among neighboring nodes with the help of context-aware routing tables. The important information maintained in the context-aware routing table is which physical properties can be measured by descendent nodes reachable from the current node. Based on the information, the node is able to eliminate unnecessary packet transmission by filtering out the child nodes for query dissemination or result relaying. The simulation results show that up to 80% of performance gains can be achieved with our technique.

Replica Update Propagation Method for Cost Optimization of Request Forwarding in the Grid Database (그리드 데이터베이스에서 전송비용 최적화를 위한 복제본 갱신 전파 기법)

  • Jang, Yong-Il;Baek, Sung-Ha;Bae, Hae-Young
    • Journal of Korea Multimedia Society
    • /
    • v.9 no.11
    • /
    • pp.1410-1420
    • /
    • 2006
  • In this paper, a replica update propagation method for cost optimization of request forwarding in the Grid database is proposed,. In the Grid database, the data is replicated for performance and availability. In the case of data update, update information is forwarded to the neighbor nodes to synchronize with the others replicated data. There are two kinds of update propagation method that are the query based scheme and the log based scheme. And, only one of them is commonly used. But, because of dynamically changing environment through property of update query and processing condition, strategies that using one propagation method increases transmission cost in dynamic environment. In the proposed method, the three classes are defined from two cost models of query and log based scheme. And, cost functions and update propagation method is designed to select optimized update propagation scheme from these three classes. This paper shows a proposed method has an optimized performance through minimum transmission cost in dynamic processing environment.

  • PDF

Design and Implementation of a System for Recommending Related Content Using NoSQL (NoSQL 기반 연관 콘텐츠 추천 시스템의 설계 및 구현)

  • Ko, Eun-Jeong;Kim, Ho-Jun;Park, Hyo-Ju;Jeon, Young-Ho;Lee, Ki-Hoon;Shin, Saim
    • Journal of Korea Multimedia Society
    • /
    • v.20 no.9
    • /
    • pp.1541-1550
    • /
    • 2017
  • The increasing number of multimedia content offered to the user demands content recommendation. In this paper, we propose a system for recommending content related to the content that user is watching. In the proposed system, relationship information between content is generated using relationship information between representative keywords of content. Relationship information between keywords is generated by analyzing keyword collocation frequencies in Internet news corpus. In order to handle big corpus data, we design an architecture that consists of a distributed search engine and a distributed data processing engine. Furthermore, we store relationship information between keywords and relationship information between keywords and content in NoSQL to handle big relationship data. Because the query optimizer of NoSQL is not as well developed as RDBMS, we propose query optimization techniques to efficiently process complex queries for recommendation. Experimental results show that the performance is improved by up to 69 times by using the proposed techniques, especially when the number of requested related keywords is small.

Mobile Client-Server System for Realtime Continuous Query of Moving Objects (이동 객체의 실시간 연속 질의를 위한 모바일 클라이언트-서버 시스템)

  • Joo, Hae-Jong;Park, young-Bae;Choi, Chang-Hoon
    • Journal of the Korea Society of Computer and Information
    • /
    • v.11 no.6 s.44
    • /
    • pp.289-298
    • /
    • 2006
  • Many researches are going on with regard to issues and problems related to mobile database systems, which are caused by the weak connectivity of wireless networks, the mobility and the portability of mobile clients. Mobile computing satisfies user's demands for convenience and performance to use information at any time and in any place, but it has many problems to be solved in the aspect of data management. The purpose of our study is to design Mobile Continuous Query Processing System(MCQPS) to solve problems related to database hoarding, the maintenance of shared data consistency and the optimization of logging, which are caused by the weak connectivity and disconnection of wireless networks inherent in mobile database systems under mobile client server environments. We proved the superiority of the proposed MCQPS by comparing its performance to the C I S(Client-Intercept-Slaver) model. In Addition, we experiment on proposed index structure and methodology in various methods.

  • PDF