• Title/Summary/Keyword: query clustering

Search Result 120, Processing Time 0.038 seconds

Two-Dimensional Grouping Index for Efficient Processing of XML Filtering Queries (XML 필터링 질의의 효율적 처리를 위한 이차원 그룹핑 색인기법)

  • Yeo, Dae-Hwi;Lee, Jong-Hak
    • Journal of Information Technology and Architecture
    • /
    • v.10 no.1
    • /
    • pp.123-135
    • /
    • 2013
  • This paper presents a two-dimensional grouping index(2DG-index) for efficient processing of XML filtering queries. Recently, many index techniques have been suggested for the efficient processing of structural relationships among the elements in the XML database such as an ancestor- descendant and a parent-child relationship. However, these index techniques focus on simple path queries, and don't consider the path queries that include a condition value for filtering. The 2DG-index is an index structure that deals with the problem of clustering index entries in the twodimensional domain space that consists of a XML path identifier domain and a filtering data value domain. For performance evaluation, we have compared our proposed 2DG-index with the conventional one dimensional index structure such as the data grouping index (DG-index) and the path grouping index (PG-index). As the result of the performance evaluations, we have verified that our proposed 2DG-index can efficiently support the query processing in XML databases according to the query types.

Cluster Property based Data Transfer for Efficient Energy Consumption in IoT (사물인터넷의 에너지 효율을 위한 클러스터 속성 기반 데이터 교환)

  • Lee, Chungsan;Jeon, Soobin;Jung, Inbum
    • Journal of KIISE
    • /
    • v.44 no.9
    • /
    • pp.966-975
    • /
    • 2017
  • In Internet of Things (IoT), the aim of the nodes (called 'Things') is to exchange information with each other, whereby they gather and share information with each other through self decision-making. Therefore, we cannot apply existing aggregation algorithms of Wireless sensor networks that aim to transmit information to only a sink node or a central server, directly to the IoT environment. In addition, since existing algorithms aggregate information from all sensor nodes, problems can arise including an increasing number of transmissions and increasing transmission delay and energy consumption. In this paper, we propose the clustering and property based data exchange method for energy efficient information sharing. First, the proposed method assigns the properties of each node, including the sensing data and unique resource. The property determines whether the node can respond to the query requested from the other node. Second, a cluster network is constructed considering the location and energy consumption. Finally, the nodes communicate with each other efficiently using the properties. For the performance evaluation, TOSSIM was used to measure the network lifetime and average energy consumption.

Uncertainty for Privacy and 2-Dimensional Range Query Distortion

  • Sioutas, Spyros;Magkos, Emmanouil;Karydis, Ioannis;Verykios, Vassilios S.
    • Journal of Computing Science and Engineering
    • /
    • v.5 no.3
    • /
    • pp.210-222
    • /
    • 2011
  • In this work, we study the problem of privacy-preservation data publishing in moving objects databases. In particular, the trajectory of a mobile user in a plane is no longer a polyline in a two-dimensional space, instead it is a two-dimensional surface of fixed width $2A_{min}$, where $A_{min}$ defines the semi-diameter of the minimum spatial circular extent that must replace the real location of the mobile user on the XY-plane, in the anonymized (kNN) request. The desired anonymity is not achieved and the entire system becomes vulnerable to attackers, since a malicious attacker can observe that during the time, many of the neighbors' ids change, except for a small number of users. Thus, we reinforce the privacy model by clustering the mobile users according to their motion patterns in (u, ${\theta}$) plane, where u and ${\theta}$ define the velocity measure and the motion direction (angle) respectively. In this case, the anonymized (kNN) request looks up neighbors, who belong to the same cluster with the mobile requester in (u, ${\theta}$) space: Thus, we know that the trajectory of the k-anonymous mobile user is within this surface, but we do not know exactly where. We transform the surface's boundary poly-lines to dual points and we focus on the information distortion introduced by this space translation. We develop a set of efficient spatiotemporal access methods and we experimentally measure the impact of information distortion by comparing the performance results of the same spatiotemporal range queries executed on the original database and on the anonymized one.

Policies of Trajectory Clustering in Index based on R-trees for Moving Objects (이동체를 위한 R-트리 기반 색인에서의 궤적 클러스터링 정책)

  • Ban ChaeHoon;Kim JinGon;Jun BongGi;Hong BongHee
    • The KIPS Transactions:PartD
    • /
    • v.12D no.4 s.100
    • /
    • pp.507-520
    • /
    • 2005
  • The R-trees are usually used for an index of trajectories in moving-objects databases. However, they need to access a number of nodes to trace same trajectories because of considering only a spatial proximity. Overlaps and dead spaces should be minimized to enhance the performance of range queries in moving-objects indexes. Trajectories of moving-objects should be preserved to enhance the performance of the trajectory queries. In this paper, we propose the TP3DR-tree(Trajectory Preserved 3DR-tree) using clusters of trajectories for range and trajectory queries. The TP3DR-tree uses two split policies: one is a spatial splitting that splits the same trajectory by clustering and the other is a time splitting that increases space utilization. In addition, we use connecting information in non-leaf nodes to enhance the performance of combined-queries. Our experiments show that the new index outperforms the others in processing queries on various datasets.

Contents-based Image Retrieval Using Color & Edge Information (칼라와 에지 정보를 이용한 내용기반 영상 검색)

  • Park, Dong-Won;An, Syungog;Ma, Ming;Singh, Kulwinder
    • The Journal of Korean Association of Computer Education
    • /
    • v.8 no.1
    • /
    • pp.81-91
    • /
    • 2005
  • In this paper we present a novel approach for image retrieval using color and edge information. We take into account the HSI(Hue, Saturation and Intensity) color space instead of RGB space, which emphasizes more on visual perception. In our system colors in an image are clustered into a small number of representative colors. The color feature descriptor consists of the representative colors and their percentages in the image. An improved cumulative color histogram distance measure is defined for this descriptor. And also, we have developed an efficient edge detection technique as an optional feature to our retrieval system in order to surmount the weakness of color feature. During the query processing, both the features (color, edge information) could be integrated for image retrieval as well as a standalone entity, by specifying it in a certain proportion. The content-based retrieval system is tested to be effective in terms of retrieval and scalability through experimental results and precision-recall analysis.

  • PDF

Bagged Auto-Associative Kernel Regression-Based Fault Detection and Identification Approach for Steam Boilers in Thermal Power Plants

  • Yu, Jungwon;Jang, Jaeyel;Yoo, Jaeyeong;Park, June Ho;Kim, Sungshin
    • Journal of Electrical Engineering and Technology
    • /
    • v.12 no.4
    • /
    • pp.1406-1416
    • /
    • 2017
  • In complex and large-scale industries, properly designed fault detection and identification (FDI) systems considerably improve safety, reliability and availability of target processes. In thermal power plants (TPPs), generating units operate under very dangerous conditions; system failures can cause severe loss of life and property. In this paper, we propose a bagged auto-associative kernel regression (AAKR)-based FDI approach for steam boilers in TPPs. AAKR estimates new query vectors by online local modeling, and is suitable for TPPs operating under various load levels. By combining the bagging method, more stable and reliable estimations can be achieved, since the effects of random fluctuations decrease because of ensemble averaging. To validate performance, the proposed method and comparison methods (i.e., a clustering-based method and principal component analysis) are applied to failure data due to water wall tube leakage gathered from a 250 MW coal-fired TPP. Experimental results show that the proposed method fulfills reasonable false alarm rates and, at the same time, achieves better fault detection performance than the comparison methods. After performing fault detection, contribution analysis is carried out to identify fault variables; this helps operators to confirm the types of faults and efficiently take preventive actions.

A Study of Designing the Intelligent Information Retrieval System by Automatic Classification Algorithm (자동분류 알고리즘을 이용한 지능형 정보검색시스템 구축에 관한 연구)

  • Seo, Whee
    • Journal of Korean Library and Information Science Society
    • /
    • v.39 no.4
    • /
    • pp.283-304
    • /
    • 2008
  • This is to develop Intelligent Retrieval System which can automatically present early query's category terms(association terms connected with knowledge structure of relevant terminology) through learning function and it changes searching form automatically and runs it with association terms. For the reason, this theoretical study of Intelligent Automatic Indexing System abstracts expert's index term through learning and clustering algorism about automatic classification, text mining(categorization), and document category representation. It also demonstrates a good capacity in the aspects of expense, time, recall ratio, and precision ratio.

  • PDF

An Experimental Study on Selecting Association Terms Using Text Mining Techniques (텍스트 마이닝 기법을 이용한 연관용어 선정에 관한 실험적 연구)

  • Kim, Su-Yeon;Chung, Young-Mee
    • Journal of the Korean Society for information Management
    • /
    • v.23 no.3 s.61
    • /
    • pp.147-165
    • /
    • 2006
  • In this study, experiments for selection of association terms were conducted in order to discover the optimum method in selecting additional terms that are related to an initial query term. Association term sets were generated by using support, confidence, and lift measures of the Apriori algorithm, and also by using the similarity measures such as GSS, Jaccard coefficient, cosine coefficient, and Sokal & Sneath 5, and mutual information. In performance evaluation of term selection methods, precision of association terms as well as the overlap ratio of association terms and relevant documents' indexing terms were used. It was found that Apriori algorithm and GSS achieved the highest level of performances.

Design Pattern Base4 Component Classification and Retrieval using E-SARM (설계 패턴 기반 컴포넌트 분류와 E-SARM을 이용한 검색)

  • Kim, Gui-Jung;Han, Jung-Soo;Song, Young-Jae
    • The KIPS Transactions:PartD
    • /
    • v.11D no.5
    • /
    • pp.1133-1142
    • /
    • 2004
  • This paper proposes a method to classify and retrieve components in repository using the idea of domain orientation for the successful reuse of components. A design pattern was applied to existing systems and a component classification method is suggested here to compare the structural similarity between each component in relevant domain and criterion patterns. Classifying reusable components by their functionality and then depicting their structures with a diagram can increase component reusability and portability between platforms. Efficiency of component reuse can be raised because the most appropriate component to query and similar candidate components are provided in priority by use of-SARM algorithm.

Mining of Subspace Contrasting Sample Groups in Microarray Data (마이크로어레이 데이터의 부공간 대조 샘플집단 마이닝)

  • Lee, Kyung-Mi;Lee, Keon-Myung
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.21 no.5
    • /
    • pp.569-574
    • /
    • 2011
  • In this paper, we introduce the subspace contrasting group identification problem and propose an algorithm to solve the problem. In order to identify contrasting groups, the algorithm first determines two groups of which attribute values are in one of the contrasting ranges specified by the analyst, and searches for the contrasting groups while increasing the dimension of subspaces with an association rule mining strategy. Because the dimension of microarray data is likely to be tens of thousands, it is burdensome to find all contrasting groups over all possible subspaces by query generation. It is very useful in the sense that the proposed method allows to find those contrasting groups without analyst's involvement.