• Title/Summary/Keyword: tree data structure

Search Result 600, Processing Time 0.027 seconds

1H*-tree: An Improved Data Cube Structure for Multi-dimensional Analysis of Data Streams (1H*-tree: 데이터 스트림의 다차원 분석을 위한 개선된 데이터 큐브 구조)

  • XiangRui Chen;YuXiang Cheng;Yan Li;Song-Sun Shin;Dong-Wook Lee;Hae-Young Bae
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2008.11a
    • /
    • pp.332-335
    • /
    • 2008
  • In this paper, based on H-tree, which is proposed as the basic data cube structure for multi-dimensional data stream analysis, we have done some analysis. We find there are a lot of redundant nodes in H-tree, and the tree-build method can be improved for saving not only memory, but also time used for inserting tuples. Also, to facilitate more fast and large amount of data stream analysis, which is very important for stream research, H*-tree is designed and developed. Our performance study compare the proposed H*-tree and H-tree, identify that H*-tree can save more memory and time during inserting data stream tuples.

H*-tree/H*-cubing-cubing: Improved Data Cube Structure and Cubing Method for OLAP on Data Stream (H*-tree/H*-cubing: 데이터 스트림의 OLAP를 위한 향상된 데이터 큐브 구조 및 큐빙 기법)

  • Chen, Xiangrui;Li, Yan;Lee, Dong-Wook;Kim, Gyoung-Bae;Bae, Hae-Young
    • The KIPS Transactions:PartD
    • /
    • v.16D no.4
    • /
    • pp.475-486
    • /
    • 2009
  • Data cube plays an important role in multi-dimensional, multi-level data analysis. Meeting on-line analysis requirements of data stream, several cube structures have been proposed for OLAP on data stream, such as stream cube, flowcube, S-cube. Since it is costly to construct data cube and execute ad-hoc OLAP queries, more research works should be done considering efficient data structure, query method and algorithms. Stream cube uses H-cubing to compute selected cuboids and store the computed cells in an H-tree, which form the cuboids along popular-path. However, the H-tree layoutis disorderly and H-cubing method relies too much on popular path.In this paper, first, we propose $H^*$-tree, an improved data structure, which makes the retrieval operation in tree structure more efficient. Second, we propose an improved cubing method, $H^*$-cubing, with respect to computing the cuboids that cannot be retrieved along popular-path when an ad-hoc OLAP query is executed. $H^*$-tree construction and $H^*$-cubing algorithms are given. Performance study turns out that during the construction step, $H^*$-tree outperforms H-tree with a more desirable trade-off between time and memory usage, and $H^*$-cubing is better adapted to ad-hoc OLAP querieswith respect to the factors such as time and memory space.

An Index Structure for Efficiently Handling Dynamic User Preferences and Multidimensional Data (다차원 데이터 및 동적 이용자 선호도를 위한 색인 구조의 연구)

  • Choi, Jong-Hyeok;Yoo, Kwan-Hee;Nasridinov, Aziz
    • Asia-pacific Journal of Multimedia Services Convergent with Art, Humanities, and Sociology
    • /
    • v.7 no.7
    • /
    • pp.925-934
    • /
    • 2017
  • R-tree is index structure which is frequently used for handling spatial data. However, if the number of dimensions increases, or if only partial dimensions are used for searching the certain data according to user preference, the time for indexing is greatly increased and the efficiency of the generated R-tree is greatly reduced. Hence, it is not suitable for the multidimensional data, where dimensions are continuously increasing. In this paper, we propose a multidimensional hash index, a new multidimensional index structure based on a hash index. The multidimensional hash index classifies data into buckets of euclidean space through a hash function, and then, when an actual search is requested, generates a hash search tree for effective searching. The generated hash search tree is able to handle user preferences in selected dimensional space. Experimental results show that the proposed method has better indexing performance than R-tree, while maintaining the similar search performance.

A Hierarchical Binary-search Tree for the High-Capacity and Asymmetric Performance of NVM (비대칭적 성능의 고용량 비휘발성 메모리를 위한 계층적 구조의 이진 탐색 트리)

  • Jeong, Minseong;Lee, Mijeong;Lee, Eunji
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.14 no.2
    • /
    • pp.79-86
    • /
    • 2019
  • For decades, in-memory data structures have been designed for DRAM-based main memory that provides symmetric read/write performances and has no limited write endurance. However, such data structures provide sub-optimal performance for NVM as it has different characteristics to DRAM. With this motivation, we rethink a conventional red-black tree in terms of its efficacy under NVM settings. The original red-black tree constantly rebalances sub-trees so as to export fast access time over dataset, but it inevitably increases the write traffic, adversely affecting the performance for NVM with a long write latency and limited endurance. To resolve this problem, we present a variant of the red-black tree called a hierarchical balanced binary search tree. The proposed structure maintains multiple keys in a single node so as to amortize the rebalancing cost. The performance study reveals that the proposed hierarchical binary search tree effectively reduces the write traffic by effectively reaping the high capacity of NVM.

Classification and Regression Tree Analysis for Molecular Descriptor Selection and Binding Affinities Prediction of Imidazobenzodiazepines in Quantitative Structure-Activity Relationship Studies

  • Atabati, Morteza;Zarei, Kobra;Abdinasab, Esmaeil
    • Bulletin of the Korean Chemical Society
    • /
    • v.30 no.11
    • /
    • pp.2717-2722
    • /
    • 2009
  • The use of the classification and regression tree (CART) methodology was studied in a quantitative structure-activity relationship (QSAR) context on a data set consisting of the binding affinities of 39 imidazobenzodiazepines for the α1 benzodiazepine receptor. The 3-D structures of these compounds were optimized using HyperChem software with semiempirical AM1 optimization method. After optimization a set of 1481 zero-to three-dimentional descriptors was calculated for each molecule in the data set. The response (dependent variable) in the tree model consisted of the binding affinities of drugs. Three descriptors (two topological and one 3D-Morse descriptors) were applied in the final tree structure to describe the binding affinities. The mean relative error percent for the data set is 3.20%, compared with a previous model with mean relative error percent of 6.63%. To evaluate the predictive power of CART cross validation method was also performed.

An Integration Algorithm of X-tree and kd-tree for Efficient Retrieval of Spatial Database (공간 데이터베이스의 효율적인 검색을 위한 X-트리와 kd-트리의 병합 알고리즘)

  • Yoo, Jang-Woo;Shin, Young-Jin;Jung, Soon-Key
    • The Transactions of the Korea Information Processing Society
    • /
    • v.6 no.12
    • /
    • pp.3469-3476
    • /
    • 1999
  • In spatial database based on spatial data structures, instead of one-dimensional indexing structure, new indexing structure which corresponds to multi-dimensional features of spatial objects is required. In order to meet those requirements, in this paper we proposed new indexing structure for efficient retrieval of spatial database by carrying through the feature analysis of conventional multi-dimensional indexing structures. To improve the sequential search method of supernodes in the conventional X-tree and to reduce the retrieval time in case of generating the huge supernode, we proposed a indexing structure integrating the kd-tree based on point index structure into the X-tree. We implemented the proposed indexing structure and analyzed its retrieval time according to the dimension and distribution of experimental data.

  • PDF

Bayesian-based seismic margin assessment approach: Application to research reactor

  • Kwag, Shinyoung;Oh, Jinho;Lee, Jong-Min;Ryu, Jeong-Soo
    • Earthquakes and Structures
    • /
    • v.12 no.6
    • /
    • pp.653-663
    • /
    • 2017
  • A seismic margin assessment evaluates how much margin exists for the system under beyond design basis earthquake events. Specifically, the seismic margin for the entire system is evaluated by utilizing a systems analysis based on the sub-system and component seismic fragility data. Each seismic fragility curve is obtained by using empirical, experimental, and/or numerical simulation data. The systems analysis is generally performed by employing a fault tree analysis. However, the current practice has clear limitations in that it cannot deal with the uncertainties of basic components and accommodate the newly observed data. Therefore, in this paper, we present a Bayesian-based seismic margin assessment that is conducted using seismic fragility data and fault tree analysis including Bayesian inference. This proposed approach is first applied to the pooltype nuclear research reactor system for the quantitative evaluation of the seismic margin. The results show that the applied approach can allow updating by considering the newly available data/information at any level of the fault tree, and can identify critical scenarios modified due to new information. Also, given the seismic hazard information, this approach is further extended to the real-time risk evaluation. Thus, the proposed approach can finally be expected to solve the fundamental restrictions of the current method.

Efficient Spatial Index Structure for GIS and VLSI Design (GIS와 VLSI Design을 위한 효율적인 공간 색인구조)

  • Bang Kapsan
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2004.11a
    • /
    • pp.129-132
    • /
    • 2004
  • 공간 색인구조는 공간 데이터를 효율적으로 관리하기 위한 도구로써, GIS와 같은 공간 데이터베이스의 성능을 결정하는 중요한 요소라 하겠다. 대부분의 응용분야에서 공간 데이터베이스는 보조기억장치에 저장된 방대한 양의 공간데이터 처리를 요구하므로 디스크 접근의 수를 줄이는 것이 전체 데이터베이스의 성능을 향상시키는데 중요한 요소이다. 이 논문에서는 SMR-tree라는 공간색인구조의 여러 응용분야에서 활용 가능성을 기존의 색인구조들과의 비교를 통해 확인한다. SMR-tree는 R-tree 계열의 구조로써 기존의 R-tree계열의 구조들과 동일한 노드의 형태를 가지고 있으나, 여러 개의 data space를 사용하여 data object를 배분함으로써 $R^{+}-tree$의 말단노드 내에 존재하는 잉여공간을 제거하면서 R-tree의 단점인 색인노드들 사이에 중첩을 허용치 않는다. SMR-tree의 성능은 여러 종류의 테스트 데이터(VLSI layout data, Tiger/Line file data)를 사용하여 R-tree, $R^{+}-tree,\;R^{\ast}-tree$와 비교된다. SMR-tree는 높은 공간 활용도와 다른 색인구조에 비해 빠른 질의 성능을 보임으로써 GIS와 같은 공간 데이터베이스를 위한 효율적인 색인구조로 사용이 될 것으로 기대된다.

  • PDF

Bit-Vector-Based Space Partitioning Indexing Scheme for Improving Node Utilization and Information Retrieval (노드 이용률과 검색 속도 개선을 위한 비트 벡터 기반 공간 분할 색인 기법)

  • Yeo, Myung-Ho;Seong, Dong-Ook;Yoo, Jae-Soo
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.16 no.7
    • /
    • pp.799-803
    • /
    • 2010
  • The KDB-tree is a traditional indexing scheme for retrieving multidimensional data. Much research for KDB-tree family frequently addresses the low storage utilization and insufficient retrieval performance as their two bottlenecks. The bottlenecks occur due to a number of unnecessary splits caused by data insertion orders and data skewness. In this paper, we propose a novel index structure, called as $KDB_{CS}^+$-tree, to process skewed data efficiently and improve the retrieval performance. The $KDB_{CS}^+$-tree increases the number of fan-outs by exploiting bit-vectors for representing splitting information and pointer elimination. It also improves the storage utilization by representing entries as a hierarchical structure in each internal node.

A Distributed High Dimensional Indexing Structure for Content-based Retrieval of Large Scale Data (대용량 데이터의 내용 기반 검색을 위한 분산 고차원 색인 구조)

  • Cho, Hyun-Hwa;Lee, Mi-Young;Kim, Young-Chang;Chang, Jae-Woo;Lee, Kyu-Chul
    • Journal of KIISE:Databases
    • /
    • v.37 no.5
    • /
    • pp.228-237
    • /
    • 2010
  • Although conventional index structures provide various nearest-neighbor search algorithms for high-dimensional data, there are additional requirements to increase search performances as well as to support index scalability for large scale data. To support these requirements, we propose a distributed high-dimensional indexing structure based on cluster systems, called a Distributed Vector Approximation-tree (DVA-tree), which is a two-level structure consisting of a hybrid spill-tree and VA-files. We also describe the algorithms used for constructing the DVA-tree over multiple machines and performing distributed k-nearest neighbors (NN) searches. To evaluate the performance of the DVA-tree, we conduct an experimental study using both real and synthetic datasets. The results show that our proposed method contributes to significant performance advantages over existing index structures on difference kinds of datasets.