• Title/Summary/Keyword: Main-memory index structure

Search Result 35, Processing Time 0.021 seconds

Making Cache-Conscious CCMR-trees for Main Memory Indexing (주기억 데이타베이스 인덱싱을 위한 CCMR-트리)

  • 윤석우;김경창
    • Journal of KIISE:Databases
    • /
    • v.30 no.6
    • /
    • pp.651-665
    • /
    • 2003
  • To reduce cache misses emerges as the most important issue in today's situation of main memory databases, in which CPU speeds have been increasing at 60% per year, and memory speeds at 10% per year. Recent researches have demonstrated that cache-conscious index structure such as the CR-tree outperforms the R-tree variants. Its search performance can be poor than the original R-tree, however, since it uses a lossy compression scheme. In this paper, we propose alternatively a cache-conscious version of the R-tree, which we call MR-tree. The MR-tree propagates node splits upward only if one of the internal nodes on the insertion path has empty room. Thus, the internal nodes of the MR-tree are almost 100% full. In case there is no empty room on the insertion path, a newly-created leaf simply becomes a child of the split leaf. The height of the MR-tree increases according to the sequence of inserting objects. Thus, the HeightBalance algorithm is executed when unbalanced heights of child nodes are detected. Additionally, we also propose the CCMR-tree in order to build a more cache-conscious MR-tree. Our experimental and analytical study shows that the two-dimensional MR-tree performs search up to 2.4times faster than the ordinary R-tree while maintaining slightly better update performance and using similar memory space.

An Efficient Indexing Structure for Multidimensional Categorical Range Aggregation Query

  • Yang, Jian;Zhao, Chongchong;Li, Chao;Xing, Chunxiao
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.13 no.2
    • /
    • pp.597-618
    • /
    • 2019
  • Categorical range aggregation, which is conceptually equivalent to running a range aggregation query separately on multiple datasets, returns the query result on each dataset. The challenge is when the number of dataset is as large as hundreds or thousands, it takes a lot of computation time and I/O. In previous work, only a single dimension of the range restriction has been solved, and in practice, more applications are being used to calculate multiple range restriction statistics. We proposed MCRI-Tree, an index structure designed to solve multi-dimensional categorical range aggregation queries, which can utilize main memory to maximize the efficiency of CRA queries. Specifically, the MCRI-Tree answers any query in $O(nk^{n-1})$ I/Os (where n is the number of dimensions, and k denotes the maximum number of pages covered in one dimension among all the n dimensions during a query). The practical efficiency of our technique is demonstrated with extensive experiments.

Design of Efficient Storage Structure and Indexing Mechanism for XML Documents (XML을 위한 효율적인 저장구조 및 인덱싱 기법설계)

  • 신판섭
    • Journal of the Korea Computer Industry Society
    • /
    • v.5 no.1
    • /
    • pp.87-100
    • /
    • 2004
  • XML has recently considered as a new standard for data presentation and exchange on the web, many researches are on going to develop applications and index mechanism to store and retrieve XML documents efficiently. In this paper, design a Main-Memory based XML storage system for efficient management of XML document. And propose structured retrieval of XML document tree which reduce the traverse of XML document tree using element type information included user queries. Proposed indexing mechanism has flexibilities for dynamic data update. Finally, for query processing of XML document include Link information, design a index structure of table type link information on observing XLink standards.

  • PDF

Dynamic Management of Equi-Join Results for Multi-Keyword Searches (다중 키워드 검색에 적합한 동등조인 연산 결과의 동적 관리 기법)

  • Lim, Sung-Chae
    • The KIPS Transactions:PartA
    • /
    • v.17A no.5
    • /
    • pp.229-236
    • /
    • 2010
  • With an increasing number of documents in the Internet or enterprises, it becomes crucial to efficiently support users' queries on those documents. In that situation, the full-text search technique is accepted in general, because it can answer uncontrolled ad-hoc queries by automatically indexing all the keywords found in the documents. The size of index files made for full-text searches grows with the increasing number of indexed documents, and thus the disk cost may be too large to process multi-keyword queries against those enlarged index files. To solve the problem, we propose both of the index file structure and its management scheme suitable to the processing of multi-keyword queries against a large volume of index files. For this, we adopt the structure of inverted-files, which are widely used in the multi-keyword searches, as a basic index structure and modify it to a hierarchical structure for join operations and ranking operations performed during the query processing. In order to save disk costs based on that index structure, we dynamically store in the main memory the results of join operations between two keywords, if they are highly expected to be entered in users' queries. We also do performance comparisons using a cost model of the disk to show the performance advantage of the proposed scheme.

A Construction of Pointer-based Model for Main Memory Database Systems (주기억장치 데이터베이스를 위한 포인터 기반 모델의 구축)

  • Bae, Myung-Nam;Choi, Wan
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.28 no.4B
    • /
    • pp.323-338
    • /
    • 2003
  • The main memory database systems (MMDBMS) efficiently supports various database applications that require high performance since it employs main memory rather than disk as a primary storage. Recently, it has been increased needs that have the fast data processing as well as the efficient modeling of application requiring for a complicated structure, and conformity to applications that need the strict dta consistency. In MMDBMS, because all the data is located in the main memory, it can support the usable expression methods of data satisfying their needs without performance overhead. The method has the operation to manipulate the data and the constraint such as referential integrity in more detail. The data model consists of this methods is an essential component to decide the expression power of DBMS. In this paper, we discuss about various requests to provide the communication services and propose the data model that support it. The mainly discussed issues are 1) definition of the relationship between tables using the pointer, 2) navigation of the data using the relationship, 3) support of the referential integrity for pointer, 4) support of the uniform processing time for the join, 5) support of the object-oriented concepts, and 6) sharing of an index on multi-tables. We discuss the pointer-based data model that designed to include these issues to efficiently support complication environments.

Design of a Multi-dimensional Index Structure based on Main Memory (주기억장치 상주형 다차원 색인 구조 설계)

  • 심정민;송석일;유재수
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2003.10b
    • /
    • pp.1-3
    • /
    • 2003
  • 최근 중앙처리장치와 주기억장치간의 병목 현상에 의한 성능 저하를 극복하기 위해 캐시를 고려한 색인 구조들이 제안되었다. 이런 색인 구조들의 궁극적인 목표는 엔트리 크기를 줄여 팬-아웃(fan-out)을 증가시키고, 캐시 접근 실패를 최소화하여 시스템의 성능을 높이는 것이다. 엔트리의 크기를 줄이는 기법에 따라 기존의 색인 구조들을 두 가지로 구분할 수 있다. 하나는 좌표 값을 고정된 비트로 양자화 함으로써, MBR 키를 압축하는 것이다. 또 다른 하나는 MBR들의 각 좌표 값 중에 그들의 부모 MBR과 같지 않은 좌표 값만을 저장하는 것이다. 본 논문에서는 두 기법의 특성들을 적절히 합한 새로운 색인 구조를 제안하고, 기존에 제시된 두 접근법을 따르는 주기억장치 상주형 다차원 색인 구조를 다양한 환경에서 성능 평가한다. 또한, 기존의 색인 구조와 비교를 통해 제안하는 색인 구조의 우수성을 보인다.

  • PDF

Performance Evaluation of an Index Structure for Dynamic Main Memory Database (동적 주기억 데이터베이스를 위한 색인 구조의 성능 평가)

  • 박정규;전흥석;노삼혁
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2000.10a
    • /
    • pp.213-215
    • /
    • 2000
  • 주기억 데이터베이스에서 효율적인 성능을 위해서 제안된 색인 구조 중 T-트리가 있다. 이 색인 구조는 삽입 삭제가 많은 동적 주기억 데이터베이스에서 빈번한 노드 생성 및 삭제에 따르는 오버헤드(overhead)로 효율적이지 못한다. 이 문제를 극복하기 위해서 T2-트리가 제안되었다. T2-트리는 T-트리의 단점인 범위 질의의 비효율성의 해결과 삽입 삭제가 빈번한 동적 주기억 데이터베이스 시스템으 위해 억제된 노드 생성 및 삭제 기법과 스레드 이진 트리의 특징을 가지고 있다. 이 논문에서는 리눅스에서 주기억 데이터베이스 프로그램인 FastDB에 사용된 T-트리 인덱싱 구조를 새롭게 제안된 T2-트리로 수정하여 두 가지 인덱싱 구조를 비교 실험한 결과를 보여주고 있다. 실험결과에 의하면 T-트리에 비해서 T2-트리가 동적인 주기억 데이터베이스 시스템에서 효율적인 구조임을 알 수 있다.

  • PDF

$T^2$-Tree: An Efficient Index Structure for Dynamic Main Memory Database ($T^2$-트리: 동적 주기억 데이터베이스를 위한 효율적 색인 구조)

  • 김태진;전홍석;이재호;노삼혁
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 1999.10a
    • /
    • pp.258-260
    • /
    • 1999
  • 주기억 데이터베이스를 위한 색인 구조는 기존의 디스크 기반 데이터베이스의 색인 구조와는 고려되어야 할 사항이 다르다. 최근까지 연구된 색인 구조 중 대표적인 것은 T-트리와 T*-트리이다. 비록 T*-트리가 T-트리의 단점인 범위 질의의 비효율성을 해결하고 있지만 데이터의 삽입과 삭제가 많은 시스템에서 트리 균형을 맞추기 위한 오버헤드, 회전 연산의 수행과 후위 포인터(successor pointer)의 추가적인 오버헤드가 있다. 따라서 본 논문에서는 삽입과 삭제가 빈번한 동적 주기억 데이터베이스를 위해서 억제된 노드 생성 및 삭제 기법과 스레드 이진 트리의 특성을 이용한 보다 효율적인 색인 구조인 T2-트리를 제안한다.

  • PDF

AS B-tree: A study on the enhancement of the insertion performance of B-tree on SSD (AS B-트리: SSD를 사용한 B-트리에서 삽입 성능 향상에 관한 연구)

  • Kim, Sung-Ho;Roh, Hong-Chan;Lee, Dae-Wook;Park, Sang-Hyun
    • The KIPS Transactions:PartD
    • /
    • v.18D no.3
    • /
    • pp.157-168
    • /
    • 2011
  • Recently flash memory has been being utilized as a main storage device in mobile devices, and flashSSDs are getting popularity as a major storage device in laptop and desktop computers, and even in enterprise-level server machines. Unlike HDDs, on flash memory, the overwrite operation is not able to be performed unless it is preceded by the erase operation to the same block. To address this, FTL(Flash memory Translation Layer) is employed on flash memory. Even though the modified data block is overwritten to the same logical address, FTL writes the updated data block to the different physical address from the previous one, mapping the logical address to the new physical address. This enables flash memory to avoid the high block-erase cost. A flashSSD has an array of NAND flash memory packages so it can access one or more flash memory packages in parallel at once. To take advantage of the internal parallelism of flashSSDs, it is beneficial for DBMSs to request I/O operations on sequential logical addresses. However, the B-tree structure, which is a representative index scheme of current relational DBMSs, produces excessive I/O operations in random order when its node structures are updated. Therefore, the original b-tree is not favorable to SSD. In this paper, we propose AS(Always Sequential) B-tree that writes the updated node contiguously to the previously written node in the logical address for every update operation. In the experiments, AS B-tree enhanced 21% of B-tree's insertion performance.

Suffix Tree Constructing Algorithm for Large DNA Sequences Analysis (대용량 DNA서열 처리를 위한 서픽스 트리 생성 알고리즘의 개발)

  • Choi, Hae-Won
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.15 no.1
    • /
    • pp.37-46
    • /
    • 2010
  • A Suffix Tree is an efficient data structure that exposes the internal structure of a string and allows efficient solutions to a wide range of complex string problems, in particular, in the area of computational biology. However, as the biological information explodes, it is impossible to construct the suffix trees in main memory. We should find an efficient technique to construct the trees in a secondary storage. In this paper, we present a method for constructing a suffix tree in a disk for large set of DNA strings using new index scheme. We also show a typical application example with a suffix tree in the disk.