DOI QR코드

DOI QR Code

Building Hierarchical Bitmap Indices in Space Constrained Environments

저장 공간이 제약된 환경에서 계층적 비트맵 인덱스 생성에 관한 연구

  • Received : 2015.01.02
  • Accepted : 2015.02.12
  • Published : 2015.02.28

Abstract

Since bitmap indices are useful for OLAP queries over low-cardinality data columns, they are frequently used in data warehouses. In many data warehouse applications, the domain of a column tends to be hierarchical, such as categorical data and geographical data. When the domain of a column is hierarchical, hierarchical bitmap index is able to significantly improve the performance of queries with conditions on that column. This strategy, however, has a limitation in that when a large scale hierarchy is used, building a bimamp for each distinct node leads to a large space overhead. Thus, in this paper, we introduce the way to build hierarchical bitmap index on an attribute whose domain is organized into a large-scale hierarchy in space-constrained environments. Especially, in order to figure out space overhead of hierarchical bitmap indices, we propose the cut-selection strategy which divides the entire hierarchy into two exclusive regions.

비트맵 인덱스는 낮은 카디널리티를 갖는 컬럼에 대한 OLAP 질의의 수행 속도에 있어서 매우 우수한 성능을 보이고 있기 때문에, 데이터 웨어하우스에서 많이 사용하고 있는 인덱스 기법 중에 하나이다. 일반적으로 데이터 웨어하우스에 기반을 둔 많은 응용 프로그램들은 컬럼 값들이 계층 구조를 형성하는 경우가 많이 있다. 만일, 컬럼 값들이 계층적으로 표현될 수 있는 경우 일반적인 비트맵 인덱스 보다 계층적 비트맵 인덱스를 이용하는 것이 질의 처리 수행 속도에 있어서 더 높은 성능을 보인다고 알려지고 있다. 그러나 계층적 비트맵 인덱스의 경우 사용하는 계층 구조의 크기가 큰 경우 저장 공간 오버헤드가 발생할 수 있다는 문제점을 가지고 있다. 그러므로 본 논문에서는 저장 공간이 제약된 환경에서 컬럼 값들이 거대 계층 구조를 형성하고 있을 때, 질의 워크로드에 기반하여 계층적 비트맵 인덱스를 효과적으로 생성하기 위한 방법을 제안한다. 특히, 본 논문에서는 주어진 계층 구조를 두 개의 배타적 역영으로 나누는 Cut 선택 방법 제안함으로써, 계층적 비트맵 인덱스의 저장 공간 오버헤드 문제를 해결한다.

Keywords

References

  1. S. Chaudhuri and U. Dayal, "An Overview of Data Warehousing and OLAP Technology," ACM SIGMOD Record, vol. 26, issue 1, pp. 65-74, March 1997. https://doi.org/10.1145/248603.248616
  2. S. Amer-Yahia and T. Johnson, "Optimizing Queries on Compressed Bitmaps," Proceedings of the 26th International Conference on Very Large Data Bases, pp. 329-338, Sept. 2000.
  3. C. Rotem, K. Stockinger, and K. Wu, "Optimizing candidate check costs for bitmap indices," Proceedings of the 14th ACM international conference on Information and knowledge management, pp. 648-655, Oct. 2005.
  4. C. Y. Chan and Y. E. Ioannidis, "An Efficient Bitmap Encoding Scheme for Selection Queries," Proceedings of the 1999 ACM SIGMOD international conference on Management of data, pp. 215-226, June 1999.
  5. N. Koudas, "Space Efficient Bitmap Indexing," Proceedings of the ninth international conference on Information and knowledge management, pp. 194-201, Nov. 2000.
  6. D. Rotem, K. Stockinger, and K. Wu, "Efficient Binning for Bitmap Indices on High-Cardinality Attributes." Technical Report LBNL-56936, Berkeley Lab, Berkeley, California, USA, Nov. 2004.
  7. K. Wu, E. J. Otoo, and A. Shoshani, "On the Performance of Bitmap Indices for High Cardinality Attributes," Proceedings of the Thirtieth international conference on Very large data bases, pp. 24-35, Sept. 2004.
  8. K.-L. Wu and P.S. Yu, "Range-Based Bitmap Indexing for High-Cardinality Attributes with Skew," Technical report, IBM Watson Research, May 1996.
  9. P. Nagarkar and K. S. Candan, "HCS: Hierarchical Cut Selection for Efficiently Processing Queries on Data Columns using Hierarchical Bitmap Indices," Proceedings of the 17th International Conference on Extending Database Technology, pp. 271-282, March, 2014.
  10. J. Chmiel, T. Morzy, and R. Wrembel, "HOBI: Hierarchically Organized Bitmap Index for Indexing Dimensional Data," Proceedings of the 11th International Conference on Data Warehousing and Knowledge Discovery, pp. 87-98, Sept., 2009.
  11. J. Chmiel, T. Morzy, and R. Wrembel, "Time-HOBI: indexing dimension hierarchies by means of hierarchically organized bitmaps," Proceedings of the ACM 13th international workshop on Data warehousing and OLAP, pp. 69-76, Oct, 2010.
  12. K. Wu, K. Stockinger, and A. Shoshani, "Breaking the curse of cardinality on bitmap indexes," In Scientific and Statistical Database Management, Lecture Notes in Computer Science Volume, Vol. 5069, pp. 348-365, 2008.
  13. R. R. Sinha and M. Winslett, "Multi-resolution bitmap indexes for scientific data," ACM Transactions on Database Systems Vol. 32, Issue 3, 2007.
  14. K. Madduri and K.Wu, "Efficient joins with compressed bitmap indexes," Proceedings of the 18th ACM conference on Information and knowledge management, pp. 1017-1026, 2009.
  15. T. A. Bjorklund, N. Grimsmo, J. Gehrke and O. Torbjornsen, "Inverted Indexes vs. Bitmap Indexes in Decision Support Systems," Proceedings of the 18th ACM Conference on Information and Knowledge Management, pp. 1509-1512, 2009.
  16. A. Hamadou and Y. Kehua, "An Efficient Bitmap Indexing Strategy Based on Word-Aligned Hybrid for Data Warehouses," Proceedings of the 2008 Inter national Conference on Computer Science and Soft ware Engineering, pp. 486-491, Dec. 2008.
  17. The 2012 ACM Computing Classification System, http://www.acm.org/about/class/class/2012.