DOI QR코드

DOI QR Code

An Approximate Query Answering Method using a Knowledge Representation Approach

지식 표현 방식을 이용한 근사 질의응답 기법

  • Lee, Sun-Young (Department of Computer Education, Chungbuk National University) ;
  • Lee, Jong-Yun (Department of Computer Education, Chungbuk National University)
  • 이선영 (충북대학교 컴퓨터교육과) ;
  • 이종연 (충북대학교 컴퓨터교육과)
  • Received : 2011.06.14
  • Accepted : 2011.08.11
  • Published : 2011.08.31

Abstract

In decision support system, knowledge workers require aggregation operations of the large data and are more interested in the trend analysis rather than in the punctual analysis. Therefore, it is necessary to provide fast approximate answers rather than exact answers, and to research approximate query answering techniques. In this paper, we propose a new approximation query answering method which is based on Fuzzy C-means clustering (FCM) method and Adaptive Neuro-Fuzzy Inference System (ANFIS). The proposed method using FCM-ANFIS can compute aggregate queries without accessing massive multidimensional data cube by producing the KR model of multidimensional data cube. In our experiments, we show that our method using the KR model outperforms the NMF method.

의사결정 지원시스템에서 작업자들은 대량의 데이터 집계 연산을 요구하며, 데이터에 대한 정확한 응답보다는 경향 분석에 더 많은 관심을 가진다. 그러므로 정확한 응답보다 빠른 근사 질의응답을 제공하는 것이 필요하며 그것을 실현하기 위한 근사질의 응답 기법의 연구가 필요하다. 따라서 본 논문에서는 기존 연구들의 단점을 보안하고 근사 응답의 정확성을 향상시킬 수 있는 Fuzzy C-Means (FCM) 클러스터링 기반 Adaptive Neuro-Fuzzy Inference System (ANFIS)을 이용한 근사 질의응답 기법을 제안한다. FCM-ANFIS을 이용한 근사 질의응답 기법은 다차원 데이터의 지식 표현 모델을 생성함으로써 거대한 다차원 데이터 큐브에 직접적인 접근 없이 집계 질의 수행이 가능하다. 비교실험을 통하여 제안된 기법이 기존의 NMF 기법보다 근사 질의응답의 정확성이 향상되었음을 확인한다.

Keywords

References

  1. J. Han et al., Data Mining: Concepts and Techniques, Morgan Kafmann Publishers, 2000.
  2. F. Yu et al., "Compressed data cube for Approximate OLAP Query Processing," Journal of Computer Science and Technology, Vol. 17, Issue 5, pp.625-635, 2002. https://doi.org/10.1007/BF02948830
  3. A. Cuzzocrea, "Overcoming Limitations of Approximate query Answering in OLAP," Proceedings of the 9th International Database Engineering & Application Symposium (IDEAS'05), pp.200-209, 2005.
  4. T. Palpana et al., "Using Data cube Aggregates for Approximate Querying and Deviation Detection," IEEE Transactions on Knowledge and Data Engineering, Vol. 17, No. 11, 2005. https://doi.org/10.1109/TKDE.2005.187
  5. P. B. Gibbons et al., "New Sampling-Based Summary Statistics for Improving Approximate Query Answers," Proceeding of the 1998 ACM Int. Conf. on Management of Data, pp. 331-342, 1998.
  6. V. Poosala et al., "Fast approximate answers to aggregate queries on a data cube," Eleventh International Conference on Scientific and Statistical Database Management, pp.24-33, 1999.
  7. V. Ganti et al., "ICICLES: Self-tuning Samples for Approximate Query Answering," Proceedings of the 26th VLDB Conference, Cairo, Egypt, 2000.
  8. R. Jin et al., "New Sampling-Based Estimators for OLAP Queries," The 22nd International Confernece on Data Engineering, ICDE'06, 2006.
  9. J. S. Vitter et al., "Data Cube Approximation and Histograms via Wavelets," Proceedings of Seventh International Conference on Information and Knowledge Management (CIKM'98), Washington D.C., November 1998.
  10. J. S. Vitter et al., "Approximate Computation of Multidimensional Aggregates of Sparse Data Using Wavelets," In Proceedings of the SIGMOD '99 Conference, pages 193-204, 1999.
  11. K. Chakrabarti et al., "Approximate Query Answering Using Wavelets", Proceedings of the 26th VLDB Conference, Cairo, Egypt, pages 111-122, 2000.
  12. A. C. Gilbert et al., "Surfing Wavelets on Streams: One-Pass Summaries for Approximate Aggregate Queries", Proceedings of the 27th VLDB Conference, Romma, Italy, 2001.
  13. Y.E. Ionnidis et al., "Histogram-Based Approximation of Set-Valued Query Answers," 25th VLDB Conference, 1999.
  14. C. Goutte et al., "Data cube Approximation and Mining using Probabilistic Modelling," TR 2007, NRC 2007.
  15. R. Missaoui et al., "A Probabilistic Model for Data Cube Compression and Query Approximation," DOLAP 2007, ACM 10th International Workshop on Data Warehousing and OLAP, ACM Press, 2007.
  16. J. Shanmugasundaram et al., "Compressed Data Cubes for OLAP Aggregate Query Approximation on Continuous Dimensions," Proceeding of the 5th ACM SIGKDD international conference, ACM press, pp. 223 - 232, 1999.
  17. B. Babcock et al., "Dynamic Sample Selection for Approximate Query Processing," Proceedings of 22nd ACM SIGMOD International Conference, Management of Data (SIGMOD '03), pp. 539-550, 2003.
  18. S. Acharya et al., "The Aqua Approximate Query Answering System," SIGMOD, 1999.
  19. J. M. Hellerstein et al., "Online Aggregation," Proceedings of ACM SIGMOD Conference, 1996.
  20. Wen-Chi Hou, Cheng Luo, Zhewei Jiang, and Feng Yan, "Approximate Rang-sum queries over data cubes using cosine transform," 2008.
  21. Gautam Das, "Sampling Methods in Approximate Query Answering Systems", Invited Book Chapter, Encyclopedia of Data Warehousing and Mining. Editor John Wang, Information Science Publishing, 2005.
  22. J.S.R. Jang et al., Neuro-Fuzzy and Soft Computing: A Computational Approach to Learning and machine Intelligence, Prentice Hall, 1997.
  23. J.S.R. Jang, "ANFIS: Adaptive network-based fuzzy inference system," IEEE Transactions on Systems, Man and Cybernetics, Vol. 23 (3) pp. 665-685, 1993. https://doi.org/10.1109/21.256541
  24. S.R. Jang, "Input selection for ANFIS learning," in: Proceedings of the Fifth IEEE International Conference on Fuzzy Syste, pp.1493-1499, 1996. https://doi.org/10.1109/FUZZY.1996.552396
  25. M. A. Denai, et al., "ANFIS based modelling and control of non-linear systems: a tutorial," IEEE International Conference on Systems, Man and Cybernetics, pp.3433-3438, 2004. https://doi.org/10.1109/ICSMC.2004.1400873