Cost-based Optimization of Extended Boolean Queries

;

Journal of the Korean Society for information Management (정보관리학회지)

Volume 18 Issue 3
/
Pages.29-40
/
2001
/
1013-0799(pISSN)
/
2586-2073(eISSN)

Korean Society for Information Management (한국정보관리학회)

Cost-based Optimization of Extended Boolean Queries

확장 불리언 질의에 대한 비용 기반 최적화

박병권 (동아대학교 경영정보과학부)

Published : 2001.09.01

PDF

Download PDF

⟨ Previous Next ⟩

Abstract

In this paper, we suggest a query optimization algorithm to select the optimal processing method of an extended boolean query on inverted files. There can be a lot of methods for processing an extended boolean query according to the processing sequence oh the keywords con tamed in the query, In this sense, the problem of optimizing an extended boolean query it essentially that of optimizing the keyword sequence in the query. In this paper, we show that the problem is basically analogous to the problem of finding the optimal join order in database query optimization, and apply the ideas in the area to the problem solving. We establish the cost model for processing an extended boolean query and develop an algorithm to filled the optimal keyword-processing sequence based on the concept of keyword rank using the keyword selectivity and the access costs of inverted file. We prove that the method selected by the optimization algorithm is really optimum, and show, through experiments, that the optimal method is superior to the others in performance We believe that the suggested optimization algorithm will contribute to the significant enhancement of the information retrieval performance.

본 논문에서는 역색인 파일을 미용하여 학장 불리언 질의를 처리할 때 최소 비용의 질의 처리 방법을 구해 주는 질의 최적화 알고리즘을 제시한다. 확장 불리언 질의를 처리하는 방법은 질의를 구성하는 키위드의 처리 순서에 따라 여러 가지가 있을 수 있으므로 확장 불리언 질의 최적화 문제는 결국 최적 키워드 처리 순서를 구하는 문제로 귀결된다. 본 논문에서는 이 문제가 데이터베이스 질의 최적화에서 최적 조인 순서를 구하는 문제와 구조적으로 유사함을 보이고 이 분야의 연구 결과를 이용하여 문제를 해결한다. 즉, 확장 불리언 질의 처리에 대한 비용 모델을 수립하고 키워드 선택률과 역색인 파일 접근 비용을 이용하여 키워드 순위 개념을 도입한 후 이를 이용하여 최적 키워드 처리 순서를 구하는 알고리즘을 도출한다. 그리고 도출한 질의 최적화 알고리즘의 최적성을 증명하고. 실험을 통하여 실제로 최소비용의 질의 처리 방법을 구함을 보이고, 질의 최적화를 하지 않을 경우와 비교하였을 때 그 성능이 월등히 우수함을 보인다. 본 논문에서 제시한 질의 최적화 알고리즘은 정보검색시스템의 질의 처리 성능 향상에 큰 기여를 하리라 믿는다.

Keywords

참고문헌

ACM Trans. on Database Systems v.24 no.2 Optimization of Queries with User-Defined Predicates Chaudhuri, S.;K. Shim
Proc. Intl. Conf. on Information Retrieval, ACM Optimizitions for Dynamic Inverted Index Maintenance Cutting, D.;J. Pedersen
Fundamentals of Database Systems Elmasri, R.;S. B. Navathe
ACM Computing Surveys v.17 no.1 Access Methods for Text Faloutsos, C.
Information Retrieval - Data Structures & Algorithms Frakes, W.;R. Baeza Yates
Communications of the ACM v.20 no.5 An Optimal Evaluation of Boolean Expressions in an Online Query System Hanani;Micheal Z.
ACM Trans. on Database Systems v.9 no.3 On the Optimal Nesting Order for Computing N-Relational Joins Ibaraki, T.;T. Kameda
Readings in Information Retrieval Jones, K.;P. Willett
Proc. ACM SIGIR '98, ACM SIGIR Term-ordered Query Evaluation versus Document-ordered Query Evalution for Large Document Database Kaszkiel, M.;J. Zobel
Information Storage and Retrieval Korfhage, R. R.
Proc. ACM SIGIR '93, ACM SIGIR On the Evaluation of Boolean Operators in the Extended Boolean Retrieval Framework Lee, J. H.;W. Y. Kim;M. H. Kim;Y. J. Lee
ACM Trans. on Database Systems v.1 no.4 Algorithms for Parsing Search Queries in Systems with Inverted File Organization Liu;Jane W. S.
ACM Trans. on Information Systems v.14 no.4 Self-Indexing Inverted Files for fast Text Retrieval Moffat, A.;J. Zobel
Mathematics of Operations Research v.4 no.3 Sequencing with Series-Parallel Precedence Constraints Monma, C.;J. B. Sidney
Automatic Text Processing - The Transformation, Analysis, and Retrieval of Information by Computer Salton;Gerard
Proc. Intl. Conf. on Management of Data, ACM SIGMOD Access Path Selection in a Relational Database Management System Selinger P. G.;M. M. Astrahan;D. D. Chamberlin;R. A. Lorie;T. G. Price
VLDB Journal v.2 Query Processing and Inverted indices in Shared-Nothing Text Document Information Retrieval Systems Tomasic, A.;H. Garcia Molina
Magaging Gigabytes - Compressing and Indexing Documents and Images Witten, I. H.;A. Moffat;T. C. Bell
ACM Trans. on Database Systems v.23 no.4 Inverted Files Versus Signature Files for Text Indexing Zobel, J;A. Moffat;K. Ramamohanarao

Journal of the Korean Society for information Management (정보관리학회지)

Cost-based Optimization of Extended Boolean Queries

확장 불리언 질의에 대한 비용 기반 최적화

Abstract

Keywords

참고문헌

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)