DOI QR코드

DOI QR Code

An Experimental Study on the Performance of Element-based XML Document Retrieval

엘리먼트 기반 XML 문서검색의 성능에 관한 실험적 연구

  • 윤소영 (국사편찬위원회 자료정보실) ;
  • 문성빈 (연세대학교 문헌정보학과)
  • Published : 2006.03.01

Abstract

This experimental study suggests an element-based XML document retrieval method that reveals highly relevant elements. The models investigated here for comparison are divergence and smoothing method, and hierarchical language model. In conclusion, the hierarchical language model proved to be most effective in element-based XML document retrieval with regard to the improved exhaustivity and harmed specificity.

이 연구에서는 가장 적합한 엘리먼트 기반 XML 문서검색 기법을 제시하기 위해 언어모델 검색 접근법으로 다이버전스 기법, 보정 기법 그리고 계층적 언어모델의 검색성능을 평가하는 실험을 수행하였다. 실험 결과, 가장 효율적인 검색 접근법으로 문서의 구조정보를 적용한 계층적 언어모델 검색을 제안하였다. 특히, 계층적 언어모델은 실제 검색에서 중요성을 가지는 검색순위 상위에서 뛰어난 성능을 보였다.

Keywords

References

  1. Abolhassani, M, and N, Fuhr. 2004. 'Applying the Divergence From Randomness Approach for Content-Only Search in XML Documents.' 26th European Conference on Information Retrieval Research (ECIR 2004). Springer
  2. Amati, G, and C. J. Rijsbergen. 2002. 'Probabilistic models of information retrieval based on measuring the divergence from randomness.' ACM Transactions on Information Systems(TOIS), 20(4)
  3. Chiaramella, Y., P. Mulhem, and F. Fourel. 1996. 'A Model for multimedia information retrieval.' Technical report, FERMI SPRIT BRA E8314, University of Glasgow
  4. Hiemstra, D. and Q. Kraaij. 1999. Twenty One at TREC-7 : Ad-hoc and cross language track. In The Seventh Text REtrieval Conference (TREC-7), 227-238
  5. Jelinek, F. and R. Mercer. 1980. 'Interpolated estimation of Markov source parameters from sparse data.' In Proceedings of the Workshop on Pattern Recognition in Practice
  6. McCallum, A. and K. Nigam. 1999. 'Text classification by bootstrapping with keywords, em and shrinkage.' In Proceedings of the ACL 99 Workshop for Unsupervised Learning in Natural Language Processing, 52-58
  7. Miller, D. R. H., T. Leek, and R. M. Schwartz. 1999. 'A hidden Markov model information retrieval system.' In Proceedings of the 22nd ACM SIGIR Conference, 214-221
  8. Moffat, A., R. Sacks-Davis, R. Wilkinson, and J. Zobel. 1994. 'Retrieval of partial documents.' In D. Harman, editor, Proceedings of the Second Text REtrieval Conference (TREC-2)
  9. Ogilvie, P. and J. Callan. 2003. Using Language Models for Flat Text Queries in XML Retrieval. In Proceedings of the Second Annual Workshop of the INitiative for the Evaluation of XML Retrieval (INEX)
  10. Ogilvie, P. and J. Callan. 2004. Hierarchical Language Models for XML Component Retrieval. In Proceedings of the Third Annual Workshop of the INitiative for the Evaluation of XML Retrieval (INEX)
  11. Ponte, J. M. and W. B. Croft. 1998. 'A language modeling approach to information retrieval,' In Proceedings of the 21st ACM Conference on Research and Development in Information Retrieval (SIGIR ' 98)
  12. Salton, G, J Allan, and C. Buckley. 1993. 'Approach to passage retrieval in full text information systems.' Proceedings of the 16th Annual International Conference on Research and Development in Information Retrieval
  13. Sigurbjonsson, B., J. Kamps and R. Maarten. 2003. 'An Element-based Approach to XML Retrieval.' Proceedings of the third Workshop of the INitiative for the Evaluation of XML Retrieval
  14. Singhul, A., C. Buckley, and M. Mitra. 1996. 'Pivoted document length normalization.' In Proceedings of the 19th Annual International ACM-SIGIR Conference on Research and Development Information Retrieval, 21-29
  15. Wilkinson, R. 1994. 'Effective retrieval of structured documents.' Proceedings of SIGIR Conference, 311-317
  16. Zhai, C. and J. Lafferty. 2001. 'A study of smoothing methods for language models applied to ad hoc information retrieval.' In Proceedings of the 24th ACM SIGIR Conference, 334-342