DOI QR코드

DOI QR Code

Cluster-based Information Retrieval with Tolerance Rough Set Model

  • Ho, Tu-Bao (Japan Advanced Institute of Science and Technology, Tatsunokuchi) ;
  • Kawasaki, Saori (Japan Advanced Institute of Science and Technology, Tatsunokuchi) ;
  • Nguyen, Ngoc-Binh (Hanoi University of Technology, DaiCoViet Road, Hanoi, Vietnam)
  • Published : 2002.03.01

Abstract

The objectives of this paper are twofold. First is to introduce a model for representing documents with semantics relatedness using rough sets but with tolerance relations instead of equivalence relations (TRSM). Second is to introduce two document hierarchical and nonhierarchical clustering algorithms based on this model and TRSM cluster-based information retrieval using these two algorithms. The experimental results show that TRSM offers an alterative approach to text clustering and information retrieval.

Keywords

References

  1. Baeza-Yates, R. and Ribeiro-Neto, B., Modern Information Retrieval, Addison Wesley, 1999
  2. Fakes, W. B. and Baeza-Yates, Ines, Information Retrieval. Data Structures and AIgorithms (eds.), Prentice Hall, 1992
  3. Ho, T. B. and Funakoshi K., 'Information retrieval usingrough sets', Journal of Japanese Society for ArtificialIntelligence, vol. 13, no. 3, pp. 424-433, 1998
  4. Lebart, L., Salem, A., and Berry, L., Exploring Textual Data, Kluwer Academic Publishers, 1998
  5. Lin, T. Y. and Cercone, N., Rough Sets and Data Mining.,Analysis of Imprecise Data(eds.), Kluwer Academic Publishers, 1997
  6. Manning, C. D. and Schutze, H., Foundations of Statistical Natural Language Processing, The MIT Press, 1999
  7. Pawlak, Z., Rough sets: Theoretical Aspects of Reasoningabout Data, Kluwer Academic Publishers, 1991
  8. Polkowski, L. and Skowron, A., Rough Sets in Knowledge Discovery 2. Applications, Case Studies and Software Systems(eds.), Physica-Verlag, 1998
  9. Raghavan, V. V. and Sharma, R.S., A Framework and a Prototype for Intelligent Organization of Information,The Canadian Journal of Information Science, vol. 11, pp.88-101, 1986
  10. Salton, G. and Buckley, C., Term-Weighting approachesin automatic text retrieval, Information Processing &Management, vol. 4, no. 5, pp. 513-523, 1998
  11. Skowron, A. and Stepaniuk, J., Generalized approximation spaces, The 3rd International Workshop on Rough Sets and Soft Computing, pp. 156-163, 1994
  12. Slowinski, R. and Vanderpooten, D., Similarity Relation asa Basis for Rough Approximations, Advances in MachineIntelligence and Soft Computing, P. Wang (ed.), vol. 4,pp. 17-33, 1997
  13. Srinrvasan, P., The importance of rough approximationsfor information retrieval, International Journal of Man-Machine Studies, vol. 34, no. 5, pp. 657-671, 1991 https://doi.org/10.1016/0020-7373(91)90017-2
  14. Willet, P., Recent trends in hierarchical document clustering: A critical review, Information Processing and Management, pp. 577-597, 1988 https://doi.org/10.1016/0306-4573(88)90027-1