Fuzzy Clustering Algorithm for Web-mining

웹마이닝을 위한 퍼지 클러스터링 알고리즘

  • 임영희 (대전대학교 컴퓨터정보통신공학부) ;
  • 송지영 (고려대학교 컴퓨터정보학과) ;
  • 박대희 (고려대학교 컴퓨터정보학과)
  • Published : 2002.06.01


The post-clustering algorithms, which cluster the result of Web search engine, have some different requirements from conventional clustering algorithms. In this paper, we propose the new post-clustering algorithm satisfying those of requirements as many as possible. The proposed fuzzy Concept ART is the form of combining the concept vector having several advantages in document clustering with fuzzy ART known as real time clustering algorithms on the basis of fuzzy set theory. Moreover we show that it can be applicable to general-purpose clustering as well as post clustering.


  1. O. Zamir and O. Etzioni, 'Web Document CIUS-tering: A Feasibility Demonstration', Proceedingsof the 19th International ACM SIGIR Conferenceon Research and Development in InformationRetrieval(SIGIR '98), pp. 46-54, 1998
  2. A. Leouski and W. B. Croft, 'An Evaluation ofTechniques for Clustering Search Results', Technical Report IR-76, University of Massachusettsat Amherst, 1996
  3. D. S. Modha and W. S. Spangler, 'ClusteringHypertext With Applications To Web Searching',Proceedings of ACM Hypertext Conference, 2000
  4. M. A. Hearst and J. O. Pedersen, 'Reexaminmgthe duster Hypothesis: Scatter/Gather on Re-trieval Results', Proceedings of ACI IGIR '96,pp. 76-84, 1996
  5. O. Zamir and O. Etzioni, 'Grouper: A DynamicClustering Interface to Web Search Results',available at
  6. 박민우, '검색엔진의 과거와 현재 그리고 미래', 마아크로소르트웨어, 2000년 3윈호, pp. 220-235,2000
  7. I. S. Dhillon and D. S. Modha, 'Concept De-composition for Large Sparse Text Data usingClustering', Technical Report RJ 10147(9502),IBM Almaden Research Center, 1999
  8. N. Vlajic and H. C. Card, 'Categohzing WebPages using Modified ART', IEEE CanadianConference, Vol. 1, pp. 313-316, 1998
  9. N. Vlajic and H. C. Card, 'An Adaptive NeuralNetwork Approach to Hypertext Clustering',IEEE-INNS-ENNS International JointConference on Neurat Networks, Vol. 6,PP.3772-3726, 1999
  10. W. B. Frakes and R. Baeza-Yates, 'InformationRetrieual: Data Structures and Algorithms',Prentice Hall, Englewood Cliffs, New Jersey,1992
  11. J. J. Fan, 'MC: A Fast Sparse Matrix GeneratorFor Large Text Collections', available at
  12. Available at
  13. G. A. Carpenter, S. Grossburg, and D. B. Rosen,'Fuzzy ART: An Adaptive Resonance Algohthmfor Rapid, Stable Classification of Analog Pat-terns', Proceedings of 1991 International Conference Neurat Networks, Vol. II, pp. 411-416, 1991
  14. A. Baraldi and E. Alpaydin, 'Simplified ART: Aew Class of ART Algorithms', InternationalComputer Science Institute, TR 98-004, 1998
  15. 임영희, 'Fuzzy Concept ART: 웹 정보 검색을 위한 후처리 클러스터링 알고리즘', 고려대학교 박사학위 논문, 2001
  16. 임영희, '후처리 웹 문서 클러스터링 알고리즘', 정보처리학회논문지B, 제 9-B권 제 1호, pp. 7-16,2002