Implementation and Performance Analysis of Web Robot for URL Analysis

URL 분석을 위한 웹 로봇 구현 및 성능분석

  • 김원 (경희대학교 전자공학과) ;
  • 김희철 (대구대학교 정보통신학부) ;
  • 진용옥 (경희대학교 전파공학과 디지털통신연구실)
  • Published : 2002.03.01

Abstract

This paper proposes the web robot based on Multi-Agent which the mutual dependency should be minimized each other with dividing the function each to collect Webpage. In result it is written to make a foundation for producing the effective statistics to analyze the domestic webpages and text, multimedia file composition ratio through performance analysis of the implemented system. It is easy that Web robot of the sequential processing method to collect Webpage on the same resource environment produces the limit of collecting performance. So to speak Webpages have "Dead-links" URL which is produced by the temporary host down and instability of network resource. If there are much "Dead-links" URL in the webpages, it takes a lot of time for web robot to collect HTML. The propose of this paper to be proposed, makes the maximum improvement to extract the webpages to process "Dead-links" URL on the Inactive URL scanner Agent.

Keywords

References

  1. Martijn Koster, 'Robots in the Web: threat ortreat?', ConneXions, Volume 9, No. 4, April1995
  2. R. Fielding, J.Gettys, J.C. Mogul, H. Frystykand T. Bemers-Lee, 'RFC 2068 Hypertext Transfer Protocol HTTP/l.l', UC Irvine, Digital Equipment Corporation, MIT, 1997
  3. Tim Bemeis-Lee, 'Unifonn Resource Lo-cators', RFC 1738, IET F, Dec. 1994
  4. Allison Woodruff, Paul M.akoi, Eric Brewerand Paul Gauthier, 'An Investigation of Documents from the World Wide Web',http://www.cs.berkeley.edu/-woodruff/inkotmi
  5. Nicholas Jennings, Katia Sycara and Michael Wooldridge, 'Road map of Agent Researchand Development : Autonomous Agents andMulti-Agent Systems', vol. 1 pp. 7-38, 1998
  6. Riechen, Doug, 'Intelligent Agents', Communications of the ACM Vol. 37 No. 7, July 1994
  7. Kai Hwang, 'Advanced Computer Archi-tecture', McGraw-Hill, pp.111-129, 1993