• Title/Summary/Keyword: Web Retrieval

Search Result 687, Processing Time 0.022 seconds

Implementation of a Web Robot and Statistics on the Korean Web (웹 로봇 구현 및 한국 웹 통계보고)

  • Kim, Sung-Jin;Lee, Sang-Ho
    • The KIPS Transactions:PartC
    • /
    • v.10C no.4
    • /
    • pp.509-518
    • /
    • 2003
  • A web robot is a program that downloads and stores web pages. Implementation issues for developing web robots have been studied widely and various web statistics are reported in the literature. First, this paper describes the overall architecture of our robot and implementation decisions on several important issues. Second, we show empirical statistics on approximately 74 million Korean web pages. Third, we monitored 1,424 Korean web sites to observe the changes of web pages. We identify what factors of web pages could affect the changes. The factors may be used for the selection of web pages to be updated incrementally.

A Study on the Relevance Improvement of Enterprise Search using Tag Information (TAG 정보를 활용한 기업검색의 적합성 향상 기법에 관한 연구)

  • Shon, Tae-Shik;Park, Byoung-Seob;Choi, Hyo-Hyun
    • Journal of the Korea Society of Computer and Information
    • /
    • v.15 no.12
    • /
    • pp.101-108
    • /
    • 2010
  • In this paper, how fast and accurate the companies provides exponentially increasing information to the users is the most important in the corporate competitiveness. The enhancement of the retrieval relevance became the important element in enhancing company competitiveness and it is required to provide the services that are beyond simple retrieval service for good quality search service. This paper proposes the effective scheme that enhances retrieval relevance by utilizing registered tag information. By proposed scheme, we can overcome the limitations of retrieval relevance that usual search engines provide. And we compare the proposed scheme with existing web retrieval service on retrieval relevance evaluation and related search keyword.

A Exploratory Study on the Expansion of Academic Information Services Based on Automatic Semantic Linking Between Academic Web Resources and Information Services (웹 정보의 자동 의미연계를 통한 학술정보서비스의 확대 방안 연구)

  • Jeong, Do-Heon;Yu, So-Young;Kim, Hwan-Min;Kim, Hye-Sun;Kim, Yong-Kwang;Han, Hee-Jun
    • Journal of Information Management
    • /
    • v.40 no.1
    • /
    • pp.133-156
    • /
    • 2009
  • In this study, we link informal Web resources to KISTI NDSL's collections using automatic semantic indexing and tagging to examine the possibility of the service which recommends related documents using the similarity between KISTI's formal information resources and informal web resources. We collect and index Web resources and make automatic semantic linking through STEAK with KISTI's collections for NDSL retrieval. The macro precision which shows retrieval precision per a subject category is 62.6% and the micro precision which shows retrieval precision per a query is 66.9%. The experts' evaluation score is 76.7. This study shows the possibility of semantic linking NDSL retrieval results with Web information resources and expanding information services' coverage to informal information resources.

A Study of Web Retrieval System for Children (아동을 위한 웹 검색 시스템에 관한 연구)

  • Choi, Jeong-Ho;Kim, Young-Chul;Moon, Il-Young
    • Journal of Digital Contents Society
    • /
    • v.8 no.4
    • /
    • pp.601-606
    • /
    • 2007
  • As the library retrieval system grows on the web rapidly through the rapid popularization of the internet, specially, it is almost impossible for children-only to retrieve on the web. So, there are some problems, such as, providing the retrieving results with no relation to children library. In this paper, we are supposed to design and implement library information retrieval system to provide better relevant library information for children using 3D environment. It consists of PHP, APACHE and MYSQL databases. At first, web page which gathers documents on the web implemented PHP using 3D. At last, APACHE server return retrieving results for user query using PHP.

  • PDF

Video Retrieval System supporting Adaptive Streaming Service (적응형 스트리밍 서비스를 지원하는 비디오 검색 시스템)

  • 이윤채;전형수;장옥배
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.9 no.1
    • /
    • pp.1-12
    • /
    • 2003
  • Recently, many researches into distributed processing on Internet, and multimedia data processing have been performed. Rapid and convenient multimedia services supplied with high quality and high speed are to be needed. In this paper, we design and implement clip-based video retrieval system on the Web enviroment in real-time. Our system consists of the content-based indexing system supporting convenient services for video content providers, and the Web-based retrieval system in order to make it easy and various information retrieval for users in the Web. Three important methods are used in the content-based indexing system, key frame extracting method by dividing video data, clip file creation method by clustering related information, and video database construction method by using clip unit. In Web-based retrieval system, retrieval method ny using a key word, two dimension browsing method of key frame, and real-time display method of the clip are used. In this paper, we design and implement the system that supports real-time display method of the clip are used. In this paper, we design and implement the system that supports real-time retrieval for video clips on Web environment and provides the multimedia service in stability. The proposed methods show a usefulness of video content providing, and provide an easy method for serching intented video content.

A Usability Evaluation on the Visualization Techniques of Web Retrieval Results (웹 검색 결과 시각화 기법의 사용성 평가에 관한 연구)

  • Kim, Seong-Hee;Kim, Moon-Jeong
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.41 no.3
    • /
    • pp.181-199
    • /
    • 2007
  • This study is to suggest the usefulness of visualization techniques to display web retrieval results. We described the concept of visualization techniques and evaluated the usability for the SearchCrystal and KartOO search engines which provide visualization techniques for displaying the retrieval results. As a result, Searchcrystal search engine had higher score than KartOO system in terms of usability check lists.

Analysis of Korean Patent & Trademark Retrieval Query Log to Improve Retrieval and Query Reformulation Efficiency (질의로그 데이터에 기반한 특허 및 상표검색에 관한 연구)

  • Lee, Jee-Yeon;Paik, Woo-Jin
    • Journal of the Korean Society for information Management
    • /
    • v.23 no.2
    • /
    • pp.61-79
    • /
    • 2006
  • To come up with the recommendations to improve the patent & trademark retrieval efficiency, 100,016 patent & trademark search requests by 17,559 unique users over a period of 193 days were analyzed. By analyzing 2,202 multi-query sessions, where one user issuing two or more queries consecutively, we discovered a number of retrieval efficiency improvements clues. The session analysis result also led to suggestions for new system features to help users reformulating queries. The patent & trademark retrieval users were found to be similar to the typical web users in certain aspects especially in issuing short queries. However, we also found that the patent & trademark retrieval users used Boolean operators more than the typical web search users. By analyzing the multi-query sessions, we found that the users had five intentions in reformulating queries such as paraphrasing, specialization, generalization, alternation, and interruption, which were also used by the web search engine users.

A Study on the Practical Application Plans of a Library Information System through Web 2.0 Site Analysis (Web 2.0 기술 적용 사이트 분석을 통한 도서관 정보시스템의 활용방안에 관한 연구)

  • Park, Mi-Sung
    • Journal of Korean Library and Information Science Society
    • /
    • v.39 no.1
    • /
    • pp.139-168
    • /
    • 2008
  • With the advent of the Web 2.0 technology, its versatility has been experienced through various web sites. Consequently, many library users request to implement this technology into the library information system. Here this paper try to increase our understanding of the concept and technology of the Web 2.0, Library 2.0. and Catalog 2.0(OPAC 2.0) through literature survey. Also, this paper try to figure out the most applicable technology to the library information system by analyzing the representative web sites and libraries equipped with Web 2.0 technology. Finally, based on the analysis results, this study suggests an applicable practical plans in the field of catalog, retrieval and auxiliary services toward the user-friendly library information system. We also indicate several side effects expected with the implementation of the suggested plans.

  • PDF

Improved UDDI Model for Web Services with Quality based Retrieval (웹 서비스 품질 기반 검색을 위한 UDDI 개선 모델)

  • 윤석현;김동준;한상용
    • Journal of KIISE:Information Networking
    • /
    • v.31 no.5
    • /
    • pp.511-518
    • /
    • 2004
  • Web Services following distributed object computing technology like DCOM, CORBA provides remote procedure call mechanism based on XML-based open standard such as SOAP, WSDL, UDDI, and it is spotlighted as means of integration and collaboration at e-business. Especially, UDDI is the Web Services Registry enabling to register and search Web Services, that takes charge of providing infrastructure for Web Services. However, the existing UDDI has a few problems that searching process is very simple and it cannot provide information of Web Services quality and quality-based retrieval. Therefore, this study suggest improved UDDI model that evaluates the Web Services quality and use this information for searching.

A Study on Layout Extraction from Internet Documents Through Xpath (Xpath에 의한 인터넷 문서의 레이아웃 추출 방법에 관한 연구)

  • Han Kwang-Rok;Sun Bok-Keun
    • The Journal of the Korea Contents Association
    • /
    • v.5 no.4
    • /
    • pp.237-244
    • /
    • 2005
  • Currently most Internet documents including news data are made based on predefined templates, but templates are usually formed only for main data and are not helpful for information retrieval against indexes, advertisements, header data etc. Templates in such forms are not appropriate when Internet documents are used as data for information retrieval. In order to process Internet documents in various areas of information retrieval, it is necessary to detect additional information such as advertisements and page indexes. Thus this study proposes a method of detecting the layout of web pages by identifying the characteristics and structure of block tags that affect the layout of web pages and calculating distances between web pages. As a result of experiment, we can successfully extract 640 documents from 1000 samples and obtain 64% recall rate. This method is purposed to reduce the cost of web document automatic processing and improve its efficiency through applying the method to document preprocessing of information retrieval such as data extraction and document summarization.

  • PDF