Search | Korea Science

Numerical Formula and Verification of Web Robot for Collection Speedup of Web Documents

Kim Weon;Kim Young-Ki;Chin Yong-Ok
- Journal of Internet Computing and Services
- /
- v.5 no.6
- /
- pp.1-10
- /
- 2004
A web robot is a software that has abilities of tracking and collecting web documents on the Internet(l), The performance scalability of recent web robots reached the limit CIS the number of web documents on the internet has increased sharply as the rapid growth of the Internet continues, Accordingly, it is strongly demanded to study on the performance scalability in searching and collecting documents on the web. 'Design of web robot based on Multi-Agent to speed up documents collection ' rather than 'Sequentially executing Web Robot based on the existing Fork-Join method' and the results of analysis on its performance scalability is presented in the thesis, For collection speedup, a Multi-Agent based web robot performs the independent process for inactive URL ('Dead-links' URL), which is caused by overloaded web documents, temporary network or web-server disturbance, after dividing them into each agent. The agents consist of four component; Loader, Extractor, Active URL Scanner and inactive URL Scanner. The thesis models a Multi-Agent based web robot based on 'Amdahl's Law' to speed up documents collection, introduces a numerical formula for collection speedup, and verifies its performance improvement by comparing data from the formula with data from experiments based on the formula. Moreover, 'Dynamic URL Partition algorithm' is introduced and realized to minimize the workload of the web server by maximizing a interval of the web server which can be a collection target.
PDF

An Implementation and Design Web-Based Instruction-Learning System Using Web Agent (웹 에이전트를 이용한 웹기반 교수-학습 시스템의 설계 및 개발)

Kim, Kap-Su;Lee, Keon-Min
- Journal of The Korean Association of Information Education
- /
- v.5 no.1
- /
- pp.69-78
- /
- 2001
Recently, the current trend for computer based learning is moving from CAI environment to WBI environment. Most web documents for WBI learning are collected by aid of search engine. Instructors use those documents as learning materials after they evaluate availability of retrieved web documents. But, this method has the following problems. First, we search repeatedly the web documents selected by instructor. Second, there is a need for another course of instruction design in order to suggest the web documents for learner. Third, it is very difficult to analyze for relevance between the web documents and test results. In this work, we suggest WAILS(Web Agent Instruction Learning System) that retrieves web documents for WBI learning and guides learning course for learners. WAILS collects web documents for WBI learning by aid of web agent. Then, instructors can evaluate them and suggest to learners by using instruction-learning generating machine. Instructors retrieve web documents and the instruction-learning design at the same time. This can facilitate WBI learning.
PDF

A Method of Efficient Web Crawling Using URL Pattern Scripts (URL 패턴 스크립트를 이용한 효율적인 웹문서 수집 방안)

Chang, Moon-Soo;Jung, June-Young
- Journal of the Korean Institute of Intelligent Systems
- /
- v.17 no.6
- /
- pp.849-854
- /
- 2007
It is difficult that we collect only target documents from the Innumerable Web documents. One of solution to the problem is that we select target documents on the Web site which services many documents of target domain. In this paper, we will propose an intelligent crawling method collecting needed documents based on URL pattern script defined by XML. Proposed crawling method will efficiently apply to the sites which service structuralized information of a piece with database. In this paper, we collected 50 thousand Web documents using our crawling method.
https://doi.org/10.5391/JKIIS.2007.17.6.849 인용 PDF KSCI

The Present Condition of Opening of Archival Documents and Providing Reference Services in China (중국의 기록물 공개 및 서비스 현황)

Youn, Mi-Kyung
- Journal of Korean Society of Archives and Records Management
- /
- v.8 no.2
- /
- pp.105-125
- /
- 2008
In this study, system of opening of archival documents and providing reference services and archives web service in China have been reviewed. Throughout laws and regulations of archives management since found the People's Republic of China, related to opening of archival documents and providing reference services are analyzed. This paper also consider the present condition of archives web service in China and web service of Beijing archives.
https://doi.org/10.14404/JKSARM.2008.8.2.105 인용 PDF

A Dynamic Recommendation System Using User Log Analysis and Document Similarity in Clusters (사용자 로그 분석과 클러스터 내의 문서 유사도를 이용한 동적 추천 시스템)

김진수;김태용;최준혁;임기욱;이정현
- Journal of KIISE:Software and Applications
- /
- v.31 no.5
- /
- pp.586-594
- /
- 2004
Because web documents become creation and disappearance rapidly, users require the recommend system that offers users to browse the web document conveniently and correctly. One largely untapped source of knowledge about large data collections is contained in the cumulative experiences of individuals finding useful information in the collection. Recommendation systems attempt to extract such useful information by capturing and mining one or more measures of the usefulness of the data. The existing Information Filtering system has the shortcoming that it must have user's profile. And Collaborative Filtering system has the shortcoming that users have to rate each web document first and in high-quantity, low-quality environments, users may cover only a tiny percentage of documents available. And dynamic recommendation system using the user browsing pattern also provides users with unrelated web documents. This paper classifies these web documents using the similarity between the web documents under the web document type and extracts the user browsing sequential pattern DB using the users' session information based on the web server log file. When user approaches the web document, the proposed Dynamic recommendation system recommends Top N-associated web documents set that has high similarity between current web document and other web documents and recommends set that has sequential specificity using the extracted informations and users' session information.
PDF KSCI

Document Replacement Policy by Web Site Popularity (웹 사이트의 인기도에 의한 도큐먼트 교체정책)

Yoo, Hang-Suk;Chang, Tae-Mu
- Journal of the Korea Society of Computer and Information
- /
- v.13 no.1
- /
- pp.227-232
- /
- 2008
General web caches save documents temporarily into themselves on the basis of those documents. And when a corresponding document exists within the cache on user's request. web cache sends the document to corresponding user. On the contrary. when there is not any document within the cache, web cache requests a new document to the related server to copy the document into the cache and then turn it back to user. Here, web cache uses a replacement policy to change existing document into a new one due to exceeded capacity of cache. Typical replacement policy includes document-based LRU or LFU technique and other various replacement policies are used to replace the documents within cache effectively. However. these replacement policies function only with regard to the time and frequency of document request. not considering the popularity of each web site. Based on replacement policies with regard to documents on frequent requests and the popularity of each web site, this paper aims to present the document replacement policies with regard to the popularity of each web site, which are suitable for latest network environments to enhance the hit-ratio of cache and efficiently manage the contents of cache by effectively replacing documents on intermittent requests by new ones.
PDF

Estimating Coverage of the Web Search Services Using Near-Uniform Sampling of Web Documents (균등한 웹 문서 샘플링을 이용한 웹 검색 서비스들의 커버리지 측정)

Jang, Sung-Soo;Kim, Kwang-Hyun;Lee, Joon-Ho
- The KIPS Transactions:PartD
- /
- v.15D no.3
- /
- pp.305-312
- /
- 2008
Web documents with useful information are widely available on the internet and they are accessible with web search service. For this reason, web search services study better ways to collect more web documents, but have a difficulty figuring out the coverage of these web pages. This paper is intended to find ways to evaluate the current coverage assessment methods and suggest more effective coverage assessment technique that is, sampling internet web documents equally, monitoring how they are classified on web search services, in an attempt to assess both absolute and relative coverage of the web search engines. The paper also presents the comparison among Korean web search services using the suggested methods.the absolute and relative coverage was highest in Google followed by Naver and Empas. The result is expected to help estimating coverage of web search services.
https://doi.org/10.3745/KIPSTD.2008.15-D.3.305 인용 PDF KSCI

A Theoretical Study on Indexing Methods using the Metadata for the Automatic Construction of a Thesaurus Browser (시소러스 브라우저 자동구현을 위한 Metadata를 이용한 색인어 처리방안에 대한 연구)

Seo , Whee
- Journal of Korean Library and Information Science Society
- /
- v.35 no.4
- /
- pp.451-467
- /
- 2004
This paper is intended to present the theoretical analyses on automatic indexing, which is vital in the process of constructing a thesaurus browser, and clustering algorithms to construct hierarchical relations among terms as well as the methods for the automatic construction of a thesaurus browser. The methods to select the index term automatically in the web documents are studied by surveying the methods for analyzing and processing metadata which conforms to bibliographical roles of traditional paper documents in web documents. Also, the result of the study suggests to adding or involving the metadata in web documents, using the metadata automatic editor because metadata is not listed in most of the web documents.
PDF

A Study on the Effective Method of Generating the Dynamic Web Documents in the Multi-user System (다중-사용자 시스템에서의 효과적인 동적 웹 문서 발생 방법에 관한 연구)

Lee Hyun-Chang;Lee Jong-Eon
- The Journal of Korean Institute of Communications and Information Sciences
- /
- v.31 no.5B
- /
- pp.478-485
- /
- 2006
In this paper, we analyze the conditions of generating the dynamic Web documents in multi-user server and propose effective method for it. PSSI technique leads to replace the complex process of modifying a CGI source program by simply correcting the HTML Web document in the external file form. This technique has the strong points of CGI, flexibility and security of programming as well as those of SSI, easiness of modifying Web documents. Due to the characteristics of PSSI that Web source documents are in the form of external file, we show that with a single CGI program an individual user can design and modify his own Web documents in his directory. This means that PSSI technique has more advantage in managing the server than the CGI method which requires CGI program to be set up whenever that service is needed.
PDF KSCI

Analysis and Implementation of a Web Document Converter for Wireless Internet Use XHTML On Mobile Communication Environment (이동통신환경에서 XHTML을 이용한 무선인터넷 문서변환기 분석 및 구현)

백진영;이종옥;조성언;조경룡
- Proceedings of the Korean Institute of Information and Commucation Sciences Conference
- /
- 2001.10a
- /
- pp.105-108
- /
- 2001
This paper is purposed in design and implement of a device which can convert XHTML documents in web-Server into WML documents when users access the web by using portable devices. Users access XHTML(so-called HTML) web page and ask for informations, this document convertor recognizes of XHTML documents structures, reconstructs into simple WML documents by using

Title, Summary, Keyword

Publications

Publication Year

Volume

Issue

Page

Author

Affiliation

Publisher

DOI

Publication Type

Journal

Conference Proceeding Paper

Magazine

Search Result 394, Processing Time 0.028 seconds

Image Search (β)