• Title/Summary/Keyword: Web documents

Search Result 831, Processing Time 0.026 seconds

Design of Templating System for Web Publication (웹 출판을 위한 템플릿 시스템의 설계)

  • Abdallah, Hisham;Koo, Heung-Seo
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2002.11c
    • /
    • pp.1777-1780
    • /
    • 2002
  • This paper presents a well-designed templating system for CMS web Publication using XML/XSL technology. The primary motivation is the need of Web CMS to separate content from layout and logic. Our system provides GUI XSLT editor (x-editor) to create and modify XSLT stylesheet documents easily. These documents are used to add "layout" and "look and feel" information to XML document which contains content and functionality. The modified XML document is processed by XML-template engine to produce dynamic or static web sites.

  • PDF

Analysis of Preference Criteria for Personalized Web Search (개인화된 웹 검색을 위한 선호 기준 분석)

  • Lee, Soo-Jung
    • The Journal of Korean Association of Computer Education
    • /
    • v.13 no.1
    • /
    • pp.45-52
    • /
    • 2010
  • With rapid increase in the number of web documents, the problem of information overload in Internet search is growing seriously. In order to improve web search results, previous research studies employed user queries/preferred words and the number of links in the web documents. In this study, performance of the search results exploiting these two criteria is examined and other preference criteria for web documents are analyzed. Experimental results show that personalized web search results employing queries and preferred words yield up to 1.7 times better performance over the current search engine and that the search results using the number of links gives up to 1.3 times better performance. Although it is found that the first of the user's preference criteria for web documents is the contents of the document, readability and images in the document are also given a large weight. Therefore, performance of web search personalization algorithms will be greatly improved if they incorporate objective data reflecting each user's characteristics in addition to the number of queries and preferred words.

  • PDF

Semi Automatic Ontology Generation about XML Documents

  • Gu Mi Sug;Hwang Jeong Hee;Ryu Keun Ho;Jung Doo Yeong;Lee Keum Woo
    • Proceedings of the KSRS Conference
    • /
    • 2004.10a
    • /
    • pp.730-733
    • /
    • 2004
  • Recently XML (eXtensible Markup Language) is becoming the standard for exchanging the documents on the web. And as the amount of information is increasing because of the development of the technique in the Internet, semantic web is becoming to appear for more exact result of information retrieval than the existing one on the web. Ontology which is the basis of the semantic web provides the basic knowledge system to express a particular knowledge. So it can show the exact result of the information retrieval. Ontology defines the particular concepts and the relationships between the concepts about specific domain and it has the hierarchy similar to the taxonomy. In this paper, we propose the generation of semi-automatic ontology based on XML documents that are interesting to many researchers as the means of knowledge expression. To construct the ontology in a particular domain, we suggest the algorithm to determine the domain. So we determined that the domain of ontology is to extract the information of movie on the web. And we used the generalized association rules, one of data mining methods, to generate the ontology, using the tag and contents of XML documents. And XTM (XML Topic Maps), ISO Standard, is used to construct the ontology as an ontology language. The advantage of this method is that because we construct the ontology based on the terms frequently used documents related in the domain, it is useful to query and retrieve the related domain.

  • PDF

Development of A Web Mining System Based On Document Similarity (문서 유사도 기반의 웹 마이닝 시스템 개발)

  • 이강찬;민재홍;박기식;임동순;우훈식
    • The Journal of Society for e-Business Studies
    • /
    • v.7 no.1
    • /
    • pp.75-86
    • /
    • 2002
  • In this study, we proposed design issues and structure of a web mining system and develop a system for the purpose of knowledge integration under world wide web environments resulted from our developing experiences. The developed system consists of three main functions: 1) gathering documents utilizing a search agent; 2) determining similarity coefficients between any two documents from term frequencies; 3) clustering documents based on similarity coefficients. It is believed that the developed system can be utilized for discovery of knowledge in relatively narrow domains such as news classification, index term generation in knowledge management.

  • PDF

A Study on the Effect of Data Fusion on the Retrieval Effectiveness of Web Documents (데이터 결합이 웹 문서 검색성능에 미치는 영향 연구)

  • Park, Ok-Hwa;Chung, Young-Mee
    • Journal of Information Management
    • /
    • v.38 no.1
    • /
    • pp.1-19
    • /
    • 2007
  • This study investigates the effect of data fusion on the retrieval effectiveness by performing an experiment combining multiple representations of Web documents. The types of document representation combined in the study include content terms, links, anchor text, and URL. The experimental results showed that the data fusion technique combining document representation methods in Web environment did not bring any significant improvement in retrieval effectiveness.

Automatic Generation of Ontology with Simplified Sentences and Transfer Rules (단문화와 변환 규칙을 이용한 온톨로지의 자동 생성)

  • Park, In-Cheol
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.8 no.5
    • /
    • pp.1092-1097
    • /
    • 2007
  • Ontology construction has need of many time and cost. This is why it is difficult to build a commercial semantic web. To solve the problem, we must automatically construct ontology. In this paper, we propose an automatic ontology generation system from web documents containing important informations of the web. The proposed system has two steps. One is simplification process which generates simple sentences from all sentences in the documents. Another is ontology generation process with transfer rules. Our system is very useful fur application domains in which many documents are updated or inserted frequently such as online shopping malls.

  • PDF

A Comparative Study of Feature Selection Methods for Korean Web Documents Clustering (한글 웹 문서 클러스터링 성능향상을 위한 자질선정 기법 비교 연구)

  • Kim Young-Gi
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.39 no.1
    • /
    • pp.45-58
    • /
    • 2005
  • This Paper is a comparative study of feature selection methods for Korean web documents clustering. First, we focused on how the term feature and the co-link of web documents affect clustering performance. We clustered web documents by native term feature, co-link and both, and compared the output results with the originally allocated category. And we selected term features for each category using $X^2$, Information Gain (IG), and Mutual Information (MI) from training documents, and applied these features to other experimental documents. In addition we suggested a new method named Max Feature Selection, which selects terms that have the maximum count for a category in each experimental document, and applied $X^2$ (or MI or IG) values to each term instead of term frequency of documents, and clustered them. In the results, $X^2$ shows a better performance than IG or MI, but the difference appears to be slight. But when we applied the Max Feature Selection Method, the clustering Performance improved notably. Max Feature Selection is a simple but effective means of feature space reduction and shows powerful performance for Korean web document clustering.

Evaluation of Mobile Unified Search Contents of Naver and Google Korea (네이버와 구글의 모바일 통합 검색 컨텐츠 평가)

  • Park, So-Yeon
    • Journal of Korean Library and Information Science Society
    • /
    • v.42 no.4
    • /
    • pp.263-280
    • /
    • 2011
  • This study aims to investigate current status of mobile search services of Korean search portals, and analyze mobile unified search contents of Naver and Google Korea. In particular, this study analyzed characteristics of mobile unified search such as number of retrieved documents, collection distribution, and yearly distribution. Also, documents were evaluated in terms of relevance, credibility, and currency. This study compared quality of Naver's unified Web best and unified Web, and Google's best Web documents and Web documents. The correlation between document's ranking and document's relevance was analyzed. The results of this study can be implemented to the portal's effective development of mobile search service.

Web Site Construction Using Internet Information Extraction (인터넷 정보 추출을 이용한 웹문서 구조화)

INFORMATION SEARCH BASED ON CONCEPT GRAPH IN WEB

  • Lee, Mal-Rey;Kim, Sang-Geun
    • Journal of applied mathematics & informatics
    • /
    • v.10 no.1_2
    • /
    • pp.333-351
    • /
    • 2002
  • This paper introduces a search method based on conceptual graph. A hyperlink information is essential to construct conceptual graph in web. The information is very useful as it provides summary and further linkage to construct conceptual graph that has been provided by human. It also has a property which shows review, relation, hierarchy, generality, and visibility. Using this property, we extracted the keywords of web documents and made up of the conceptual graph among the keywords sampled from web pages. This paper extracts the keywords of web pages using anchor text one out of hyperlink information and makes hyperlink of web pages abstract as the link relation between keywords of each web page. 1 suggest this useful search method providing querying word extension or domain knowledge by conceptual graph of keywords. Domain knowledge was conceptualized knowledged as the conceptual graph. Then it is not listing web documents which is the defect of previous search system. And it gives the index of concept associating with querying word.