• Title/Summary/Keyword: Document Retrieval

Search Result 448, Processing Time 0.026 seconds

A Natural Language Question Answering System-an Application for e-learning

  • Gupta, Akash;Rajaraman, Prof. V.
    • Proceedings of the Korea Inteligent Information System Society Conference
    • /
    • 2001.01a
    • /
    • pp.285-291
    • /
    • 2001
  • This paper describes a natural language question answering system that can be used by students in getting as solution to their queries. Unlike AI question answering system that focus on the generation of new answers, the present system retrieves existing ones from question-answer files. Unlike information retrieval approaches that rely on a purely lexical metric of similarity between query and document, it uses a semantic knowledge base (WordNet) to improve its ability to match question. Paper describes the design and the current implementation of the system as an intelligent tutoring system. Main drawback of the existing tutoring systems is that the computer poses a question to the students and guides them in reaching the solution to the problem. In the present approach, a student asks any question related to the topic and gets a suitable reply. Based on his query, he can either get a direct answer to his question or a set of questions (to a maximum of 3 or 4) which bear the greatest resemblance to the user input. We further analyze-application fields for such kind of a system and discuss the scope for future research in this area.

  • PDF

Classification and Retrieval of XML Document for Teacher Support System based on Web (웹 기반의 교수 지원 시스템을 위한 XML 문서의 분류 및 검색)

  • Kim, Haeng-Kon;Kim, Ji-Young;Choi, Mun-Kyoung;Kim, Soung-Won
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2001.10b
    • /
    • pp.1615-1618
    • /
    • 2001
  • 최근 인터넷이 급속히 성장함에 따라 웹을 기반으로 한 학습이 활발히 진행되고 있고, 또한 학교 업무의 효율화를 지원하기 위한 분야에서도 웹이 응용되고 있다. 특히 웹에서 교수를 위한 복잡한 학교 업무의 관리와 학습자료 및 업무 자료를 지원하기 위해서는 확장성과 호환성, 편의성을 가진 XML 형태의 문서가 제공되어져야 한다. 따라서 교수 업무 지원을 위해 XML 문서의 정보들을 효율적이고 정확하게 이용하기 위해 이들 문서를 적절하게 분류하고 저장, 검색하기 위한 방법이 필요하다. 본 논문에서는 XML로 작성된 교수 업무 지원 문서의 저장과 검색을 위한 선행작업으로서, 일반적인 메타 데이터와 DTD 데이터를 정의하고, 이렇게 정의된 데이터를 이용하여 패싯 검색과 구조기반 검색, 키워드 검색을 제공함으로써 사용자는 원하는 문서를 쉽게 검색한 수 있다. 따라서 이를 통해 교수 업무 지원 문서들을 웹 상에서 효율적이고 정확하게 저장하며, 사용자가 원하는 문서를 정확하고 신속하게 검색할 수 있게 하고자 한다.

  • PDF

A Method for Precision Improvement Based on Core Query Clusters and Term Proximity (핵심질의 클러스터와 단어 근접도를 이용한 문서 검색 정확률 향상 기법)

  • Jang, Kye-Hun;Lee, Kyung-Soon
    • The KIPS Transactions:PartB
    • /
    • v.17B no.5
    • /
    • pp.399-404
    • /
    • 2010
  • In this paper, we propose a method for precision improvement based on core clusters and term proximity. The method is composed by three steps. The initial retrieval documents are clustered based on query term combination, which occurred in the document. Core clusters are selected by using proximity between query terms. Then, the documents in core clusters are reranked based on context information of query. On TREC AP test collection, experimental results in precision at the top documents(P@100) show that the proposed method improved 11.2% over the language model.

A study on the Implementation for the effective Digital Library System (효율적인 전자도서관 체제 구축을 위한 연구)

  • 류범종;강무영;조영화
    • Proceedings of the Korea Technology Innovation Society Conference
    • /
    • 1998.05a
    • /
    • pp.3-3
    • /
    • 1998
  • Digital library system is an area of multimedia data production system which can differentiate itself from the traditional libraries. While the major role of the traditional libraries is servicing the customers, digital library system is a total system which produces new-formatted knowledge from the existing and new information, and servicing the knowledge to the customers. Thus digital library system must incorporate open and standardized document formats such as SGML, and make the knowledge shareable among the other digital library systems. In order to do show the efficient implementation for the effective digital library system, a technical alternative will be introduced as performing the Digital Library Pilot project.

  • PDF

Design and Performance Evaluation of Data Models for the XML Document Management (XML 문서관리를 위한 데이터 모델 설계 및 성능평가)

  • 유재수;손충범;조혜영
    • Journal of Internet Computing and Services
    • /
    • v.2 no.5
    • /
    • pp.59-70
    • /
    • 2001
  • Recently, in various fields XML has been become a standard for information exchange in internet. Therefore, the researches on data modeling for storing and fetching the XML documents have been made actively. However, existing researches have weak points that they can support neither versioning nor fast fetching of documents while changing documents in dynamic environments. In this paper, we propose four kinds of hybrid data modeling schemes that combine fragmentation model and nonfragmentation model. Our data modeling schemes are suitable to dynamic environments. We also present guidelines that our hybrid data modeling schemes can be used for various applications. We shaw through various experiments that our hybrid schemes partially outperforms the existing modeling schemes in terms of insertion time, storage space and retrieval time.

  • PDF

A Review of Access Conditions of the W3 and the Inline Image/Sound Processing of HTML Document for Utilizing of the Virtual Library (W3 가상도서관 활용을 위한 HTML 문서작성과 이미지/사운드 처리)

  • 유사라
    • Journal of the Korean Society for information Management
    • /
    • v.12 no.1
    • /
    • pp.45-66
    • /
    • 1995
  • The information users of the middle of 1990s. who know the Internet as well as its useful information services, are now expecting the virtual library services. Especially the increasing demands on hypertext and hypermedia information in the internet settings have been centered on the W3 with the man-page information. In this manner, the paper describes the access methods with brief concepts of the W3 and explains URLs and HTML. It also gives the retrieval layouts of unformatted data including images and sounds and then provides the information sources and software of W3 Clients and Servers in order to catch up the most recently post version of W3.

  • PDF

Attribute-Based Classification Method for Automatic Construction of Answer Set (정답문서집합 자동 구축을 위한 속성 기반 분류 방법)

  • 오효정;장문수;장명길
    • Journal of KIISE:Software and Applications
    • /
    • v.30 no.7_8
    • /
    • pp.764-772
    • /
    • 2003
  • The main thrust of our talk will be based on our experience in developing and applying an attribute-based classification technique in the context of an operational answer set driven retrieval system. To alleviate the difficulty and reduce the cost of manually constructing and maintaining answer sets, i.e., knowledge base, we have devised a new method of automating the answer document selection process by using the notion of attribute-based classification, which is in and of itself novel. We attempt to explain through experiments how helpful the proposed method is for the knowledge base construction process.

The Development of the Prototype for Electronic Journal (전자저널 개발모형에 관한 연구)

  • 정준민
    • Journal of the Korean Society for information Management
    • /
    • v.18 no.3
    • /
    • pp.203-218
    • /
    • 2001
  • The prototype of electronic journal is proposed on the notion of functional properties of internet or web. Additionally, the methodology for the friendly relationship between print oriented journal and electronic one is also suggested. The electronic journal is consisted of 6 menus, administration area, community service, what\`s up & current papers. category service, searching and extend retrieval service. And the electronic journal is designed not only on the succession of the attributes of printed journal, but also on tole emphasis of its electrical properties. But at the end of document, it is mentioned that in the near future all the information service related media or mechanisms will be synthesized and uniquely evolved on the succession of those properties.

  • PDF

Jointly Image Topic and Emotion Detection using Multi-Modal Hierarchical Latent Dirichlet Allocation

  • Ding, Wanying;Zhu, Junhuan;Guo, Lifan;Hu, Xiaohua;Luo, Jiebo;Wang, Haohong
    • Journal of Multimedia Information System
    • /
    • v.1 no.1
    • /
    • pp.55-67
    • /
    • 2014
  • Image topic and emotion analysis is an important component of online image retrieval, which nowadays has become very popular in the widely growing social media community. However, due to the gaps between images and texts, there is very limited work in literature to detect one image's Topics and Emotions in a unified framework, although topics and emotions are two levels of semantics that often work together to comprehensively describe one image. In this work, a unified model, Joint Topic/Emotion Multi-Modal Hierarchical Latent Dirichlet Allocation (JTE-MMHLDA) model, which extends previous LDA, mmLDA, and JST model to capture topic and emotion information at the same time from heterogeneous data, is proposed. Specifically, a two level graphical structured model is built to realize sharing topics and emotions among the whole document collection. The experimental results on a Flickr dataset indicate that the proposed model efficiently discovers images' topics and emotions, and significantly outperform the text-only system by 4.4%, vision-only system by 18.1% in topic detection, and outperforms the text-only system by 7.1%, vision-only system by 39.7% in emotion detection.

  • PDF

Detection of Porno Sites on the Web using Fuzzy Inference (퍼지추론을 적용한 웹 음란문서 검출)

  • 김병만;최상필;노순억;김종완
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.11 no.5
    • /
    • pp.419-425
    • /
    • 2001
  • A method to detect lots of porno documents on the internet is presented in this parer. The proposed method applies fuzzy inference mechanism to the conventional information retrieval techniques. First, several example sites on porno arc provided by users and then candidate words representing for porno documents are extracted from theme documents. In this process, lexical analysis and stemming are performed. Then, several values such as tole term frequency(TF), the document frequency(DF), and the Heuristic Information(HI) Is computed for each candidate word. Finally, fuzzy inference is performed with the above three values to weight candidate words. The weights of candidate words arc used to determine whether a liven site is sexual or not. From experiments on small test collection, the proposed method was shown useful to detect the sexual sites automatically.

  • PDF