• Title/Summary/Keyword: Document information retrieval

Search Result 410, Processing Time 0.028 seconds

A Study on Clustering Query-answer Documents with Structural Features (문서구조를 이용한 질의응답문서 클러스터링에 관한 연구)

  • Choi, Sang-Hee
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.39 no.4
    • /
    • pp.105-118
    • /
    • 2005
  • As the number of users who ask and give answers in the query-answer documents retrieval system is growing exponentially, the query-answer document become a crucial information resource, as a new type of information retrieval service. A query-answer document Consists of three structural parts : a query, explanation on query, and answers Chosen by users who asked the query. To identify the role of each structural part in representing the topics of documents, the three structural parts were clustered automatically and the results of several clustering tests were compared in this study.

Personal Electronic Document Retrieval System Using Semantic Web/Ontology Technologies (시멘틱 웹/온톨로지 기술을 이용한 개인용 전자문서 검색 시스템)

  • Kim, Hak-Lae;Kim, Hong-Gee
    • The Journal of Society for e-Business Studies
    • /
    • v.12 no.1
    • /
    • pp.135-149
    • /
    • 2007
  • There are many kinds of applications or software components to manage files in a local computer, but it is very difficult to organize personal documents in a consistent way and to search expected ones in a precise way. In this paper, we present our development of a document management and retrieval tool, which is named Ontalk. Our system provides a semi-automatic metadata generator and an ontology-based search engine for electronic documents. Ontalk can create and import various ontologies in RDFS or OWL for describing the metadata. Our system that is built upon.NET technology is easily communicated with or flexibly plugged into many different programs.

  • PDF

A Study on the Design of Interlibrary Loan System Linked with IRS (정보검색시스템과 연계된 상호대차시스템 설계에 관한 연구: 의학도서관을 중심으로)

  • 최흥식
    • Journal of the Korean Society for information Management
    • /
    • v.18 no.2
    • /
    • pp.165-186
    • /
    • 2001
  • The purpose of this study is to explore practical methods of interlibrary loan system, linked with information retrieval system. The process and current status of interlibrary loan was investigated and analyzed at medical libraries in this study. In order to find the way of utilizing data from information retrieval and document request at the same time, the author tries to develop a way of linking PubMed, Union Catalog and ILL system. This study shows that hit record of information retrieval was saved in type of MEDLINE, XMLJSGML and reused for document delivery services. It seems to be efficient that the maintenance of Union Catalog was managed by all of member libraries, and data were retrieved at server and client side at once. In addition, it was found that user information can be checked by the IP and ILL system can be used for requested document by the saved result.

  • PDF

An Indexing Model for Efficient Structure Retrieval of XML Documents (XML 문서의 효율적인 구조 검색을 위한 색인 모델)

  • Park, Jong-Gwan;Son, Chung-Beom;Gang, Hyeong-Il;Yu, Jae-Su;Lee, Byeong-Yeop
    • The KIPS Transactions:PartD
    • /
    • v.8D no.5
    • /
    • pp.451-460
    • /
    • 2001
  • In this paper, we propose an indexing model for efficient structure retrieval of XML documents. The proposed indexing model consists of structured information that supports a wide range of queries such as content-based queries and structure-attribute queries at all levels of the document hierarchy and index organizations that are constructed based on the information. To support structured retrieval, a new representation method for structured information is presented. Using this structured information, we design content index, structure index, and attribute index for efficient retrieval. also, we explain processing procedures for mixed queries and evaluate the performance of proposed indexing model. It is shown that the proposed indexing model achieves better retrieval performance than the existing method.

  • PDF

A Hierarchical Index Technique for Moving Image Retrieval System based on MPEG-7 (MPEG-7에 기반한 동영상 검색 시스템을 위한 계층형 인덱스 기법)

  • Kim Tack gon;Kim Woo saeng
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.29 no.10C
    • /
    • pp.1444-1450
    • /
    • 2004
  • MPEG-7 based on XML represents various information of multimedia data's contents. and it support search and browsing by user's wants. But, MPEG-7 standard don't support retrieval method and Many XML Indexing is not compatible to retrieval MPEG-7 documents. So Much research activity and interest has emerged recently in retrieval MPEG-7 documents. In our paper, we suppose a hierarchical index based on MPEG-7 document's structural information, and review how to query processing based on high level feature description.

Design and Implementation of Video Documents Management System (비디오 문서 관리시스템의 설계 및 구현)

  • Kweon, Jae-Gil;Bae, Jong-Min
    • The Transactions of the Korea Information Processing Society
    • /
    • v.7 no.8
    • /
    • pp.2287-2297
    • /
    • 2000
  • Video documents which have audio-visual and other semantics information have complex relationship among media. While user requests for topic retrieval or specific region retrieval increase, it is difficult to meet these requests with the existing design methodology, In order to support the systematic management and the various retrieval capabilities of video document, we must formulate structural and systematic model on metadata using semantics and structural informations which are abstracted automaticallv or manuallv. This paper suggests generic metadata model with which we analyze the characteristics of video document, supports various query types and serves as a generic framework for video applications, we propose the generic integrated management model(GIMM)for generic metadata,, design video documents management system(VDMS) and implement it using GIMM.

  • PDF

Document Filtering Algorithm for Efficient Preprocessing of XML Information Retrieval (XML 정보검색의 효율적 전처리를 위한 문서여과 알고리즘)

  • Kong Yong-Hae;Kim Myung-Sook
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.6 no.1
    • /
    • pp.1-11
    • /
    • 2005
  • The paper proposes a preprocessing method for efficient processing of XML queries in information retrieval with a large amount of XML documents. The conventional preprocessing methods filter out XML documents by parsing XML document for keyword of query or by comparing query signatures with signatures of XML document to be generated. But these methods are dependent on a query and are very in efficient for a large amount of XML documents. For this, we generate a universal DTD based on ontology of a domain. The universal DTD is applicable to the XML documents when they contain information of a same domain even when they have different structures and attributes. Then, using the universal DTD, we filter out the XML documents that are not bounded in the domain. We evaluate the performance of this method through experiments.

  • PDF

A Study on the Visual Representation of TREC Text Documents in the Construction of Digital Library (디지털도서관 구축과정에서 TREC 텍스트 문서의 시각적 표현에 관한 연구)

  • Jeong, Ki-Tai;Park, Il-Jong
    • Journal of the Korean Society for information Management
    • /
    • v.21 no.3
    • /
    • pp.1-14
    • /
    • 2004
  • Visualization of documents will help users when they do search similar documents. and all research in information retrieval addresses itself to the problem of a user with an information need facing a data source containing an acceptable solution to that need. In various contexts. adequate solutions to this problem have included alphabetized cubbyholes housing papyrus rolls. microfilm registers. card catalogs and inverted files coded onto discs. Many information retrieval systems rely on the use of a document surrogate. Though they might be surprise to discover it. nearly every information seeker uses an array of document surrogates. Summaries. tables of contents. abstracts. reviews, and MARC recordsthese are all document surrogates. That is, they stand infor a document allowing a user to make some decision regarding it. whether to retrieve a book from the stacks, whether to read an entire article, etc. In this paper another type of document surrogate is investigated using a grouping method of term list. lising Multidimensional Scaling Method (MDS) those surrogates are visualized on two-dimensional graph. The distances between dots on the two-dimensional graph can be represented as the similarity of the documents. More close the distance. more similar the documents.

A Study of Document Ranking Algorithms in a P-norm Retrieval System (P-norm 검색의 문헌 순위화 기법에 관한 실험적 연구)

  • 고미영;정영미
    • Journal of the Korean Society for information Management
    • /
    • v.16 no.1
    • /
    • pp.7-30
    • /
    • 1999
  • This study is to develop effective document ranking algorithms in the P-norm retrieval system which can be implemented to the Boolean retrieval system without major difficulties by using non-statistical term weights based on document structure. Also, it is to enhance the performance by introducing the rank adjustment process which rearranges the ranks of retrieved documents according to the similarity between the top ranked documents and the rest of them. Of the non-statistical term weight algorithms, this study uses field weight and term pair distance weight. In the rank adjustment process, five retrieval experiments were performed, ranging between the case of using one record for the similarity measurement and the case of using first five records. It is proved that non-statistical term weights are highly effective and the rank adjustment process enhance the performance further.

  • PDF

Medicine Ontology Building based on Semantic Relation and Its Application (의미관계 정보를 이용한 약품 온톨로지의 구축과 활용)

  • Lim Soo-Yeon;Park Seong-Bae;Lee Sang-Jo
    • Journal of KIISE:Software and Applications
    • /
    • v.32 no.5
    • /
    • pp.428-437
    • /
    • 2005
  • An ontology consists of a set and definition of concepts that represents the characteristics of a given domain and relationship between the elements. To reduce time-consuming and cost in building ontology, this paper proposes a semiautomatic method to build a domain ontology using the results of text analysis. To do this, we Propose a terminology processing method and use the extracted concepts and semantic relations between them to build ontology. An experiment domain is selected by the pharmacy field and the built ontology is applied to document retrieval. In order to represent usefulness for retrieving a document using the hierarchical relations in ontology, we compared a typical keyword based retrieval method with an ontology based retrieval method, which uses related information in an ontology for a related feedback. As a result, the latter shows the improvement of precision and recall by $4.97\%$ and $0.78\%$ respectively.