• Title/Summary/Keyword: Text Retrieval

Search Result 342, Processing Time 0.026 seconds

A Hybrid Information Retrieval Model Using Metadata and Text (메타데이타와 텍스트 정보의 통합검색 모델)

  • Yoo, Jeong-Mok;Myaeng, Sung-Hyon;Kim, Sung-Soo;Lee, Mann-Ho
    • Journal of KIISE:Databases
    • /
    • v.34 no.3
    • /
    • pp.232-243
    • /
    • 2007
  • Metadata IR model has high precision and low recall because the query in Metadata IR model is strict that is, the query can express user information need exactly, while Full-text IR model has low precision and high recall because the query in Full-text IR model is a kind of simple keyword query which expresses user information need roughly. If user can translate one's information need into structured query well, the retrieval result will be improved. However, it is little possible to make relevant query without understanding characteristics of metadata. Unfortunately, most users do not interested in metadata, then they cannot construct well-made structured query. Amount of information contained in metadata is less than text information. In this paper, we suggest hybrid IR model using metadata and text which can provide users with lots of relevant documents by retrieving from metadata field and text field complementarily.

BADA-$IV/I^2R$: Design & Implementation of an Efficient Content-based Image Retrieval System using a High-Dimensional Image Index Structure (바다-$IV/I^2R$: 고차원 이미지 색인 구조를 이용한 효율적인 내용 기반 이미지 검색 시스템의 설계와 구현)

  • Kim, Yeong-Gyun;Lee, Jang-Seon;Lee, Hun-Sun;Kim, Wan-Seok;Kim, Myeong-Jun
    • The Transactions of the Korea Information Processing Society
    • /
    • v.7 no.2S
    • /
    • pp.678-691
    • /
    • 2000
  • A variety of multimedia applications require multimedia database management systems to manage multimedia data, such as text, image, and video, as well as t support content-based image or video retrieval. In this paper we design and implement a content-based image retrieval system, BADA-IV/I$^2$R(Image Information Retrieval), which is developed based on BADA-IV multimedia database management system. In this system image databases can be efficiently constructed and retrieved with the visual features, such as color, shape, and texture, of image. we extend SQL statements to define image query based on both annotations and visual features of image together. A high-dimensional index structure, called CIR-tree, is also employed in the system to provide an efficient access method to image databases. We show that BADA-IV/I$^2$R provides a flexible way to define query for image retrieval and retrieves image data fast and effectively: the effectiveness and performance of image retrieval are shown by BEP(Bull's Eye Performance) that is used to measure the retrieval effectiveness in MPEG-7 and comparing the performance of CIR-tree with those of X-tree and TV-tree, respectively.

  • PDF

Web Information Retrieval Exploiting Markup Pattern (마크업 패턴을 이용한 웹 검색)

  • Kim, Min-Soo;Kim, Min-Koo
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.13 no.6
    • /
    • pp.407-411
    • /
    • 2007
  • Over the years, great attention has been paid to the question of exploiting inherent semantic of HTML in the area of web document retrieval. Although HTML is mainly presentation oriented, HTML tags implicitly contain useful semantics that can be catch meaning of text. Focusing on this idea. in this paper we define 'markup pattern' and try to improve performance of web document retrieval using markup patterns. Markup pattern is a mirror of intends of web document publisher and an internal semantic of text on web document. To discover the markup pattern and exploit it, we suggest a new scheme for extracting concepts and weighting documents. For evaluation task, we select two domains-BBC and CNN web sites, and use their search engines to gather domain documents. We re-weight and re-score documents using proposed scheme, and show the performance improvement in the two domains.

Content-based Image Retrieval Using HSI Color Space and Neural Networks (HSI 컬러 공간과 신경망을 이용한 내용 기반 이미지 검색)

  • Kim, Kwang-Baek;Woo, Young-Woon
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.5 no.2
    • /
    • pp.152-157
    • /
    • 2010
  • The development of computer and internet has introduced various types of media - such as, image, audio, video, and voice - to the traditional text-based information. However, most of the information retrieval systems are based only on text, which results in the absence of ability to use available information. By utilizing the available media, one can improve the performance of search system, which is commonly called content-based retrieval and content-based image retrieval system specifically tries to incorporate the analysis of images into search systems. In this paper, a content-based image retrieval system using HSI color space, ART2 algorithm, and SOM algorithm is introduced. First, images are analyzed in the HSI color space to generate several sets of features describing the images and an SOM algorithm is used to provide candidates of training features to a user. The features that are selected by a user are fed to the training part of a search system, which uses an ART2 algorithm. The proposed system can handle the case in which an image belongs to several groups and showed better performance than other systems.

A screening study of human factors variables in designing multimedia information retrieval systems (정보습득용 멀티미디어 시스템의 인간공학적 설계변수 선별)

  • 김미정;한성호
    • Proceedings of the ESK Conference
    • /
    • 1995.10a
    • /
    • pp.56-61
    • /
    • 1995
  • Multimedia systems present information by using various media, for example, video, sound, music, animation, movie, etc., in addition to the text which has long been used for conveying the information. Among many multimedia applications, the multimedia information retrieval systems commercialized in the form of multimedia encyclopedia CD-ROMs, benefit by using various media for their ability to present information in an efficient and complete way. But using various media may cause end users' confusion and furthermore, poor user-interface design often exacerbates the systems. For appropriate design of the user interface of multimedia information retrieval systems, we investigated the characteristics of the multimedia information retrieval systems and listed 35 variables that might affect the usability of the user interface. And we selected 10 variables through some procedures such as brainstorming, literature survey, expert opinion, relevance analysis and feasibility analysis, in order to perform a screening study which will remarkably reduce the cost and time in conducting subsequent human factors experiments.

  • PDF

Video Data Modeling for Supporting Structural and Semantic Retrieval (구조 및 의미 검색을 지원하는 비디오 데이타의 모델링)

  • 복경수;유재수;조기형
    • Journal of KIISE:Databases
    • /
    • v.30 no.3
    • /
    • pp.237-251
    • /
    • 2003
  • In this paper, we propose a video retrieval system to search logical structure and semantic contents of video data efficiently. The proposed system employs a layered modelling method that orBanifes video data in raw data layer, content layer and key frame layer. The layered modelling of the proposed system represents logical structures and semantic contents of video data in content layer. Also, the proposed system supports various types of searches such as text search, visual feature based similarity search, spatio-temporal relationship based similarity search and semantic contents search.

Cluster-based Information Retrieval with Tolerance Rough Set Model

  • Ho, Tu-Bao;Kawasaki, Saori;Nguyen, Ngoc-Binh
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • v.2 no.1
    • /
    • pp.26-32
    • /
    • 2002
  • The objectives of this paper are twofold. First is to introduce a model for representing documents with semantics relatedness using rough sets but with tolerance relations instead of equivalence relations (TRSM). Second is to introduce two document hierarchical and nonhierarchical clustering algorithms based on this model and TRSM cluster-based information retrieval using these two algorithms. The experimental results show that TRSM offers an alterative approach to text clustering and information retrieval.

A Study on Automatic Keyword Classification (용어의 자동분류에 관한 연구)

  • Seo, Eun-Gyoung
    • Journal of the Korean Society for information Management
    • /
    • v.1 no.1
    • /
    • pp.78-99
    • /
    • 1984
  • In this paper, the automatic keyword classification which is one of the automatic construction methods of retrieval thesaurus is experimented to the Korean language on the basis that the use of retrieval thesaurus would increase the efficiency of information retrieval in the natural language retrieval system searching machine-readable data base. Furthermore, this paper proposes the application methods. In this experiment, the automatic keyword classification was based on the assumption that semantic relationships between terms can be found out by the statistical patterns of terms occurring in a text.

  • PDF

N- gram Adaptation Using Information Retrieval and Dynamic Interpolation Coefficient (정보검색 기법과 동적 보간 계수를 이용한 N-gram 언어모델의 적응)

  • Choi Joon Ki;Oh Yung-Hwan
    • MALSORI
    • /
    • no.56
    • /
    • pp.207-223
    • /
    • 2005
  • The goal of language model adaptation is to improve the background language model with a relatively small adaptation corpus. This study presents a language model adaptation technique where additional text data for the adaptation do not exist. We propose the information retrieval (IR) technique with N-gram language modeling to collect the adaptation corpus from baseline text data. We also propose to use a dynamic language model interpolation coefficient to combine the background language model and the adapted language model. The interpolation coefficient is estimated from the word hypotheses obtained by segmenting the input speech data reserved for held-out validation data. This allows the final adapted model to improve the performance of the background model consistently The proposed approach reduces the word error rate by $13.6\%$ relative to baseline 4-gram for two-hour broadcast news speech recognition.

  • PDF

Design of a hypermedia system for effective searching and browsing (탐색과 브라우징을 지원하는 하이퍼미디어 시스템의 설계)

  • 고영곤;최윤철
    • Journal of the Korean Society for information Management
    • /
    • v.10 no.1
    • /
    • pp.15-30
    • /
    • 1993
  • Hypermedia system supports associative linking concept for multimedia information using link and node concept, and overcomes the limitations of database system and text retrieval system in some application areas. This study shows the design and implementation of a hypermedia system which supports text, graphics, image and voice /sound information. This system has been designed to integrate the browsing and searching functions of the hypermedia system for efficient multimedia information retrieval and user-interface. To demonstrate the function and capability of the system, an application was made in the area of Bible and related information.

  • PDF