• Title/Summary/Keyword: Retrieval Effectiveness

Search Result 253, Processing Time 0.022 seconds

A Study on the Effectiveness of Information Retrieval (정보검색효율에 관한 연구)

  • Yoon Koo-ho
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.8
    • /
    • pp.73-101
    • /
    • 1981
  • Retrieval effectiveness is the principal criterion for measuring the performance of an information retrieval system. The effectiveness of a retrieval system depends primarily on the extent to which it can retrieve wanted documents without retrieving unwanted ones. So, ultimately, effectiveness is a function of the relevant and nonrelevant documents retrieved. Consequently, 'relevance' of information to the user's request has become one of the most fundamental concept encountered in the theory of information retrieval. Although there is at present no consensus as to how this notion should be defined, relevance has been widely used as a meaningful quantity and an adequate criterion for measures of the evaluation of retrieval effectiveness. The recall and precision among various parameters based on the 'two-by-two' table (or, contingency table) were major considerations in this paper, because it is assumed that recall and precision are sufficient for the measurement of effectiveness. Accordingly, different concepts of 'relevance' and 'pertinence' of documents to user requests and their proper usages were investigated even though the two terms have unfortunately been used rather loosely in the literature. In addition, a number of variables affecting the recall and precision values were discussed. Some conclusions derived from this study are as follows: Any notion of retrieval effectiveness is based on 'relevance' which itself is extremely difficult to define. Recall and precision are valuable concepts in the study of any information retrieval system. They are, however, not the only criteria by which a system may be judged. The recall-precision curve represents the average performance of any given system, and this may vary quite considerably in particular situations. Therefore, it is possible to some extent to vary the indexing policy, the indexing policy, the indexing language, or the search methodology to improve the performance of the system in terms of recall and precision. The 'inverse relationship' between average recall and precision could be accepted as the 'fundamental law of retrieval', and it should certainly be used as an aid to evaluation. Finally, there is a limit to the performance(in terms of effectiveness) achievable by an information retrieval system. That is : "Perfect retrieval is impossible."

  • PDF

The Study On the Effectiveness of Information Retrieval in the Vector Space Model and the Neural Network Inductive Learning Model

  • Kim, Seong-Hee
    • The Journal of Information Technology and Database
    • /
    • v.3 no.2
    • /
    • pp.75-96
    • /
    • 1996
  • This study is intended to compare the effectiveness of the neural network inductive learning model with a vector space model in information retrieval. As a result, searches responding to incomplete queries in the neural network inductive learning model produced a higher precision and recall as compared with searches responding to complete queries in the vector space model. The results show that the hybrid methodology of integrating an inductive learning technique with the neural network model can help solve information retrieval problems that are the results of inconsistent indexing and incomplete queries--problems that have plagued information retrieval effectiveness.

  • PDF

A Comparative Study of Local Features in Face-based Video Retrieval

  • Zhou, Juan;Huang, Lan
    • Journal of Computing Science and Engineering
    • /
    • v.11 no.1
    • /
    • pp.24-31
    • /
    • 2017
  • Face-based video retrieval has become an active and important branch of intelligent video analysis. Face profiling and matching is a fundamental step and is crucial to the effectiveness of video retrieval. Although many algorithms have been developed for processing static face images, their effectiveness in face-based video retrieval is still unknown, simply because videos have different resolutions, faces vary in scale, and different lighting conditions and angles are used. In this paper, we combined content-based and semantic-based image analysis techniques, and systematically evaluated four mainstream local features to represent face images in the video retrieval task: Harris operators, SIFT and SURF descriptors, and eigenfaces. Results of ten independent runs of 10-fold cross-validation on datasets consisting of TED (Technology Entertainment Design) talk videos showed the effectiveness of our approach, where the SIFT descriptors achieved an average F-score of 0.725 in video retrieval and thus were the most effective, while the SURF descriptors were computed in 0.3 seconds per image on average and were the most efficient in most cases.

A study on improving the effectiveness of a boolean retrieval system with feedback information (피드백 정보를 이용한 불논리 검색 시스템의 성능 증진에 관한 실험적 연구)

  • 신은자;정영미
    • Journal of the Korean Society for information Management
    • /
    • v.15 no.1
    • /
    • pp.129-148
    • /
    • 1998
  • The objective of this study is to develop a useful relevance feedback retrieval technique that can be applied to the current Boolean retrieval system. A feedback retrieval technique based on user model is recommended here to achieve this objective. To prove the usefulness of this feedback retrieval technique, two enhanced Boolean retrieval models including DNF model and P-norm model were evaluated first through retrieval effectiveness experiments. After selecting DNF model as the retrieval model, two feedback retrieval experiments were performed using initial and extended user models. It is proved that the feedback retrieval based on user model can greatly enhance the effectiveness of a Boolean retrieval system with a small modification.

  • PDF

A Study on the Effect of Data Fusion on the Retrieval Effectiveness of Web Documents (데이터 결합이 웹 문서 검색성능에 미치는 영향 연구)

  • Park, Ok-Hwa;Chung, Young-Mee
    • Journal of Information Management
    • /
    • v.38 no.1
    • /
    • pp.1-19
    • /
    • 2007
  • This study investigates the effect of data fusion on the retrieval effectiveness by performing an experiment combining multiple representations of Web documents. The types of document representation combined in the study include content terms, links, anchor text, and URL. The experimental results showed that the data fusion technique combining document representation methods in Web environment did not bring any significant improvement in retrieval effectiveness.

A Study on the Improvement of Retrieval Effectiveness to Clustered and Filtered Document through Query Expansion (질의어 확장에 기반을 둔 클러스터링 및 필터링 문서의 검색효율 제고에 관한 연구)

  • 노동조
    • Journal of the Korean BIBLIA Society for library and Information Science
    • /
    • v.14 no.1
    • /
    • pp.219-230
    • /
    • 2003
  • The purpose of this study is to improve of retrieval effectiveness to clustered and filtered document through query expansion. The result of this research prove that extended queries and documents, information in encyclopedia, clustering and filtering techniques are effective to promote retrieval effectiveness.

  • PDF

A Study on measuring techniques of retrieval effectiveness (검색효율 측정척도에 관한 연구)

  • Yoon Koo Ho
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.16
    • /
    • pp.177-205
    • /
    • 1989
  • Retrieval effectiveness is the principal criteria for measuring the performance of an information retrieval system. This paper deals with the characteristics of 'relevance' of information and various measuring techniques of retrieval effectivess. The outlines of this study are as follows: 1) Relevance decision for evaluation should be devided into the user-oriented and the system-oriented decisions. 2) The recall-precision measure seems to be user-oriented, and the recall-fallout measure to be system-oriented. 3) Many of composite measures can not be justified III any rational manner unfortunately. 4) The Swets model has demonstrated that it yields, in general, a straight line instead of a curve of varying curvature and emphasized the fundamentally probabilistic nature of information retrieval. 5) The Cooper model seems to be a good substitute for precision and a useful measure for systems which ranked documents. 6) The Rocchio model were proposed for the evaluation of retreval systems which ranked documents, and were designed to be independent of cut-off. 7) The Cawkell model suggested that the Shannon's equation for entropy can be applied to measuring of retrieval effectiveness.

  • PDF

A study on evaluating effectiveness of fuzzy information retrieval system (퍼지정보검색시스템의 검색효율에 관한 연구)

  • 김현희;배금표
    • Journal of the Korean Society for information Management
    • /
    • v.10 no.1
    • /
    • pp.31-52
    • /
    • 1993
  • This study compared the retrieval effectiveness of conventional Boolean and fuzzy set retrieval strategies through a retrieval experiment. The recall ratio of fuzzy set retrieval is 75%, higher than the 60% of Boolean retrieval. On the other hand, the precision ratio of fuzzy set retrieval is 69%, a little bit lower than the 73% of Boolean retrieval.

  • PDF

Improving the Retrieval Effectiveness by Incorporating Word Sense Disambiguation Process (정보검색 성능 향상을 위한 단어 중의성 해소 모형에 관한 연구)

  • Chung, Young-Mee;Lee, Yong-Gu
    • Journal of the Korean Society for information Management
    • /
    • v.22 no.2 s.56
    • /
    • pp.125-145
    • /
    • 2005
  • This paper presents a semantic vector space retrieval model incorporating a word sense disambiguation algorithm in an attempt to improve retrieval effectiveness. Nine Korean homonyms are selected for the sense disambiguation and retrieval experiments. The total of approximately 120,000 news articles comprise the raw test collection and 18 queries including homonyms as query words are used for the retrieval experiments. A Naive Bayes classifier and EM algorithm representing supervised and unsupervised learning algorithms respectively are used for the disambiguation process. The Naive Bayes classifier achieved $92\%$ disambiguation accuracy. while the clustering performance of the EM algorithm is $67\%$ on the average. The retrieval effectiveness of the semantic vector space model incorporating the Naive Bayes classifier showed $39.6\%$ precision achieving about $7.4\%$ improvement. However, the retrieval effectiveness of the EM algorithm-based semantic retrieval is $3\%$ lower than the baseline retrieval without disambiguation. It is worth noting that the performances of disambiguation and retrieval depend on the distribution patterns of homonyms to be disambiguated as well as the characteristics of queries.

Mathematical Properties of the Formulas Evaluating Boolean Operators in Information Retrieval (정보검색에서 부울연산자를 연산하는 식의 수학적 특성)

  • 이준호;이기호;조영화
    • Journal of the Korean Society for information Management
    • /
    • v.12 no.1
    • /
    • pp.87-97
    • /
    • 1995
  • Boolean retrieval systems have been most widely used in the area of information retrieval due to easy implementation and efficient retrieval. Conventional Boolean retrieval systems. however, cannot rank retrieved documents in decreasing order of query-document similarities because they cannot compute similarity coefficients between queries and documents. Extended Boolean models such as fuzzy set. Waller-Kraft, Paice, P-Norm and Infinite-One have been developed to provide the document ranking facility. In extended Boolean models, the formulas evaluating Boolean operators AND and OR are an important component to affect the quality of document ranking. In this paper we present mathematical properties of the formulas, and analyse their effect on retrieval effectiveness. Our analyses show that P-Norm is the most suitable for achieving high retrieval effectiveness.

  • PDF