• 제목/요약/키워드: user query

검색결과 704건 처리시간 0.027초

Word Embeddings-Based Pseudo Relevance Feedback Using Deep Averaging Networks for Arabic Document Retrieval

  • Farhan, Yasir Hadi;Noah, Shahrul Azman Mohd;Mohd, Masnizah;Atwan, Jaffar
    • Journal of Information Science Theory and Practice
    • /
    • 제9권2호
    • /
    • pp.1-17
    • /
    • 2021
  • Pseudo relevance feedback (PRF) is a powerful query expansion (QE) technique that prepares queries using the top k pseudorelevant documents and choosing expansion elements. Traditional PRF frameworks have robustly handled vocabulary mismatch corresponding to user queries and pertinent documents; nevertheless, expansion elements are chosen, disregarding similarity to the original query's elements. Word embedding (WE) schemes comprise techniques of significant interest concerning QE, that falls within the information retrieval domain. Deep averaging networks (DANs) defines a framework relying on average word presence passed through multiple linear layers. The complete query is understandably represented using the average vector comprising the query terms. The vector may be employed for determining expansion elements pertinent to the entire query. In this study, we suggest a DANs-based technique that augments PRF frameworks by integrating WE similarities to facilitate Arabic information retrieval. The technique is based on the fundamental that the top pseudo-relevant document set is assessed to determine candidate element distribution and select expansion terms appropriately, considering their similarity to the average vector representing the initial query elements. The Word2Vec model is selected for executing the experiments on a standard Arabic TREC 2001/2002 set. The majority of the evaluations indicate that the PRF implementation in the present study offers a significant performance improvement compared to that of the baseline PRF frameworks.

메타데이터 레지스트리 기반의 분산 정보 통합 시스템 설계 및 구현 (Design and Implementation of A Distributed Information Integration System based on Metadata Registry)

  • 김종환;박혜숙;문창주;백두권
    • 정보처리학회논문지D
    • /
    • 제10D권2호
    • /
    • pp.233-246
    • /
    • 2003
  • 중개기 기반 정보 통합 시스템은 서로 다른 지역 정보 시스템의 유연한 통합을 지원하나, 질의 처리시 최적화 측면과 지역 스키마 정보에 관한 메타데이터 표준화 측면에는 그리 큰 비중을 두지 않았다. 이러한 점을 개선하기 위해 제안된 분산 정보 통합 시스템은 질의 처리시 최적화 측면을 위해 질의 캐싱을 사용하며, 지역 스키마 정보에 관한 메타데이터 표준화 측면을 위해 ISO/IEC 11179 기반의 메타데이터 레지스트리를 사용한다. 이 시스템은 분산된 이기종의 비즈니스 정보 시스템들을 논리적으로 통합하여 사용자가 필요로 하는 통합된 정보를 웹 기반으로 제공한다. 이러한 시스템을 시스템 재사용성의 향상과 유지보수의 용이함을 위해 계층적 패턴을 사용하여 3계층 표현 방식 아키텍처로 표현하였고, 3계층 아키텍처의 핵심 요소들의 기능성과 흐름을 효과적으로 표현하기 위하여 UML 방법론을 확장한 EPEM 방법론을 이용하여 설계하였다. 또한 제안한 시스템의 구체적인 한 예로서, 공급망 관리 도메인에 적용하여 웹 기반으로 구현하였다. 따라서 분산 정보 통합 시스템은 질의 처리 속도 향상을 위해 질의 함수 관리기와 질의 함수 저장소를 통하여 질의 캐싱 기능을 제공하였고, 의미 이질성 해결을 위해 ISO/IEC 11179 기반의 메타데이터 레지스트리와 스키마 레파지토리를 이용함으로써 스키마 이질성과 데이터 이질성을 해결하였다.

Robust Syntactic Annotation of Corpora and Memory-Based Parsing

  • Hinrichs, Erhard W.
    • 한국언어정보학회:학술대회논문집
    • /
    • 한국언어정보학회 2002년도 Language, Information, and Computation Proceedings of The 16th Pacific Asia Conference
    • /
    • pp.1-1
    • /
    • 2002
  • This talk provides an overview of current work in my research group on the syntactic annotation of the T bingen corpus of spoken German and of the German Reference Corpus (Deutsches Referenzkorpus: DEREKO) of written texts. Morpho-syntactic and syntactic annotation as well as annotation of function-argument structure for these corpora is performed automatically by a hybrid architecture that combines robust symbolic parsing with finite-state methods ("chunk parsing" in the sense Abney) with memory-based parsing (in the sense of Daelemans). The resulting robust annotations can be used by theoretical linguists, who lire interested in large-scale, empirical data, and by computational linguists, who are in need of training material for a wide range of language technology applications. To aid retrieval of annotated trees from the treebank, a query tool VIQTORYA with a graphical user interface and a logic-based query language has been developed. VIQTORYA allows users to query the treebanks for linguistic structures at the word level, at the level of individual phrases, and at the clausal level.

  • PDF

문서 길이 정규화를 이용한 문서 요약 자동화 시스템 구현 (Implementation of Text Summarize Automation Using Document Length Normalization)

  • 이재훈;김영천;이성주
    • 한국지능시스템학회:학술대회논문집
    • /
    • 한국퍼지및지능시스템학회 2001년도 추계학술대회 학술발표 논문집
    • /
    • pp.51-55
    • /
    • 2001
  • With the rapid growth of the World Wide Web and electronic information services, information is becoming available on-Line at an incredible rate. One result is the oft-decried information overload. No one has time to read everything, yet we often have to make critical decisions based on what we are able to assimilate. The technology of automatic text summarization is becoming indispensable for dealing with this problem. Text summarization is the process of distilling the most important information from a source to produce an abridged version for a particular user or task. Information retrieval(IR) is the task of searching a set of documents for some query-relevant documents. On the other hand, text summarization is considered to be the task of searching a document, a set of sentences, for some topic-relevant sentences. In this paper, we show that document information, that is more reliable and suitable for query, using document length normalization of which is gained through information retrieval . Experimental results of this system in newspaper articles show that document length normalization method superior to other methods use query itself.

  • PDF

2계층 유사관계행렬 구축을 통한 질의 처리 (Fuzzy Query Processing through Two-level Similarity Relation Matrices Construction)

  • 이기영
    • 한국컴퓨터산업학회논문지
    • /
    • 제4권10호
    • /
    • pp.587-598
    • /
    • 2003
  • 본 연구에서는 학술논문을 대상으로 하여 표제와 초록에 대한 2단계 색인어 유사관계행렬을 구축하였다. 동시출현빈도 기반으로 구축된 색인어 유사관계행렬은 호환관계에 따른 질의 확장으로 재현률을 유지하면서 2단계 내용기반 검색으로 정확률을 향상시키기 위한 색인구조이다. 따라서, 주제 분석을 통해 영역지식을 추출하고 이용자의 정보 요구와 영역지식을 퍼지논리 기반으로 추론하였다. 본 연구는 질의에 본질적으로 가지고 있는 용어 불일치 및 정보표현을 향상시키기 위한 연구이다.

  • PDF

퍼지 연관규칙을 이용한 지능적 질의해석 (Intelligent Query Analysis using Fuzzy Association Rule)

  • 김미혜
    • 한국산학기술학회논문지
    • /
    • 제11권6호
    • /
    • pp.2214-2218
    • /
    • 2010
  • 대용량 데이터에서 의미있고 유용한 지식을 추출하는 기법 중의 하나인 연관규칙은 데이터베이스에 존재하는 속성들 사이에 유사성 또는 패턴을 기술하여 사용자에게 데이터에 관한 유용한 정보를 줄 수 있다. 기존에 연구되어 온 연관규칙은 이진(boolean) 데이터베이스에 존재하는 유무에 대한 규칙으로 발견하는 것에 대해서 주로 연구되어왔다. 본 논문에서는 정량적 속성의 데이터를 기호적 속성 값으로 바꾼 후 연관규칙을 추출함으로써, 퍼지개념을 사용한 퍼지 연관규칙을 이용한 지능적 질의 처리 시스템을 제안하고자 한다.

An Efficient Video Retrieval Algorithm Using Key Frame Matching for Video Content Management

  • Kim, Sang Hyun
    • International Journal of Contents
    • /
    • 제12권1호
    • /
    • pp.1-5
    • /
    • 2016
  • To manipulate large video contents, effective video indexing and retrieval are required. A large number of video indexing and retrieval algorithms have been presented for frame-wise user query or video content query whereas a relatively few video sequence matching algorithms have been proposed for video sequence query. In this paper, we propose an efficient algorithm that extracts key frames using color histograms and matches the video sequences using edge features. To effectively match video sequences with a low computational load, we make use of the key frames extracted by the cumulative measure and the distance between key frames, and compare two sets of key frames using the modified Hausdorff distance. Experimental results with real sequence show that the proposed video sequence matching algorithm using edge features yields the higher accuracy and performance than conventional methods such as histogram difference, Euclidean metric, Battachaya distance, and directed divergence methods.

공간 집계 질의 기능을 가진 직기 관리 시스템의 구현 (Implementing the User Interface of Looms Management System with Spatial Aggregate Query Functions)

  • 전일수;부기동
    • 한국정보기술응용학회:학술대회논문집
    • /
    • 한국정보기술응용학회 2002년도 추계공동학술대회 정보환경 변화에 따른 신정보기술 패러다임
    • /
    • pp.512-519
    • /
    • 2002
  • In this paper, we implemented a loom component to be placed in a window and a looms management system which is able to connect database and to process various queries. The implemented system has aggregate query p개cessing functions for the loom components existing in the selected area by the mouse and it also has high level query processing functions represented with chart and pivot table; it can be used as a decision support system. The proposed system can detect temporal or persistent problems of the looms. Therefore it can be used to raise the productivity and to reduce the cost in textile companies by coping with the situation properly.

  • PDF

Integrated Methods of Various Media Generators in The SuperSQL Query Process System

  • Shin Sang-Gyu;Kim Tai-Suk;Toyama Motomichi
    • 한국멀티미디어학회논문지
    • /
    • 제9권6호
    • /
    • pp.720-727
    • /
    • 2006
  • In this paper, we propose a method which allows the SuperSQL query processor to share as much code as possible among various, generators, each of which is responsible for the output of a certain medium. SuperSQL is an enhanced query-processing system that converts database records into a variety of formats such as XML, HTML, PDF and etc. However, the existing SuperSQL media generator would require creation of a different processor for each medium, causing duplicated development cost. This research makes three main contributions: First, it analyzes the structures of various media, examining any possibility of integration based on their common structure. Second, it also facilitates the addition of a new output media generator by separating constructors and decorators from each medium. Last, it provides an integrated user interface to each media by method of the Media Abstraction Table Concept. We also show the performance and feasibility of our system using experimental results.

  • PDF

An Efficient Video Retrieval Algorithm Using Color and Edge Features

  • Kim Sang-Hyun
    • 융합신호처리학회논문지
    • /
    • 제7권1호
    • /
    • pp.11-16
    • /
    • 2006
  • To manipulate large video databases, effective video indexing and retrieval are required. A large number of video indexing and retrieval algorithms have been presented for frame-w]so user query or video content query whereas a relatively few video sequence matching algorithms have been proposed for video sequence query. In this paper, we propose an efficient algorithm to extract key frames using color histograms and to match the video sequences using edge features. To effectively match video sequences with low computational load, we make use of the key frames extracted by the cumulative measure and the distance between key frames, and compare two sets of key frames using the modified Hausdorff distance. Experimental results with several real sequences show that the proposed video retrieval algorithm using color and edge features yields the higher accuracy and performance than conventional methods such as histogram difference, Euclidean metric, Battachaya distance, and directed divergence methods.

  • PDF