• Title/Summary/Keyword: Proximity Query

Search Result 26, Processing Time 0.024 seconds

A Method for Precision Improvement Based on Core Query Clusters and Term Proximity (핵심질의 클러스터와 단어 근접도를 이용한 문서 검색 정확률 향상 기법)

  • Jang, Kye-Hun;Lee, Kyung-Soon
    • The KIPS Transactions:PartB
    • /
    • v.17B no.5
    • /
    • pp.399-404
    • /
    • 2010
  • In this paper, we propose a method for precision improvement based on core clusters and term proximity. The method is composed by three steps. The initial retrieval documents are clustered based on query term combination, which occurred in the document. Core clusters are selected by using proximity between query terms. Then, the documents in core clusters are reranked based on context information of query. On TREC AP test collection, experimental results in precision at the top documents(P@100) show that the proposed method improved 11.2% over the language model.

Comparison of Voxel Map and Sphere Tree Structures for Proximity Computation of Protein Molecules (단백질 분자에 대한 proximity 연산을 위한 복셀 맵과 스피어 트리 구조 비교)

  • Kim, Byung-Joo;Lee, Jung-Eun;Kim, Young-J.;Kim, Ku-Jin
    • Journal of Korea Multimedia Society
    • /
    • v.15 no.6
    • /
    • pp.794-804
    • /
    • 2012
  • For the geometric computations on the protein molecules, the proximity queries, such as computing the minimum distance from an arbitrary point to the molecule or detecting the collision between a point and the molecule, are essential. For the proximity queries, the efficiency of the computation time can be different according to the data structure used for the molecule. In this paper, we present the data structures and algorithms for applying proximity queries to a molecule with GPU acceleration. We present two data structures, a voxel map and a sphere tree, where the molecule is represented as a set of spheres, and corresponding algorithms. Moreover, we show that the performance of presented data structures are improved from 3 to 633 times compared to the previous data structure for the molecules containing 1,000~15,000 atoms.

Query Expansion based on Word Graph using Term Proximity (질의 어휘와의 근접도를 반영한 단어 그래프 기반 질의 확장)

  • Jang, Kye-Hun;Lee, Kyung-Soon
    • The KIPS Transactions:PartB
    • /
    • v.19B no.1
    • /
    • pp.37-42
    • /
    • 2012
  • The pseudo relevance feedback suggests that frequent words at the top documents are related to initial query. However, the main drawback associated with the term frequency method is the fact that it relies on feature independence, and disregards any dependencies that may exist between words in the text. In this paper, we propose query expansion based on word graph using term proximity. It supplements term frequency method. On TREC WT10g test collection, experimental results in MAP(Mean Average Precision) show that the proposed method achieved 6.4% improvement over language model.

Related Term Extraction with Proximity Matrix for Query Related Issue Detection using Twitter (트위터를 이용한 질의어 관련 이슈 탐지를 위한 인접도 행렬 기반 연관 어휘 추출)

  • Kim, Je-Sang;Jo, Hyo-Geun;Kim, Dong-Sung;Kim, Byeong Man;Lee, Hyun Ah
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.3 no.1
    • /
    • pp.31-36
    • /
    • 2014
  • Social network services(SNS) including Twitter and Facebook are good resources to extract various issues like public interest, trend and topic. This paper proposes a method to extract query-related issues by calculating relatedness between terms in Twitter. As a term that frequently appears near query terms should be semantically related to a query, we calculate term relatedness in retrieved documents by summing proximity that is proportional to term frequency and inversely proportional to distance between words. Then terms, relatedness of which is bigger than threshold, are extracted as query-related issues, and our system shows those issues with a connected network. By analyzing single transitions in a connected network, compound words are easily obtained.

Query Expansion Based on Word Graphs Using Pseudo Non-Relevant Documents and Term Proximity (잠정적 부적합 문서와 어휘 근접도를 반영한 어휘 그래프 기반 질의 확장)

  • Jo, Seung-Hyeon;Lee, Kyung-Soon
    • The KIPS Transactions:PartB
    • /
    • v.19B no.3
    • /
    • pp.189-194
    • /
    • 2012
  • In this paper, we propose a query expansion method based on word graphs using pseudo-relevant and pseudo non-relevant documents to achieve performance improvement in information retrieval. The initially retrieved documents are classified into a core cluster when a document includes core query terms extracted by query term combinations and the degree of query term proximity. Otherwise, documents are classified into a non-core cluster. The documents that belong to a core query cluster can be seen as pseudo-relevant documents, and the documents that belong to a non-core cluster can be seen as pseudo non-relevant documents. Each cluster is represented as a graph which has nodes and edges. Each node represents a term and each edge represents proximity between the term and a query term. The term weight is calculated by subtracting the term weight in the non-core cluster graph from the term weight in the core cluster graph. It means that a term with a high weight in a non-core cluster graph should not be considered as an expanded term. Expansion terms are selected according to the term weights. Experimental results on TREC WT10g test collection show that the proposed method achieves 9.4% improvement over the language model in mean average precision.

A Fast Algorithm for the k-Keyword Ordered Proximity Problem (순서를 고려하는 k-키워드 근접도 문제를 위한 빠른 알고리즘)

  • Kim, Jin-Wook
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.16 no.3
    • /
    • pp.281-288
    • /
    • 2010
  • In the web search engines, the proximity is used to compute the relevance of a document to the given query. There exist various research results about the proximity problems and the ordered proximity problems. In this paper, we present O(n) time algorithms for the k-keyword ordered proximity problems where n is the total number of occurrences of the k keywords in a document. Experimental results show that the proposed algorithms are about 1.2 times and over 3 times faster than the previous results when k=2 and k=5, respectively.

An Efficient Collision Queries in Parallel Close Proximity Situations

  • Kim, Dae-Hyun;Choi, Han-Soo;Kim, Yeong-Dong
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 2005.06a
    • /
    • pp.2402-2406
    • /
    • 2005
  • A collision query determines the intersection between given objects, and is used in computer-aided design and manufacturing, animation and simulation systems, and physically-based modeling. Bounding volume hierarchies are one of the simplest and most widely used data structures for performing collision detection on complex models. In this paper, we present hierarchy of oriented rounded bounding volume for fast proximity queries. Designing hierarchies of new bounding volumes, we use to combine multiple bounding volume types in a single hierarchy. The new bounding volume corresponds to geometric shape composed of a core primitive shape grown outward by some offset such as the Minkowski sum of rectangular box and a sphere shape. In the experiment of parallel close proximity, a number of benchmarks to measure the performance of the new bounding box and compare to that of other bounding volumes.

  • PDF

Design and Implementation of a Query Processor for Document Management Systems (문서관리시스템을 위한 질의처리기 설계 및 구현)

  • U, Jong-Won;Yun, Seung-Hyeon;Yu, Jae-Su
    • The Transactions of the Korea Information Processing Society
    • /
    • v.6 no.6
    • /
    • pp.1419-1432
    • /
    • 1999
  • The Document Management System(DMS) is a system which retrieves and manages library information efficiently. Since DMS manages the information using only one table, it does not need to provide join and view operations that spend high cost in traditional DBMS. In addition, DMs requires new operations because of their property. the operation has not been supported in existing DBMSs. In this paper we define a data language which represents the structure definition and process of data on the DMS. Especially we define Ranking and Proximity operation which is needed in Document Retrieval,. We also design and implement a query processor to process the query constructed with the data language. When the exiting query processors of relational DBMS are used as a query processor of DMS, they degrade the whole system performance. The proposed query processor not only overcomes such a problem but also supports new operation which is needed in DMS.

  • PDF

Finding Top-k Answers in Node Proximity Search Using Distribution State Transition Graph

  • Park, Jaehui;Lee, Sang-Goo
    • ETRI Journal
    • /
    • v.38 no.4
    • /
    • pp.714-723
    • /
    • 2016
  • Considerable attention has been given to processing graph data in recent years. An efficient method for computing the node proximity is one of the most challenging problems for many applications such as recommendation systems and social networks. Regarding large-scale, mutable datasets and user queries, top-k query processing has gained significant interest. This paper presents a novel method to find top-k answers in a node proximity search based on the well-known measure, Personalized PageRank (PPR). First, we introduce a distribution state transition graph (DSTG) to depict iterative steps for solving the PPR equation. Second, we propose a weight distribution model of a DSTG to capture the states of intermediate PPR scores and their distribution. Using a DSTG, we can selectively follow and compare multiple random paths with different lengths to find the most promising nodes. Moreover, we prove that the results of our method are equivalent to the PPR results. Comparative performance studies using two real datasets clearly show that our method is practical and accurate.

Reordering Scheme of Location Identifiers for Indexing RFID Tags (RFID 태그의 색인을 위한 위치 식별자 재순서 기법)

  • Ahn, Sung-Woo;Hong, Bong-Hee
    • Journal of KIISE:Databases
    • /
    • v.36 no.3
    • /
    • pp.198-214
    • /
    • 2009
  • Trajectories of RFID tags can be modeled as a line, denoted by tag interval, captured by an RFID reader and indexed in a three-dimensional domain, with the axes being the tag identifier (TID), the location identifier (LID), and the time (TIME). Distribution of tag intervals in the domain space is an important factor for efficient processing of a query for tracing tags and is changed according to arranging coordinates of each domain. Particularly, the arrangement of LIDs in the domain has an effect on the performance of queries retrieving the traces of tags as times goes by because it provides the location information of tags. Therefore, it is necessary to determine the optimal ordering of LIDs in order to perform queries efficiently for retrieving tag intervals from the index. To do this, we propose LID proximity for reordering previously assigned LIDs to new LIDs and define the LID proximity function for storing tag intervals accessed together closely in index nodes when a query is processed. To determine the sequence of LIDs in the domain, we also propose a reordering scheme of LIDs based on LID proximity. Our experiments show that the proposed reordering scheme considerably improves the performance of Queries for tracing tag locations comparing with the previous method of assigning LIDs.