• Title/Summary/Keyword: top-k query

Search Result 66, Processing Time 0.04 seconds

A Study on Top-k Query Processing using List-based Approach (List 기반의 접근법을 사용하는 Top-k 질의 처리 연구)

  • Ihm, Sun-Young;Park, Young-Ho
    • Annual Conference of KIPS
    • /
    • 2011.04a
    • /
    • pp.1249-1252
    • /
    • 2011
  • 최근 인터넷의 발달과 사용량의 증가로 데이터의 양이 급증하고 있다. 사용자들은 빠른 시간 내에 원하는 검색 결과를 얻기를 원한다. 또한 사용자 마다 모두 다른 선호도를 가지기 때문에 사용자 질의에 기반 하여 검색되어야 한다. 따라서 본 논문에서는 사용자 질의에 따라 빠른 시간 내에 효율적으로 List 기반의 접근법을 사용하여 top k 질의를 하는 기존의 연구를 소개 및 분석하고 문제점을 파악한다.

A Study on Improving the Effectiveness of Information Retrieval Through P-norm, RF, LCAF

  • Kim, Young-cheon;Lee, Sung-joo
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • v.2 no.1
    • /
    • pp.9-14
    • /
    • 2002
  • Boolean retrieval is simple and elegant. However, since there is no provision for term weighting, no ranking of the answer set is generated. As a result, the size of the output might be too large or too small. Relevance feedback is the most popular query reformulation strategy. in a relevance feedback cycle, the user is presented with a list of the retrieved documents and, after examining them, marks those which are relevant. In practice, only the top 10(or 20) ranked documents need to be examined. The main idea consists of selecting important terms, or expressions, attached to the documents that have been identified as relevant by the user, and of enhancing the importance of these terms in a new query formulation. The expected effect is that the new query will be moved towards the relevant documents and away from the non-relevant ones. Local analysis techniques are interesting because they take advantage of the local context provided with the query. In this regard, they seem more appropriate than global analysis techniques. In a local strategy, the documents retrieved for a given query q are examined at query time to determine terms for query expansion. This is similar to a relevance feedback cycle but might be done without assistance from the user.

Design Blockchain as a Service and Smart Contract with Secure Top-k Search that Improved Accuracy (정확도가 향상된 안전한 Top-k 검색 기반 서비스형 블록체인과 스마트 컨트랙트 설계)

  • Hobin Jang;Ji Young Chun;Ik Rae Jeong;Geontae Noh
    • Journal of Internet Computing and Services
    • /
    • v.24 no.5
    • /
    • pp.85-96
    • /
    • 2023
  • With advance of cloud computing technology, Blockchain as a Service of Cloud Service Provider has been utilized in various areas such as e-Commerce and financial companies to manage customer history and distribution history. However, if users' search history, purchase history, etc. are to be utilized in a BaaS in areas such as recommendation algorithms and search engine development, the users' search queries will be exposed to the company operating the BaaS, and privacy issues will be occured. Z. Guan et al. ensure the unlinkability between users' search query and search result using searchable encryption, and based on the inner product similarity, they select Top-k results that are highly relevant to the users' search query. However, there is a problem that the Top-k results selection may be not possible due to ties of inner product similarity, and BaaS over cloud is not considered. Therefore, this paper solve the problem of Z. Guan et al. using cosine similarity, so we improve accuracy of search result. And based on this, we design a BaaS with secure Top-k search that improved accuracy. Furthermore, we design a smart contracts that preserve privacy of users' search and obtain Top-k search results that are highly relevant to the users' search.

An Efficient Processing Method of Top-k(g) Skyline Group Queries for Incomplete Data (불완전 데이터를 위한 효율적 Top-k(g) 스카이라인 그룹 질의 처리 기법)

  • Park, Mi-Ra;Min, Jun-Ki
    • The KIPS Transactions:PartD
    • /
    • v.17D no.1
    • /
    • pp.17-24
    • /
    • 2010
  • Recently, there has been growing interest in skyline queries. Most of works for skyline queries assume that the data do not have null value. However, when we input data through the Web or with other different tools, there exist incomplete data with null values. As a result, several skyline processing techniques for incomplete data have been proposed. However, available skyline query techniques for incomplete data do not consider the environments that coexist complete data and incomplete data since these techniques deal with the incomplete data only. In this paper, we propose a novel skyline group processing technique which evaluates skyline queries for the environments that coexist complete data and incomplete data. To do this, we introduce the top-k(g) skyline group query which searches g skyline groups with respect to the user's dimensional preference. In our experimental study, we show efficiency of our proposed technique.

Extending SQL for Moving Objects Databases

  • Nam, Kwang-Woo;Lee, Jai-Ho;Kim, Min-Soo
    • Proceedings of the KSRS Conference
    • /
    • 2002.10a
    • /
    • pp.138-143
    • /
    • 2002
  • This paper describes a framework for extending GIS databases to support moving object data type and query language. The rapid progress of wireless communications, positioning systems, and mobile computing devices have led location-aware applications to be essential components for commercial and industrial systems. Location-aware applications require GIS databases system to represent moving objects and to support querying on the motion properties of objects. For example, fleet management applications may require storage of information about moving vehicles. Also, advanced CRM(Customer Relationship Management) applications may require to store and query the trajectories of mobile phone users. In this trend, maintaining consistent information about the location of continuously moving objects and processing motion-specific queries is challenging problem. We formally define a data model and query language for mobile objects that includes complex evolving spatial structure, and propose core algebra to process the moving object query language. Main profit of proposed moving objects query language and algebra is that proposed model can be constructed on the top of GIS databases.

  • PDF

The MeSH-Term Query Expansion Models using LDA Topic Models in Health Information Retrieval (MeSH 기반의 LDA 토픽 모델을 이용한 검색어 확장)

  • You, Sukjin
    • Journal of Korean Library and Information Science Society
    • /
    • v.52 no.1
    • /
    • pp.79-108
    • /
    • 2021
  • Information retrieval in the health field has several challenges. Health information terminology is difficult for consumers (laypeople) to understand. Formulating a query with professional terms is not easy for consumers because health-related terms are more familiar to health professionals. If health terms related to a query are automatically added, it would help consumers to find relevant information. The proposed query expansion (QE) models show how to expand a query using MeSH terms. The documents were represented by MeSH terms (i.e. Bag-of-MeSH), found in the full-text articles. And then the MeSH terms were used to generate LDA (Latent Dirichlet Analysis) topic models. A query and the top k retrieved documents were used to find MeSH terms as topic words related to the query. LDA topic words were filtered by threshold values of topic probability (TP) and word probability (WP). Threshold values were effective in an LDA model with a specific number of topics to increase IR performance in terms of infAP (inferred Average Precision) and infNDCG (inferred Normalized Discounted Cumulative Gain), which are common IR metrics for large data collections with incomplete judgments. The top k words were chosen by the word score based on (TP *WP) and retrieved document ranking in an LDA model with specific thresholds. The QE model with specific thresholds for TP and WP showed improved mean infAP and infNDCG scores in an LDA model, comparing with the baseline result.

Survey on Top-k Query Processing Considering Attractive and Repulsive Dimensions (선호 차원과 배척 차원을 모두 고려한 top-k 질의 처리 연구 조사)

  • Lee, Juneyoung;Seo, In;Choi, Dong-june;Kim, Kyoungmin;Kim, Dongwon
    • Annual Conference of KIPS
    • /
    • 2017.04a
    • /
    • pp.804-807
    • /
    • 2017
  • Top-k 질의란 주어진 조건을 만족하면서 높은 점수를 가진 상위 k개의 레코드를 요청하는 질의이다. 개체의 점수를 계산하는 랭킹함수가 단조함수가 아닐 경우 발생하는 기술적 어려움을 해결하기 위한 여러 연구가 있었다. 본 논문에서는 이들 중 각 차원이 선호 차원과 배척 차원으로 나뉘는 비단조 랭킹함수를 효율적으로 처리하는 기존의 top-k 질의 처리 기법들을 소개하고 비교한다.

Finding Top-k Answers in Node Proximity Search Using Distribution State Transition Graph

  • Park, Jaehui;Lee, Sang-Goo
    • ETRI Journal
    • /
    • v.38 no.4
    • /
    • pp.714-723
    • /
    • 2016
  • Considerable attention has been given to processing graph data in recent years. An efficient method for computing the node proximity is one of the most challenging problems for many applications such as recommendation systems and social networks. Regarding large-scale, mutable datasets and user queries, top-k query processing has gained significant interest. This paper presents a novel method to find top-k answers in a node proximity search based on the well-known measure, Personalized PageRank (PPR). First, we introduce a distribution state transition graph (DSTG) to depict iterative steps for solving the PPR equation. Second, we propose a weight distribution model of a DSTG to capture the states of intermediate PPR scores and their distribution. Using a DSTG, we can selectively follow and compare multiple random paths with different lengths to find the most promising nodes. Moreover, we prove that the results of our method are equivalent to the PPR results. Comparative performance studies using two real datasets clearly show that our method is practical and accurate.

Efficient Top-K Queries Computation for Encrypted Data in the Cloud (클라우드 환경에서의 암호화 데이터에 대한 효율적인 Top-K 질의 수행 기법)

  • Kim, Jong Wook
    • Journal of Korea Multimedia Society
    • /
    • v.18 no.8
    • /
    • pp.915-924
    • /
    • 2015
  • With growing popularity of cloud computing services, users can more easily manage massive amount of data by outsourcing them to the cloud, or more efficiently analyse large amount of data by leveraging IT infrastructure provided by the cloud. This, however, brings the security concerns of sensitive data. To provide data security, it is essential to encrypt sensitive data before uploading it to cloud computing services. Although data encryption helps provide data security, it negatively affects the performance of massive data analytics because it forbids the use of index and mathematical operation on encrypted data. Thus, in this paper, we propose a novel algorithm which enables to efficiently process a large amount of encrypted data. In particular, we propose a novel top-k processing algorithm on the massive amount of encrypted data in the cloud computing environments, and verify the performance of the proposed approach with real data experiments.

Secure Top-k Query Processing in Wireless Sensor Networks (무선 센서 네트워크에서 안전한 Top-k 질의 처리 기법)

  • Lee, Myong-Soo;Shim, Kyu-Sun;Park, Sang-Hyun;Lee, SangKeun
    • Annual Conference of KIPS
    • /
    • 2009.11a
    • /
    • pp.723-724
    • /
    • 2009
  • 무선 센서 네트워크에서 데이터 전송은 에너지를 소비하는 주 요인이다. 에너지를 줄이기 위한 주 기법 중 하나가 센서 데이터를 애그리게이션하여 전송할 데이터를 줄이는 것이다. 무선 센서 네트워크는 공개된 공간에서 진행되어 외부 공격에 취약점을 가지고 있으며, 기존 몇몇의 연구에서 애그리게이션 기법의 보안 기술을 제안하고 있다. 하지만, 기존의 기술들은 특정 연산자에만 제한되어 있고, 많은 유용성을 가진 top-k 질의에 대해서는 취약점을 가지고 있다. 본 논문에서는 무선 센서 네트워크에서 에너지 효율을 위한 애그리게이션 기법 적용시 top-k 질의를 효율적으로 처리할 보안 기법에 대해 제안한다.