• Title/Summary/Keyword: Queries

Search Result 1,273, Processing Time 0.025 seconds

Semantic-based Query Generation For Information Retrieval

  • Shin Seung-Eun;Seo Young-Hoon
    • International Journal of Contents
    • /
    • v.1 no.2
    • /
    • pp.39-43
    • /
    • 2005
  • In this paper, we describe a generation mechanism of semantic-based queries for high accuracy information retrieval and question answering. It is difficult to offer the correct retrieval result because general information retrieval systems do not analyze the semantic of user's natural language question. We analyze user's question semantically and extract semantic features, and we .generate semantic-based queries using them. These queries are generated using the se-mantic-based question analysis grammar and the query generation rule. They are represented as semantic features and grammatical morphemes that consider semantic and syntactic structure of user's questions. We evaluated our mechanism using 100 questions whose answer type is a person in the TREC-9 corpus and Web. There was a 0.28 improvement in the precision at 10 documents when semantic-based queries were used for information retrieval.

  • PDF

A Study on Representative Skyline Using Connected Component Clustering

  • Choi, Jong-Hyeok;Nasridinov, Aziz
    • Journal of Multimedia Information System
    • /
    • v.6 no.1
    • /
    • pp.37-42
    • /
    • 2019
  • Skyline queries are used in a variety of fields to make optimal decisions. However, as the volume of data and the dimension of the data increase, the number of skyline points increases with the amount of time it takes to discover them. Mainly, because the number of skylines is essential in many real-life applications, various studies have been proposed. However, previous researches have used the k-parameter methods such as top-k and k-means to discover representative skyline points (RSPs) from entire skyline point set, resulting in high query response time and reduced representativeness due to k dependency. To solve this problem, we propose a new Connected Component Clustering based Representative Skyline Query (3CRS) that can discover RSP quickly even in high-dimensional data through connected component clustering. 3CRS performs fast discovery and clustering of skylines through hash indexes and connected components and selects RSPs from each cluster. This paper proves the superiority of the proposed method by comparing it with representative skyline queries using k-means and DBSCAN with the real-world dataset.

Query Processing based Branch Node Stream for XML Message Broker

  • Ko, Hye-Kyeong
    • International journal of advanced smart convergence
    • /
    • v.10 no.2
    • /
    • pp.64-72
    • /
    • 2021
  • XML message brokers have a lot of importance because XML has become a practical standard for data exchange in many applications. Message brokers covered in this document store many users. This paper is a study of the processing of twig pattern queries in XML documents using branching node streams in XML message broker structures. This work is about query processing in XML documents, especially for query processing with XML twig patterns in the XML message broker structure and proposed a method to reduce query processing time when parsing documents with XML twig patterns by processing information. In this paper, the twig pattern query processing method of documents using the branching node stream removes the twigging value of the branch node that does not include the labeling value of the branch node stream when it receives a twig query from the client. In this paper, the leaf node discovery time can be reduced by reducing the navigation time of nodes in XML documents that are matched to leaf nodes in twig queries for client twig queries. Overall, the overall processing time to respond to queries is reduced, allowing for rapid question-answer processing.

Efficient Authentication of Aggregation Queries for Outsourced Databases (아웃소싱 데이터베이스에서 집계 질의를 위한 효율적인 인증 기법)

  • Shin, Jongmin;Shim, Kyuseok
    • Journal of KIISE
    • /
    • v.44 no.7
    • /
    • pp.703-709
    • /
    • 2017
  • Outsourcing databases is to offload storage and computationally intensive tasks to the third party server. Therefore, data owners can manage big data, and handle queries from clients, without building a costly infrastructure. However, because of the insecurity of network systems, the third-party server may be untrusted, thus the query results from the server may be tampered with. This problem has motivated significant research efforts on authenticating various queries such as range query, kNN query, function query, etc. Although aggregation queries play a key role in analyzing big data, authenticating aggregation queries has not been extensively studied, and the previous works are not efficient for data with high dimension or a large number of distinct values. In this paper, we propose the AMR-tree that is a data structure, applied to authenticate aggregation queries. We also propose an efficient proof construction method and a verification method with the AMR-tree. Furthermore, we validate the performance of the proposed algorithm by conducting various experiments through changing parameters such as the number of distinct values, the number of records, and the dimension of data.

A Study on the Social and Cultural Characteristics of Web Queries (웹 검색질의어 분석을 통한 사회·문화적 특성에 관한 연구)

  • Kim, Seong-Hee
    • Journal of Information Management
    • /
    • v.42 no.4
    • /
    • pp.155-174
    • /
    • 2011
  • This study aims to focus on classifying the search engine queries according to web query topic and the different user intents behind web queries. First, we classified 10,000 web query data set by topic. The results showed that there was significant differences in interesting topics across time. Also, we categorized 500 popular queries in web search engine as informational, navigational, or transactional. As a result, 82 percent of web queries are informational in nature, with about 10.8 percent for navigational and 7.2 percent for transactional. This results will help establish the policy to provide internet contents based on user's intent and also find out the social and cultural characteristics.

An Indexing Technique for Range Sum Queries in Spatio - Temporal Databases (시공간 데이타베이스에서 영역 합 질의를 위한 색인 기법)

  • Cho Hyung-Ju;Choi Yong-Jin;Min Jun-Ki;Chung Chin-Wan
    • Journal of KIISE:Databases
    • /
    • v.32 no.2
    • /
    • pp.129-141
    • /
    • 2005
  • Although spatio-temporal databases have received considerable attention recently, there has been little work on processing range sum queries on the historical records of moving objects despite their importance. Since to answer range sum queries, the direct access to a huge amount of data incurs prohibitive computation cost, materialization techniques based on existing index structures are recently suggested. A simple but effective solution is to apply the materialization technique to the MVR-tree known as the most efficient structure for window queries with spatio-temporal conditions. However, the MVR-tree has a difficulty in maintaining pre-aggregated results inside its internal nodes due to cyclic paths between nodes. Aggregate structures based on other index structures such as the HR-tree and the 3DR-tree do not provide satisfactory query performance. In this paper, we propose a new indexing technique called the Adaptive Partitioned Aggregate R-Tree (APART) and query processing algorithms to efficiently process range sum queries in many situations. Experimental results show that the performance of the APART is typically above 2 times better than existing aggregate structures in a wide range of scenarios.

Distance Browsing Query Processing using Query Result Set (질의 결과를 이용한 거리 브라우징 질의의 처리)

  • Park Dong-Joo;Park Sangwon;Chung Tae-Sun;Lee Sang-Won
    • The KIPS Transactions:PartD
    • /
    • v.12D no.5 s.101
    • /
    • pp.673-682
    • /
    • 2005
  • Distance browsing queries, namely k-nearest neighbor queries, are the most important queries in spatial database applications, e.g., Geographic Information Systems(GISs). Recently, GIS applications trends to extend themselves toward wide multi-user environments such as the Web. Since many techniques for such queries, where Hjaltason and Samet's algorithm is the most efficient one, were optimized for only one query, we need to complement them suitable for multi-user environments. It can be a good approach that we store many individual query results in a cache, i.e., query result caching and reuse them in evaluating incoming queries, j.e., query result matching. In this paper, we propose a complementary Hjaltason and Samet's algerian capable of reusing previous query results in a cache for answering distance browsing queries in multi-user GIS environments. Our experimental results conform the efficiency of our approach.

Evaluating real-time search query variation for intelligent information retrieval service (지능 정보검색 서비스를 위한 실시간검색어 변화량 평가)

  • Chong, Min-Young
    • Journal of Digital Convergence
    • /
    • v.16 no.12
    • /
    • pp.335-342
    • /
    • 2018
  • The search service, which is a core service of the portal site, presents search queries that are rapidly increasing among the inputted search queries based on the highest instantaneous search frequency, so it is difficult to immediately notify a search query having a high degree of interest for a certain period. Therefore, it is necessary to overcome the above problems and to provide more intelligent information retrieval service by bringing improved analysis results on the change of the search queries. In this paper, we present the criteria for measuring the interest, continuity, and attention of real-time search queries. In addition, according to the criteria, we measure and summarize changes in real-time search queries in hours, days, weeks, and months over a period of time to assess the issues that are of high interest, long-lasting issues of interest, and issues that need attention in the future.

A Node Relocation Strategy of Trajectory Indexes for Efficient Processing of Spatiotemporal Range Queries (효율적인 시공간 영역 질의 처리를 위한 궤적 색인의 노드 재배치 전략)

  • Lim Duksung;Cho Daesoo;Hong Bonghee
    • Journal of KIISE:Databases
    • /
    • v.31 no.6
    • /
    • pp.664-674
    • /
    • 2004
  • The trajectory preservation property that stores only one trajectory in a leaf node is the most important feature of an index structure, such as the TB-tree for retrieving object's moving paths in the spatio-temporal space. It performs well in trajectory-related queries such as navigational queries and combined queries. But, the MBR of non-leaf nodes in the TB-tree have large amounts of dead space because trajectory preservation is achieved at the sacrifice of the spatial locality of trajectories. As dead space increases, the overlap between nodes also increases, and, thus, the classical range query cost increases. We present a new split policy and entry relocation policies, which have no deterioration of the performance for trajectory-related queries, for improving the performance of range queries. To maximally reduce the dead space of a non-leaf node's MBR, the Maximal Area Reduction (MAR) policy is used as a split policy for non-leaf nodes. The entry relocation policy induces entries in non-leaf nodes to exchange each other for the purpose of reducing dead spaces in these nodes. We propose two algorithms for the entry relocation policy, and evaluate the performance studies of new algorithms comparing to the TB-tree under a varying set of spatio-temporal queries.

Continuous Query Processing Utilizing Follows Relationship between Queries in Stock Databases (주식 데이타베이스에서 질의간 따름 관계를 이용한 연속 질의의 처리)

  • Ha, You-Min;Kim, Sang-Wook;Park, Sang-Hyun
    • Journal of KIISE:Databases
    • /
    • v.33 no.6
    • /
    • pp.644-653
    • /
    • 2006
  • This paper analyzes the properties of user query for stock investment recommendation, and defines the 'following relation', which is a new relation between two queries. A following relation between two queries $Q_1,\;Q_2$ and a recommendation value X means 'If the recommendation value of a preceding Query $Q_1$ is X, then a following query $Q_2$ always has X as its recommendation value'. If there exists a following relation between $Q_1\;and\;Q_2$, the recommendation value of $Q_2$ is decided immediately by that of $Q_1$, therefore we can eliminate the running process for $Q_2$. We suggest two methods in this paper. The former method analyzes all the following relations among user queries and represents them as a graph. The latter searches the graph and decides the order of queries to be processed, in order to make the number of eliminated query-running process maximized. When we apply the suggested procedures that use the following relation, most of user queries do not need to be processed directly, hence the performance of running overall queries is greatly improved. We examined the superiority of the suggested methods through experiments using real stock market data. According to the results of our experiments, overall query processing time has reduced less than 10% with our proposed methods, compared to the traditional procedure.