• Title/Summary/Keyword: Query Index

Search Result 410, Processing Time 0.033 seconds

Enhanced Grid-Based Trajectory Cloaking Method for Efficiency Search and User Information Protection in Location-Based Services (위치기반 서비스에서 효율적 검색과 사용자 정보보호를 위한 향상된 그리드 기반 궤적 클로킹 기법)

  • Youn, Ji-Hye;Song, Doo-Hee;Cai, Tian-Yuan;Park, Kwang-Jin
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.7 no.8
    • /
    • pp.195-202
    • /
    • 2018
  • With the development of location-based applications such as smart phones and GPS navigation, active research is being conducted to protect location and trajectory privacy. To receive location-related services, users must disclose their exact location to the server. However, disclosure of users' location exposes not only their locations but also their trajectory to the server, which can lead to concerns of privacy violation. Furthermore, users request from the server not only location information but also multimedia information (photographs, reviews, etc. of the location), and this increases the processing cost of the server and the information to be received by the user. To solve these problems, this study proposes the EGTC (Enhanced Grid-based Trajectory Cloaking) technique. As with the existing GTC (Grid-based Trajectory Cloaking) technique, EGTC method divides the user trajectory into grids at the user privacy level (UPL) and creates a cloaking region in which a random query sequence is determined. In the next step, the necessary information is received as index by considering the sub-grid cell corresponding to the path through which the user wishes to move as c(x,y). The proposed method ensures the trajectory privacy as with the existing GTC method while reducing the amount of information the user must listen to. The excellence of the proposed method has been proven through experimental results.

A Search Method for Components Based-on XML Component Specification (XML 컴포넌트 명세서 기반의 컴포넌트 검색 기법)

  • Park, Seo-Young;Shin, Yoeng-Gil;Wu, Chi-Su
    • Journal of KIISE:Software and Applications
    • /
    • v.27 no.2
    • /
    • pp.180-192
    • /
    • 2000
  • Recently, the component technology has played a main role in software reuse. It has changed the code-based reuse into the binary code-based reuse, because components can be easily combined into the developing software only through component interfaces. Since components and component users have increased rapidly, it is necessary that the users of components search for the most proper components for HTML among the enormous number of components on the Internet. It is desirable to use web-document-typed specifications for component specifications on the Internet. This paper proposes to use XML component specifications instead of HTML specifications, because it is impossible to represent the semantics of contexts using HTML. We also propose the XML context-search method based on XML component specifications. Component users use the contexts for the component properties and the terms for the values of component properties in their queries for searching components. The index structure for the context-based search method is the inverted file indexing structure of term-context-component specification. Not only an XML context-based search method but also a variety of search methods based on context-based search, such as keyword, search, faceted search, and browsing search method, are provided for the convenience of users. We use the 3-layer architecture, with an interface layer, a query expansion layer, and an XML search engine layer, of the search engine for the efficient index scheme. In this paper, an XML DTD(Document Type Definition) for component specification is defined and the experimental results of comparing search performance of XML with HTML are discussed.

  • PDF

An Efficient Technique for Processing Frequent Updates in the R-tree (R-트리에서 빈번한 변경 질의 처리를 위한 효율적인 기법)

  • 권동섭;이상준;이석호
    • Journal of KIISE:Databases
    • /
    • v.31 no.3
    • /
    • pp.261-273
    • /
    • 2004
  • Advances in information and communication technologies have been creating new classes of applications in the area of databases. For example, in moving object databases, which track positions of a lot of objects, or stream databases, which process data streams from a lot of sensors, data Processed in such database systems are usually changed very rapidly and continuously. However, traditional database systems have a problem in processing these rapidly and continuously changing data because they suppose that a data item stored in the database remains constant until It is explicitly modified. The problem becomes more serious in the R-tree, which is a typical index structure for multidimensional data, because modifying data in the R-tree can generate cascading node splits or merges. To process frequent updates more efficiently, we propose a novel update technique for the R-tree, which we call the leaf-update technique. If a new value of a data item lies within the leaf MBR that the data item belongs, the leaf-update technique changes the leaf node only, not whole of the tree. Using this leaf-update manner and the leaf-access hash table for direct access to leaf nodes, the proposed technique can reduce update cost greatly. In addition, the leaf-update technique can be adopted in diverse variants of the R-tree and various applications that use the R-tree since it is based on the R-tree and it guarantees the correctness of the R-tree. In this paper, we prove the effectiveness of the leaf-update techniques theoretically and present experimental results that show that our technique outperforms traditional one.

Development of Information Extraction System from Multi Source Unstructured Documents for Knowledge Base Expansion (지식베이스 확장을 위한 멀티소스 비정형 문서에서의 정보 추출 시스템의 개발)

  • Choi, Hyunseung;Kim, Mintae;Kim, Wooju;Shin, Dongwook;Lee, Yong Hun
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.4
    • /
    • pp.111-136
    • /
    • 2018
  • In this paper, we propose a methodology to extract answer information about queries from various types of unstructured documents collected from multi-sources existing on web in order to expand knowledge base. The proposed methodology is divided into the following steps. 1) Collect relevant documents from Wikipedia, Naver encyclopedia, and Naver news sources for "subject-predicate" separated queries and classify the proper documents. 2) Determine whether the sentence is suitable for extracting information and derive the confidence. 3) Based on the predicate feature, extract the information in the proper sentence and derive the overall confidence of the information extraction result. In order to evaluate the performance of the information extraction system, we selected 400 queries from the artificial intelligence speaker of SK-Telecom. Compared with the baseline model, it is confirmed that it shows higher performance index than the existing model. The contribution of this study is that we develop a sequence tagging model based on bi-directional LSTM-CRF using the predicate feature of the query, with this we developed a robust model that can maintain high recall performance even in various types of unstructured documents collected from multiple sources. The problem of information extraction for knowledge base extension should take into account heterogeneous characteristics of source-specific document types. The proposed methodology proved to extract information effectively from various types of unstructured documents compared to the baseline model. There is a limitation in previous research that the performance is poor when extracting information about the document type that is different from the training data. In addition, this study can prevent unnecessary information extraction attempts from the documents that do not include the answer information through the process for predicting the suitability of information extraction of documents and sentences before the information extraction step. It is meaningful that we provided a method that precision performance can be maintained even in actual web environment. The information extraction problem for the knowledge base expansion has the characteristic that it can not guarantee whether the document includes the correct answer because it is aimed at the unstructured document existing in the real web. When the question answering is performed on a real web, previous machine reading comprehension studies has a limitation that it shows a low level of precision because it frequently attempts to extract an answer even in a document in which there is no correct answer. The policy that predicts the suitability of document and sentence information extraction is meaningful in that it contributes to maintaining the performance of information extraction even in real web environment. The limitations of this study and future research directions are as follows. First, it is a problem related to data preprocessing. In this study, the unit of knowledge extraction is classified through the morphological analysis based on the open source Konlpy python package, and the information extraction result can be improperly performed because morphological analysis is not performed properly. To enhance the performance of information extraction results, it is necessary to develop an advanced morpheme analyzer. Second, it is a problem of entity ambiguity. The information extraction system of this study can not distinguish the same name that has different intention. If several people with the same name appear in the news, the system may not extract information about the intended query. In future research, it is necessary to take measures to identify the person with the same name. Third, it is a problem of evaluation query data. In this study, we selected 400 of user queries collected from SK Telecom 's interactive artificial intelligent speaker to evaluate the performance of the information extraction system. n this study, we developed evaluation data set using 800 documents (400 questions * 7 articles per question (1 Wikipedia, 3 Naver encyclopedia, 3 Naver news) by judging whether a correct answer is included or not. To ensure the external validity of the study, it is desirable to use more queries to determine the performance of the system. This is a costly activity that must be done manually. Future research needs to evaluate the system for more queries. It is also necessary to develop a Korean benchmark data set of information extraction system for queries from multi-source web documents to build an environment that can evaluate the results more objectively.

A Spatio-Temporal Clustering Technique for the Moving Object Path Search (이동 객체 경로 탐색을 위한 시공간 클러스터링 기법)

  • Lee, Ki-Young;Kang, Hong-Koo;Yun, Jae-Kwan;Han, Ki-Joon
    • Journal of Korea Spatial Information System Society
    • /
    • v.7 no.3 s.15
    • /
    • pp.67-81
    • /
    • 2005
  • Recently, the interest and research on the development of new application services such as the Location Based Service and Telemetics providing the emergency service, neighbor information search, and route search according to the development of the Geographic Information System have been increasing. User's search in the spatio-temporal database which is used in the field of Location Based Service or Telemetics usually fixes the current time on the time axis and queries the spatial and aspatial attributes. Thus, if the range of query on the time axis is extensive, it is difficult to efficiently deal with the search operation. For solving this problem, the snapshot, a method to summarize the location data of moving objects, was introduced. However, if the range to store data is wide, more space for storing data is required. And, the snapshot is created even for unnecessary space that is not frequently used for search. Thus, non storage space and memory are generally used in the snapshot method. Therefore, in this paper, we suggests the Hash-based Spatio-Temporal Clustering Algorithm(H-STCA) that extends the two-dimensional spatial hash algorithm used for the spatial clustering in the past to the three-dimensional spatial hash algorithm for overcoming the disadvantages of the snapshot method. And, this paper also suggests the knowledge extraction algorithm to extract the knowledge for the path search of moving objects from the past location data based on the suggested H-STCA algorithm. Moreover, as the results of the performance evaluation, the snapshot clustering method using H-STCA, in the search time, storage structure construction time, optimal path search time, related to the huge amount of moving object data demonstrated the higher performance than the spatio-temporal index methods and the original snapshot method. Especially, for the snapshot clustering method using H-STCA, the more the number of moving objects was increased, the more the performance was improved, as compared to the existing spatio-temporal index methods and the original snapshot method.

  • PDF

An Efficient Location Encoding Method Based on Hierarchical Administrative District (계층적 행정 구역에 기반한 효율적인 위치 정보 표현 방식)

  • Lee Sang-Yoon;Park Sang-Hyun;Kim Woo-Cheol;Lee Dong-Won
    • Journal of KIISE:Databases
    • /
    • v.33 no.3
    • /
    • pp.299-309
    • /
    • 2006
  • Due to the rapid development in mobile communication technologies, the usage of mobile devices such as cell phone or PDA becomes increasingly popular. As different devices require different applications, various new services are being developed to satisfy the needs. One of the popular services under heavy demand is the Location-based Service (LBS) that exploits the spatial information of moving objects per temporal changes. In order to support LBS efficiently, it is necessary to be able to index and query well a large amount of spatio-temporal information of moving objects. Therefore, in this paper, we investigate how such location information of moving objects can be efficiently stored and indexed. In particular, we propose a novel location encoding method based on hierarchical administrative district information. Our proposal is different from conventional approaches where moving objects are often expressed as geometric points in two dimensional space, (x,y). Instead, in ours, moving objects are encoded as one dimensional points by both administrative district as well as road information. Our method is especially useful for monitoring traffic situation or tracing location of moving objects through approximate spatial queries.

Design and Implementation of Web GIS Server Using Node.js (Node.js를 활용한 웹GIS 서버의 설계와 구현)

  • Jun, Sang Hwan;Doh, Kyoung Tae
    • Spatial Information Research
    • /
    • v.21 no.3
    • /
    • pp.45-53
    • /
    • 2013
  • Web GIS, based on the latest web-technology, has evolved to provide efficient and accurate spatial information to users. Furthermore, Web GIS Server has improved the performance constantly to respond user web requests and to offer spatial information service. This research aims to create a designed and implemented Web GIS Server that is named as Nodemap which uses the emergent technology, Node.js, which has been issued for an event-oriented, non-blocking I/O model framework for coding JavaScript on the server development. Basically, NodeMap is Web GIS Server that supports OGC implementation specification. It is designed to process GIS data by using DBMS, which supports spatial index and standard spatial query function. And NodeMap uses Node-Canvas module supported HTML5 canvas to render spatial information on tile map. Lastly, NodeMap uses Express module based connect module framework. NodaMap performance demonstration confirmed a possibility of applying Node.js as a (next/future) Web GIS Server development technology through the benchmarking. Having completed its quality test of NodeMap, this study has shown the compatibility and potential for Node.js as a Web GIS server development technology, and has shown the bright future of internet GIS service.

Vector Approximation Bitmap Indexing Method for High Dimensional Multimedia Database (고차원 멀티미디어 데이터 검색을 위한 벡터 근사 비트맵 색인 방법)

  • Park Joo-Hyoun;Son Dea-On;Nang Jong-Ho;Joo Bok-Gyu
    • The KIPS Transactions:PartD
    • /
    • v.13D no.4 s.107
    • /
    • pp.455-462
    • /
    • 2006
  • Recently, the filtering approach using vector approximation such as VA-file[1] or LPC-file[2] have been proposed to support similarity search in high dimensional data space. This approach filters out many irrelevant vectors by calculating the approximate distance from a query vector using the compact approximations of vectors in database. Accordingly, the total elapsed time for similarity search is reduced because the disk I/O time is eliminated by reading the compact approximations instead of original vectors. However, the search time of the VA-file or LPC-file is not much lessened compared to the brute-force search because it requires a lot of computations for calculating the approximate distance. This paper proposes a new bitmap index structure in order to minimize the calculating time. To improve the calculating speed, a specific value of an object is saved in a bit pattern that shows a spatial position of the feature vector on a data space, and the calculation for a distance between objects is performed by the XOR bit calculation that is much faster than the real vector calculation. According to the experiment, the method that this paper suggests has shortened the total searching time to the extent of about one fourth of the sequential searching time, and to the utmost two times of the existing methods by shortening the great deal of calculating time, although this method has a longer data reading time compared to the existing vector approximation based approach. Consequently, it can be confirmed that we can improve even more the searching performance by shortening the calculating time for filtering of the existing vector approximation methods when the database speed is fast enough.

A Study on Spatial Data Integration using Graph Database: Focusing on Real Estate (그래프 데이터베이스를 활용한 공간 데이터 통합 방안 연구: 부동산 분야를 중심으로)

  • Ju-Young KIM;Seula PARK;Ki-Yun YU
    • Journal of the Korean Association of Geographic Information Studies
    • /
    • v.26 no.3
    • /
    • pp.12-36
    • /
    • 2023
  • Graph databases, which store different types of data and their relationships modeled as a graph, can be effective in managing and analyzing real estate spatial data linked by complex relationships. However, they are not widely used due to the limited spatial functionalities of graph databases. In this study, we propose a uniform grid-based real estate spatial data management approach using a graph database to respond to various real estate-related spatial questions. By analyzing the real estate community to identify relevant data and utilizing national point numbers as unit grids, we construct a graph schema that linking diverse real estate data, and create a test database. After building a test database, we tested basic topological relationships and spatial functions using the Jackpine benchmark, and further conducted query tests based on various scenarios to verify the appropriateness of the proposed method. The results show that the proposed method successfully executed 25 out of 29 spatial topological relationships and spatial functions, and achieved about 97% accuracy for the 25 functions and 15 scenarios. The significance of this study lies in proposing an efficient data integration method that can respond to real estate-related spatial questions, considering the limited spatial operation capabilities of graph databases. However, there are limitations such as the creation of incorrect spatial topological relationships due to the use of grid-based indexes and inefficiency of queries due to list comparisons, which need to be improved in follow-up studies.

Data Mining and Construction of Database Concerning Effects of Vitis Genus (산머루 관련 정보수집 및 데이터베이스의 구축)

  • Kim, Min-A;Jo, Yun-Ju;Shin, Jee-Young;Shin, Min-Kyu;Bae, Hyun-Su;Hong, Moo-Chang;Kim, Yang-Seok
    • Journal of Physiology & Pathology in Korean Medicine
    • /
    • v.26 no.4
    • /
    • pp.551-556
    • /
    • 2012
  • The database for the oriental medicine had been existed in documentation in past times and it has been developed to the database type for random accesses in the information society. However, the aspects of the database are not so diversified and the database for the bio herbal material exists in widened type dictionary style. It is a situation that the database which handles the in-depth raw herbal medicines is not sufficient in its quantity and quality. Korean wild grape is a deciduous plant categorized into the Vitaceae and it was found experimentally that it has various medical effects. It is one of the medical materials with higher potentiality of academic study and commercialization recently because it has a bigger possibility to be applied into diverse industrial fields including the medical product for health, food and beauty. We constituted the cooperative system among the Muju cluster business group for Korean mountain wild grapes, Physiology Laboratory in Kyung Hee University Oriental Medicine and Medical Classics Laboratory in Kyung Hee University Oriental Medicine with a view to focusing on such potentiality and a database for Korean wild grapes was made a touchstone for establishing the in-depth database for the single bio medical materials. First of all, the literatures based on the North East Asia in ancient times had been categorized into the classical literature (Korean literature published by government organization, Korean classical literature, Chinese classical literature and classical literature fro Korean and Chinese oriental medicine) and modern literature (Modern literature for oriental medicine, modern literature for domestic and foreign herbal medicine) to cover the eastern and western research records and writings related to Korean wild grapes and the text-mining work has been performed through the cooperation system with the Medical Classics Laboratory in Kyung Hee University Oriental Medicine. First of all, the data for the experiment and theory for Korean wild grape were collected for the Medline database controlled by the Parliament Library of USA to arrange the domestic and foreign theses with topic for Korean wild grapes and the network hyperlink function and down load function were mounted for self-thesis searching function and active view based on the collected data. The thesis searching function provides various auxiliary functions and the searching is available according to the diverse searching/queries such as the name of sub species of Korean wild grape, the logical intersection index for the active ingredients, efficacy and elements. It was constituted for the researchers who design the Korean wild grape study to design of easier experiment. In addition, the data related to the patents for Korean wild grape which were collected from European Patent Office in response to the commercialization possibility and the system available for searching and view was established in the same viewpoint. Perl was used for the query programming and MS-SQL for database establishment and management in the designing of this database. Currently, the data is available for free use and the address is as follows. http://163.180.41.43:8011/index.html