• Title/Summary/Keyword: Query Filtering

Search Result 117, Processing Time 0.02 seconds

A Proposal of Methods for Extracting Temporal Information of History-related Web Document based on Historical Objects Using Machine Learning Techniques (역사객체 기반의 기계학습 기법을 활용한 웹 문서의 시간정보 추출 방안 제안)

  • Lee, Jun;KWON, YongJin
    • Journal of Internet Computing and Services
    • /
    • v.16 no.4
    • /
    • pp.39-50
    • /
    • 2015
  • In information retrieval process through search engine, some users want to retrieve several documents that are corresponding with specific time period situation. For example, if user wants to search a document that contains the situation before 'Japanese invasions of Korea era', he may use the keyword 'Japanese invasions of Korea' by using searching query. Then, search engine gives all of documents about 'Japanese invasions of Korea' disregarding time period in order. It makes user to do an additional work. In addition, a large percentage of cases which is related to historical documents have different time period between generation date of a document and record time of contents. If time period in document contents can be extracted, it may facilitate effective information for retrieval and various applications. Consequently, we pursue a research extracting time period of Joseon era's historical documents by using historic literature for Joseon era in order to deduct the time period corresponding with document content in this paper. We define historical objects based on historic literature that was collected from web and confirm a possibility of extracting time period of web document by machine learning techniques. In addition to the machine learning techniques, we propose and apply the similarity filtering based on the comparison between the historical objects. Finally, we'll evaluate the result of temporal indexing accuracy and improvement.

Vector Approximation Bitmap Indexing Method for High Dimensional Multimedia Database (고차원 멀티미디어 데이터 검색을 위한 벡터 근사 비트맵 색인 방법)

  • Park Joo-Hyoun;Son Dea-On;Nang Jong-Ho;Joo Bok-Gyu
    • The KIPS Transactions:PartD
    • /
    • v.13D no.4 s.107
    • /
    • pp.455-462
    • /
    • 2006
  • Recently, the filtering approach using vector approximation such as VA-file[1] or LPC-file[2] have been proposed to support similarity search in high dimensional data space. This approach filters out many irrelevant vectors by calculating the approximate distance from a query vector using the compact approximations of vectors in database. Accordingly, the total elapsed time for similarity search is reduced because the disk I/O time is eliminated by reading the compact approximations instead of original vectors. However, the search time of the VA-file or LPC-file is not much lessened compared to the brute-force search because it requires a lot of computations for calculating the approximate distance. This paper proposes a new bitmap index structure in order to minimize the calculating time. To improve the calculating speed, a specific value of an object is saved in a bit pattern that shows a spatial position of the feature vector on a data space, and the calculation for a distance between objects is performed by the XOR bit calculation that is much faster than the real vector calculation. According to the experiment, the method that this paper suggests has shortened the total searching time to the extent of about one fourth of the sequential searching time, and to the utmost two times of the existing methods by shortening the great deal of calculating time, although this method has a longer data reading time compared to the existing vector approximation based approach. Consequently, it can be confirmed that we can improve even more the searching performance by shortening the calculating time for filtering of the existing vector approximation methods when the database speed is fast enough.

Development of Personalized Recommendation System using RFM method and k-means Clustering (RFM기법과 k-means 기법을 이용한 개인화 추천시스템의 개발)

  • Cho, Young-Sung;Gu, Mi-Sug;Ryu, Keun-Ho
    • Journal of the Korea Society of Computer and Information
    • /
    • v.17 no.6
    • /
    • pp.163-172
    • /
    • 2012
  • Collaborative filtering which is used explicit method in a existing recommedation system, can not only reflect exact attributes of item but also still has the problem of sparsity and scalability, though it has been practically used to improve these defects. This paper proposes the personalized recommendation system using RFM method and k-means clustering in u-commerce which is required by real time accessablity and agility. In this paper, using a implicit method which is is not used complicated query processing of the request and the response for rating, it is necessary for us to keep the analysis of RFM method and k-means clustering to be able to reflect attributes of the item in order to find the items with high purchasablity. The proposed makes the task of clustering to apply the variable of featured vector for the customer's information and calculating of the preference by each item category based on purchase history data, is able to recommend the items with efficiency. To estimate the performance, the proposed system is compared with existing system. As a result, it can be improved and evaluated according to the criteria of logicality through the experiment with dataset, collected in a cosmetic internet shopping mall.

Cloaking Method supporting K-anonymity and L-diversity for Privacy Protection in Location-Based Services (위치기반 서비스에서 개인 정보 보호를 위한 K-anonymity 및 L-diversity를 지원하는 Cloaking 기법)

  • Kim, Ji-Hee;Lee, Ah-Reum;Kim, Yong-Ki;Um, Jung-Ho;Chang, Jae-Woo
    • Journal of Korea Spatial Information System Society
    • /
    • v.10 no.4
    • /
    • pp.1-10
    • /
    • 2008
  • In wireless internet, the location information of the user is one of the important resources for many applications. One of these applications is Location-Based Services (LBSs) which are being popular. Because, in the LBS system, users request a location-based query to LBS servers by sending their exact location, the location information of the users can be misused by adversaries. In this regard, there must be a mechanism which can deal with privacy protection of the users. In this paper, we propose a cloaking method considering both features of K-anonymity and L-diversity. Our cloaking method creates a minimum cloaking region by finding L number of buildings (L-diversity) and then finding number of users (K-anonymity). To support this, we use a R*-tree based index structure and use filtering methods especially for the m inimum cloaking region. Finally, we show from a performance analysis that our method outperforms the existing grid based cloaking method.

  • PDF

Tracking Moving Objects Using Signature-based Data Aggregation in Sensor Network (센서네트워크에서 시그니처 기반 데이터 집계를 이용한 이동객체 트래킹 기법)

  • Kim, Yong-Ki;Kim, Young-Jin;Yoon, Min;Chang, Jae-Woo
    • Journal of Korea Spatial Information System Society
    • /
    • v.11 no.2
    • /
    • pp.99-110
    • /
    • 2009
  • Currently, there are many applications being developed based on sensor network technology. A tracking method for moving objects in sensor network is one of the main issue of this field. There is a little research on this issue, but most of the existing work has two problems. The first problem is a communication overhead for visiting sensor nodes many times to track a moving object. The second problem is an disability for dealing with many moving objects at a time. To resolve the problems, we, in this paper, propose a signature-based tracking method using efficient data aggregation for moving objects, called SigMO-TRK. For this, we first design a local routing hierarchy tree to aggregate moving objects' trajectories efficiently by using a space filtering technique. Secondly, we do the tracking of all trajectories of moving objects by using signature in a efficient way, our approach generates signatures to method. In addition, by extending the SigMO-TRK, we can retrieve the similar trajectories of moving objects for given a query. Finally, by using the TOSSIM simulator, we show that our signature-based tracking method outperforms the existing tracking method in terms of energy efficiency.

  • PDF

A Benchmark Test of Spatial Big Data Processing Tools and a MapReduce Application

  • Nguyen, Minh Hieu;Ju, Sungha;Ma, Jong Won;Heo, Joon
    • Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
    • /
    • v.35 no.5
    • /
    • pp.405-414
    • /
    • 2017
  • Spatial data processing often poses challenges due to the unique characteristics of spatial data and this becomes more complex in spatial big data processing. Some tools have been developed and provided to users; however, they are not common for a regular user. This paper presents a benchmark test between two notable tools of spatial big data processing: GIS Tools for Hadoop and SpatialHadoop. At the same time, a MapReduce application is introduced to be used as a baseline to evaluate the effectiveness of two tools and to derive the impact of number of maps/reduces on the performance. By using these tools and New York taxi trajectory data, we perform a spatial data processing related to filtering the drop-off locations within Manhattan area. Thereby, the performance of these tools is observed with respect to increasing of data size and changing number of worker nodes. The results of this study are as follows 1) GIS Tools for Hadoop automatically creates a Quadtree index in each spatial processing. Therefore, the performance is improved significantly. However, users should be familiar with Java to handle this tool conveniently. 2) SpatialHadoop does not automatically create a spatial index for the data. As a result, its performance is much lower than GIS Tool for Hadoop on a same spatial processing. However, SpatialHadoop achieved the best result in terms of performing a range query. 3) The performance of our MapReduce application has increased four times after changing the number of reduces from 1 to 12.

Implementation of Bytecode based Data Service Middleware Supporting Energy Efficiency in Geosensor Networks (지오센서 네트워크에서 에너지 효율성을 지원하는 바이트코드 기반 데이터 서비스 미들웨어 구현)

  • Hong, Seung-Tae;Yoon, Min;Chang, Jae-Woo
    • Spatial Information Research
    • /
    • v.18 no.4
    • /
    • pp.75-88
    • /
    • 2010
  • Recent development in wireless communication and mobile positioning technologies make geosensor networks widely used in the various fields of real world. As a result, much research has been done on the middleware that uses limited energy resources efficiently. However, because traditional middleware does not consider the characteristics of sensor node, such as computing power and specification, the existing middleware call not support the sensor nodes with only the restricted system resource. Therefore, in this paper, we design and implement a new Bytecode based Data Service Middleware supporting energy efficiency in geosensor networks. At first, the proposed middleware provides the optimized functions for sensor nodes by using minimum by tee ode instruction set and data manager supporting hardware abstraction. Secondly, the proposed middleware increases the energy efficiency of sensor node through both data aggregation query processing and data filtering that minimize data transmission by eliminating unnecessary data. Finally, we show from our performance analysis that the proposed middleware is more energy efficient than the existing SwissQM.

Filtering Method for Analyzing Renewable Energy Stream Data (신재생 에너지 스트림 데이터 분석을 위한 필터링 기법)

  • Jin, Cheng Hao;Li, Xun;Kim, Kyu Ik;Hwang, Mi Yeong;Kim, Sang Yeob;Kim, Kwang Deuk;Ryu, Keun Ho
    • Journal of Convergence Society for SMB
    • /
    • v.1 no.1
    • /
    • pp.39-44
    • /
    • 2011
  • Recently, due to people's incontinent use all over the world, fossil fuels such as coal, oil, and natural gas were nearly to be exhausted and also causes serious environment pollutions. Therefore, there is a strong need to develop solar, wind, hydro, biomass, geothermal to replace fossil fuels to prevent suffering from above problems. Wish advances in sensor technology, such data is collected as a kind of stream data which arrives in an online manner so that it is characterized as high- speed, real-time and unbounded and it requires fast data processing to get the up-to-date results. Therefore, the traditional data processing techniques are not fit to deal with stream data. In this paper, we propose a kalman filter-based algorithm to process renewable stream data.

  • PDF

Dynamic Recommendation System for a Web Library by Using Cluster Analysis and Bayesian Learning (군집분석과 베이지안 학습을 이용한 웹 도서 동적 추천 시스템)

  • Choi, Jun-Hyeog;Kim, Dae-Su;Rim, Kee-Wook
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.12 no.5
    • /
    • pp.385-392
    • /
    • 2002
  • Collaborative filtering method for personalization can suggest new items and information which a user hasn t expected. But there are some problems. Not only the steps for calculating similarity value between each user is complex but also it doesn t reflect user s interest dynamically when a user input a query. In this paper, classifying users by their interest makes calculating similarity simple. We propose the a1gorithm for readjusting user s interest dynamically using the profile and Bayesian learning. When a user input a keyword searching for a item, his new interest is readjusted. And the user s profile that consists of used key words and the presence frequency of key words is designed and used to reflect the recent interest of users. Our methods of adjusting user s interest using the profile and Bayesian learning can improve the real satisfaction of users through the experiment with data set, collected in University s library. It recommends a user items which he would be interested in.

A Signature-based Video Indexing Scheme using Spatio-Temporal Modeling for Content-based and Concept-based Retrieval on Moving Objects (이동 객체의 내용 및 개념 기반 검색을 위한 시공간 모델링에 근거한 시그니쳐 기반 비디오 색인 기법)

  • Sim, Chun-Bo;Jang, Jae-U
    • The KIPS Transactions:PartD
    • /
    • v.9D no.1
    • /
    • pp.31-42
    • /
    • 2002
  • In this paper, we propose a new spatio-temporal representation scheme which can model moving objets trajectories effectively in video data and a new signature-based access method for moving objects trajectories which can support efficient retrieval on user query based on moving objects trajectories. The proposed spatio-temporal representation scheme supports content-based retrieval based on moving objects trajectories and concept-based retrieval based on concepts(semantics) which are acquired through the location information of moving objects trajectories. Also, compared with the sequential search, our signature-based access method can improve retrieval performance by reducing a large number of disk accesses because it access disk using only retrieved candidate signatures after it first scans all signatures and performs filtering before accessing the data file. Finally, we show the experimental results that proposed scheme is superior to the Li and Shan's scheme in terns of both retrieval effectiveness and efficiency.