• Title/Summary/Keyword: Nearest neighbor search

Search Result 120, Processing Time 0.025 seconds

An Improvement in K-NN Graph Construction using re-grouping with Locality Sensitive Hashing on MapReduce (MapReduce 환경에서 재그룹핑을 이용한 Locality Sensitive Hashing 기반의 K-Nearest Neighbor 그래프 생성 알고리즘의 개선)

  • Lee, Inhoe;Oh, Hyesung;Kim, Hyoung-Joo
    • KIISE Transactions on Computing Practices
    • /
    • v.21 no.11
    • /
    • pp.681-688
    • /
    • 2015
  • The k nearest neighbor (k-NN) graph construction is an important operation with many web-related applications, including collaborative filtering, similarity search, and many others in data mining and machine learning. Despite its many elegant properties, the brute force k-NN graph construction method has a computational complexity of $O(n^2)$, which is prohibitive for large scale data sets. Thus, (Key, Value)-based distributed framework, MapReduce, is gaining increasingly widespread use in Locality Sensitive Hashing which is efficient for high-dimension and sparse data. Based on the two-stage strategy, we engage the locality sensitive hashing technique to divide users into small subsets, and then calculate similarity between pairs in the small subsets using a brute force method on MapReduce. Specifically, generating a candidate group stage is important since brute-force calculation is performed in the following step. However, existing methods do not prevent large candidate groups. In this paper, we proposed an efficient algorithm for approximate k-NN graph construction by regrouping candidate groups. Experimental results show that our approach is more effective than existing methods in terms of graph accuracy and scan rate.

A Nearest Neighbor Query Processing Algorithm Supporting K-anonymity Based on Weighted Adjacency Graph in LBS (위치 기반 서비스에서 K-anonymity를 보장하는 가중치 근접성 그래프 기반 최근접 질의처리 알고리즘)

  • Jang, Mi-Young;Chang, Jae-Woo
    • Spatial Information Research
    • /
    • v.20 no.4
    • /
    • pp.83-92
    • /
    • 2012
  • Location-based services (LBS) are increasingly popular due to the improvement of geo-positioning capabilities and wireless communication technology. However, in order to enjoy LBS services, a user requesting a query must send his/her exact location to the LBS provider. Therefore, it is a key challenge to preserve user's privacy while providing LBS. To solve this problem, the existing method employs a 2PASS cloaking framework that not only hides the actual user location but also reduces bandwidth consumption. However, 2PASS does not fully guarantee the actual user privacy because it does not take the real user distribution into account. Hence, in this paper, we propose a nearest neighbor query processing algorithm that supports K-anonymity property based on the weighted adjacency graph(WAG). Our algorithm not only preserves the location of a user by guaranteeing k-anonymity in a query region, but also improves a bandwidth usage by reducing unnecessary search for a query result. We demonstrate from experimental results that our algorithm outperforms the existing one in terms of query processing time and bandwidth usage.

Design of an Efficient Parallel High-Dimensional Index Structure (효율적인 병렬 고차원 색인구조 설계)

  • Park, Chun-Seo;Song, Seok-Il;Sin, Jae-Ryong;Yu, Jae-Su
    • Journal of KIISE:Databases
    • /
    • v.29 no.1
    • /
    • pp.58-71
    • /
    • 2002
  • Generally, multi-dimensional data such as image and spatial data require large amount of storage space. There is a limit to store and manage those large amount of data in single workstation. If we manage the data on parallel computing environment which is being actively researched these days, we can get highly improved performance. In this paper, we propose a parallel high-dimensional index structure that exploits the parallelism of the parallel computing environment. The proposed index structure is nP(processor)-n$\times$mD(disk) architecture which is the hybrid type of nP-nD and lP-nD. Its node structure increases fan-out and reduces the height of a index tree. Also, A range search algorithm that maximizes I/O parallelism is devised, and it is applied to K-nearest neighbor queries. Through various experiments, it is shown that the proposed method outperforms other parallel index structures.

A Fast Motion Estimation Scheme using Spatial and Temporal Characteristics (시공간 특성을 이용한 고속 움직임 백터 예측 방법)

  • 노대영;장호연;오승준;석민수
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.40 no.4
    • /
    • pp.237-247
    • /
    • 2003
  • The Motion Estimation (ME) process is an important part of a video encoding systems since they can significantly reduce bitrate with keeping the output quality of an encoded sequence. Unfortunately this process may dominate the encoding time using straightforward full search algorithm (FS). Up to now, many fast algorithms can reduce the computation complexity by limiting the number of searching locations. This is accomplished at the expense of less accuracy of motion estimation. In this paper, we introduce a new fast motion estimation method based on the spatio-temporal correlation of adjacent blocks. A reliable predicted motion vector (RPMV) is defined. The reliability of RPMV is shown on the basis of motion vectors achieved by FS. The scalar and the direction of RPMV are used in our proposed scheme. The experimental results show that the proposed method Is about l1~14% faster than the nearest neighbor method which is a wellknown conventional fast scheme.

Novel Method for Face Recognition using Laplacian of Gaussian Mask with Local Contour Pattern

  • Jeon, Tae-jun;Jang, Kyeong-uk;Lee, Seung-ho
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.10 no.11
    • /
    • pp.5605-5623
    • /
    • 2016
  • We propose a face recognition method that utilizes the LCP face descriptor. The proposed method applies a LoG mask to extract a face contour response, and employs the LCP algorithm to produce a binary pattern representation that ensures high recognition performance even under the changes in illumination, noise, and aging. The proposed LCP algorithm produces excellent noise reduction and efficiency in removing unnecessary information from the face by extracting a face contour response using the LoG mask, whose behavior is similar to the human eye. Majority of reported algorithms search for face contour response information. On the other hand, our proposed LCP algorithm produces results expressing major facial information by applying the threshold to the search area with only 8 bits. However, the LCP algorithm produces results that express major facial information with only 8-bits by applying a threshold value to the search area. Therefore, compared to previous approaches, the LCP algorithm maintains a consistent accuracy under varying circumstances, and produces a high face recognition rate with a relatively small feature vector. The test results indicate that the LCP algorithm produces a higher facial recognition rate than the rate of human visual's recognition capability, and outperforms the existing methods.

Stochastic Optimization Approach for Parallel Expansion of the Existing Water Distribution Systems (추계학적 최적화방법에 의한 기존관수로시스템의 병열관로 확장)

  • Ahn, Tae-Jin;Choi, Gye-Woon;Park, Jung-Eung
    • Water for future
    • /
    • v.28 no.2
    • /
    • pp.169-180
    • /
    • 1995
  • The cost of a looped pipe network is affected by a set of loop flows. The mathematical model for optimizing the looped pipe network is expressed in the optimal set of loop flows to apply to a stochastic optimization method. Because the feasible region of the looped pipe network problem is nonconvex with multiple local optima, the Modified Stochastic Probing Method is suggested to efficiently search the feasible region. The method consists of two phase: i) a global search phase(the stochastic probing method) and ii) a local search phase(the nearest neighbor method). While the global search sequentially improves a local minimum, the local search escapes out of a local minimum trapped in the global search phase and also refines a final solution. In order to test the method, a standard test problem from the literature is considered for the optimal design of the paralled expansion of an existing network. The optimal solutions thus found have significantly smaller costs than the ones reported previously by other researchers.

  • PDF

An Advanced Scheme for Searching Spatial Objects and Identifying Hidden Objects (숨은 객체 식별을 위한 향상된 공간객체 탐색기법)

  • Kim, Jongwan;Cho, Yang-Hyun
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.18 no.7
    • /
    • pp.1518-1524
    • /
    • 2014
  • In this paper, a new method of spatial query, which is called Surround Search (SuSe) is suggested. This method makes it possible to search for the closest spatial object of interest to the user from a query point. SuSe is differentiated from the existing spatial object query schemes, because it locates the closest spatial object of interest around the query point. While SuSe searches the surroundings, the spatial object is saved on an R-tree, and MINDIST, the distance between the query location and objects, is measured by considering an angle that the existing spatial object query methods have not previously considered. The angle between targeted-search objects is found from a query point that is hidden behind another object in order to distinguish hidden objects from them. The distinct feature of this proposed scheme is that it can search the faraway or hidden objects, in contrast to the existing method. SuSe is able to search for spatial objects more precisely, and users can be confident that this scheme will have superior performance to its predecessor.

An Efficient Continuous Nearest Neighbor Search Scheme Using the Slab (슬랩을 이용한 효율적인 연속적 최근접 이운 탐색기법)

  • 한석;박광진;김종완;황종선
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2004.10b
    • /
    • pp.226-228
    • /
    • 2004
  • 최근에 이동객체의 위치정보를 활용한 위치기반서비스(L8S, Location Based Services)에 대한 관심이 증가하고 있다. 전통적으로 정적인 위치정보를 갖는 공간 객체는 GIS(Geographic Information System) 서버에 저장, 관리되었다. 이동객체는 시간에 따라 위치의 변화가 매우 빈번하여 위치 정보가 계속 갱신되기 때문에, 전통적인 GIS 서버로는 관리가 어렵다. 본 논문에서는 기존의 연속적인 최근접 이웃탐색 기법에서 데이터의 처리 순서에 따라 탐색공간과 계산비용이 증가하는 문제점을 슬랩을 사용하여 해결한다. 최근접 이웃의 수직연장선 사이의 공간인 슬랩 내부영역에 대해서만 탐색하도록 하여 탐색영역을 줄이고, 그 내부에 있는 점들에 대해서만 처리하여 계산비용을 줄인다.

  • PDF

A Genetic Algorithm for the Traveling Salesman Problem Using Prufer Number (Prufer 수를 이용한 외판원문제의 유전해법)

  • 이재승;신해웅;강맹규
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.20 no.41
    • /
    • pp.1-14
    • /
    • 1997
  • This study proposes a genetic algorithm using Pr(equation omitted)fer number for the traveling salesman problem(PNGATSP). Nearest neighbor nodes are mixed with randomly selected nodes at the stage of generating initial solutions. Proposed PNGATSP adopts a few ideas which are different from traditional genetic algorithms. For instance, an exponential fitness function and elitism are used and Pr(equation omitted)fer number is used for encoding TSP. Genetic operators are selected by experiments, which make a good solution among four combinations of conventional genetic operators and new genetic operators. For respective combinations, robust set of parameters is determined by the experimental designing approach. The feature of Pr(equation omitted)fer number code for TSP and the search power of GA using Pr(equation omitted)fer number is analysed. The best is a combination of OX(order crossover) and swap, which is superior to the other experimented combinations of genetic operators by 1.0%∼12.8% deviation.

  • PDF

Density-based Outlier Detection for Very Large Data (대용량 자료 분석을 위한 밀도기반 이상치 탐지)

  • Kim, Seung;Cho, Nam-Wook;Kang, Suk-Ho
    • Journal of the Korean Operations Research and Management Science Society
    • /
    • v.35 no.2
    • /
    • pp.71-88
    • /
    • 2010
  • A density-based outlier detection such as an LOF (Local Outlier Factor) tries to find an outlying observation by using density of its surrounding space. In spite of several advantages of a density-based outlier detection method, the computational complexity of outlier detection has been one of major barriers in its application. In this paper, we present an LOF algorithm that can reduce computation time of a density based outlier detection algorithm. A kd-tree indexing and approximated k-nearest neighbor search algorithm (ANN) are adopted in the proposed method. A set of experiments was conducted to examine performance of the proposed algorithm. The results show that the proposed method can effectively detect local outliers in reduced computation time.