• Title/Summary/Keyword: K-Nearest Neighbor algorithm

Search Result 265, Processing Time 0.025 seconds

A study on neighbor selection methods in k-NN collaborative filtering recommender system (근접 이웃 선정 협력적 필터링 추천시스템에서 이웃 선정 방법에 관한 연구)

  • Lee, Seok-Jun
    • Journal of the Korean Data and Information Science Society
    • /
    • v.20 no.5
    • /
    • pp.809-818
    • /
    • 2009
  • Collaborative filtering approach predicts the preference of active user about specific items transacted on the e-commerce by using others' preference information. To improve the prediction accuracy through collaborative filtering approach, it must be needed to gain enough preference information of users' for predicting preference. But, a bit much information of users' preference might wrongly affect on prediction accuracy, and also too small information of users' preference might make bad effect on the prediction accuracy. This research suggests the method, which decides suitable numbers of neighbor users for applying collaborative filtering algorithm, improved by existing k nearest neighbors selection methods. The result of this research provides useful methods for improving the prediction accuracy and also refines exploratory data analysis approach for deciding appropriate numbers of nearest neighbors.

  • PDF

Performance analysis of maximum likelihood detection for the spatial multiplexing system with multiple antennas (다중 안테나를 갖는 공간 다중화 시스템을 위한 maximum likelihood 검출기의 성능 분석)

  • Shin Myeongcheol;Song Young Seog;Kwon Dong-Seung;Seo Jeongtae;Lee Chungyong
    • Journal of the Institute of Electronics Engineers of Korea TC
    • /
    • v.42 no.12
    • /
    • pp.103-110
    • /
    • 2005
  • The performance of maximum likelihood(ML) detection for the given channel is analyzed in spatially multiplexed MIMO system. In order to obtain the vector symbol error rate, we define error vectors which represent the geometrical relation between lattice points. The properties of error vectors are analyzed to show that all lattice points in infinite lattice almost surely have four nearest neighbors after random channel transformation. Using this information and minimum distance obtained by the modified sphere decoding algorithm, we formulate the analytical performance of vector symbol error over the given channel. To verify the result, we simulate ML performance over various random channel which are classified into three categories: unitary channel, dense channel, and sparse channel. From the simulation results, it is verified that the derived analytical result gives a good approximation about the performance of ML detector over the all random MIMO channels.

Classification Protein Subcellular Locations Using n-Gram Features (단백질 서열의 n-Gram 자질을 이용한 세포내 위치 예측)

  • Kim, Jinsuk
    • Proceedings of the Korea Contents Association Conference
    • /
    • 2007.11a
    • /
    • pp.12-16
    • /
    • 2007
  • The function of a protein is closely co-related with its subcellular location(s). Given a protein sequence, therefore, how to determine its subcellular location is a vitally important problem. We have developed a new prediction method for protein subcellular location(s), which is based on n-gram feature extraction and k-nearest neighbor (kNN) classification algorithm. It classifies a protein sequence to one or more subcellular compartments based on the locations of top k sequences which show the highest similarity weights against the input sequence. The similarity weight is a kind of similarity measure which is determined by comparing n-gram features between two sequences. Currently our method extract penta-grams as features of protein sequences, computes scores of the potential localization site(s) using kNN algorithm, and finally presents the locations and their associated scores. We constructed a large-scale data set of protein sequences with known subcellular locations from the SWISS-PROT database. This data set contains 51,885 entries with one or more known subcellular locations. Our method show very high prediction precision of about 93% for this data set, and compared with other method, it also showed comparable prediction improvement for a test collection used in a previous work.

  • PDF

A Method of Highspeed Similarity Retrieval based on Self-Organizing Maps (자기 조직화 맵 기반 유사화상 검색의 고속화 수법)

  • Oh, Kun-Seok;Yang, Sung-Ki;Bae, Sang-Hyun;Kim, Pan-Koo
    • The KIPS Transactions:PartB
    • /
    • v.8B no.5
    • /
    • pp.515-522
    • /
    • 2001
  • Feature-based similarity retrieval become an important research issue in image database systems. The features of image data are useful to discrimination of images. In this paper, we propose the highspeed k-Nearest Neighbor search algorithm based on Self-Organizing Maps. Self-Organizing Map(SOM) provides a mapping from high dimensional feature vectors onto a two-dimensional space. A topological feature map preserves the mutual relations (similarity) in feature spaces of input data, and clusters mutually similar feature vectors in a neighboring nodes. Each node of the topological feature map holds a node vector and similar images that is closest to each node vector. We implemented about k-NN search for similar image classification as to (1) access to topological feature map, and (2) apply to pruning strategy of high speed search. We experiment on the performance of our algorithm using color feature vectors extracted from images. Promising results have been obtained in experiments.

  • PDF

An Algorithm of Curved Hull Plates Classification for the Curved Hull Plates Forming Process (곡가공 프로세스를 고려한 곡판 분류 알고리즘)

  • Noh, Ja-Ckyou;Shin, Jong-Gye
    • Journal of the Society of Naval Architects of Korea
    • /
    • v.46 no.6
    • /
    • pp.675-687
    • /
    • 2009
  • In general, the forming process of the curved hull plates consists of sub tasks, such as roll bending, line heating, and triangle heating. In order to complement the automated curved hull forming system, it is necessary to develop an algorithm to classify the curved hull plates of a ship into standard shapes with respect to the techniques of forming task, such as the roll bending, the line heating, and the triangle heating. In this paper, the curved hull plates are classified by four standard shapes and the combination of them, or saddle, convex, flat, cylindrical shape, and the combination of them, that are related to the forming tasks necessary to form the shapes. In preprocessing, the Gaussian curvature and the mean curvature at the mid-point of a mesh of modeling surface by Coon's patch are calculated. Then the nearest neighbor method to classify the input plate type is applied. Tests to verify the developed algorithm with sample plates of a real ship data have been performed.

Estimation of Aboveground Biomass Carbon Stock in Danyang Area using kNN Algorithm and Landsat TM Seasonal Satellite Images (kNN 알고리즘과 계절별 Landsat TM 위성영상을 이용한 단양군 지역의 지상부 바이오매스 탄소저장량 추정)

  • Jung, Jae-Hoon;Heo, Joon;Yoo, Su-Hong;Kim, Kyung-Min;Lee, Jung-Bin
    • Journal of Korean Society for Geospatial Information Science
    • /
    • v.18 no.4
    • /
    • pp.119-129
    • /
    • 2010
  • The joint use of remotely sensed data and field measurements has been widely used to estimate aboveground carbon stock in many countries. Recently, Korea Forest Research Institute has developed new carbon emission factors for kind of tree, thus more accurate estimate is possible. In this study, the aboveground carbon stock of Danyang area in South Korea was estimated using k-Nearest Neighbor(kNN) algorithm with the 5th National Forest Inventory(NFI) data. Considering the spectral response of forested area under the climate condition in Korea peninsular which has 4 distinct seasons, Landsat TM seasonal satellite images were collected. As a result, the estimated total carbon stock of Danyang area was ranged from 3542768.49tonC to 3329037.51tonC but seasonal trends were not found.

The Optimized Detection Range of RFID-based Positioning System using k-Nearest Neighbor Algorithm

  • Kim, Jung-Hwan;Heo, Joon;Han, Soo-Hee;Kim, Sang-Min
    • Proceedings of the Korean Association of Geographic Inforamtion Studies Conference
    • /
    • 2008.10a
    • /
    • pp.270-271
    • /
    • 2008
  • The positioning technology for a moving object is an important and essential component of ubiquitous communication computing environment and applications, for which Radio Frequency IDentification Identification(RFID) is has been considered as also a core technology for ubiquitous wireless communication. RFID-based positioning system calculates the position of moving object based on k-nearest neighbor(k-nn) algorithm using detected k-tags which have known coordinates and k can be determined according to the detection range of RFID system. In this paper, RFID-based positioning system determines the position of moving object not using weight factor which depends on received signal strength but assuming that tags within the detection range always operate and have same weight value. Because the latter system is much more economical than the former one. The geometries of tags were determined with considerations in huge buildings like office buildings, shopping malls and warehouses, so they were determined as the line in 1-Dimensional space, the square in 2-Dimensional space and the cubic in 3-Dimensional space. In 1-Dimensional space, the optimal detection range is determined as 125% of the tag spacing distance through the analytical and numerical approach. Here, the analytical approach means a mathematical proof and the numerical approach means a simulation using matlab. But the analytical approach is very difficult in 2- and 3-Dimensional space, so through the numerical approach, the optimal detection range is determined as 134% of the tag spacing distance in 2-Dimensional space and 143% of the tag spacing distance in 3-Dimensional space. This result can be used as a fundamental study for designing RFID-based positioning system.

  • PDF

Human activity recognition with analysis of angles between skeletal joints using a RGB-depth sensor

  • Ince, Omer Faruk;Ince, Ibrahim Furkan;Yildirim, Mustafa Eren;Park, Jang Sik;Song, Jong Kwan;Yoon, Byung Woo
    • ETRI Journal
    • /
    • v.42 no.1
    • /
    • pp.78-89
    • /
    • 2020
  • Human activity recognition (HAR) has become effective as a computer vision tool for video surveillance systems. In this paper, a novel biometric system that can detect human activities in 3D space is proposed. In order to implement HAR, joint angles obtained using an RGB-depth sensor are used as features. Because HAR is operated in the time domain, angle information is stored using the sliding kernel method. Haar-wavelet transform (HWT) is applied to preserve the information of the features before reducing the data dimension. Dimension reduction using an averaging algorithm is also applied to decrease the computational cost, which provides faster performance while maintaining high accuracy. Before the classification, a proposed thresholding method with inverse HWT is conducted to extract the final feature set. Finally, the K-nearest neighbor (k-NN) algorithm is used to recognize the activity with respect to the given data. The method compares favorably with the results using other machine learning algorithms.

Pattern Recognition System Combining KNN rules and New Feature Weighting algorithm (KNN 규칙과 새로운 특징 가중치 알고리즘을 결합한 패턴 인식 시스템)

  • Lee Hee-Sung;Kim Euntai;Kim Dongyeon
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.42 no.4 s.304
    • /
    • pp.43-50
    • /
    • 2005
  • This paper proposes a new pattern recognition system combining the new adaptive feature weighting based on the genetic algorithm and the modified KNN(K Nearest-Neighbor) rules. The new feature weighting proposed herein avoids the overfitting and finds the Proper feature weighting value by determining the middle value of weights using GA. New GA operators are introduced to obtain the high performance of the system. Moreover, a class dependent feature weighting strategy is employed. Whilst the classical methods use the same feature space for all classes, the Proposed method uses a different feature space for each class. The KNN rule is modified to estimate the class of test pattern using adaptive feature space. Experiments were performed with the unconstrained handwritten numeral database of Concordia University in Canada to show the performance of the proposed method.

Security tendency analysis techniques through machine learning algorithms applications in big data environments (빅데이터 환경에서 기계학습 알고리즘 응용을 통한 보안 성향 분석 기법)

  • Choi, Do-Hyeon;Park, Jung-Oh
    • Journal of Digital Convergence
    • /
    • v.13 no.9
    • /
    • pp.269-276
    • /
    • 2015
  • Recently, with the activation of the industry related to the big data, the global security companies have expanded their scopes from structured to unstructured data for the intelligent security threat monitoring and prevention, and they show the trend to utilize the technique of user's tendency analysis for security prevention. This is because the information scope that can be deducted from the existing structured data(Quantify existing available data) analysis is limited. This study is to utilize the analysis of security tendency(Items classified purpose distinction, positive, negative judgment, key analysis of keyword relevance) applying the machine learning algorithm($Na{\ddot{i}}ve$ Bayes, Decision Tree, K-nearest neighbor, Apriori) in the big data environment. Upon the capability analysis, it was confirmed that the security items and specific indexes for the decision of security tendency could be extracted from structured and unstructured data.