• Title/Summary/Keyword: K-nearest neighbors (KNN)

Search Result 48, Processing Time 0.019 seconds

KNN-based Image Annotation by Collectively Mining Visual and Semantic Similarities

  • Ji, Qian;Zhang, Liyan;Li, Zechao
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.11 no.9
    • /
    • pp.4476-4490
    • /
    • 2017
  • The aim of image annotation is to determine labels that can accurately describe the semantic information of images. Many approaches have been proposed to automate the image annotation task while achieving good performance. However, in most cases, the semantic similarities of images are ignored. Towards this end, we propose a novel Visual-Semantic Nearest Neighbor (VS-KNN) method by collectively exploring visual and semantic similarities for image annotation. First, for each label, visual nearest neighbors of a given test image are constructed from training images associated with this label. Second, each neighboring subset is determined by mining the semantic similarity and the visual similarity. Finally, the relevance between the images and labels is determined based on maximum a posteriori estimation. Extensive experiments were conducted using three widely used image datasets. The experimental results show the effectiveness of the proposed method in comparison with state-of-the-arts methods.

KNN/ANN Hybrid Location Determination Algorithm for Indoor Location Base Service (실내 위치기반서비스를 위한 KNN/ANN Hybrid 측위 결정 알고리즘)

  • Lee, Jang-Jae;Jung, Min-A;Lee, Seong-Ro;Song, Iick-Ho
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.48 no.2
    • /
    • pp.109-115
    • /
    • 2011
  • As fingerprinting method, k-nearest neighbor(KNN) has been widely applied for indoor location in wireless location area networks(WLAN), but its performance is sensitive to number of neighbors k and positions of reference points(RPs). So artificial neural network(ANN) clustering algorithm is applied to improve KNN, which is the KNN/ANN hybrid algorithm presented in this paper. For any pattern matching based algorithm in WLAN environment, the characteristics of signal to noise ratio(SNR) to multiple access points(APs) are utilized to establish database in the training phase, and in the estimation phase, the actual two dimensional coordinates of mobile unit(MU) are estimated based on the comparison between the new recorded SNR and fingerprints stored in database. In the proposed algorithm, through KNN, k RPs are firstly chosen as the data samples of ANN based on SNR. Then, the k RPs are classified into different clusters through ANN based on SNR. Experimental results indicate that the proposed KNN/ANN hybrid algorithm generally outperforms KNN algorithm when the locations error is less than 2m.

Linear interpolation and Machine Learning Methods for Gas Leakage Prediction Base on Multi-source Data Integration (다중소스 데이터 융합 기반의 가스 누출 예측을 위한 선형 보간 및 머신러닝 기법)

  • Dashdondov, Khongorzul;Jo, Kyuri;Kim, Mi-Hye
    • Journal of the Korea Convergence Society
    • /
    • v.13 no.3
    • /
    • pp.33-41
    • /
    • 2022
  • In this article, we proposed to predict natural gas (NG) leakage levels through feature selection based on a factor analysis (FA) of the integrating the Korean Meteorological Agency data and natural gas leakage data for considering complex factors. The paper has been divided into three modules. First, we filled missing data based on the linear interpolation method on the integrated data set, and selected essential features using FA with OrdinalEncoder (OE)-based normalization. The dataset is labeled by K-means clustering. The final module uses four algorithms, K-nearest neighbors (KNN), decision tree (DT), random forest (RF), Naive Bayes (NB), to predict gas leakage levels. The proposed method is evaluated by the accuracy, area under the ROC curve (AUC), and mean standard error (MSE). The test results indicate that the OrdinalEncoder-Factor analysis (OE-F)-based classification method has improved successfully. Moreover, OE-F-based KNN (OE-F-KNN) showed the best performance by giving 95.20% accuracy, an AUC of 96.13%, and an MSE of 0.031.

An Efficient KNN Query Processing Method in Sensor Networks (센서 네트워크에서 효율적인 KNN 질의처리 방법)

  • Son, In-Keun;Hyun, Dong-Joon;Chung, Yon-Dohn;Lee, Eun-Kyu;Kim, Myoung-Ho
    • Journal of KIISE:Databases
    • /
    • v.32 no.4
    • /
    • pp.429-440
    • /
    • 2005
  • As rapid improvement in electronic technologies makes sensor hardware more powerful and capable, the application range of sensor networks Is getting to be broader. The main purpose of sensor networks is to monitor the phenomena in interesting regions (e.g., factory warehouses, disaster areas, wild fields, etc) and return required data. The k Nearest Neighbor (KNN) query that finds k objects which are geographically close to the given point is an Important application in sensor networks. However, most previous approaches are either seem to be impractical or are not energy-efficient in resource-limited sensor networks. In this paper. we propose an efficient KNN query processing method in sensor networks. In the proposed method, we dynamically increase searching boundary, if necessary, and traverse nodes inside the boundary until finding k nearest neighbors. Since only the representative sensor nodes are visited, our algorithm reduces a number of messages. We show thorough experiments that the proposed method performs better than the existing method in various network environments.

BIM Mesh Optimization Algorithm Using K-Nearest Neighbors for Augmented Reality Visualization (증강현실 시각화를 위해 K-최근접 이웃을 사용한 BIM 메쉬 경량화 알고리즘)

  • Pa, Pa Win Aung;Lee, Donghwan;Park, Jooyoung;Cho, Mingeon;Park, Seunghee
    • KSCE Journal of Civil and Environmental Engineering Research
    • /
    • v.42 no.2
    • /
    • pp.249-256
    • /
    • 2022
  • Various studies are being actively conducted to show that the real-time visualization technology that combines BIM (Building Information Modeling) and AR (Augmented Reality) helps to increase construction management decision-making and processing efficiency. However, when large-capacity BIM data is projected into AR, there are various limitations such as data transmission and connection problems and the image cut-off issue. To improve the high efficiency of visualizing, a mesh optimization algorithm based on the k-nearest neighbors (KNN) classification framework to reconstruct BIM data is proposed in place of existing mesh optimization methods that are complicated and cannot adequately handle meshes with numerous boundaries of the 3D models. In the proposed algorithm, our target BIM model is optimized with the Unity C# code based on triangle centroid concepts and classified using the KNN. As a result, the algorithm can check the number of mesh vertices and triangles before and after optimization of the entire model and each structure. In addition, it is able to optimize the mesh vertices of the original model by approximately 56 % and the triangles by about 42 %. Moreover, compared to the original model, the optimized model shows no visual differences in the model elements and information, meaning that high-performance visualization can be expected when using AR devices.

Deterministic and probabilistic analysis of tunnel face stability using support vector machine

  • Li, Bin;Fu, Yong;Hong, Yi;Cao, Zijun
    • Geomechanics and Engineering
    • /
    • v.25 no.1
    • /
    • pp.17-30
    • /
    • 2021
  • This paper develops a convenient approach for deterministic and probabilistic evaluations of tunnel face stability using support vector machine classifiers. The proposed method is comprised of two major steps, i.e., construction of the training dataset and determination of instance-based classifiers. In step one, the orthogonal design is utilized to produce representative samples after the ranges and levels of the factors that influence tunnel face stability are specified. The training dataset is then labeled by two-dimensional strength reduction analyses embedded within OptumG2. For any unknown instance, the second step applies the training dataset for classification, which is achieved by an ad hoc Python program. The classification of unknown samples starts with selection of instance-based training samples using the k-nearest neighbors algorithm, followed by the construction of an instance-based SVM-KNN classifier. It eventually provides labels of the unknown instances, avoiding calculate its corresponding performance function. Probabilistic evaluations are performed by Monte Carlo simulation based on the SVM-KNN classifier. The ratio of the number of unstable samples to the total number of simulated samples is computed and is taken as the failure probability, which is validated and compared with the response surface method.

Design of Radial Basis Function with the Aid of Fuzzy KNN and Conditional FCM (퍼지 kNN과 Conditional FCM을 이용한 퍼지 RBF의 설계)

  • Roh, Seok-Beon;Oh, Sung-Kwun
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.58 no.6
    • /
    • pp.1223-1229
    • /
    • 2009
  • The performance of Radial Basis Function Neural Networks depends on setting up the Radial Basis Functions over the input space which are the important design procedure of Radial Basis Function Neural Networks. The existing method to initialize the location of the radial basis functions over the input space is to use the conditional fuzzy C-means clustering. However, the researchers which are interested in the conditional fuzzy C-means clustering cannot get as good modeling performance as they expect because the conditional fuzzy C-means clustering cannot project the information which is extracted over the output space into the input space. To compensate the above mentioned drawback of the conditional fuzzy C-means clustering, we apply a fuzzy K-nearest neighbors approach to project the auxiliary information defined over the output space into the input space without lose of the information.

Expressway Travel Time Prediction Using K-Nearest Neighborhood (KNN 알고리즘을 활용한 고속도로 통행시간 예측)

  • Shin, Kangwon;Shim, Sangwoo;Choi, Keechoo;Kim, Soohee
    • KSCE Journal of Civil and Environmental Engineering Research
    • /
    • v.34 no.6
    • /
    • pp.1873-1879
    • /
    • 2014
  • There are various methodologies to forecast the travel time using real-time data but the K-nearest neighborhood (KNN) method in general is regarded as the most one in forecasting when there are enough historical data. The objective of this study is to evaluate applicability of KNN method. In this study, real-time and historical data of toll collection system (TCS) traffic flow and the dedicated short range communication (DSRC) link travel time, and the historical path travel time data are used as input data for KNN approach. The proposed method investigates the path travel time which is the nearest to TCS traffic flow and DSRC link travel time from real-time and historical data, then it calculates the predicted path travel time using weight average method. The results show that accuracy increased when weighted value of DSRC link travel time increases. Moreover the trend of forecasted and real travel times are similar. In addition, the error in forecasted travel time could be further reduced when more historical data could be available in the future database.

Behavior and Script Similarity-Based Cryptojacking Detection Framework Using Machine Learning (머신러닝을 활용한 행위 및 스크립트 유사도 기반 크립토재킹 탐지 프레임워크)

  • Lim, EunJi;Lee, EunYoung;Lee, IlGu
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.31 no.6
    • /
    • pp.1105-1114
    • /
    • 2021
  • Due to the recent surge in popularity of cryptocurrency, the threat of cryptojacking, a malicious code for mining cryptocurrencies, is increasing. In particular, web-based cryptojacking is easy to attack because the victim can mine cryptocurrencies using the victim's PC resources just by accessing the website and simply adding mining scripts. The cryptojacking attack causes poor performance and malfunction. It can also cause hardware failure due to overheating and aging caused by mining. Cryptojacking is difficult for victims to recognize the damage, so research is needed to efficiently detect and block cryptojacking. In this work, we take representative distinct symptoms of cryptojacking as an indicator and propose a new architecture. We utilized the K-Nearst Neighbors(KNN) model, which trained computer performance indicators as behavior-based dynamic analysis techniques. In addition, a K-means model, which trained the frequency of malicious script words for script similarity-based static analysis techniques, was utilized. The KNN model had 99.6% accuracy, and the K-means model had a silhouette coefficient of 0.61 for normal clusters.

A Study on the Drug Classification Using Machine Learning Techniques (머신러닝 기법을 이용한 약물 분류 방법 연구)

  • Anmol Kumar Singh;Ayush Kumar;Adya Singh;Akashika Anshum;Pradeep Kumar Mallick
    • Advanced Industrial SCIence
    • /
    • v.3 no.2
    • /
    • pp.8-16
    • /
    • 2024
  • This paper shows the system of drug classification, the goal of this is to foretell the apt drug for the patients based on their demographic and physiological traits. The dataset consists of various attributes like Age, Sex, BP (Blood Pressure), Cholesterol Level, and Na_to_K (Sodium to Potassium ratio), with the objective to determine the kind of drug being given. The models used in this paper are K-Nearest Neighbors (KNN), Logistic Regression and Random Forest. Further to fine-tune hyper parameters using 5-fold cross-validation, GridSearchCV was used and each model was trained and tested on the dataset. To assess the performance of each model both with and without hyper parameter tuning evaluation metrics like accuracy, confusion matrices, and classification reports were used and the accuracy of the models without GridSearchCV was 0.7, 0.875, 0.975 and with GridSearchCV was 0.75, 1.0, 0.975. According to GridSearchCV Logistic Regression is the most suitable model for drug classification among the three-model used followed by the K-Nearest Neighbors. Also, Na_to_K is an essential feature in predicting the outcome.