• Title/Summary/Keyword: k-Nearest Neighbor Classification

Search Result 187, Processing Time 0.038 seconds

Analysis of Texture Features and Classifications for the Accurate Diagnosis of Prostate Cancer (전립선암의 정확한 진단을 위한 질감 특성 분석 및 등급 분류)

  • Kim, Cho-Hee;So, Jae-Hong;Park, Hyeon-Gyun;Madusanka, Nuwan;Deekshitha, Prakash;Bhattacharjee, Subrata;Choi, Heung-Kook
    • Journal of Korea Multimedia Society
    • /
    • v.22 no.8
    • /
    • pp.832-843
    • /
    • 2019
  • Prostate cancer is a high-risk with a high incidence and is a disease that occurs only in men. Accurate diagnosis of cancer is necessary as the incidence of cancer patients is increasing. Prostate cancer is also a disease that is difficult to predict progress, so it is necessary to predict in advance through prognosis. Therefore, in this paper, grade classification is attempted based on texture feature extraction. There are two main methods of classification: Uses One-way Analysis of Variance (ANOVA) to determine whether texture features are significant values, compares them with all texture features and then uses only one classification i.e. Benign versus. The second method consisted of more detailed classifications without using ANOVA for better analysis between different grades. Results of both these methods are compared and analyzed through the machine learning models such as Support Vector Machine and K-Nearest Neighbor. The accuracy of Benign versus Grade 4&5 using the second method with the best results was 90.0 percentage.

Classification of Surface Defects on Steel Strip by KNN Classifier (KNN 분류기에 의한 강판 표면 결함의 분류)

  • Kim C.H.;Choi S.H.;Joo W.J.;Kim K.B.
    • Proceedings of the Korean Society of Precision Engineering Conference
    • /
    • 2005.10a
    • /
    • pp.379-383
    • /
    • 2005
  • This paper proposes a new steel strip surface inspection system. The system acquires bright and dark field images of defects by using a stroboscopic IR LED light and area camera system and the defect images are preprocessed and segmented in real time for feature extraction. 4113 defect samples of cold roll steel strips are used to develop KNN (k-Nearest Neighbor) classifier which classifies the defects into 8 different types. The developed KNN classifier demonstrates about 85% classifying performance which is considered very plausible result.

  • PDF

kNNDD-based One-Class Classification by Nonparametric Density Estimation (비모수 추정방법을 활용한 kNNDD의 이상치 탐지 기법)

  • Son, Jung-Hwan;Kim, Seoung-Bum
    • Journal of Korean Institute of Industrial Engineers
    • /
    • v.38 no.3
    • /
    • pp.191-197
    • /
    • 2012
  • One-class classification (OCC) is one of the recent growing areas in data mining and pattern recognition. In the present study we examine a k-nearest neighbors data description (kNNDD) algorithm, one of the OCC algorithms widely used. In particular, we propose to use nonparametric estimation methods to determine the threshold of the kNNDD algorithm. A simulation study has been conducted to explore the characteristics of the proposed approach and compare it with the existing approach that determines the threshold. The results demonstrate the usefulness and flexibility of the proposed approach.

Evaluation of Polycystic Ovary Syndrome Classification Model Using Machine Learning (머신러닝을 이용한 다낭성 난소 증후군 분류 모델 평가)

  • So-Young Jo;Soo-Young Ye
    • Journal of Radiation Industry
    • /
    • v.18 no.3
    • /
    • pp.173-176
    • /
    • 2024
  • In this paper, general characteristics, blood tests, and ultrasound examination results were used to classify the presence of polycystic ovary syndrome (PCOS). The classification algorithms used were SVM (Support Vector Machine) and k-NN (k-Nearest Neighbors). Out of a total of 300 data samples, 210 were used as training data and 90 as test data. The results showed that SVM achieved higher accuracy compared to k-NN, confirming its greater utility in diagnosing the presence of PCOS. Future research is expected to improve classification performance by incorporating various additional indicators and securing more data. Additionally, it is expected to serve as a foundational resource for predicting and classifying other diseases.

Identification of Plastic Wastes by Using Fuzzy Radial Basis Function Neural Networks Classifier with Conditional Fuzzy C-Means Clustering

  • Roh, Seok-Beom;Oh, Sung-Kwun
    • Journal of Electrical Engineering and Technology
    • /
    • v.11 no.6
    • /
    • pp.1872-1879
    • /
    • 2016
  • The techniques to recycle and reuse plastics attract public attention. These public attraction and needs result in improving the recycling technique. However, the identification technique for black plastic wastes still have big problem that the spectrum extracted from near infrared radiation spectroscopy is not clear and is contaminated by noise. To overcome this problem, we apply Raman spectroscopy to extract a clear spectrum of plastic material. In addition, to improve the classification ability of fuzzy Radial Basis Function Neural Networks, we apply supervised learning based clustering method instead of unsupervised clustering method. The conditional fuzzy C-Means clustering method, which is a kind of supervised learning based clustering algorithms, is used to determine the location of radial basis functions. The conditional fuzzy C-Means clustering analyzes the data distribution over input space under the supervision of auxiliary information. The auxiliary information is defined by using k Nearest Neighbor approach.

Comparison of the performance of classification algorithms using cytotoxicity data (세포독성 자료를 이용한 분류 알고리즘 성능 비교)

  • Yoon, Yeochang;Jeung, Eui Bae;Jo, Na Rae;Ju, Su In;Lee, Sung Duck
    • The Korean Journal of Applied Statistics
    • /
    • v.31 no.3
    • /
    • pp.417-426
    • /
    • 2018
  • An alternative developmental toxicity test using mouse embryonic stem cell derived embryoid bodies has been developed. This alternative method is not to administer chemicals to animals, but to treat chemicals with cells. This study suggests the use of Discriminant Analysis, Support Vector Machine, Artificial Neural Network and k-Nearest Neighbor. Algorithm performance was compared with accuracy and a weighted Cohen's kappa coefficient. In application, various classification techniques were applied to cytotoxicity data to classify drug toxicity and compare the results.

Classification of Surface Defect on Steel Strip by KNN Classifier (KNN 분류기에 의한 강판 표면 결함의 분류)

  • Kim Cheol-Ho;Choi Se-Ho;Kim Gi-Bum;Joo Won-Jong
    • Journal of the Korean Society for Precision Engineering
    • /
    • v.23 no.8 s.185
    • /
    • pp.80-88
    • /
    • 2006
  • This paper proposes a new steel strip surface inspection system. The system acquires bright and dark field images of defects by using a stroboscopic IR LED illuminator and area camera system and the defect images are preprocessed and segmented in real time for feature extraction. 4113 defect samples of hot rolled steel strip are used to develop KNN (k- Nearest Neighbor) classifier which classifies the defects into 8 different types. The developed KNN classifier demonstrates about 85% classifying performance which is considered very plausible result.

Prediction of arrhythmia using multivariate time series data (다변량 시계열 자료를 이용한 부정맥 예측)

  • Lee, Minhai;Noh, Hohsuk
    • The Korean Journal of Applied Statistics
    • /
    • v.32 no.5
    • /
    • pp.671-681
    • /
    • 2019
  • Studies on predicting arrhythmia using machine learning have been actively conducted with increasing number of arrhythmia patients. Existing studies have predicted arrhythmia based on multivariate data of feature variables extracted from RR interval data at a specific time point. In this study, we consider that the pattern of the heart state changes with time can be important information for the arrhythmia prediction. Therefore, we investigate the usefulness of predicting the arrhythmia with multivariate time series data obtained by extracting and accumulating the multivariate vectors of the feature variables at various time points. When considering 1-nearest neighbor classification method and its ensemble for comparison, it is confirmed that the multivariate time series data based method can have better classification performance than the multivariate data based method if we select an appropriate time series distance function.

A Classification Algorithm Based on Data Clustering and Data Reduction for Intrusion Detection System over Big Data

  • Wang, Qiuhua;Ouyang, Xiaoqin;Zhan, Jiacheng
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.13 no.7
    • /
    • pp.3714-3732
    • /
    • 2019
  • With the rapid development of network, Intrusion Detection System(IDS) plays a more and more important role in network applications. Many data mining algorithms are used to build IDS. However, due to the advent of big data era, massive data are generated. When dealing with large-scale data sets, most data mining algorithms suffer from a high computational burden which makes IDS much less efficient. To build an efficient IDS over big data, we propose a classification algorithm based on data clustering and data reduction. In the training stage, the training data are divided into clusters with similar size by Mini Batch K-Means algorithm, meanwhile, the center of each cluster is used as its index. Then, we select representative instances for each cluster to perform the task of data reduction and use the clusters that consist of representative instances to build a K-Nearest Neighbor(KNN) detection model. In the detection stage, we sort clusters according to the distances between the test sample and cluster indexes, and obtain k nearest clusters where we find k nearest neighbors. Experimental results show that searching neighbors by cluster indexes reduces the computational complexity significantly, and classification with reduced data of representative instances not only improves the efficiency, but also maintains high accuracy.

Method for Assessing Landslide Susceptibility Using SMOTE and Classification Algorithms (SMOTE와 분류 기법을 활용한 산사태 위험 지역 결정 방법)

  • Yoon, Hyung-Koo
    • Journal of the Korean Geotechnical Society
    • /
    • v.39 no.6
    • /
    • pp.5-12
    • /
    • 2023
  • Proactive assessment of landslide susceptibility is necessary for minimizing casualties. This study proposes a methodology for classifying the landslide safety factor using a classification algorithm based on machine learning techniques. The high-risk area model is adopted to perform the classification and eight geotechnical parameters are adopted as inputs. Four classification algorithms-namely decision tree, k-nearest neighbor, logistic regression, and random forest-are employed for comparing classification accuracy for the safety factors ranging between 1.2 and 2.0. Notably, a high accuracy is demonstrated in the safety factor range of 1.2~1.7, but a relatively low accuracy is obtained in the range of 1.8~2.0. To overcome this issue, the synthetic minority over-sampling technique (SMOTE) is adopted to generate additional data. The application of SMOTE improves the average accuracy by ~250% in the safety factor range of 1.8~2.0. The results demonstrate that SMOTE algorithm improves the accuracy of classification algorithms when applied to geotechnical data.