• Title/Summary/Keyword: K-Nearest Neighbor algorithm

Search Result 271, Processing Time 0.03 seconds

Machine-learning Approaches with Multi-temporal Remotely Sensed Data for Estimation of Forest Biomass and Forest Reference Emission Levels (시계열 위성영상과 머신러닝 기법을 이용한 산림 바이오매스 및 배출기준선 추정)

  • Yong-Kyu, Lee;Jung-Soo, Lee
    • Journal of Korean Society of Forest Science
    • /
    • v.111 no.4
    • /
    • pp.603-612
    • /
    • 2022
  • The study aims were to evaluate a machine-learning, algorithm-based, forest biomass-estimation model to estimate subnational forest biomass and to comparatively analyze REDD+ forest reference emission levels. Time-series Landsat satellite imagery and ESA Biomass Climate Change Initiative information were used to build a machine-learning-based biomass estimation model. The k-nearest neighbors algorithm (kNN), which is a non-parametric learning model, and the tree-based random forest (RF) model were applied to the machine-learning algorithm, and the estimated biomasses were compared with the forest reference emission levels (FREL) data, which was provided by the Paraguayan government. The root mean square error (RMSE), which was the optimum parameter of the kNN model, was 35.9, and the RMSE of the RF model was lower at 34.41, showing that the RF model was superior. As a result of separately using the FREL, kNN, and RF methods to set the reference emission levels, the gradient was set to approximately -33,000 tons, -253,000 tons, and -92,000 tons, respectively. These results showed that the machine learning-based estimation model was more suitable than the existing methods for setting reference emission levels.

Performance comparison of machine learning classification methods for decision of disc cutter replacement of shield TBM (쉴드 TBM 디스크 커터 교체 유무 판단을 위한 머신러닝 분류기법 성능 비교)

  • Kim, Yunhee;Hong, Jiyeon;Kim, Bumjoo
    • Journal of Korean Tunnelling and Underground Space Association
    • /
    • v.22 no.5
    • /
    • pp.575-589
    • /
    • 2020
  • In recent years, Shield TBM construction has been continuously increasing in domestic tunnels. The main excavation tool in the shield TBM construction is a disc cutter which naturally wears during the excavation process and significantly degrades the excavation efficiency. Therefore, it is important to know the appropriate time of the disc cutter replacement. In this study, it is proposed a predictive model that can determine yes/no of disc cutter replacement using machine learning algorithm. To do this, the shield TBM machine data which is highly correlated to the disc cutter wears and the disc cutter replacement from the shield TBM field which is already constructed are used as the input data in the model. Also, the algorithms used in the study were the support vector machine, k-nearest neighbor algorithm, and decision tree algorithm are all classification methods used in machine learning. In order to construct an optimal predictive model and to evaluate the performance of the model, the classification performance evaluation index was compared and analyzed.

Real Time Face Detection and Recognition using Rectangular Feature based Classifier and Class Matching Algorithm (사각형 특징 기반 분류기와 클래스 매칭을 이용한 실시간 얼굴 검출 및 인식)

  • Kim, Jong-Min;Kang, Myung-A
    • The Journal of the Korea Contents Association
    • /
    • v.10 no.1
    • /
    • pp.19-26
    • /
    • 2010
  • This paper proposes a classifier based on rectangular feature to detect face in real time. The goal is to realize a strong detection algorithm which satisfies both efficiency in calculation and detection performance. The proposed algorithm consists of the following three stages: Feature creation, classifier study and real time facial domain detection. Feature creation organizes a feature set with the proposed five rectangular features and calculates the feature values efficiently by using SAT (Summed-Area Tables). Classifier learning creates classifiers hierarchically by using the AdaBoost algorithm. In addition, it gets excellent detection performance by applying important face patterns repeatedly at the next level. Real time facial domain detection finds facial domains rapidly and efficiently through the classifier based on the rectangular feature that was created. Also, the recognition rate was improved by using the domain which detected a face domain as the input image and by using PCA and KNN algorithms and a Class to Class rather than the existing Point to Point technique.

Determination of the stage and grade of periodontitis according to the current classification of periodontal and peri-implant diseases and conditions (2018) using machine learning algorithms

  • Kubra Ertas;Ihsan Pence;Melike Siseci Cesmeli;Zuhal Yetkin Ay
    • Journal of Periodontal and Implant Science
    • /
    • v.53 no.1
    • /
    • pp.38-53
    • /
    • 2023
  • Purpose: The current Classification of Periodontal and Peri-Implant Diseases and Conditions, published and disseminated in 2018, involves some difficulties and causes diagnostic conflicts due to its criteria, especially for inexperienced clinicians. The aim of this study was to design a decision system based on machine learning algorithms by using clinical measurements and radiographic images in order to determine and facilitate the staging and grading of periodontitis. Methods: In the first part of this study, machine learning models were created using the Python programming language based on clinical data from 144 individuals who presented to the Department of Periodontology, Faculty of Dentistry, Süleyman Demirel University. In the second part, panoramic radiographic images were processed and classification was carried out with deep learning algorithms. Results: Using clinical data, the accuracy of staging with the tree algorithm reached 97.2%, while the random forest and k-nearest neighbor algorithms reached 98.6% accuracy. The best staging accuracy for processing panoramic radiographic images was provided by a hybrid network model algorithm combining the proposed ResNet50 architecture and the support vector machine algorithm. For this, the images were preprocessed, and high success was obtained, with a classification accuracy of 88.2% for staging. However, in general, it was observed that the radiographic images provided a low level of success, in terms of accuracy, for modeling the grading of periodontitis. Conclusions: The machine learning-based decision system presented herein can facilitate periodontal diagnoses despite its current limitations. Further studies are planned to optimize the algorithm and improve the results.

Intensity and Ambient Enhanced Lidar-Inertial SLAM for Unstructured Construction Environment (비정형의 건설환경 매핑을 위한 레이저 반사광 강도와 주변광을 활용한 향상된 라이다-관성 슬램)

  • Jung, Minwoo;Jung, Sangwoo;Jang, Hyesu;Kim, Ayoung
    • The Journal of Korea Robotics Society
    • /
    • v.16 no.3
    • /
    • pp.179-188
    • /
    • 2021
  • Construction monitoring is one of the key modules in smart construction. Unlike structured urban environment, construction site mapping is challenging due to the characteristics of an unstructured environment. For example, irregular feature points and matching prohibit creating a map for management. To tackle this issue, we propose a system for data acquisition in unstructured environment and a framework for Intensity and Ambient Enhanced Lidar Inertial Odometry via Smoothing and Mapping, IA-LIO-SAM, that achieves highly accurate robot trajectories and mapping. IA-LIO-SAM utilizes a factor graph same as Tightly-coupled Lidar Inertial Odometry via Smoothing and Mapping (LIO-SAM). Enhancing the existing LIO-SAM, IA-LIO-SAM leverages point's intensity and ambient value to remove unnecessary feature points. These additional values also perform as a new factor of the K-Nearest Neighbor algorithm (KNN), allowing accurate comparisons between stored points and scanned points. The performance was verified in three different environments and compared with LIO-SAM.

Flood Frequency Analysis with the consideration of the heterogeneous impacts from TC and non-TC rainfalls: application to daily flows in the Nam River Basin, South Korea

  • Alcantara, Angelika;Ahn, Kuk-Hyun
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2020.06a
    • /
    • pp.121-121
    • /
    • 2020
  • Varying dominant processes, including Tropical Cyclone (TC) and non-TC rainfall events, have been known to drive the occurrence of precipitation in South Korea. With the changes in the pattern of the Earth's climate due to anthropogenic activities, nonstationarity or changes in the magnitude and frequency of these dominant processes have been separately observed for the past decades and are expected to continue in the coming years. These changes often cause unprecedented hydrologic events such as extreme flooding which pose a greater risk to the society. This study aims to take into account a more reliable future climate condition with two dominant processes. Diverse statistical models including the hidden markov chain, K-nearest neighbor algorithm, and quantile mappings are utilized to mimic future rainfall events based on the recorded historical data with the consideration of the varying effects of TC and non-TC events. The data generated is then utilized to the hydrologic model to conduct a flood frequency analysis. Results in this study emphasize the need to consider the nonstationarity of design rainfalls to fully grasp the degree of future flooding events when designing urban water infrastructures.

  • PDF

A personalized exercise recommendation system using dimension reduction algorithms

  • Lee, Ha-Young;Jeong, Ok-Ran
    • Journal of the Korea Society of Computer and Information
    • /
    • v.26 no.6
    • /
    • pp.19-28
    • /
    • 2021
  • Nowadays, interest in health care is increasing due to Coronavirus (COVID-19), and a lot of people are doing home training as there are more difficulties in using fitness centers and public facilities that are used together. In this paper, we propose a personalized exercise recommendation algorithm using personalized propensity information to provide more accurate and meaningful exercise recommendation to home training users. Thus, we classify the data according to the criteria for obesity with a k-nearest neighbor algorithm using personal information that can represent individuals, such as eating habits information and physical conditions. Furthermore, we differentiate the exercise dataset by the level of exercise activities. Based on the neighborhood information of each dataset, we provide personalized exercise recommendations to users through a dimensionality reduction algorithm (SVD) among model-based collaborative filtering methods. Therefore, we can solve the problem of data sparsity and scalability of memory-based collaborative filtering recommendation techniques and we verify the accuracy and performance of the proposed algorithms.

Evaluation of Machine Learning Algorithm Utilization for Lung Cancer Classification Based on Gene Expression Levels

  • Podolsky, Maxim D;Barchuk, Anton A;Kuznetcov, Vladimir I;Gusarova, Natalia F;Gaidukov, Vadim S;Tarakanov, Segrey A
    • Asian Pacific Journal of Cancer Prevention
    • /
    • v.17 no.2
    • /
    • pp.835-838
    • /
    • 2016
  • Background: Lung cancer remains one of the most common cancers in the world, both in terms of new cases (about 13% of total per year) and deaths (nearly one cancer death in five), because of the high case fatality. Errors in lung cancer type or malignant growth determination lead to degraded treatment efficacy, because anticancer strategy depends on tumor morphology. Materials and Methods: We have made an attempt to evaluate effectiveness of machine learning algorithms in the task of lung cancer classification based on gene expression levels. We processed four publicly available data sets. The Dana-Farber Cancer Institute data set contains 203 samples and the task was to classify four cancer types and sound tissue samples. With the University of Michigan data set of 96 samples, the task was to execute a binary classification of adenocarcinoma and non-neoplastic tissues. The University of Toronto data set contains 39 samples and the task was to detect recurrence, while with the Brigham and Women's Hospital data set of 181 samples it was to make a binary classification of malignant pleural mesothelioma and adenocarcinoma. We used the k-nearest neighbor algorithm (k=1, k=5, k=10), naive Bayes classifier with assumption of both a normal distribution of attributes and a distribution through histograms, support vector machine and C4.5 decision tree. Effectiveness of machine learning algorithms was evaluated with the Matthews correlation coefficient. Results: The support vector machine method showed best results among data sets from the Dana-Farber Cancer Institute and Brigham and Women's Hospital. All algorithms with the exception of the C4.5 decision tree showed maximum potential effectiveness in the University of Michigan data set. However, the C4.5 decision tree showed best results for the University of Toronto data set. Conclusions: Machine learning algorithms can be used for lung cancer morphology classification and similar tasks based on gene expression level evaluation.

EAR: Enhanced Augmented Reality System for Sports Entertainment Applications

  • Mahmood, Zahid;Ali, Tauseef;Muhammad, Nazeer;Bibi, Nargis;Shahzad, Imran;Azmat, Shoaib
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.11 no.12
    • /
    • pp.6069-6091
    • /
    • 2017
  • Augmented Reality (AR) overlays virtual information on real world data, such as displaying useful information on videos/images of a scene. This paper presents an Enhanced AR (EAR) system that displays useful statistical players' information on captured images of a sports game. We focus on the situation where the input image is degraded by strong sunlight. Proposed EAR system consists of an image enhancement technique to improve the accuracy of subsequent player and face detection. The image enhancement is followed by player and face detection, face recognition, and players' statistics display. First, an algorithm based on multi-scale retinex is proposed for image enhancement. Then, to detect players' and faces', we use adaptive boosting and Haar features for feature extraction and classification. The player face recognition algorithm uses boosted linear discriminant analysis to select features and nearest neighbor classifier for classification. The system can be adjusted to work in different types of sports where the input is an image and the desired output is display of information nearby the recognized players. Simulations are carried out on 2096 different images that contain players in diverse conditions. Proposed EAR system demonstrates the great potential of computer vision based approaches to develop AR applications.

An effective automated ontology construction based on the agriculture domain

  • Deepa, Rajendran;Vigneshwari, Srinivasan
    • ETRI Journal
    • /
    • v.44 no.4
    • /
    • pp.573-587
    • /
    • 2022
  • The agricultural sector is completely different from other sectors since it completely relies on various natural and climatic factors. Climate changes have many effects, including lack of annual rainfall and pests, heat waves, changes in sea level, and global ozone/atmospheric CO2 fluctuation, on land and agriculture in similar ways. Climate change also affects the environment. Based on these factors, farmers chose their crops to increase productivity in their fields. Many existing agricultural ontologies are either domain-specific or have been created with minimal vocabulary and no proper evaluation framework has been implemented. A new agricultural ontology focused on subdomains is designed to assist farmers using Jaccard relative extractor (JRE) and Naïve Bayes algorithm. The JRE is used to find the similarity between two sentences and words in the agricultural documents and the relationship between two terms is identified via the Naïve Bayes algorithm. In the proposed method, the preprocessing of data is carried out through natural language processing techniques and the tags whose dimensions are reduced are subjected to rule-based formal concept analysis and mapping. The subdomain ontologies of weather, pest, and soil are built separately, and the overall agricultural ontology are built around them. The gold standard for the lexical layer is used to evaluate the proposed technique, and its performance is analyzed by comparing it with different state-of-the-art systems. Precision, recall, F-measure, Matthews correlation coefficient, receiver operating characteristic curve area, and precision-recall curve area are the performance metrics used to analyze the performance. The proposed methodology gives a precision score of 94.40% when compared with the decision tree(83.94%) and K-nearest neighbor algorithm(86.89%) for agricultural ontology construction.