• Title/Summary/Keyword: random forest (RF)

Search Result 185, Processing Time 0.032 seconds

Analysis of Land Cover Changes Based on Classification Result Using PlanetScope Satellite Imagery

  • Yoon, Byunghyun;Choi, Jaewan
    • Korean Journal of Remote Sensing
    • /
    • v.34 no.4
    • /
    • pp.671-680
    • /
    • 2018
  • Compared to the imagery produced by traditional satellites, PlanetScope satellite imagery has made it possible to easily capture remotely-sensed imagery every day through dozens or even hundreds of satellites on a relatively small budget. This study aimed to detect changed areas and update a land cover map using a PlanetScope image. To generate a classification map, pixel-based Random Forest (RF) classification was performed by using additional features, such as the Normalized Difference Water Index (NDWI) and the Normalized Difference Vegetation Index (NDVI). The classification result was converted to vector data and compared with the existing land cover map to estimate the changed area. To estimate the accuracy and trends of the changed area, the quantitative quality of the supervised classification result using the PlanetScope image was evaluated first. In addition, the patterns of the changed area that corresponded to the classification result were analyzed using the PlanetScope satellite image. Experimental results found that the PlanetScope image can be used to effectively to detect changed areas on large-scale land cover maps, and supervised classification results can update the changed areas.

Single nucleotide polymorphism marker combinations for classifying Yeonsan Ogye chicken using a machine learning approach

  • Eunjin, Cho;Sunghyun, Cho;Minjun, Kim;Thisarani Kalhari, Ediriweera;Dongwon, Seo;Seung-Sook, Lee;Jihye, Cha;Daehyeok, Jin;Young-Kuk, Kim;Jun Heon, Lee
    • Journal of Animal Science and Technology
    • /
    • v.64 no.5
    • /
    • pp.830-841
    • /
    • 2022
  • Genetic analysis has great potential as a tool to differentiate between different species and breeds of livestock. In this study, the optimal combinations of single nucleotide polymorphism (SNP) markers for discriminating the Yeonsan Ogye chicken (Gallus gallus domesticus) breed were identified using high-density 600K SNP array data. In 3,904 individuals from 198 chicken breeds, SNP markers specific to the target population were discovered through a case-control genome-wide association study (GWAS) and filtered out based on the linkage disequilibrium blocks. Significant SNP markers were selected by feature selection applying two machine learning algorithms: Random Forest (RF) and AdaBoost (AB). Using a machine learning approach, the 38 (RF) and 43 (AB) optimal SNP marker combinations for the Yeonsan Ogye chicken population demonstrated 100% accuracy. Hence, the GWAS and machine learning models used in this study can be efficiently utilized to identify the optimal combination of markers for discriminating target populations using multiple SNP markers.

Wildfire Severity Mapping Using Sentinel Satellite Data Based on Machine Learning Approaches (Sentinel 위성영상과 기계학습을 이용한 국내산불 피해강도 탐지)

  • Sim, Seongmun;Kim, Woohyeok;Lee, Jaese;Kang, Yoojin;Im, Jungho;Kwon, Chunguen;Kim, Sungyong
    • Korean Journal of Remote Sensing
    • /
    • v.36 no.5_3
    • /
    • pp.1109-1123
    • /
    • 2020
  • In South Korea with forest as a major land cover class (over 60% of the country), many wildfires occur every year. Wildfires weaken the shear strength of the soil, forming a layer of soil that is vulnerable to landslides. It is important to identify the severity of a wildfire as well as the burned area to sustainably manage the forest. Although satellite remote sensing has been widely used to map wildfire severity, it is often difficult to determine the severity using only the temporal change of satellite-derived indices such as Normalized Difference Vegetation Index (NDVI) and Normalized Burn Ratio (NBR). In this study, we proposed an approach for determining wildfire severity based on machine learning through the synergistic use of Sentinel-1A Synthetic Aperture Radar-C data and Sentinel-2A Multi Spectral Instrument data. Three wildfire cases-Samcheok in May 2017, Gangreung·Donghae in April 2019, and Gosung·Sokcho in April 2019-were used for developing wildfire severity mapping models with three machine learning algorithms (i.e., Random Forest, Logistic Regression, and Support Vector Machine). The results showed that the random forest model yielded the best performance, resulting in an overall accuracy of 82.3%. The cross-site validation to examine the spatiotemporal transferability of the machine learning models showed that the models were highly sensitive to temporal differences between the training and validation sites, especially in the early growing season. This implies that a more robust model with high spatiotemporal transferability can be developed when more wildfire cases with different seasons and areas are added in the future.

Classifying Severity of Senior Driver Accidents In Capital Regions Based on Machine Learning Algorithms (머신러닝 기반의 수도권 지역 고령운전자 차대사람 사고심각도 분류 연구)

  • Kim, Seunghoon;Lym, Youngbin;Kim, Ki-Jung
    • Journal of Digital Convergence
    • /
    • v.19 no.4
    • /
    • pp.25-31
    • /
    • 2021
  • Moving toward an aged society, traffic accidents involving elderly drivers have also attracted broader public attention. A rapid increase of senior involvement in crashes calls for developing appropriate crash-severity prediction models specific to senior drivers. In that regard, this study leverages machine learning (ML) algorithms so as to predict the severity of vehicle-pedestrian collisions induced by elderly drivers. Specifically, four ML algorithms (i.e., Logistic model, K-nearest Neighbor (KNN), Random Forest (RF), and Support Vector Machine (SVM)) have been developed and compared. Our results show that Logistic model and SVM have outperformed their rivals in terms of the overall prediction accuracy, while precision measure exhibits in favor of RF. We also clarify that driver education and technology development would be effective countermeasures against severity risks of senior driver-induced collisions. These allow us to support informed decision making for policymakers to enhance public safety.

Water level forecasting for extended lead times using preprocessed data with variational mode decomposition: A case study in Bangladesh

  • Shabbir Ahmed Osmani;Roya Narimani;Hoyoung Cha;Changhyun Jun;Md Asaduzzaman Sayef
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2023.05a
    • /
    • pp.179-179
    • /
    • 2023
  • This study suggests a new approach of water level forecasting for extended lead times using original data preprocessing with variational mode decomposition (VMD). Here, two machine learning algorithms including light gradient boosting machine (LGBM) and random forest (RF) were considered to incorporate extended lead times (i.e., 5, 10, 15, 20, 25, 30, 40, and 50 days) forecasting of water levels. At first, the original data at two water level stations (i.e., SW173 and SW269 in Bangladesh) and their decomposed data from VMD were prepared on antecedent lag times to analyze in the datasets of different lead times. Mean absolute error (MAE), root mean squared error (RMSE), and mean squared error (MSE) were used to evaluate the performance of the machine learning models in water level forecasting. As results, it represents that the errors were minimized when the decomposed datasets were considered to predict water levels, rather than the use of original data standalone. It was also noted that LGBM produced lower MAE, RMSE, and MSE values than RF, indicating better performance. For instance, at the SW173 station, LGBM outperformed RF in both decomposed and original data with MAE values of 0.511 and 1.566, compared to RF's MAE values of 0.719 and 1.644, respectively, in a 30-day lead time. The models' performance decreased with increasing lead time, as per the study findings. In summary, preprocessing original data and utilizing machine learning models with decomposed techniques have shown promising results for water level forecasting in higher lead times. It is expected that the approach of this study can assist water management authorities in taking precautionary measures based on forecasted water levels, which is crucial for sustainable water resource utilization.

  • PDF

Predicting Surgical Complications in Adult Patients Undergoing Anterior Cervical Discectomy and Fusion Using Machine Learning

  • Arvind, Varun;Kim, Jun S.;Oermann, Eric K.;Kaji, Deepak;Cho, Samuel K.
    • Neurospine
    • /
    • v.15 no.4
    • /
    • pp.329-337
    • /
    • 2018
  • Objective: Machine learning algorithms excel at leveraging big data to identify complex patterns that can be used to aid in clinical decision-making. The objective of this study is to demonstrate the performance of machine learning models in predicting postoperative complications following anterior cervical discectomy and fusion (ACDF). Methods: Artificial neural network (ANN), logistic regression (LR), support vector machine (SVM), and random forest decision tree (RF) models were trained on a multicenter data set of patients undergoing ACDF to predict surgical complications based on readily available patient data. Following training, these models were compared to the predictive capability of American Society of Anesthesiologists (ASA) physical status classification. Results: A total of 20,879 patients were identified as having undergone ACDF. Following exclusion criteria, patients were divided into 14,615 patients for training and 6,264 for testing data sets. ANN and LR consistently outperformed ASA physical status classification in predicting every complication (p < 0.05). The ANN outperformed LR in predicting venous thromboembolism, wound complication, and mortality (p < 0.05). The SVM and RF models were no better than random chance at predicting any of the postoperative complications (p < 0.05). Conclusion: ANN and LR algorithms outperform ASA physical status classification for predicting individual postoperative complications. Additionally, neural networks have greater sensitivity than LR when predicting mortality and wound complications. With the growing size of medical data, the training of machine learning on these large datasets promises to improve risk prognostication, with the ability of continuously learning making them excellent tools in complex clinical scenarios.

Comparison of machine learning algorithms to evaluate strength of concrete with marble powder

  • Sharma, Nitisha;Upadhya, Ankita;Thakur, Mohindra S.;Sihag, Parveen
    • Advances in materials Research
    • /
    • v.11 no.1
    • /
    • pp.75-90
    • /
    • 2022
  • In this paper, functionality of soft computing algorithms such as Group method of data handling (GMDH), Random forest (RF), Random tree (RT), Linear regression (LR), M5P, and artificial neural network (ANN) have been looked out to predict the compressive strength of concrete mixed with marble powder. Assessment of result suggests that, the overall performance of ANN based model gives preferable results over the different applied algorithms for the estimate of compressive strength of concrete. The results of coefficient of correlation were maximum in ANN model (0.9139) accompanied through RT with coefficient of correlation (CC) value 0.8241 and minimum root mean square error (RMSE) value of ANN (4.5611) followed by RT with RMSE (5.4246). Similarly, other evaluating parameters like, Willmott's index and Nash-sutcliffe coefficient value of ANN was 0.9458 and 0.7502 followed by RT model (0.8763 and 0.6628). The end result showed that, for both subsets i.e., training and testing subset, ANN has the potential to estimate the compressive strength of concrete. Also, the results of sensitivity suggest that the water-cement ratio has a massive impact in estimating the compressive strength of concrete with marble powder with ANN based model in evaluation with the different parameters for this data set.

The Analysis of the Activity Patterns of Dog with Wearable Sensors Using Machine Learning

  • Hussain, Ali;Ali, Sikandar;Kim, Hee-Cheol
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2021.05a
    • /
    • pp.141-143
    • /
    • 2021
  • The Activity patterns of animal species are difficult to access and the behavior of freely moving individuals can not be assessed by direct observation. As it has become large challenge to understand the activity pattern of animals such as dogs, and cats etc. One approach for monitoring these behaviors is the continuous collection of data by human observers. Therefore, in this study we assess the activity patterns of dog using the wearable sensors data such as accelerometer and gyroscope. A wearable, sensor -based system is suitable for such ends, and it will be able to monitor the dogs in real-time. The basic purpose of this study was to develop a system that can detect the activities based on the accelerometer and gyroscope signals. Therefore, we purpose a method which is based on the data collected from 10 dogs, including different nine breeds of different sizes and ages, and both genders. We applied six different state-of-the-art classifiers such as Random forests (RF), Support vector machine (SVM), Gradient boosting machine (GBM), XGBoost, k-nearest neighbors (KNN), and Decision tree classifier, respectively. The Random Forest showed a good classification result. We achieved an accuracy 86.73% while the detecting the activity.

  • PDF

Machine Learning Based MMS Point Cloud Semantic Segmentation (머신러닝 기반 MMS Point Cloud 의미론적 분할)

  • Bae, Jaegu;Seo, Dongju;Kim, Jinsoo
    • Korean Journal of Remote Sensing
    • /
    • v.38 no.5_3
    • /
    • pp.939-951
    • /
    • 2022
  • The most important factor in designing autonomous driving systems is to recognize the exact location of the vehicle within the surrounding environment. To date, various sensors and navigation systems have been used for autonomous driving systems; however, all have limitations. Therefore, the need for high-definition (HD) maps that provide high-precision infrastructure information for safe and convenient autonomous driving is increasing. HD maps are drawn using three-dimensional point cloud data acquired through a mobile mapping system (MMS). However, this process requires manual work due to the large numbers of points and drawing layers, increasing the cost and effort associated with HD mapping. The objective of this study was to improve the efficiency of HD mapping by segmenting semantic information in an MMS point cloud into six classes: roads, curbs, sidewalks, medians, lanes, and other elements. Segmentation was performed using various machine learning techniques including random forest (RF), support vector machine (SVM), k-nearest neighbor (KNN), and gradient-boosting machine (GBM), and 11 variables including geometry, color, intensity, and other road design features. MMS point cloud data for a 130-m section of a five-lane road near Minam Station in Busan, were used to evaluate the segmentation models; the average F1 scores of the models were 95.43% for RF, 92.1% for SVM, 91.05% for GBM, and 82.63% for KNN. The RF model showed the best segmentation performance, with F1 scores of 99.3%, 95.5%, 94.5%, 93.5%, and 90.1% for roads, sidewalks, curbs, medians, and lanes, respectively. The variable importance results of the RF model showed high mean decrease accuracy and mean decrease gini for XY dist. and Z dist. variables related to road design, respectively. Thus, variables related to road design contributed significantly to the segmentation of semantic information. The results of this study demonstrate the applicability of segmentation of MMS point cloud data based on machine learning, and will help to reduce the cost and effort associated with HD mapping.

Implementation of a Machine Learning-based Recommender System for Preventing the University Students' Dropout (대학생 중도탈락 예방을 위한 기계 학습 기반 추천 시스템 구현 방안)

  • Jeong, Do-Heon
    • Journal of the Korea Convergence Society
    • /
    • v.12 no.10
    • /
    • pp.37-43
    • /
    • 2021
  • This study proposed an effective automatic classification technique to identify dropout patterns of university students, and based on this, an intelligent recommender system to prevent dropouts. To this end, 1) a data processing method to improve the performance of machine learning was proposed based on actual enrollment/dropout data of university students, and 2) performance comparison experiments were conducted using five types of machine learning algorithms. 3) As a result of the experiment, the proposed method showed superior performance in all algorithms compared to the baseline method. The precision rate of discrimination of enrolled students was measured to be up to 95.6% when using a Random Forest(RF), and the recall rate of dropout students was measured to be up to 80.0% when using Naive Bayes(NB). 4) Finally, based on the experimental results, a method for using a counseling recommender system to give priority to students who are likely to drop out was suggested. It was confirmed that reasonable decision-making can be conducted through convergence research that utilizes technologies in the IT field to solve the educational issues, and we plan to apply various artificial intelligence technologies through continuous research in the future.