• Title/Summary/Keyword: Extreme Random Forest

Search Result 49, Processing Time 0.038 seconds

Forest Vertical Structure Mapping from Bi-Seasonal Sentinel-2 Images and UAV-Derived DSM Using Random Forest, Support Vector Machine, and XGBoost

  • Young-Woong Yoon;Hyung-Sup Jung
    • Korean Journal of Remote Sensing
    • /
    • v.40 no.2
    • /
    • pp.123-139
    • /
    • 2024
  • Forest vertical structure is vital for comprehending ecosystems and biodiversity, in addition to fundamental forest information. Currently, the forest vertical structure is predominantly assessed via an in-situ method, which is not only difficult to apply to inaccessible locations or large areas but also costly and requires substantial human resources. Therefore, mapping systems based on remote sensing data have been actively explored. Recently, research on analyzing and classifying images using machine learning techniques has been actively conducted and applied to map the vertical structure of forests accurately. In this study, Sentinel-2 and digital surface model images were obtained on two different dates separated by approximately one month, and the spectral index and tree height maps were generated separately. Furthermore, according to the acquisition time, the input data were separated into cases 1 and 2, which were then combined to generate case 3. Using these data, forest vetical structure mapping models based on random forest, support vector machine, and extreme gradient boost(XGBoost)were generated. Consequently, nine models were generated, with the XGBoost model in Case 3 performing the best, with an average precision of 0.99 and an F1 score of 0.91. We confirmed that generating a forest vertical structure mapping model utilizing bi-seasonal data and an appropriate model can result in an accuracy of 90% or higher.

Evaluating the Efficiency of Models for Predicting Seismic Building Damage (지진으로 인한 건물 손상 예측 모델의 효율성 분석)

  • Chae Song Hwa;Yujin Lim
    • The Transactions of the Korea Information Processing Society
    • /
    • v.13 no.5
    • /
    • pp.217-220
    • /
    • 2024
  • Predicting earthquake occurrences accurately is challenging, and preparing all buildings with seismic design for such random events is a difficult task. Analyzing building features to predict potential damage and reinforcing vulnerabilities based on this analysis can minimize damages even in buildings without seismic design. Therefore, research analyzing the efficiency of building damage prediction models is essential. In this paper, we compare the accuracy of earthquake damage prediction models using machine learning classification algorithms, including Random Forest, Extreme Gradient Boosting, LightGBM, and CatBoost, utilizing data from buildings damaged during the 2015 Nepal earthquake.

Development of Prediction Model of Chloride Diffusion Coefficient using Machine Learning (기계학습을 이용한 염화물 확산계수 예측모델 개발)

  • Kim, Hyun-Su
    • Journal of Korean Association for Spatial Structures
    • /
    • v.23 no.3
    • /
    • pp.87-94
    • /
    • 2023
  • Chloride is one of the most common threats to reinforced concrete (RC) durability. Alkaline environment of concrete makes a passive layer on the surface of reinforcement bars that prevents the bar from corrosion. However, when the chloride concentration amount at the reinforcement bar reaches a certain level, deterioration of the passive protection layer occurs, causing corrosion and ultimately reducing the structure's safety and durability. Therefore, understanding the chloride diffusion and its prediction are important to evaluate the safety and durability of RC structure. In this study, the chloride diffusion coefficient is predicted by machine learning techniques. Various machine learning techniques such as multiple linear regression, decision tree, random forest, support vector machine, artificial neural networks, extreme gradient boosting annd k-nearest neighbor were used and accuracy of there models were compared. In order to evaluate the accuracy, root mean square error (RMSE), mean square error (MSE), mean absolute error (MAE) and coefficient of determination (R2) were used as prediction performance indices. The k-fold cross-validation procedure was used to estimate the performance of machine learning models when making predictions on data not used during training. Grid search was applied to hyperparameter optimization. It has been shown from numerical simulation that ensemble learning methods such as random forest and extreme gradient boosting successfully predicted the chloride diffusion coefficient and artificial neural networks also provided accurate result.

A Study on the Prediction of CNC Tool Wear Using Machine Learning Technique (기계학습 기법을 이용한 CNC 공구 마모도 예측에 관한 연구)

  • Lee, Kangbae;Park, Sungho;Sung, Sangha;Park, Domyoung
    • Journal of the Korea Convergence Society
    • /
    • v.10 no.11
    • /
    • pp.15-21
    • /
    • 2019
  • The fourth industrial revolution is noted. It is a smarter factory. At present, research on CNC (Computerized Numeric Controller) is actively underway in the manufacturing field. Domestic CNC equipment, acoustic sensors, vibration sensors, etc. This study can improve efficiency through CNC. Collect various data such as X-axis, Y-axis, Z-axis force, moving speed. Data exploration of the characteristics of the collected data. You can use your data as Random Forest (RF), Extreme Gradient Boost (XGB), and Support Vector Machine (SVM). The result of this study is CNC equipment.

A study of quantitative precipitation estimation method using advanced machine learning algorithms. (기계학습을 이용한 레이더 강우추정 기법 연구)

  • Shin, Ju-Young;Ro, Yonghun
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2019.05a
    • /
    • pp.58-58
    • /
    • 2019
  • 최근 기계학습기법에 대한 활발한 연구로 인하여 많은 기계학습기법들이 개발되었다. 이러한 최신기계학습기법은 기존에 사용되어온 기계학습기법과 경험식들보다 자연현상을 예측하고 재현하는데 높은 성능을 보이는 것으로 알려져 있다. 레이더 자료를 이용한 강우추정 기법으로는 ZR관계식이 널리 사용되고 있다. 이상적인 조건에서는 ZR 관계식을 이용한 레이더 강우추정이 양호한 성능을 보이나, 실제 레이더 자료를 이용한 강우추정은 이상적인 환경이 아닌 경우가 매우 많다. 이런 ZR관계식의 한계점을 보완하기 위한 방법으로 기계학습기법을 이용한 레이더 강우추정 기법들이 개발되었으나, 현재 한국의 레이더 자료를 대상으로 해서는 많은 연구가 진행되어 오지 않고 있다. 레이더 자료를 이용한 강우추정의 정확도 향상을 위해서는 최신 기계학습기법들의 레이더 강우추정 기법에 대한 적용가능성을 평가해 볼 필요성이 있다. 본 연구에서는 random forest, stochastic gradient boosted model, extreme learning machine의 강우 레이더 강우추정 기법으로의 적용성을 평가하였다. 강우추정 기법 개발 및 성능 비교를 위해서 2018년 광덕산 이중편파 레이더 자료를 이용하였다. 다양한 이중편파 매개변수 조합을 레이더 강우추정 기법의 입력변수로 적용하였다. 기존 연구의 사용되어 온 ZR관계식의 매개변수를 또한 강우사상과 이중편파 매개변수 조합을 이용하여 추정하였다. 기계학습을 적용한 레이더 강우추정 기법이 ZR관계식보다 상관계수와 제곱근오차를 기준으로 높은 강우추정 정확도를 보였다. 특히 개발된 강우추정 기법은 호우사상에서 높은 정확도를 보이는 것을 확인 할 수 있었다. 적용된 기계학습 기법 중에서는extreme learning machine이 레이더 강우추정기법 개발에 가장 적합한 것으로 나타났다.

  • PDF

Quality Prediction Model for Manufacturing Process of Free-Machining 303-series Stainless Steel Small Rolling Wire Rods (쾌삭 303계 스테인리스강 소형 압연 선재 제조 공정의 생산품질 예측 모형)

  • Seo, Seokjun;Kim, Heungseob
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.44 no.4
    • /
    • pp.12-22
    • /
    • 2021
  • This article suggests the machine learning model, i.e., classifier, for predicting the production quality of free-machining 303-series stainless steel(STS303) small rolling wire rods according to the operating condition of the manufacturing process. For the development of the classifier, manufacturing data for 37 operating variables were collected from the manufacturing execution system(MES) of Company S, and the 12 types of derived variables were generated based on literature review and interviews with field experts. This research was performed with data preprocessing, exploratory data analysis, feature selection, machine learning modeling, and the evaluation of alternative models. In the preprocessing stage, missing values and outliers are removed, and oversampling using SMOTE(Synthetic oversampling technique) to resolve data imbalance. Features are selected by variable importance of LASSO(Least absolute shrinkage and selection operator) regression, extreme gradient boosting(XGBoost), and random forest models. Finally, logistic regression, support vector machine(SVM), random forest, and XGBoost are developed as a classifier to predict the adequate or defective products with new operating conditions. The optimal hyper-parameters for each model are investigated by the grid search and random search methods based on k-fold cross-validation. As a result of the experiment, XGBoost showed relatively high predictive performance compared to other models with an accuracy of 0.9929, specificity of 0.9372, F1-score of 0.9963, and logarithmic loss of 0.0209. The classifier developed in this study is expected to improve productivity by enabling effective management of the manufacturing process for the STS303 small rolling wire rods.

Projecting the spatial-temporal trends of extreme climatology in South Korea based on optimal multi-model ensemble members

  • Mirza Junaid Ahmad;Kyung-sook Choi
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2023.05a
    • /
    • pp.314-314
    • /
    • 2023
  • Extreme climate events can have a large impact on human life by hampering social, environmental, and economic development. Global circulation models (GCMs) are the widely used numerical models to understand the anticipated future climate change. However, different GCMs can project different future climates due to structural differences, varying initial boundary conditions and assumptions about the physical phenomena. The multi-model ensemble (MME) approach can improve the uncertainties associated with the different GCM outcomes. In this study, a comprehensive rating metric was used to select the best-performing GCMs out of 11 CMIP5 and 13 CMIP6 GCMs, according to their skills in terms of four temporal and five spatial performance indices, in replicating the 21 extreme climate indices during the baseline (1975-2017) in South Korea. The MME data were derived by averaging the simulations from all selected GCMs and three top-ranked GCMs. The random forest (RF) algorithm was also used to derive the MME data from the three top-ranked GCMs. The RF-derived MME data of the three top-ranked GCMs showed the highest performance in simulating the baseline extreme climate which was subsequently used to project the future extreme climate indices under both the representative concentration pathway (RCP) and the socioeconomic concentration pathway scenarios (SSP). The extreme cold and warming indices had declining and increasing trends, respectively, and most extreme precipitation indices had increasing trends over the period 2031-2100. Compared to all scenarios, RCP8.5 showed drastic changes in future extreme climate indices. The coasts in the east, south and west had stronger warming than the rest of the country, while mountain areas in the north experienced more extreme cold. While extreme cold climatology gradually declined from north to south, extreme warming climatology continuously grew from coastal to inland and northern mountainous regions. The results showed that the socially, environmentally and agriculturally important regions of South Korea were at increased risk of facing the detrimental impacts of extreme climatology.

  • PDF

Classification of Soil Creep Hazard Class Using Machine Learning (기계학습기법을 이용한 땅밀림 위험등급 분류)

  • Lee, Gi Ha;Le, Xuan-Hien;Yeon, Min Ho;Seo, Jun Pyo;Lee, Chang Woo
    • Journal of Korean Society of Disaster and Security
    • /
    • v.14 no.3
    • /
    • pp.17-27
    • /
    • 2021
  • In this study, classification models were built using machine learning techniques that can classify the soil creep risk into three classes from A to C (A: risk, B: moderate, C: good). A total of six machine learning techniques were used: K-Nearest Neighbor, Support Vector Machine, Logistic Regression, Decision Tree, Random Forest, and Extreme Gradient Boosting and then their classification accuracy was analyzed using the nationwide soil creep field survey data in 2019 and 2020. As a result of classification accuracy analysis, all six methods showed excellent accuracy of 0.9 or more. The methods where numerical data were applied for data training showed better performance than the methods based on character data of field survey evaluation table. Moreover, the methods learned with the data group (R1~R4) reflecting the expert opinion had higher accuracy than the field survey evaluation score data group (C1~C4). The machine learning can be used as a tool for prediction of soil creep if high-quality data are continuously secured and updated in the future.

Unveiling the mysteries of flood risk: A machine learning approach to understanding flood-influencing factors for accurate mapping

  • Roya Narimani;Shabbir Ahmed Osmani;Seunghyun Hwang;Changhyun Jun
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2023.05a
    • /
    • pp.164-164
    • /
    • 2023
  • This study investigates the importance of flood-influencing factors on the accuracy of flood risk mapping using the integration of remote sensing-based and machine learning techniques. Here, the Extreme Gradient Boosting (XGBoost) and Random Forest (RF) algorithms integrated with GIS-based techniques were considered to develop and generate flood risk maps. For the study area of NAPA County in the United States, rainfall data from the 12 stations, Sentinel-1 SAR, and Sentinel-2 optical images were applied to extract 13 flood-influencing factors including altitude, aspect, slope, topographic wetness index, normalized difference vegetation index, stream power index, sediment transport index, land use/land cover, terrain roughness index, distance from the river, soil, rainfall, and geology. These 13 raster maps were used as input data for the XGBoost and RF algorithms for modeling flood-prone areas using ArcGIS, Python, and R. As results, it indicates that XGBoost showed better performance than RF in modeling flood-prone areas with an ROC of 97.45%, Kappa of 93.65%, and accuracy score of 96.83% compared to RF's 82.21%, 70.54%, and 88%, respectively. In conclusion, XGBoost is more efficient than RF for flood risk mapping and can be potentially utilized for flood mitigation strategies. It should be noted that all flood influencing factors had a positive effect, but altitude, slope, and rainfall were the most influential features in modeling flood risk maps using XGBoost.

  • PDF

Cognitive Impairment Prediction Model Using AutoML and Lifelog

  • Hyunchul Choi;Chiho Yoon;Sae Bom Lee
    • Journal of the Korea Society of Computer and Information
    • /
    • v.28 no.11
    • /
    • pp.53-63
    • /
    • 2023
  • This study developed a cognitive impairment predictive model as one of the screening tests for preventing dementia in the elderly by using Automated Machine Learning(AutoML). We used 'Wearable lifelog data for high-risk dementia patients' of National Information Society Agency, then conducted using PyCaret 3.0.0 in the Google Colaboratory environment. This study analysis steps are as follows; first, selecting five models demonstrating excellent classification performance for the model development and lifelog data analysis. Next, using ensemble learning to integrate these models and assess their performance. It was found that Voting Classifier, Gradient Boosting Classifier, Extreme Gradient Boosting, Light Gradient Boosting Machine, Extra Trees Classifier, and Random Forest Classifier model showed high predictive performance in that order. This study findings, furthermore, emphasized on the the crucial importance of 'Average respiration per minute during sleep' and 'Average heart rate per minute during sleep' as the most critical feature variables for accurate predictions. Finally, these study results suggest that consideration of the possibility of using machine learning and lifelog as a means to more effectively manage and prevent cognitive impairment in the elderly.