• Title/Summary/Keyword: random forest (RF)

Search Result 182, Processing Time 0.022 seconds

A Random Forest Model Based Pollution Severity Classification Scheme of High Voltage Transmission Line Insulators

  • Kannan, K.;Shivakumar, R.;Chandrasekar, S.
    • Journal of Electrical Engineering and Technology
    • /
    • v.11 no.4
    • /
    • pp.951-960
    • /
    • 2016
  • Tower insulators in electric power transmission network play a crucial role in preserving the reliability of the system. Electrical utilities frequently face the problem of flashover of insulators due to pollution deposition on their surface. Several research works based on leakage current (LC) measurement has been already carried out in developing diagnostic techniques for these insulators. Since the LC signal is highly intermittent in nature, estimation of pollution severity based on LC signal measurement over a short period of time will not produce accurate results. Reports on the measurement and analysis of LC signals over a long period of time is scanty. This paper attempts to use Random Forest (RF) classifier, which produces accurate results on large data bases, to analyze the pollution severity of high voltage tower insulators. Leakage current characteristics over a long period of time were measured in the laboratory on porcelain insulator. Pollution experiments were conducted at 11 kV AC voltage. Time domain analysis and wavelet transform technique were used to extract both basic features and histogram features of the LC signal. RF model was trained and tested with a variety of LC signals measured over a lengthy period of time and it is noticed that the proposed RF model based pollution severity classifier is efficient and will be helpful to electrical utilities for real time implementation.

A Cross-Validation of SeismicVulnerability Assessment Model: Application to Earthquake of 9.12 Gyeongju and 2017 Pohang (지진 취약성 평가 모델 교차검증: 경주(2016)와 포항(2017) 지진을 대상으로)

  • Han, Jihye;Kim, Jinsoo
    • Korean Journal of Remote Sensing
    • /
    • v.37 no.3
    • /
    • pp.649-655
    • /
    • 2021
  • This study purposes to cross-validate its performance by applying the optimal seismic vulnerability assessment model based on previous studies conducted in Gyeongju to other regions. The test area was Pohang City, the occurrence site for the 2017 Pohang Earthquake, and the dataset was built the same influencing factors and earthquake-damaged buildings as in the previous studies. The validation dataset was built via random sampling, and the prediction accuracy was derived by applying it to a model based on a random forest (RF) of Gyeongju. The accuracy of the model success and prediction in Gyeongju was 100% and 94.9%, respectively, and as a result of confirming the prediction accuracy by applying the Pohang validation dataset, it appeared as 70.4%.

Applying advanced machine learning techniques in the early prediction of graduate ability of university students

  • Pham, Nga;Tiep, Pham Van;Trang, Tran Thu;Nguyen, Hoai-Nam;Choi, Gyoo-Seok;Nguyen, Ha-Nam
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.14 no.3
    • /
    • pp.285-291
    • /
    • 2022
  • The number of people enrolling in universities is rising due to the simplicity of applying and the benefit of earning a bachelor's degree. However, the on-time graduation rate has declined since plenty of students fail to complete their courses and take longer to get their diplomas. Even though there are various reasons leading to the aforementioned problem, it is crucial to emphasize the cause originating from the management and care of learners. In fact, understanding students' difficult situations and offering timely Number of Test data and advice would help prevent college dropouts or graduate delays. In this study, we present a machine learning-based method for early detection at-risk students, using data obtained from graduates of the Faculty of Information Technology, Dainam University, Vietnam. We experiment with several fundamental machine learning methods before implementing the parameter optimization techniques. In comparison to the other strategies, Random Forest and Grid Search (RF&GS) and Random Forest and Random Search (RF&RS) provided more accurate predictions for identifying at-risk students.

A Survival Prediction Model of Rats in Uncontrolled Acute Hemorrhagic Shock Using the Random Forest Classifier (랜덤 포리스트를 이용한 비제어 급성 출혈성 쇼크의 흰쥐에서의 생존 예측)

  • Choi, J.Y.;Kim, S.K.;Koo, J.M.;Kim, D.W.
    • Journal of Biomedical Engineering Research
    • /
    • v.33 no.3
    • /
    • pp.148-154
    • /
    • 2012
  • Hemorrhagic shock is a primary cause of deaths resulting from injury in the world. Although many studies have tried to diagnose accurately hemorrhagic shock in the early stage, such attempts were not successful due to compensatory mechanisms of humans. The objective of this study was to construct a survival prediction model of rats in acute hemorrhagic shock using a random forest (RF) model. Heart rate (HR), mean arterial pressure (MAP), respiration rate (RR), lactate concentration (LC), and peripheral perfusion (PP) measured in rats were used as input variables for the RF model and its performance was compared with that of a logistic regression (LR) model. Before constructing the models, we performed 5-fold cross validation for RF variable selection, and forward stepwise variable selection for the LR model to examine which variables were important for the models. For the LR model, sensitivity, specificity, accuracy, and area under the receiver operating characteristic curve (ROC-AUC) were 0.83, 0.95, 0.88, and 0.96, respectively. For the RF models, sensitivity, specificity, accuracy, and AUC were 0.97, 0.95, 0.96, and 0.99, respectively. In conclusion, the RF model was superior to the LR model for survival prediction in the rat model.

Predicting the CPT-based pile set-up parameters using HHO-RF and PSO-RF hybrid models

  • Yun Dawei;Zheng Bing;Gu Bingbing;Gao Xibo;Behnaz Razzaghzadeh
    • Structural Engineering and Mechanics
    • /
    • v.86 no.5
    • /
    • pp.673-686
    • /
    • 2023
  • Determining the properties of pile from cone penetration test (CPT) is costly, and need several in-situ tests. At the present study, two novel hybrid learning models, namely PSO-RF and HHO-RF, which are an amalgamation of random forest (RF) with particle swarm optimization (PSO) and Harris hawks optimization (HHO) were developed and applied to predict the pile set-up parameter "A" from CPT for the design aim of the projects. To forecast the "A," CPT data along were collected from different sites in Louisiana, where the selected variables as input were plasticity index (PI), undrained shear strength (Su), and over consolidation ratio (OCR). Results show that both PSO-RF and HHO-RF models have acceptable performance in predicting the set-up parameter "A," with R2 larger than 0.9094, representing the admissible correlation between observed and predicted values. HHO-RF has better proficiency than the PSO-RF model, with R2 and RMSE equal to 0.9328 and 0.0292 for the training phase and 0.9729 and 0.024 for testing data, respectively. Moreover, PI and OBJ indices are considered, in which the HHO-RF model has lower results which leads to outperforming this hybrid algorithm with respect to PSO-RF for predicting the pile set-up parameter "A," consequently being specified as the proposed model. Therefore, the results demonstrate the ability of the HHO algorithm in determining the optimal value of RF hyperparameters than PSO.

Reservoir Water Level Forecasting Using Machine Learning Models (기계학습모델을 이용한 저수지 수위 예측)

  • Seo, Youngmin;Choi, Eunhyuk;Yeo, Woonki
    • Journal of The Korean Society of Agricultural Engineers
    • /
    • v.59 no.3
    • /
    • pp.97-110
    • /
    • 2017
  • This study investigates the efficiencies of machine learning models, including artificial neural network (ANN), generalized regression neural network (GRNN), adaptive neuro-fuzzy inference system (ANFIS) and random forest (RF), for reservoir water level forecasting in the Chungju Dam, South Korea. The models' efficiencies are assessed based on model efficiency indices and graphical comparison. The forecasting results of the models are dependent on lead times and the combination of input variables. For lead time t = 1 day, ANFIS1 and ANN6 models yield superior forecasting results to RF6 and GRNN6 models. For lead time t = 5 days, ANN1 and RF6 models produce better forecasting results than ANFIS1 and GRNN3 models. For lead time t = 10 days, ANN3 and RF1 models perform better than ANFIS3 and GRNN3 models. It is found that ANN model yields the best performance for all lead times, in terms of model efficiency and graphical comparison. These results indicate that the optimal combination of input variables and forecasting models depending on lead times should be applied in reservoir water level forecasting, instead of the single combination of input variables and forecasting models for all lead times.

Development of Machine Learning Based Precipitation Imputation Method (머신러닝 기반의 강우추정 방법 개발)

  • Heechan Han;Changju Kim;Donghyun Kim
    • Journal of Wetlands Research
    • /
    • v.25 no.3
    • /
    • pp.167-175
    • /
    • 2023
  • Precipitation data is one of the essential input datasets used in various fields such as wetland management, hydrological simulation, and water resource management. In order to efficiently manage water resources using precipitation data, it is essential to secure as much data as possible by minimizing the missing rate of data. In addition, more efficient hydrological simulation is possible if precipitation data for ungauged areas are secured. However, missing precipitation data have been estimated mainly by statistical equations. The purpose of this study is to propose a new method to restore missing precipitation data using machine learning algorithms that can predict new data based on correlations between data. Moreover, compared to existing statistical methods, the applicability of machine learning techniques for restoring missing precipitation data is evaluated. Representative machine learning algorithms, Artificial Neural Network (ANN) and Random Forest (RF), were applied. For the performance of classifying the occurrence of precipitation, the RF algorithm has higher accuracy in classifying the occurrence of precipitation than the ANN algorithm. The F1-score and Accuracy values, which are evaluation indicators of the classification model, were calculated as 0.80 and 0.77, while the ANN was calculated as 0.76 and 0.71. In addition, the performance of estimating precipitation also showed higher accuracy in RF than in ANN algorithm. The RMSE of the RF and ANN algorithms was 2.8 mm/day and 2.9 mm/day, and the values were calculated as 0.68 and 0.73.

A Performance Comparison of Machine Learning Classification Methods for Soil Creep Susceptibility Assessment (땅밀림 위험지 평가를 위한 기계학습 분류모델 비교)

  • Lee, Jeman;Seo, Jung Il;Lee, Jin-Ho;Im, Sangjun
    • Journal of Korean Society of Forest Science
    • /
    • v.110 no.4
    • /
    • pp.610-621
    • /
    • 2021
  • The soil creep, primarily caused by earthquakes and torrential rainfall events, has widely occurred across the country. The Korea Forest Service attempted to quantify the soil creep susceptible areas using a discriminant value table to prevent or mitigate casualties and/or property damages in advance. With the advent of advanced computer technologies, machine learning-based classification models have been employed for managing mountainous disasters, such as landslides and debris flows. This study aims to quantify the soil creep susceptibility using several classifiers, namely the k-Nearest Neighbor (k-NN), Naive Bayes (NB), Random Forest (RF), and Support Vector Machine (SVM) models. To develop the classification models, we downscaled 292 data from 4,618 field survey data. About 70% of the selected data were used for training, with the remaining 30% used for model testing. The developed models have the classification accuracy of 0.727 for k-NN, 0.750 for NB, 0.807 for RF, and 0.750 for SVM against test datasets representing 30% of the total data. Furthermore, we estimated Cohen's Kappa index as 0.534, 0.580, 0.673, and 0.585, with AUC values of 0.872, 0.912, 0.943, and 0.834, respectively. The machine learning-based classifications for soil creep susceptibility were RF, NB, SVM, and k-NN in that order. Our findings indicate that the machine learning classifiers can provide valuable information in establishing and implementing natural disaster management plans in mountainous areas.

Development of a Water Quality Indicator Prediction Model for the Korean Peninsula Seas using Artificial Intelligence (인공지능 기법을 활용한 한반도 해역의 수질평가지수 예측모델 개발)

  • Seong-Su Kim;Kyuhee Son;Doyoun Kim;Jang-Mu Heo;Seongeun Kim
    • Journal of the Korean Society of Marine Environment & Safety
    • /
    • v.29 no.1
    • /
    • pp.24-35
    • /
    • 2023
  • Rapid industrialization and urbanization have led to severe marine pollution. A Water Quality Index (WQI) has been developed to allow the effective management of marine pollution. However, the WQI suffers from problems with loss of information due to the complex calculations involved, changes in standards, calculation errors by practitioners, and statistical errors. Consequently, research on the use of artificial intelligence techniques to predict the marine and coastal WQI is being conducted both locally and internationally. In this study, six techniques (RF, XGBoost, KNN, Ext, SVM, and LR) were studied using marine environmental measurement data (2000-2020) to determine the most appropriate artificial intelligence technique to estimate the WOI of five ecoregions in the Korean seas. Our results show that the random forest method offers the best performance as compared to the other methods studied. The residual analysis of the WQI predicted score and actual score using the random forest method shows that the temporal and spatial prediction performance was exceptional for all ecoregions. In conclusion, the RF model of WQI prediction developed in this study is considered to be applicable to Korean seas with high accuracy.

Evaluation and Predicting PM10 Concentration Using Multiple Linear Regression and Machine Learning (다중선형회귀와 기계학습 모델을 이용한 PM10 농도 예측 및 평가)

  • Son, Sanghun;Kim, Jinsoo
    • Korean Journal of Remote Sensing
    • /
    • v.36 no.6_3
    • /
    • pp.1711-1720
    • /
    • 2020
  • Particulate matter (PM) that has been artificially generated during the recent of rapid industrialization and urbanization moves and disperses according to weather conditions, and adversely affects the human skin and respiratory systems. The purpose of this study is to predict the PM10 concentration in Seoul using meteorological factors as input dataset for multiple linear regression (MLR), support vector machine (SVM), and random forest (RF) models, and compared and evaluated the performance of the models. First, the PM10 concentration data obtained at 39 air quality monitoring sites (AQMS) in Seoul were divided into training and validation dataset (8:2 ratio). The nine meteorological factors (mean, maximum, and minimum temperature, precipitation, average and maximum wind speed, wind direction, yellow dust, and relative humidity), obtained by the automatic weather system (AWS), were composed to input dataset of models. The coefficients of determination (R2) between the observed PM10 concentration and that predicted by the MLR, SVM, and RF models was 0.260, 0.772, and 0.793, respectively, and the RF model best predicted the PM10 concentration. Among the AQMS used for model validation, Gwanak-gu and Gangnam-daero AQMS are relatively close to AWS, and the SVM and RF models were highly accurate according to the model validations. The Jongno-gu AQMS is relatively far from the AWS, but since PM10 concentration for the two adjacent AQMS were used for model training, both models presented high accuracy. By contrast, Yongsan-gu AQMS was relatively far from AQMS and AWS, both models performed poorly.