• Title/Summary/Keyword: random forest model

Search Result 532, Processing Time 0.037 seconds

Application of machine learning for merging multiple satellite precipitation products

  • Van, Giang Nguyen;Jung, Sungho;Lee, Giha
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2021.06a
    • /
    • pp.134-134
    • /
    • 2021
  • Precipitation is a crucial component of water cycle and play a key role in hydrological processes. Traditionally, gauge-based precipitation is the main method to achieve high accuracy of rainfall estimation, but its distribution is sparsely in mountainous areas. Recently, satellite-based precipitation products (SPPs) provide grid-based precipitation with spatio-temporal variability, but SPPs contain a lot of uncertainty in estimated precipitation, and the spatial resolution quite coarse. To overcome these limitations, this study aims to generate new grid-based daily precipitation using Automatic weather system (AWS) in Korea and multiple SPPs(i.e. CHIRPSv2, CMORPH, GSMaP, TRMMv7) during the period of 2003-2017. And this study used a machine learning based Random Forest (RF) model for generating new merging precipitation. In addition, several statistical linear merging methods are used to compare with the results of the RF model. In order to investigate the efficiency of RF, observed data from 64 observed Automated Synoptic Observation System (ASOS) were collected to evaluate the accuracy of the products through Kling-Gupta efficiency (KGE), probability of detection (POD), false alarm rate (FAR), and critical success index (CSI). As a result, the new precipitation generated through the random forest model showed higher accuracy than each satellite rainfall product and spatio-temporal variability was better reflected than other statistical merging methods. Therefore, a random forest-based ensemble satellite precipitation product can be efficiently used for hydrological simulations in ungauged basins such as the Mekong River.

  • PDF

A Study on the prediction of BMI(Benthic Macroinvertebrate Index) using Machine Learning Based CFS(Correlation-based Feature Selection) and Random Forest Model (머신러닝 기반 CFS(Correlation-based Feature Selection)기법과 Random Forest모델을 활용한 BMI(Benthic Macroinvertebrate Index) 예측에 관한 연구)

  • Go, Woo-Seok;Yoon, Chun Gyeong;Rhee, Han-Pil;Hwang, Soon-Jin;Lee, Sang-Woo
    • Journal of Korean Society on Water Environment
    • /
    • v.35 no.5
    • /
    • pp.425-431
    • /
    • 2019
  • Recently, people have been attracting attention to the good quality of water resources as well as water welfare. to improve the quality of life. This study is a papers on the prediction of benthic macroinvertebrate index (BMI), which is a aquatic ecological health, using the machine learning based CFS (Correlation-based Feature Selection) method and the random forest model to compare the measured and predicted values of the BMI. The data collected from the Han River's branch for 10 years are extracted and utilized in 1312 data. Through the utilized data, Pearson correlation analysis showed a lack of correlation between single factor and BMI. The CFS method for multiple regression analysis was introduced. This study calculated 10 factors(water temperature, DO, electrical conductivity, turbidity, BOD, $NH_3-N$, T-N, $PO_4-P$, T-P, Average flow rate) that are considered to be related to the BMI. The random forest model was used based on the ten factors. In order to prove the validity of the model, $R^2$, %Difference, NSE (Nash-Sutcliffe Efficiency) and RMSE (Root Mean Square Error) were used. Each factor was 0.9438, -0.997, and 0,992, and accuracy rate was 71.6% level. As a result, These results can suggest the future direction of water resource management and Pre-review function for water ecological prediction.

Random Forest Model for Silicon-to-SPICE Gap and FinFET Design Attribute Identification

  • Won, Hyosig;Shimazu, Katsuhiro
    • IEIE Transactions on Smart Processing and Computing
    • /
    • v.5 no.5
    • /
    • pp.358-365
    • /
    • 2016
  • We propose a novel application of random forest, a machine learning-based general classification algorithm, to analyze the influence of design attributes on the silicon-to-SPICE (S2S) gap. To improve modeling accuracy, we introduce magnification of learning data as well as randomization for the counting of design attributes to be used for each tree in the forest. From the automatically generated decision trees, we can extract the so-called importance and impact indices, which identify the most significant design attributes determining the S2S gap. We apply the proposed method to actual silicon data, and observe that the identified design attributes show a clear trend in the S2S gap. We finally unveil 10nm key fin-shaped field effect transistor (FinFET) structures that result in a large S2S gap using the measurement data from 10nm test vehicles specialized for model-hardware correlation.

A Real-Time Sound Recognition System with a Decision Logic of Random Forest for Robots (Random Forest를 결정로직으로 활용한 로봇의 실시간 음향인식 시스템 개발)

  • Song, Ju-man;Kim, Changmin;Kim, Minook;Park, Yongjin;Lee, Seoyoung;Son, Jungkwan
    • The Journal of Korea Robotics Society
    • /
    • v.17 no.3
    • /
    • pp.273-281
    • /
    • 2022
  • In this paper, we propose a robot sound recognition system that detects various sound events. The proposed system is designed to detect various sound events in real-time by using a microphone on a robot. To get real-time performance, we use a VGG11 model which includes several convolutional neural networks with real-time normalization scheme. The VGG11 model is trained on augmented DB through 24 kinds of various environments (12 reverberation times and 2 signal to noise ratios). Additionally, based on random forest algorithm, a decision logic is also designed to generate event signals for robot applications. This logic can be used for specific classes of acoustic events with better performance than just using outputs of network model. With some experimental results, the performance of proposed sound recognition system is shown on real-time device for robots.

Object Classification Method Using Dynamic Random Forests and Genetic Optimization

  • Kim, Jae Hyup;Kim, Hun Ki;Jang, Kyung Hyun;Lee, Jong Min;Moon, Young Shik
    • Journal of the Korea Society of Computer and Information
    • /
    • v.21 no.5
    • /
    • pp.79-89
    • /
    • 2016
  • In this paper, we proposed the object classification method using genetic and dynamic random forest consisting of optimal combination of unit tree. The random forest can ensure good generalization performance in combination of large amount of trees by assigning the randomization to the training samples and feature selection, etc. allocated to the decision tree as an ensemble classification model which combines with the unit decision tree based on the bagging. However, the random forest is composed of unit trees randomly, so it can show the excellent classification performance only when the sufficient amounts of trees are combined. There is no quantitative measurement method for the number of trees, and there is no choice but to repeat random tree structure continuously. The proposed algorithm is composed of random forest with a combination of optimal tree while maintaining the generalization performance of random forest. To achieve this, the problem of improving the classification performance was assigned to the optimization problem which found the optimal tree combination. For this end, the genetic algorithm methodology was applied. As a result of experiment, we had found out that the proposed algorithm could improve about 3~5% of classification performance in specific cases like common database and self infrared database compare with the existing random forest. In addition, we had shown that the optimal tree combination was decided at 55~60% level from the maximum trees.

A comparative study of conceptual model and machine learning model for rainfall-runoff simulation (강우-유출 모의를 위한 개념적 모형과 기계학습 모형의 성능 비교)

  • Lee, Seung Cheol;Kim, Daeha
    • Journal of Korea Water Resources Association
    • /
    • v.56 no.9
    • /
    • pp.563-574
    • /
    • 2023
  • Recently, climate change has affected functional responses of river basins to meteorological variables, emphasizing the importance of rainfall-runoff simulation research. Simultaneously, the growing interest in machine learning has led to its increased application in hydrological studies. However, it is not yet clear whether machine learning models are more advantageous than the conventional conceptual models. In this study, we compared the performance of the conventional GR6J model with the machine learning-based Random Forest model across 38 basins in Korea using both gauged and ungauged basin prediction methods. For gauged basin predictions, each model was calibrated or trained using observed daily runoff data, and their performance was evaluted over a separate validation period. Subsequently, ungauged basin simulations were evaluated using proximity-based parameter regionalization with Leave-One-Out Cross-Validation (LOOCV). In gauged basins, the Random Forest consistently outperformed the GR6J, exhibiting superiority across basins regardless of whether they had strong or weak rainfall-runoff correlations. This suggest that the inherent data-driven training structures of machine learning models, in contrast to the conceptual models, offer distinct advantages in data-rich scenarios. However, the advantages of the machine-learning algorithm were not replicated in ungauged basin predictions, resulting in a lower performance than that of the GR6J. In conclusion, this study suggests that while the Random Forest model showed enhanced performance in trained locations, the existing GR6J model may be a better choice for prediction in ungagued basins.

A Study on the Performance Evaluation of Machine Learning for Predicting the Number of Movie Audiences (영화 관객 수 예측을 위한 기계학습 기법의 성능 평가 연구)

  • Jeong, Chan-Mi;Min, Daiki
    • The Journal of Society for e-Business Studies
    • /
    • v.25 no.2
    • /
    • pp.49-63
    • /
    • 2020
  • The accurate prediction of box office in the early stage is crucial for film industry to make better managerial decision. With aims to improve the prediction performance, the purpose of this paper is to evaluate the use of machine learning methods. We tested both classification and regression based methods including k-NN, SVM and Random Forest. We first evaluate input variables, which show that reputation-related information generated during the first two-week period after release is significant. Prediction test results show that regression based methods provides lower prediction error, and Random Forest particularly outperforms other machine learning methods. Regression based method has better prediction power when films have small box office earnings. On the other hand, classification based method works better for predicting large box office earnings.

A Cross-Validation of SeismicVulnerability Assessment Model: Application to Earthquake of 9.12 Gyeongju and 2017 Pohang (지진 취약성 평가 모델 교차검증: 경주(2016)와 포항(2017) 지진을 대상으로)

  • Han, Jihye;Kim, Jinsoo
    • Korean Journal of Remote Sensing
    • /
    • v.37 no.3
    • /
    • pp.649-655
    • /
    • 2021
  • This study purposes to cross-validate its performance by applying the optimal seismic vulnerability assessment model based on previous studies conducted in Gyeongju to other regions. The test area was Pohang City, the occurrence site for the 2017 Pohang Earthquake, and the dataset was built the same influencing factors and earthquake-damaged buildings as in the previous studies. The validation dataset was built via random sampling, and the prediction accuracy was derived by applying it to a model based on a random forest (RF) of Gyeongju. The accuracy of the model success and prediction in Gyeongju was 100% and 94.9%, respectively, and as a result of confirming the prediction accuracy by applying the Pohang validation dataset, it appeared as 70.4%.

A random forest-regression-based inverse-modeling evolutionary algorithm using uniform reference points

  • Gholamnezhad, Pezhman;Broumandnia, Ali;Seydi, Vahid
    • ETRI Journal
    • /
    • v.44 no.5
    • /
    • pp.805-815
    • /
    • 2022
  • The model-based evolutionary algorithms are divided into three groups: estimation of distribution algorithms, inverse modeling, and surrogate modeling. Existing inverse modeling is mainly applied to solve multi-objective optimization problems and is not suitable for many-objective optimization problems. Some inversed-model techniques, such as the inversed-model of multi-objective evolutionary algorithm, constructed from the Pareto front (PF) to the Pareto solution on nondominated solutions using a random grouping method and Gaussian process, were introduced. However, some of the most efficient inverse models might be eliminated during this procedure. Also, there are challenges, such as the presence of many local PFs and developing poor solutions when the population has no evident regularity. This paper proposes inverse modeling using random forest regression and uniform reference points that map all nondominated solutions from the objective space to the decision space to solve many-objective optimization problems. The proposed algorithm is evaluated using the benchmark test suite for evolutionary algorithms. The results show an improvement in diversity and convergence performance (quality indicators).

A Study on Prediction Techniques through Machine Learning of Real-time Solar Radiation in Jeju (제주 실시간 일사량의 기계학습 예측 기법 연구)

  • Lee, Young-Mi;Bae, Joo-Hyun;Park, Jeong-keun
    • Journal of Environmental Science International
    • /
    • v.26 no.4
    • /
    • pp.521-527
    • /
    • 2017
  • Solar radiation forecasts are important for predicting the amount of ice on road and the potential solar energy. In an attempt to improve solar radiation predictability in Jeju, we conducted machine learning with various data mining techniques such as tree models, conditional inference tree, random forest, support vector machines and logistic regression. To validate machine learning models, the results from the simulation was compared with the solar radiation data observed over Jeju observation site. According to the model assesment, it can be seen that the solar radiation prediction using random forest is the most effective method. The error rate proposed by random forest data mining is 17%.