• Title/Summary/Keyword: random forest model

Search Result 532, Processing Time 0.022 seconds

Performance Comparison Analysis of Artificial Intelligence Models for Estimating Remaining Capacity of Lithium-Ion Batteries

  • Kyu-Ha Kim;Byeong-Soo Jung;Sang-Hyun Lee
    • International Journal of Advanced Culture Technology
    • /
    • v.11 no.3
    • /
    • pp.310-314
    • /
    • 2023
  • The purpose of this study is to predict the remaining capacity of lithium-ion batteries and evaluate their performance using five artificial intelligence models, including linear regression analysis, decision tree, random forest, neural network, and ensemble model. We is in the study, measured Excel data from the CS2 lithium-ion battery was used, and the prediction accuracy of the model was measured using evaluation indicators such as mean square error, mean absolute error, coefficient of determination, and root mean square error. As a result of this study, the Root Mean Square Error(RMSE) of the linear regression model was 0.045, the decision tree model was 0.038, the random forest model was 0.034, the neural network model was 0.032, and the ensemble model was 0.030. The ensemble model had the best prediction performance, with the neural network model taking second place. The decision tree model and random forest model also performed quite well, and the linear regression model showed poor prediction performance compared to other models. Therefore, through this study, ensemble models and neural network models are most suitable for predicting the remaining capacity of lithium-ion batteries, and decision tree and random forest models also showed good performance. Linear regression models showed relatively poor predictive performance. Therefore, it was concluded that it is appropriate to prioritize ensemble models and neural network models in order to improve the efficiency of battery management and energy systems.

The Development of Biomass Model for Pinus densiflora in Chungnam Region Using Random Effect (임의효과를 이용한 충남지역 소나무림의 바이오매스 모형 개발)

  • Pyo, Jungkee;Son, Yeong Mo
    • Journal of Korean Society of Forest Science
    • /
    • v.106 no.2
    • /
    • pp.213-218
    • /
    • 2017
  • The purpose of this study was to develop age-biomass model in Chungnam region containing random effect. To develop the biomass model by species and tree component, data for Pinus densiflora in central region is collected to 30 plots (150 trees). The mixed model were used to fixed effect in the age-biomass relation for Pinus densiflora, with random effect representing correlation of survey area were obtained. To verify the evaluation of the model for random effect, the akaike information criterion (abbreviated as, AIC) was used to calculate the variance-covariance matrix, and residual of repeated data. The estimated variance-covariance matrix, and residual were -1.0022, 0.6240, respectively. The model with random effect (AIC=377.2) has low AIC value, comparison with other study relating to random effects. It is for this reason that random effect associated with categorical data were used in the data fitting process, the model can be calibrated to fit the Chungnam region by obtaining measurements. Therefore, the results of this study could be useful method for developing biomass model using random effects by region.

Developing a Pedestrian Satisfaction Prediction Model Based on Machine Learning Algorithms (기계학습 알고리즘을 이용한 보행만족도 예측모형 개발)

  • Lee, Jae Seung;Lee, Hyunhee
    • Journal of Korea Planning Association
    • /
    • v.54 no.3
    • /
    • pp.106-118
    • /
    • 2019
  • In order to develop pedestrian navigation service that provides optimal pedestrian routes based on pedestrian satisfaction levels, it is required to develop a prediction model that can estimate a pedestrian's satisfaction level given a certain condition. Thus, the aim of the present study is to develop a pedestrian satisfaction prediction model based on three machine learning algorithms: Logistic Regression, Random Forest, and Artificial Neural Network models. The 2009, 2012, 2013, 2014, and 2015 Pedestrian Satisfaction Survey Data in Seoul, Korea are used to train and test the machine learning models. As a result, the Random Forest model shows the best prediction performance among the three (Accuracy: 0.798, Recall: 0.906, Precision: 0.842, F1 Score: 0.873, AUC: 0.795). The performance of Artificial Neural Network is the second (Accuracy: 0.773, Recall: 0.917, Precision: 0.811, F1 Score: 0.868, AUC: 0.738) and Logistic Regression model's performance follows the second (Accuracy: 0.764, Recall: 1.000, Precision: 0.764, F1 Score: 0.868, AUC: 0.575). The precision score of the Random Forest model implies that approximately 84.2% of pedestrians may be satisfied if they walk the areas, suggested by the Random Forest model.

A Mixed-effects Height-Diameter Model for Pinus densiflora Trees in Gangwon Province, Korea

  • Lee, Young Jin;Coble, Dean W.;Pyo, Jung Kee;Kim, Sung Ho;Lee, Woo Kyun;Choi, Jung Kee
    • Journal of Korean Society of Forest Science
    • /
    • v.98 no.2
    • /
    • pp.178-182
    • /
    • 2009
  • A new mixed-effects model was developed that predicts individual-tree total height for Pinus densiflora trees in Gangwon province as a function of individual-tree diameter (cm). The mixed-effects model contains two random-effects parameters. Maximum likelihood estimation was used to fit the model to 560 height-diameter observations of individual trees measured throughout Gwangwon province in 2007 as part of the National Forest Inventory Program in Korea. The new model is an improvement over fixed-effects models because it can be calibrated to a local area, such as an inventory plot or individual stand. The new model also appears to be an improvement over the Forest Resources Evaluation and Prediction Program for the ten calibration trees used in this study. An example is provided that describes how to estimate the random-effects parameters using ten calibration trees.

Applicability Evaluation of a Mixed Model for the Analysis of Repeated Inventory Data : A Case Study on Quercus variabilis Stands in Gangwon Region (반복측정자료 분석을 위한 혼합모형의 적용성 검토: 강원지역 굴참나무 임분을 대상으로)

  • Pyo, Jungkee;Lee, Sangtae;Seo, Kyungwon;Lee, Kyungjae
    • Journal of Korean Society of Forest Science
    • /
    • v.104 no.1
    • /
    • pp.111-116
    • /
    • 2015
  • The purpose of this study was to evaluate mixed model of dbh-height relation containing random effect. Data were obtained from a survey site for Quercus variabilis in Gangwon region and remeasured the same site after three years. The mixed model were used to fixed effect in the dbh-height relation for Quercus variabilis, with random effect representing correlation of survey period were obtained. To verify the evaluation of the model for random effect, the akaike information criterion (abbreviated as, AIC) was used to calculate the variance-covariance matrix, and residual of repeated data. The estimated variance-covariance matrix, and residual were -0.0291, 0.1007, respectively. The model with random effect (AIC = -215.5) has low AIC value, comparison with model with fixed effect (AIC = -154.4). It is for this reason that random effect associated with categorical data is used in the data fitting process, the model can be calibrated to fit repeated site by obtaining measurements. Therefore, the results of this study could be useful method for developing model using repeated measurement.

Forest Vertical Structure Mapping from Bi-Seasonal Sentinel-2 Images and UAV-Derived DSM Using Random Forest, Support Vector Machine, and XGBoost

  • Young-Woong Yoon;Hyung-Sup Jung
    • Korean Journal of Remote Sensing
    • /
    • v.40 no.2
    • /
    • pp.123-139
    • /
    • 2024
  • Forest vertical structure is vital for comprehending ecosystems and biodiversity, in addition to fundamental forest information. Currently, the forest vertical structure is predominantly assessed via an in-situ method, which is not only difficult to apply to inaccessible locations or large areas but also costly and requires substantial human resources. Therefore, mapping systems based on remote sensing data have been actively explored. Recently, research on analyzing and classifying images using machine learning techniques has been actively conducted and applied to map the vertical structure of forests accurately. In this study, Sentinel-2 and digital surface model images were obtained on two different dates separated by approximately one month, and the spectral index and tree height maps were generated separately. Furthermore, according to the acquisition time, the input data were separated into cases 1 and 2, which were then combined to generate case 3. Using these data, forest vetical structure mapping models based on random forest, support vector machine, and extreme gradient boost(XGBoost)were generated. Consequently, nine models were generated, with the XGBoost model in Case 3 performing the best, with an average precision of 0.99 and an F1 score of 0.91. We confirmed that generating a forest vertical structure mapping model utilizing bi-seasonal data and an appropriate model can result in an accuracy of 90% or higher.

Rice yield prediction in South Korea by using random forest (Random Forest를 이용한 남한지역 쌀 수량 예측 연구)

  • Kim, Junhwan;Lee, Juseok;Sang, Wangyu;Shin, Pyeong;Cho, Hyeounsuk;Seo, Myungchul
    • Korean Journal of Agricultural and Forest Meteorology
    • /
    • v.21 no.2
    • /
    • pp.75-84
    • /
    • 2019
  • In this study, the random forest approach was used to predict the national mean rice yield of South Korea by using mean climatic factors at a national scale. A random forest model that used monthly climate variable and year as an important predictor in predicting crop yield. Annual yield change would be affected by technical improvement for crop management as well as climate. Year as prediction factor represent technical improvement. Thus, it is likely that the variables of importance identified for the random forest model could result in a large error in prediction of rice yield in practice. It was also found that elimination of the trend of yield data resulted in reasonable accuracy in prediction of yield using the random forest model. For example, yield prediction using the training set (data obtained from 1991 to 2005) had a relatively high degree of agreement statistics. Although the degree of agreement statistics for yield prediction for the test set (2006-2015) was not as good as those for the training set, the value of relative root mean square error (RRMSE) was less than 5%. In the variable importance plot, significant difference was noted in the importance of climate factors between the training and test sets. This difference could be attributed to the shifting of the transplanting date, which might have affected the growing season. This suggested that acceptable yield prediction could be achieved using random forest, when the data set included consistent planting or transplanting dates in the predicted area.

Using Mechanical Learning Analysis of Determinants of Housing Sales and Establishment of Forecasting Model (기계학습을 활용한 주택매도 결정요인 분석 및 예측모델 구축)

  • Kim, Eun-mi;Kim, Sang-Bong;Cho, Eun-seo
    • Journal of Cadastre & Land InformatiX
    • /
    • v.50 no.1
    • /
    • pp.181-200
    • /
    • 2020
  • This study used the OLS model to estimate the determinants affecting the tenure of a home and then compared the predictive power of each model with SVM, Decision Tree, Random Forest, Gradient Boosting, XGBooest and LightGBM. There is a difference from the preceding study in that the Stacking model, one of the ensemble models, can be used as a base model to establish a more predictable model to identify the volume of housing transactions in the housing market. OLS analysis showed that sales profits, housing prices, the number of household members, and the type of residential housing (detached housing, apartments) affected the period of housing ownership, and compared the predictability of the machine learning model with RMSE, the results showed that the machine learning model had higher predictability. Afterwards, the predictive power was compared by applying each machine learning after rebuilding the data with the influencing variables, and the analysis showed the best predictive power of Random Forest. In addition, the most predictable Random Forest, Decision Tree, Gradient Boosting, and XGBooost models were applied as individual models, and the Stacking model was constructed using Linear, Ridge, and Lasso models as meta models. As a result of the analysis, the RMSE value in the Ridge model was the lowest at 0.5181, thus building the highest predictive model.

Correlation Analysis of Airline Customer Satisfaction using Random Forest with Deep Neural Network and Support Vector Machine Model

  • Hong, Sang Hoon;Kim, Bumsu;Jung, Yong Gyu
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.12 no.4
    • /
    • pp.26-32
    • /
    • 2020
  • There are many airline customer evaluation data, but they are insufficient in terms of predicting customer satisfaction in practice. In particular, they are generally insufficient in case of verification of data value and development of a customer satisfaction prediction model based on customer evaluation data. In this paper, airline customer satisfaction analysis is conducted through an experiment of correlation analysis between customer evaluation data provided by Google's Kaggle. The difference in accuracy varied according to the three types, which are the overall variables, the top 4 and top 8 variables with the highest correlation. To build an airline customer satisfaction prediction model, they are applied to three classification algorithms of Random Forest, SVM, DNN and conduct a classification experiment. They are divided into training data and verification data by 7:3. As a result, the DNN model showed the lowest accuracy at 86.4%, while the SVM model at 89% and the Random Forest model at 95.7% showed the highest accuracy and performance.

The Effect of Highland Weather and Soil Information on the Prediction of Chinese Cabbage Weight (기상 및 토양정보가 고랭지배추 단수예측에 미치는 영향)

  • Kwon, Taeyong;Kim, Rae Yong;Yoon, Sanghoo
    • Journal of Environmental Science International
    • /
    • v.28 no.8
    • /
    • pp.701-707
    • /
    • 2019
  • Highland farming is agriculture that takes place 400 m above sea level and typically involves both low temperatures and long sunshine hours. Most highland Chinese cabbages are harvested in the Gangwon province. The Ubiquitous Sensor Network (USN) has been deployed to observe Chinese cabbages growth because of the lack of installed weather stations in the highlands. Five representative Chinese cabbage cultivation spots were selected for USN and meteorological data collection between 2015 and 2017. The purpose of this study is to develop a weight prediction model for Chinese cabbages using the meteorological and growth data that were collected one week prior. Both a regression and random forest model were considered for this study, with the regression assumptions being satisfied. The Root Mean Square Error (RMSE) was used to evaluate the predictive performance of the models. The variables influencing the weight of cabbage were the number of cabbage leaves, wind speed, precipitation and soil electrical conductivity in the regression model. In the random forest model, cabbage width, the number of cabbage leaves, soil temperature, precipitation, temperature, soil moisture at a depth of 30 cm, cabbage leaf width, soil electrical conductivity, humidity, and cabbage leaf length were screened. The RMSE of the random forest model was 265.478, a value that was relatively lower than that of the regression model (404.493); this is because the random forest model could explain nonlinearity.