• Title/Summary/Keyword: MSE for prediction

Search Result 83, Processing Time 0.022 seconds

Prediction of Postoperative Lung Function in Lung Cancer Patients Using Machine Learning Models

  • Oh Beom Kwon;Solji Han;Hwa Young Lee;Hye Seon Kang;Sung Kyoung Kim;Ju Sang Kim;Chan Kwon Park;Sang Haak Lee;Seung Joon Kim;Jin Woo Kim;Chang Dong Yeo
    • Tuberculosis and Respiratory Diseases
    • /
    • v.86 no.3
    • /
    • pp.203-215
    • /
    • 2023
  • Background: Surgical resection is the standard treatment for early-stage lung cancer. Since postoperative lung function is related to mortality, predicted postoperative lung function is used to determine the treatment modality. The aim of this study was to evaluate the predictive performance of linear regression and machine learning models. Methods: We extracted data from the Clinical Data Warehouse and developed three sets: set I, the linear regression model; set II, machine learning models omitting the missing data: and set III, machine learning models imputing the missing data. Six machine learning models, the least absolute shrinkage and selection operator (LASSO), Ridge regression, ElasticNet, Random Forest, eXtreme gradient boosting (XGBoost), and the light gradient boosting machine (LightGBM) were implemented. The forced expiratory volume in 1 second measured 6 months after surgery was defined as the outcome. Five-fold cross-validation was performed for hyperparameter tuning of the machine learning models. The dataset was split into training and test datasets at a 70:30 ratio. Implementation was done after dataset splitting in set III. Predictive performance was evaluated by R2 and mean squared error (MSE) in the three sets. Results: A total of 1,487 patients were included in sets I and III and 896 patients were included in set II. In set I, the R2 value was 0.27 and in set II, LightGBM was the best model with the highest R2 value of 0.5 and the lowest MSE of 154.95. In set III, LightGBM was the best model with the highest R2 value of 0.56 and the lowest MSE of 174.07. Conclusion: The LightGBM model showed the best performance in predicting postoperative lung function.

Very Short- and Long-Term Prediction Method for Solar Power (초 장단기 통합 태양광 발전량 예측 기법)

  • Mun Seop Yun;Se Ryung Lim;Han Seung Jang
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.18 no.6
    • /
    • pp.1143-1150
    • /
    • 2023
  • The global climate crisis and the implementation of low-carbon policies have led to a growing interest in renewable energy and a growing number of related industries. Among them, solar power is attracting attention as a representative eco-friendly energy that does not deplete and does not emit pollutants or greenhouse gases. As a result, the supplement of solar power facility is increasing all over the world. However, solar power is easily affected by the environment such as geography and weather, so accurate solar power forecast is important for stable operation and efficient management. However, it is very hard to predict the exact amount of solar power using statistical methods. In addition, the conventional prediction methods have focused on only short- or long-term prediction, which causes to take long time to obtain various prediction models with different prediction horizons. Therefore, this study utilizes a many-to-many structure of a recurrent neural network (RNN) to integrate short-term and long-term predictions of solar power generation. We compare various RNN-based very short- and long-term prediction methods for solar power in terms of MSE and R2 values.

Stock prediction using combination of BERT sentiment Analysis and Macro economy index

  • Jang, Euna;Choi, HoeRyeon;Lee, HongChul
    • Journal of the Korea Society of Computer and Information
    • /
    • v.25 no.5
    • /
    • pp.47-56
    • /
    • 2020
  • The stock index is used not only as an economic indicator for a country, but also as an indicator for investment judgment, which is why research into predicting the stock index is ongoing. The task of predicting the stock price index involves technical, basic, and psychological factors, and it is also necessary to consider complex factors for prediction accuracy. Therefore, it is necessary to study the model for predicting the stock price index by selecting and reflecting technical and auxiliary factors that affect the fluctuation of the stock price according to the stock price. Most of the existing studies related to this are forecasting studies that use news information or macroeconomic indicators that create market fluctuations, or reflect only a few combinations of indicators. In this paper, this we propose to present an effective combination of the news information sentiment analysis and various macroeconomic indicators in order to predict the US Dow Jones Index. After Crawling more than 93,000 business news from the New York Times for two years, the sentiment results analyzed using the latest natural language processing techniques BERT and NLTK, along with five macroeconomic indicators, gold prices, oil prices, and five foreign exchange rates affecting the US economy Combination was applied to the prediction algorithm LSTM, which is known to be the most suitable for combining numeric and text information. As a result of experimenting with various combinations, the combination of DJI, NLTK, BERT, OIL, GOLD, and EURUSD in the DJI index prediction yielded the smallest MSE value.

Maximum damage prediction for regular reinforced concrete frames under consecutive earthquakes

  • Amiri, Gholamreza Ghodrati;Rajabi, Elham
    • Earthquakes and Structures
    • /
    • v.14 no.2
    • /
    • pp.129-142
    • /
    • 2018
  • The current paper introduces a new approach for development of damage index to obtain the maximum damage in the reinforced concrete frames caused by as-recorded single and consecutive earthquakes. To do so, two sets of strong ground motions are selected based on maximum and approximately maximum peak ground acceleration (PGA) from "PEER" and "USGS" centers. Consecutive earthquakes in the first and second groups, not only occurred in similar directions and same stations, but also their real time gaps between successive shocks are less than 10 minutes and 10 days, respectively. In the following, a suite of six concrete moment resisting frames, including 3, 5, 7, 10, 12 and 15 stories, are designed in OpenSees software and analyzed for more than 850 times under two groups of as-recorded strong ground motion records with/without seismic sequences phenomena. The idealized multilayer artificial neural networks, with the least value of Mean Square Error (MSE) and maximum value of regression (R) between outputs and targets were then employed to generate the empirical charts and several correction equations for design utilization. To investigate the effectiveness of the proposed damage index, calibration of the new approach to existing real data (the result of Park-Ang damage index 1985), were conducted. The obtained results show good precision of the developed ANNs-based model in predicting the maximum damage of regular reinforced concrete frames.

Development of Multilayer Perceptron Model for the Prediction of Alcohol Concentration of Makgeolli

  • Kim, JoonYong;Rho, Shin-Joung;Cho, Yun Sung;Cho, EunSun
    • Journal of Biosystems Engineering
    • /
    • v.43 no.3
    • /
    • pp.229-236
    • /
    • 2018
  • Purpose: Makgeolli is a traditional alcoholic beverage made from rice with a fermentation starter called "nuruk." The concentration of alcohol in makgeolli depends on the temperature of the fermentation tank. It is important to monitor the alcohol concentration to manage the makgeolli production process. Methods: Data were collected from 84 makgeolli fermentation tanks over a year period. Independent variables included the temperatures of the tanks and the room where the tanks were located, as well as the quantity, acidity, and water concentration of the source. Software for the multilayer perceptron model (MLP) was written in Python using the Scikit-learn library. Results: Many models were created for which the optimization converged within 100 iterations, and their coefficients of determination $R^2$ were considerably high. The coefficient of determination $R^2$ of the best model with the training set and the test set were 0.94 and 0.93, respectively. The fact that the difference between them was very small indicated that the model was not overfitted. The maximum and minimum error was approximately 2% and the total MSE was 0.078%. Conclusions: The MLP model could help predict the alcohol concentration and to control the production process of makgeolli. In future research, the optimization of the production process will be studied based on the model.

Prediction of unconfined compressive and Brazilian tensile strength of fiber reinforced cement stabilized fly ash mixes using multiple linear regression and artificial neural network

  • Chore, H.S.;Magar, R.B.
    • Advances in Computational Design
    • /
    • v.2 no.3
    • /
    • pp.225-240
    • /
    • 2017
  • This paper presents the application of multiple linear regression (MLR) and artificial neural network (ANN) techniques for developing the models to predict the unconfined compressive strength (UCS) and Brazilian tensile strength (BTS) of the fiber reinforced cement stabilized fly ash mixes. UCS and BTS is a highly nonlinear function of its constituents, thereby, making its modeling and prediction a difficult task. To establish relationship between the independent and dependent variables, a computational technique like ANN is employed which provides an efficient and easy approach to model the complex and nonlinear relationship. The data generated in the laboratory through systematic experimental programme for evaluating UCS and BTS of fiber reinforced cement fly ash mixes with respect to 7, 14 and 28 days' curing is used for development of the MLR and ANN model. The data used in the models is arranged in the format of four input parameters that cover the contents of cement and fibers along with maximum dry density (MDD) and optimum moisture contents (OMC), respectively and one dependent variable as unconfined compressive as well as Brazilian tensile strength. ANN models are trained and tested for various combinations of input and output data sets. Performance of networks is checked with the statistical error criteria of correlation coefficient (R), mean square error (MSE) and mean absolute error (MAE). It is observed that the ANN model predicts both, the unconfined compressive and Brazilian tensile, strength quite well in the form of R, RMSE and MAE. This study shows that as an alternative to classical modeling techniques, ANN approach can be used accurately for predicting the unconfined compressive strength and Brazilian tensile strength of fiber reinforced cement stabilized fly ash mixes.

Magnetic Flux Leakage (MFL) based Defect Characterization of Steam Generator Tubes using Artificial Neural Networks

  • Daniel, Jackson;Abudhahir, A.;Paulin, J. Janet
    • Journal of Magnetics
    • /
    • v.22 no.1
    • /
    • pp.34-42
    • /
    • 2017
  • Material defects in the Steam Generator Tubes (SGT) of sodium cooled fast breeder reactor (PFBR) can lead to leakage of water into sodium. The water and sodium reaction will lead to major accidents. Therefore, the examination of steam generator tubes for the early detection of defects is an important requirement for safety and economic considerations. In this work, the Magnetic Flux Leakage (MFL) based Non Destructive Testing (NDT) technique is used to perform the defect detection process. The rectangular notch defects on the outer surface of steam generator tubes are modeled using COMSOL multiphysics 4.3a software. The obtained MFL images are de-noised to improve the integrity of flaw related information. Grey Level Co-occurrence Matrix (GLCM) features are extracted from MFL images and taken as input parameter to train the neural network. A comparative study on characterization have been carried out using feed-forward back propagation (FFBP) and cascade-forward back propagation (CFBP) algorithms. The results of both algorithms are evaluated with Mean Square Error (MSE) as a prediction performance measure. The average percentage error for length, depth and width are also computed. The result shows that the feed-forward back propagation network model performs better in characterizing the defects.

Prediction of Net Irrigation Water Requirement in paddy field Based on Machine Learning (머신러닝 기법을 활용한 논 순용수량 예측)

  • Kim, Soo-Jin;Bae, Seung-Jong;Jang, Min-Won
    • Journal of Korean Society of Rural Planning
    • /
    • v.28 no.4
    • /
    • pp.105-117
    • /
    • 2022
  • This study tested SVM(support vector machine), RF(random forest), and ANN(artificial neural network) machine-learning models that can predict net irrigation water requirements in paddy fields. For the Jeonju and Jeongeup meteorological stations, the net irrigation water requirement was calculated using K-HAS from 1981 to 2021 and set as the label. For each algorithm, twelve models were constructed based on cumulative precipitation, precipitation, crop evapotranspiration, and month. Compared to the CE model, the R2 of the CEP model was higher, and MAE, RMSE, and MSE were lower. Comprehensively considering learning performance and learning time, it is judged that the RF algorithm has the best usability and predictive power of five-days is better than three-days. The results of this study are expected to provide the scientific information necessary for the decision-making of on-site water managers is expected to be possible through the connection with weather forecast data. In the future, if the actual amount of irrigation and supply are measured, it is necessary to develop a learning model that reflects this.

Real-time prediction on the slurry concentration of cutter suction dredgers using an ensemble learning algorithm

  • Han, Shuai;Li, Mingchao;Li, Heng;Tian, Huijing;Qin, Liang;Li, Jinfeng
    • International conference on construction engineering and project management
    • /
    • 2020.12a
    • /
    • pp.463-481
    • /
    • 2020
  • Cutter suction dredgers (CSDs) are widely used in various dredging constructions such as channel excavation, wharf construction, and reef construction. During a CSD construction, the main operation is to control the swing speed of cutter to keep the slurry concentration in a proper range. However, the slurry concentration cannot be monitored in real-time, i.e., there is a "time-lag effect" in the log of slurry concentration, making it difficult for operators to make the optimal decision on controlling. Concerning this issue, a solution scheme that using real-time monitored indicators to predict current slurry concentration is proposed in this research. The characteristics of the CSD monitoring data are first studied, and a set of preprocessing methods are presented. Then we put forward the concept of "index class" to select the important indices. Finally, an ensemble learning algorithm is set up to fit the relationship between the slurry concentration and the indices of the index classes. In the experiment, log data over seven days of a practical dredging construction is collected. For comparison, the Deep Neural Network (DNN), Long Short Time Memory (LSTM), Support Vector Machine (SVM), Random Forest (RF), Gradient Boosting Decision Tree (GBDT), and the Bayesian Ridge algorithm are tried. The results show that our method has the best performance with an R2 of 0.886 and a mean square error (MSE) of 5.538. This research provides an effective way for real-time predicting the slurry concentration of CSDs and can help to improve the stationarity and production efficiency of dredging construction.

  • PDF

A Demand Forecasting for Aircraft Spare Parts using ARMIA (ARIMA를 이용한 항공기 수리부속의 수요 예측)

  • Park, Young-Jin;Jeon, Geon-Wook
    • Journal of the military operations research society of Korea
    • /
    • v.34 no.2
    • /
    • pp.79-101
    • /
    • 2008
  • This study is for improvement of repair part demand forecasting method of Republic of Korea Air Force aircraft. Recently, demand prediction methods are Weighted moving average, Linear moving average, Trend analysis, Simple exponential smoothing, Linear exponential smoothing. But these use fixed weight and moving average range. Also, NORS(Not Operationally Ready upply) is increasing. Recommended method of Box-Jenkins' ARIMA can solve problems of these method and improve estimate accuracy. To compare recent prediction method and ARIMA that use mean squared error(MSE) is reacted sensitively in change of error. ARIMA has high accuracy than existing forecasting method. If apply this method of study in other several Items, can prove demand forecast Capability.