• Title/Summary/Keyword: mean squared prediction error

Search Result 150, Processing Time 0.02 seconds

Enhancement of durability of tall buildings by using deep-learning-based predictions of wind-induced pressure

  • K.R. Sri Preethaa;N. Yuvaraj;Gitanjali Wadhwa;Sujeen Song;Se-Woon Choi;Bubryur Kim
    • Wind and Structures
    • /
    • v.36 no.4
    • /
    • pp.237-247
    • /
    • 2023
  • The emergence of high-rise buildings has necessitated frequent structural health monitoring and maintenance for safety reasons. Wind causes damage and structural changes on tall structures; thus, safe structures should be designed. The pressure developed on tall buildings has been utilized in previous research studies to assess the impacts of wind on structures. The wind tunnel test is a primary research method commonly used to quantify the aerodynamic characteristics of high-rise buildings. Wind pressure is measured by placing pressure sensor taps at different locations on tall buildings, and the collected data are used for analysis. However, sensors may malfunction and produce erroneous data; these data losses make it difficult to analyze aerodynamic properties. Therefore, it is essential to generate missing data relative to the original data obtained from neighboring pressure sensor taps at various intervals. This study proposes a deep learning-based, deep convolutional generative adversarial network (DCGAN) to restore missing data associated with faulty pressure sensors installed on high-rise buildings. The performance of the proposed DCGAN is validated by using a standard imputation model known as the generative adversarial imputation network (GAIN). The average mean-square error (AMSE) and average R-squared (ARSE) are used as performance metrics. The calculated ARSE values by DCGAN on the building model's front, backside, left, and right sides are 0.970, 0.972, 0.984 and 0.978, respectively. The AMSE produced by DCGAN on four sides of the building model is 0.008, 0.010, 0.015 and 0.014. The average standard deviation of the actual measures of the pressure sensors on four sides of the model were 0.1738, 0.1758, 0.2234 and 0.2278. The average standard deviation of the pressure values generated by the proposed DCGAN imputation model was closer to that of the measured actual with values of 0.1736,0.1746,0.2191, and 0.2239 on four sides, respectively. In comparison, the standard deviation of the values predicted by GAIN are 0.1726,0.1735,0.2161, and 0.2209, which is far from actual values. The results demonstrate that DCGAN model fits better for data imputation than the GAIN model with improved accuracy and fewer error rates. Additionally, the DCGAN is utilized to estimate the wind pressure in regions of buildings where no pressure sensor taps are available; the model yielded greater prediction accuracy than GAIN.

A Comparative Study On Accident Prediction Model Using Nonlinear Regression And Artificial Neural Network, Structural Equation for Rural 4-Legged Intersection (비선형 회귀분석, 인공신경망, 구조방정식을 이용한 지방부 4지 신호교차로 교통사고 예측모형 성능 비교 연구)

  • Oh, Ju Taek;Yun, Ilsoo;Hwang, Jeong Won;Han, Eum
    • Journal of Korean Society of Transportation
    • /
    • v.32 no.3
    • /
    • pp.266-279
    • /
    • 2014
  • For the evaluation of roadway safety, diverse methods, including before-after studies, simple comparison using historic traffic accident data, methods based on experts' opinion or literature, have been applied. Especially, many research efforts have developed traffic accident prediction models in order to identify critical elements causing accidents and evaluate the level of safety. A traffic accident prediction model must secure predictability and transferability. By acquiring the predictability, the model can increase the accuracy in predicting the frequency of accidents qualitatively and quantitatively. By guaranteeing the transferability, the model can be used for other locations with acceptable accuracy. To this end, traffic accident prediction models using non-linear regression, artificial neural network, and structural equation were developed in this study. The predictability and transferability of three models were compared using a model development data set collected from 90 signalized intersections and a model validation data set from other 33 signalized intersections based on mean absolute deviation and mean squared prediction error. As a result of the comparison using the model development data set, the artificial neural network showed the highest predictability. However, the non-linear regression model was found out to be most appropriate in the comparison using the model validation data set. Conclusively, the artificial neural network has a strong ability in representing the relationship between the frequency of traffic accidents and traffic and road design elements. However, the predictability of the artificial neural network significantly decreased when the artificial neural network was applied to a new data which was not used in the model developing.

Prediction of Seasonal Nitrate Concentration in Springs on the Southern Slope of Jeju Island using Multiple Linear Regression of Geographic Spatial Data (지리 공간 자료의 다중회귀분석을 이용한 제주도 남측사면 용천수의 시기별 질산성 질소 농도 예측)

  • Jung, Youn-Young;Koh, Dong-Chan;Kang, Bong-Rae;Ko, Kyung-Suk;Yu, Yong-Jae
    • Economic and Environmental Geology
    • /
    • v.44 no.2
    • /
    • pp.135-152
    • /
    • 2011
  • Nitrate concentrations in springs at the southern slope of Jeju Island were predicted using multiple linear regression (MLR) of spatial variables including hydrogeological parameters and land use characteristics. Springs showed wide range of nitrate concentrations from <0.02 to 86 mg/L with a mean of 20 mg/L. Spatial variables were generated for the circular buffer when the optimal buffer radius was assigned as 400 m. Selected regression models were tested using the p values and Durbin-Watson statistics. Explanatory variables were selected using the adjusted $R^2$, Cp (total squared error) and AIC (Akaike's Information Criterion), and significance. In addition, mutual linear relations between variables were also considered. Small portion of springs, usually <10% of total samples, were identified as outliers indicating limitations of MLR using circular buffers. Adjusted $R^2$ of the proposed models was improved from 0.75 to 0.87 when outliers were eliminated. In particular, the areal proportion of natural area had the greatest influence on the nitrate concentrations in springs. Among anthropogenic land uses, the influence of nitrate contamination is diminishing in the following order of orchard, residential area, and dry farmland. It is apparent quality of springs in the study area is likely to be controlled by land uses instead of hydrogeological parameters. Most of all, it is worth highlighting that the contamination susceptibility of springs is highly sensitive to nearby land uses, in particular, orchard.

Development of groundwater level monitoring and forecasting technique for drought analysis (II) - Groundwater drought forecasting Using SPI, SGI and ANN (가뭄 분석을 위한 지하수위 모니터링 및 예측기법 개발(II) - 표준강수지수, 표준지하수지수 및 인공신경망을 이용한 지하수 가뭄 예측)

  • Lee, Jeongju;Kang, Shinuk;Kim, Taeho;Chun, Gunil
    • Journal of Korea Water Resources Association
    • /
    • v.51 no.11
    • /
    • pp.1021-1029
    • /
    • 2018
  • A primary objective of this study is to develop a drought forecasting technique based on groundwater which can be exploit for water supply under drought stress. For this purpose, we explored the lagged relationships between regionalized SGI (standardized groundwater level index) and SPI (standardized precipitation index) in view of the drought propagation. A regional prediction model was constructed using a NARX (nonlinear autoregressive exogenous) artificial neural network model which can effectively capture nonlinear relationships with the lagged independent variable. During the training phase, model performance in terms of correlation coefficient was found to be satisfactory with the correlation coefficient over 0.7. Moreover, the model performance was described by root mean squared error (RMSE). It can be concluded that the proposed approach is able to provide a reliable SGI forecasts along with rainfall forecasts provided by the Korea Meteorological Administration.

Prediction of Postoperative Lung Function in Lung Cancer Patients Using Machine Learning Models

  • Oh Beom Kwon;Solji Han;Hwa Young Lee;Hye Seon Kang;Sung Kyoung Kim;Ju Sang Kim;Chan Kwon Park;Sang Haak Lee;Seung Joon Kim;Jin Woo Kim;Chang Dong Yeo
    • Tuberculosis and Respiratory Diseases
    • /
    • v.86 no.3
    • /
    • pp.203-215
    • /
    • 2023
  • Background: Surgical resection is the standard treatment for early-stage lung cancer. Since postoperative lung function is related to mortality, predicted postoperative lung function is used to determine the treatment modality. The aim of this study was to evaluate the predictive performance of linear regression and machine learning models. Methods: We extracted data from the Clinical Data Warehouse and developed three sets: set I, the linear regression model; set II, machine learning models omitting the missing data: and set III, machine learning models imputing the missing data. Six machine learning models, the least absolute shrinkage and selection operator (LASSO), Ridge regression, ElasticNet, Random Forest, eXtreme gradient boosting (XGBoost), and the light gradient boosting machine (LightGBM) were implemented. The forced expiratory volume in 1 second measured 6 months after surgery was defined as the outcome. Five-fold cross-validation was performed for hyperparameter tuning of the machine learning models. The dataset was split into training and test datasets at a 70:30 ratio. Implementation was done after dataset splitting in set III. Predictive performance was evaluated by R2 and mean squared error (MSE) in the three sets. Results: A total of 1,487 patients were included in sets I and III and 896 patients were included in set II. In set I, the R2 value was 0.27 and in set II, LightGBM was the best model with the highest R2 value of 0.5 and the lowest MSE of 154.95. In set III, LightGBM was the best model with the highest R2 value of 0.56 and the lowest MSE of 174.07. Conclusion: The LightGBM model showed the best performance in predicting postoperative lung function.

A simulation study for various propensity score weighting methods in clinical problematic situations (임상에서 발생할 수 있는 문제 상황에서의 성향 점수 가중치 방법에 대한 비교 모의실험 연구)

  • Siseong Jeong;Eun Jeong Min
    • The Korean Journal of Applied Statistics
    • /
    • v.36 no.5
    • /
    • pp.381-397
    • /
    • 2023
  • The most representative design used in clinical trials is randomization, which is used to accurately estimate the treatment effect. However, comparison between the treatment group and the control group in an observational study without randomization is biased due to various unadjusted differences, such as characteristics between patients. Propensity score weighting is a widely used method to address these problems and to minimize bias by adjusting those confounding and assess treatment effects. Inverse probability weighting, the most popular method, assigns weights that are proportional to the inverse of the conditional probability of receiving a specific treatment assignment, given observed covariates. However, this method is often suffered by extreme propensity scores, resulting in biased estimates and excessive variance. Several alternative methods including trimming, overlap weights, and matching weights have been proposed to mitigate these issues. In this paper, we conduct a simulation study to compare performance of various propensity score weighting methods under diverse situation, such as limited overlap, misspecified propensity score, and treatment contrary to prediction. From the simulation results overlap weights and matching weights consistently outperform inverse probability weighting and trimming in terms of bias, root mean squared error and coverage probability.

Prediction accuracy of incisal points in determining occlusal plane of digital complete dentures

  • Kenta Kashiwazaki;Yuriko Komagamine;Sahaprom Namano;Ji-Man Park;Maiko Iwaki;Shunsuke Minakuchi;Manabu, Kanazawa
    • The Journal of Advanced Prosthodontics
    • /
    • v.15 no.6
    • /
    • pp.281-289
    • /
    • 2023
  • PURPOSE. This study aimed to predict the positional coordinates of incisor points from the scan data of conventional complete dentures and verify their accuracy. MATERIALS AND METHODS. The standard triangulated language (STL) data of the scanned 100 pairs of complete upper and lower dentures were imported into the computer-aided design software from which the position coordinates of the points corresponding to each landmark of the jaw were obtained. The x, y, and z coordinates of the incisor point (XP, YP, and ZP) were obtained from the maxillary and mandibular landmark coordinates using regression or calculation formulas, and the accuracy was verified to determine the deviation between the measured and predicted coordinate values. YP was obtained in two ways using the hamularincisive-papilla plane (HIP) and facial measurements. Multiple regression analysis was used to predict ZP. The root mean squared error (RMSE) values were used to verify the accuracy of the XP and YP. The RMSE value was obtained after crossvalidation using the remaining 30 cases of denture STL data to verify the accuracy of ZP. RESULTS. The RMSE was 2.22 for predicting XP. When predicting YP, the RMSE of the method using the HIP plane and facial measurements was 3.18 and 0.73, respectively. Cross-validation revealed the RMSE to be 1.53. CONCLUSION. YP and ZP could be predicted from anatomical landmarks of the maxillary and mandibular edentulous jaw, suggesting that YP could be predicted with better accuracy with the addition of the position of the lower border of the upper lip.

Predicting Forest Gross Primary Production Using Machine Learning Algorithms (머신러닝 기법의 산림 총일차생산성 예측 모델 비교)

  • Lee, Bora;Jang, Keunchang;Kim, Eunsook;Kang, Minseok;Chun, Jung-Hwa;Lim, Jong-Hwan
    • Korean Journal of Agricultural and Forest Meteorology
    • /
    • v.21 no.1
    • /
    • pp.29-41
    • /
    • 2019
  • Terrestrial Gross Primary Production (GPP) is the largest global carbon flux, and forest ecosystems are important because of the ability to store much more significant amounts of carbon than other terrestrial ecosystems. There have been several attempts to estimate GPP using mechanism-based models. However, mechanism-based models including biological, chemical, and physical processes are limited due to a lack of flexibility in predicting non-stationary ecological processes, which are caused by a local and global change. Instead mechanism-free methods are strongly recommended to estimate nonlinear dynamics that occur in nature like GPP. Therefore, we used the mechanism-free machine learning techniques to estimate the daily GPP. In this study, support vector machine (SVM), random forest (RF) and artificial neural network (ANN) were used and compared with the traditional multiple linear regression model (LM). MODIS products and meteorological parameters from eddy covariance data were employed to train the machine learning and LM models from 2006 to 2013. GPP prediction models were compared with daily GPP from eddy covariance measurement in a deciduous forest in South Korea in 2014 and 2015. Statistical analysis including correlation coefficient (R), root mean square error (RMSE) and mean squared error (MSE) were used to evaluate the performance of models. In general, the models from machine-learning algorithms (R = 0.85 - 0.93, MSE = 1.00 - 2.05, p < 0.001) showed better performance than linear regression model (R = 0.82 - 0.92, MSE = 1.24 - 2.45, p < 0.001). These results provide insight into high predictability and the possibility of expansion through the use of the mechanism-free machine-learning models and remote sensing for predicting non-stationary ecological processes such as seasonal GPP.

Development of Unfolding Energy Spectrum with Clinical Linear Accelerator based on Transmission Data (물질투과율 측정정보 기반 의료용 선형가속기의 에너지스펙트럼 유도기술 개발)

  • Choi, Hyun Joon;Park, Hyo Jun;Yoo, Do Hyeon;Kim, Byoung-Chul;Yi, Chul-Young;Min, Chul Hee
    • Journal of Radiation Protection and Research
    • /
    • v.41 no.1
    • /
    • pp.41-47
    • /
    • 2016
  • Background: For the accurate dose assessment in radiation therapy, energy spectrum of the photon beam generated from the linac head is essential. The aim of this study is to develop the technique to accurately unfolding the energy spectrum with the transmission analysis method. Materials and Methods: Clinical linear accelerator and Monet Carlo method was employed to evaluate the transmission signals according to the thickness of the observer material, and then the response function of the ion chamber response was determined with the mono energy beam. Finally the energy spectrum was unfolded with HEPROW program. Elekta Synergy Flatform and Geant4 tool kits was used in this study. Results and Discussion: In the comparison between calculated and measured transmission signals using aluminum alloy as an attenuator, root mean squared error was 0.43%. In the comparison between unfolded spectrum using HEPROW program and calculated spectrum using Geant4, the difference of peak and mean energy were 0.066 and 0.03 MeV, respectively. However, for the accurate prediction of the energy spectrum, additional experiment with various type of material and improvement of the unfolding program is required. Conclusion: In this research, it is demonstrated that unfolding spectra technique could be used in megavoltage photon beam with aluminum alloy and HEPROW program.

A Study on Developing a VKOSPI Forecasting Model via GARCH Class Models for Intelligent Volatility Trading Systems (지능형 변동성트레이딩시스템개발을 위한 GARCH 모형을 통한 VKOSPI 예측모형 개발에 관한 연구)

  • Kim, Sun-Woong
    • Journal of Intelligence and Information Systems
    • /
    • v.16 no.2
    • /
    • pp.19-32
    • /
    • 2010
  • Volatility plays a central role in both academic and practical applications, especially in pricing financial derivative products and trading volatility strategies. This study presents a novel mechanism based on generalized autoregressive conditional heteroskedasticity (GARCH) models that is able to enhance the performance of intelligent volatility trading systems by predicting Korean stock market volatility more accurately. In particular, we embedded the concept of the volatility asymmetry documented widely in the literature into our model. The newly developed Korean stock market volatility index of KOSPI 200, VKOSPI, is used as a volatility proxy. It is the price of a linear portfolio of the KOSPI 200 index options and measures the effect of the expectations of dealers and option traders on stock market volatility for 30 calendar days. The KOSPI 200 index options market started in 1997 and has become the most actively traded market in the world. Its trading volume is more than 10 million contracts a day and records the highest of all the stock index option markets. Therefore, analyzing the VKOSPI has great importance in understanding volatility inherent in option prices and can afford some trading ideas for futures and option dealers. Use of the VKOSPI as volatility proxy avoids statistical estimation problems associated with other measures of volatility since the VKOSPI is model-free expected volatility of market participants calculated directly from the transacted option prices. This study estimates the symmetric and asymmetric GARCH models for the KOSPI 200 index from January 2003 to December 2006 by the maximum likelihood procedure. Asymmetric GARCH models include GJR-GARCH model of Glosten, Jagannathan and Runke, exponential GARCH model of Nelson and power autoregressive conditional heteroskedasticity (ARCH) of Ding, Granger and Engle. Symmetric GARCH model indicates basic GARCH (1, 1). Tomorrow's forecasted value and change direction of stock market volatility are obtained by recursive GARCH specifications from January 2007 to December 2009 and are compared with the VKOSPI. Empirical results indicate that negative unanticipated returns increase volatility more than positive return shocks of equal magnitude decrease volatility, indicating the existence of volatility asymmetry in the Korean stock market. The point value and change direction of tomorrow VKOSPI are estimated and forecasted by GARCH models. Volatility trading system is developed using the forecasted change direction of the VKOSPI, that is, if tomorrow VKOSPI is expected to rise, a long straddle or strangle position is established. A short straddle or strangle position is taken if VKOSPI is expected to fall tomorrow. Total profit is calculated as the cumulative sum of the VKOSPI percentage change. If forecasted direction is correct, the absolute value of the VKOSPI percentage changes is added to trading profit. It is subtracted from the trading profit if forecasted direction is not correct. For the in-sample period, the power ARCH model best fits in a statistical metric, Mean Squared Prediction Error (MSPE), and the exponential GARCH model shows the highest Mean Correct Prediction (MCP). The power ARCH model best fits also for the out-of-sample period and provides the highest probability for the VKOSPI change direction tomorrow. Generally, the power ARCH model shows the best fit for the VKOSPI. All the GARCH models provide trading profits for volatility trading system and the exponential GARCH model shows the best performance, annual profit of 197.56%, during the in-sample period. The GARCH models present trading profits during the out-of-sample period except for the exponential GARCH model. During the out-of-sample period, the power ARCH model shows the largest annual trading profit of 38%. The volatility clustering and asymmetry found in this research are the reflection of volatility non-linearity. This further suggests that combining the asymmetric GARCH models and artificial neural networks can significantly enhance the performance of the suggested volatility trading system, since artificial neural networks have been shown to effectively model nonlinear relationships.