• Title/Summary/Keyword: Multiple-Linear-Regression

Search Result 1,755, Processing Time 0.029 seconds

Correlation Analysis of Reservoir Water Quality with respect to Land Use Types of Watersheds (유역 토지이용과 저수지 수질의 상관관계 분석)

  • Youn, Dong-Koun;Chung, Sang-Ok
    • Current Research on Agriculture and Life Sciences
    • /
    • v.24
    • /
    • pp.49-53
    • /
    • 2006
  • The objective of this study was to present regression equations between reservoir water quality and land use types of the watersheds. In order to derive regression equations, a multiple linear regression analysis was used using observed data from 88 reservoirs in the Kyungpook Provcince. The measured values of BOD, COD, T-N, and T-P were correlated with the areas of land use types. 23 regression equations were obtained for all the water quality items and watershed sizes. The results showed that 2 regression equations have the multiple correlation coefficient(MCC) above 0.90, 10 regression equations have the MCC values from 0.70 to 0.90, 9 equations have the MCC from 0.40 to 0.70, and 2 equations have the MCC from 0.20 to 0.40. The results of this study can be used to estimate reservoir water quality simply and quickly in the planning phase.

  • PDF

Robust inference for linear regression model based on weighted least squares

  • Park, Jin-Pyo
    • Journal of the Korean Data and Information Science Society
    • /
    • v.13 no.2
    • /
    • pp.271-284
    • /
    • 2002
  • In this paper we consider the robust inference for the parameter of linear regression model based on weighted least squares. First we consider the sequential test of multiple outliers. Next we suggest the way to assign a weight to each observation $(x_i,\;y_i)$ and recommend the robust inference for linear model. Finally, to check the performance of confidence interval for the slope using proposed method, we conducted a Monte Carlo simulation and presented some numerical results and examples.

  • PDF

TIME SERIES PREDICTION USING INCREMENTAL REGRESSION

  • Kim, Sung-Hyun;Lee, Yong-Mi;Jin, Long;Chai, Duck-Jin;Ryu, Keun-Ho
    • Proceedings of the KSRS Conference
    • /
    • v.2
    • /
    • pp.635-638
    • /
    • 2006
  • Regression of conventional prediction techniques in data mining uses the model which is generated from the training step. This model is applied to new input data without any change. If this model is applied directly to time series, the rate of prediction accuracy will be decreased. This paper proposes an incremental regression for time series prediction like typhoon track prediction. This technique considers the characteristic of time series which may be changed over time. It is composed of two steps. The first step executes a fractional process for applying input data to the regression model. The second step updates the model by using its information as new data. Additionally, the model is maintained by only recent data in a queue. This approach has the following two advantages. It maintains the minimum information of the model by using a matrix, so space complexity is reduced. Moreover, it prevents the increment of error rate by updating the model over time. Accuracy rate of the proposed method is measured by RME(Relative Mean Error) and RMSE(Root Mean Square Error). The results of typhoon track prediction experiment are performed by the proposed technique IMLR(Incremental Multiple Linear Regression) is more efficient than those of MLR(Multiple Linear Regression) and SVR(Support Vector Regression).

  • PDF

A Comparative Study of the Results of the Regression Analysis by Linear Programming (선형계획법을 이용한 회귀분석 결과의 비교 연구)

  • Kim, Gwang-Su;Jeong, Ji-An;Lee, Jin-Gyu
    • Journal of Korean Society for Quality Management
    • /
    • v.21 no.1
    • /
    • pp.161-170
    • /
    • 1993
  • This study attempts to present the linear regression analysis that involves more than one regressor variable, because regression analysis is the most widely used statistical technique for describing, predicting and estimating the relationships between given data. The model of multiple linear regression may be solved directly by the two linear programming methods, i.e., to minimize the sum of the absolute deviation (MSD) and to minimize the maximum deviation(MMD). In addition, some results was compared to each techniques for accuracy and tested to the validity of statistical meaning.

  • PDF

A Study on Predictive Models based on the Machine Learning for Evaluating the Extent of Hazardous Zone of Explosive Gases (기계학습 기반의 가스폭발위험범위 예측모델에 관한 연구)

  • Jung, Yong Jae;Lee, Chang Jun
    • Korean Chemical Engineering Research
    • /
    • v.58 no.2
    • /
    • pp.248-256
    • /
    • 2020
  • In this study, predictive models based on machine learning for evaluating the extent of hazardous zone of explosive gases are developed. They are able to provide important guidelines for installing the explosion proof apparatus. 1,200 research data sets including 12 combustible gases and their extents of hazardous zone are generated to train predictive models. The extent of hazardous zone is set to an output variable and 12 variables affecting an output are set as input variables. Multiple linear regression, principal component regression, and artificial neural network are employed to train predictive models. Mean absolute percentage errors of multiple linear regression, principal component regression, and artificial neural network are 44.2%, 49.3%, and 5.7% and root mean square errors are 1.389m, 1.602m, and 0.203 m respectively. Therefore, it can be concluded that the artificial neural network shows the best performance. This model can be easily used to evaluate the extent of hazardous zone for explosive gases.

Prediction of lightweight concrete strength by categorized regression, MLR and ANN

  • Tavakkol, S.;Alapour, F.;Kazemian, A.;Hasaninejad, A.;Ghanbari, A.;Ramezanianpour, A.A.
    • Computers and Concrete
    • /
    • v.12 no.2
    • /
    • pp.151-167
    • /
    • 2013
  • Prediction of concrete properties is an important issue for structural engineers and different methods are developed for this purpose. Most of these methods are based on experimental data and use measured data for parameter estimation. Three typical methods of output estimation are Categorized Linear Regression (CLR), Multiple Linear Regression (MLR) and Artificial Neural Networks (ANN). In this paper a statistical cleansing method based on CLR is introduced. Afterwards, MLR and ANN approaches are also employed to predict the compressive strength of structural lightweight aggregate concrete. The valid input domain is briefly discussed. Finally the results of three prediction methods are compared to determine the most efficient method. The results indicate that despite higher accuracy of ANN, there are some limitations for the method. These limitations include high sensitivity of method to its valid input domain and selection criteria for determining the most efficient network.

Bayesian inference for an ordered multiple linear regression with skew normal errors

  • Jeong, Jeongmun;Chung, Younshik
    • Communications for Statistical Applications and Methods
    • /
    • v.27 no.2
    • /
    • pp.189-199
    • /
    • 2020
  • This paper studies a Bayesian ordered multiple linear regression model with skew normal error. It is reasonable that the kind of inherent information available in an applied regression requires some constraints on the coefficients to be estimated. In addition, the assumption of normality of the errors is sometimes not appropriate in the real data. Therefore, to explain such situations more flexibly, we use the skew-normal distribution given by Sahu et al. (The Canadian Journal of Statistics, 31, 129-150, 2003) for error-terms including normal distribution. For Bayesian methodology, the Markov chain Monte Carlo method is employed to resolve complicated integration problems. Also, under the improper priors, the propriety of the associated posterior density is shown. Our Bayesian proposed model is applied to NZAPB's apple data. For model comparison between the skew normal error model and the normal error model, we use the Bayes factor and deviance information criterion given by Spiegelhalter et al. (Journal of the Royal Statistical Society Series B (Statistical Methodology), 64, 583-639, 2002). We also consider the problem of detecting an influential point concerning skewness using Bayes factors. Finally, concluding remarks are discussed.

A Study on the Factors Affecting the Arson (방화 발생에 영향을 미치는 요인에 관한 연구)

  • Kim, Young-Chul;Bak, Woo-Sung;Lee, Su-Kyung
    • Fire Science and Engineering
    • /
    • v.28 no.2
    • /
    • pp.69-75
    • /
    • 2014
  • This study derives the factors which affect the occurrence of arson from statistical data (population, economic, and social factors) by multiple regression analysis. Multiple regression analysis applies to 4 forms of functions, linear functions, semi-log functions, inverse log functions, and dual log functions. Also analysis respectively functions by using the stepwise progress which considered selection and deletion of the independent variable factors by each steps. In order to solve a problem of multiple regression analysis, autocorrelation and multicollinearity, Variance Inflation Factor (VIF) and the Durbin-Watson coefficient were considered. Through the analysis, the optimal model was determined by adjusted Rsquared which means statistical significance used determination, Adjusted R-squared of linear function is scored 0.935 (93.5%), the highest of the 4 forms of function, and so linear function is the optimal model in this study. Then interpretation to the optimal model is conducted. As a result of the analysis, the factors affecting the arson were resulted in lines, the incidence of crime (0.829), the general divorce rate (0.151), the financial autonomy rate (0.149), and the consumer price index (0.099).

Calculation of Surface Broadband Emissivity by Multiple Linear Regression Model (다중선형회귀모형에 의한 지표면 광대역 방출율 산출)

  • Jo, Eun-Su;Lee, Kyu-Tae;Jung, Hyun-Seok;Kim, Bu-Yo;Zo, Il-Sung
    • Journal of the Korean earth science society
    • /
    • v.38 no.4
    • /
    • pp.269-282
    • /
    • 2017
  • In this study, the surface broadband emissivity ($3.0-14.0{\mu}m$) was calculated using the multiple linear regression model with narrow bands (channels 29, 30, and 31) emissivity data of the Moderate Resolution Imaging Spectroradiometer (MODIS) on Earth Observing System Terra satellite. The 307 types of spectral emissivity data (123 soil types, 32 vegetation types, 19 types of water bodies, 43 manmade materials, and 90 rock) with MODIS University of California Santa Barbara emissivity library and Advanced Spaceborne Thermal Emission & Reflection Radiometer spectral library were used as the spectral emissivity data for the derivation and verification of the multiple linear regression model. The derived determination coefficient ($R^2$) of multiple linear regression model had a high value of 0.95 (p<0.001) and the root mean square error between these model calculated and theoretical broadband emissivities was 0.0070. The surface broadband emissivity from our multiple linear regression model was comparable with that by Wang et al. (2005). The root mean square error between surface broadband emissivities calculated by models in this study and by Wang et al. (2005) during January was 0.0054 in Asia, Africa, and Oceania regions. The minimum and maximum differences of surface broadband emissivities between two model results were 0.0027 and 0.0067 respectively. The similar statistical results were also derived for August. The surface broadband emissivities by our multiple linear regression model could thus be acceptable. However, the various regression models according to different land covers need be applied for the more accurate calculation of the surface broadband emissivities.

An Incremental Regression Model for Time Series Data Prediction (시계열 데이터 예측을 위한 점진적인 회귀분석 모델)

  • Kim Sung-Hyun;Lee Yong-Mi;Jin Long;Seo Sung-Bo;Ryu Keun-Ho
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2006.05a
    • /
    • pp.23-26
    • /
    • 2006
  • 기존의 데이터 마이닝 예측 기법 중 회귀분석은 학습 단계에서 생성된 모델을 변경 없이 새로운 데이터에 적용하였다. 그러나 시계열 데이터에 모델 변경 없이 동일하게 적용하면 시간이 지남에 따라 정확도가 낮아지는 단점이 있다. 따라서 이 논문에서는 시간에 따라 변화하는 시계열데이터의 특성을 고려하여 점진적으로 회귀 모델을 갱신하는 기법을 제안한다. 이 기법은 입력되는 모든 데이터를 회귀 모델에 적용하여 점진적으로 모델을 갱신한다. 제안된 기법의 타당성은 RME(Relative Mean Error)와 RMSE(Root Mean Square Error)를 이용하여 측정하였다. 정확도 측정 실험 결과 제안 기법인 IMQR(Incremental Multiple Quadratic Regression) 기법이 MLR(Multiple Linear Regression), MQR(Multiple Quadratic Regression), SVR(Support Vector Regression) 기법에 비해 RME 가 평균 2%, RMSE 가 평균 0.02 정도 우수한 결과를 얻었다.

  • PDF