• Title/Summary/Keyword: Regression model.

Search Result 9,552, Processing Time 0.033 seconds

TIME SERIES PREDICTION USING INCREMENTAL REGRESSION

  • Kim, Sung-Hyun;Lee, Yong-Mi;Jin, Long;Chai, Duck-Jin;Ryu, Keun-Ho
    • Proceedings of the KSRS Conference
    • /
    • v.2
    • /
    • pp.635-638
    • /
    • 2006
  • Regression of conventional prediction techniques in data mining uses the model which is generated from the training step. This model is applied to new input data without any change. If this model is applied directly to time series, the rate of prediction accuracy will be decreased. This paper proposes an incremental regression for time series prediction like typhoon track prediction. This technique considers the characteristic of time series which may be changed over time. It is composed of two steps. The first step executes a fractional process for applying input data to the regression model. The second step updates the model by using its information as new data. Additionally, the model is maintained by only recent data in a queue. This approach has the following two advantages. It maintains the minimum information of the model by using a matrix, so space complexity is reduced. Moreover, it prevents the increment of error rate by updating the model over time. Accuracy rate of the proposed method is measured by RME(Relative Mean Error) and RMSE(Root Mean Square Error). The results of typhoon track prediction experiment are performed by the proposed technique IMLR(Incremental Multiple Linear Regression) is more efficient than those of MLR(Multiple Linear Regression) and SVR(Support Vector Regression).

  • PDF

Population Distribution Estimation Using Regression-Kriging Model (Regression-Kriging 모형을 이용한 인구분포 추정에 관한 연구)

  • Kim, Byeong-Sun;Ku, Cha-Yong;Choi, Jin-Mu
    • Journal of the Korean Geographical Society
    • /
    • v.45 no.6
    • /
    • pp.806-819
    • /
    • 2010
  • Population data has been essential and fundamental in spatial analysis and commonly aggregated into political boundaries. A conventional method for population distribution estimation was a regression model with land use data, but the estimation process has limitation because of spatial autocorrelation of the population data. This study aimed to improve the accuracy of population distribution estimation by adopting a Regression-Kriging method, namely RK Model, which combines a regression model with Kriging for the residuals. RK Model was applied to a part of Seoul metropolitan area to estimate population distribution based on the residential zones. Comparative results of regression model and RK model using RMSE, MAE, and G statistics revealed that RK model could substantially improve the accuracy of population distribution. It is expected that RK model could be adopted actively for further population distribution estimation.

On study for change point regression problems using a difference-based regression model

  • Park, Jong Suk;Park, Chun Gun;Lee, Kyeong Eun
    • Communications for Statistical Applications and Methods
    • /
    • v.26 no.6
    • /
    • pp.539-556
    • /
    • 2019
  • This paper derive a method to solve change point regression problems via a process for obtaining consequential results using properties of a difference-based intercept estimator first introduced by Park and Kim (Communications in Statistics - Theory Methods, 2019) for outlier detection in multiple linear regression models. We describe the statistical properties of the difference-based regression model in a piecewise simple linear regression model and then propose an efficient algorithm for change point detection. We illustrate the merits of our proposed method in the light of comparison with several existing methods under simulation studies and real data analysis. This methodology is quite valuable, "no matter what regression lines" and "no matter what the number of change points".

The Distributions of Variance Components in Two Stage Regression Model

  • Park, Dong-Joon
    • Journal of the Korean Data and Information Science Society
    • /
    • v.7 no.1
    • /
    • pp.87-92
    • /
    • 1996
  • A regression model with nested erroe structure is considered. The regression model includes two error terms that are independent and normally distributed with zero means and constant variances. This error structure of the model gives correlated response variables. The distributions of variance components in the regression model with nested error structure are dervied by using theorems for quadratic forms.

  • PDF

Analysis of Temperature Effects on Microbial Growth Parameters and Estimation of Food Shelf Life with Confidence Band

  • Park, Jin-Pyo;Lee, Dong-Sun
    • Preventive Nutrition and Food Science
    • /
    • v.13 no.2
    • /
    • pp.104-111
    • /
    • 2008
  • As a way to account for the variability of the primary model parameters in the secondary modeling of microbial growth, three different regression approaches were compared in determining the confidence interval of the temperature-dependent primary model parameters and the estimated microbial growth during storage: bootstrapped regression with all the individual primary model parameter values; bootstrapped regression with average values at each temperature; and simple regression with regression lines of 2.5% and 97.5% percentile values. Temperature dependences of converted parameters (log $q_o$, ${\mu}_{max}^{1/2}$, log $N_{max}$) of hypothetical initial physiological state, maximum specific growth rate, and maximum cell density in Baranyi's model were subjected to the regression by quadratic, linear, and linear function, respectively. With an advantage of extracting the primary model parameters instantaneously at any temperature by using mathematical functions, regression lines of 2.5% and 97.5% percentile values were capable of accounting for variation in experimental data of microbial growth under constant and fluctuating temperature conditions.

Density Adaptive Grid-based k-Nearest Neighbor Regression Model for Large Dataset (대용량 자료에 대한 밀도 적응 격자 기반의 k-NN 회귀 모형)

  • Liu, Yiqi;Uk, Jung
    • Journal of Korean Society for Quality Management
    • /
    • v.49 no.2
    • /
    • pp.201-211
    • /
    • 2021
  • Purpose: This paper proposes a density adaptive grid algorithm for the k-NN regression model to reduce the computation time for large datasets without significant prediction accuracy loss. Methods: The proposed method utilizes the concept of the grid with centroid to reduce the number of reference data points so that the required computation time is much reduced. Since the grid generation process in this paper is based on quantiles of original variables, the proposed method can fully reflect the density information of the original reference data set. Results: Using five real-life datasets, the proposed k-NN regression model is compared with the original k-NN regression model. The results show that the proposed density adaptive grid-based k-NN regression model is superior to the original k-NN regression in terms of data reduction ratio and time efficiency ratio, and provides a similar prediction error if the appropriate number of grids is selected. Conclusion: The proposed density adaptive grid algorithm for the k-NN regression model is a simple and effective model which can help avoid a large loss of prediction accuracy with faster execution speed and fewer memory requirements during the testing phase.

Fuzzy regression using regularlization method based on Tanaka's model

  • Hong Dug-Hun;Kim Kyung-Tae
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.16 no.4
    • /
    • pp.499-505
    • /
    • 2006
  • Regularlization approach to regression can be easily found in Statistics and Information Science literature. The technique of regularlization was introduced as a way of controlling the smoothness properties of regression function. In this paper, we have presented a new method to evaluate linear and non-linear fuzzy regression model based on Tanaka's model using the idea of regularlization technique. Especially this method is a very attractive approach to model non -linear fuzzy data.

An Application of a New Two-Way Regression Model for Rating Curves (수위-유량관계식에 새로운 양방향 회귀모형의 적용)

  • Lee, Chang-Hae
    • Journal of Korea Water Resources Association
    • /
    • v.41 no.1
    • /
    • pp.17-25
    • /
    • 2008
  • Whether rating curves are used in practice or new ones are derived, the characteristics of regression analysis are often neglected. For example, a discharge rating curve, which is established from a regression of observed water levels (H) on observed flowrates(Q), is sometimes used for estimating a design water level corresponding to a simulated design flood runoff. However, if independent and dependent variables are changed with each other, the regression equation is changed in existing regression analysis, which is derived from vertical errors between observed data and regression line. Thus, regression equations should not be applied inversely. To avoid this problem, A new two-way variable least-squares regression analysis is proposed. The new method was applied to the rating curves of five water level stations on main stream of Nakdong River. The three kinds of regression models, which are respectively regression of Q versus H (model 1), H versus Q (model 2) and two-way (model 3), showed that the new method can reduce inadvertent mistakes when applied in practice.

Improved Exact Inference in Logistic Regression Model

  • Kim, Donguk;Kim, Sooyeon
    • Communications for Statistical Applications and Methods
    • /
    • v.10 no.2
    • /
    • pp.277-289
    • /
    • 2003
  • We propose modified exact inferential methods in logistic regression model. Exact conditional distribution in logistic regression model is often highly discrete, and ordinary exact inference in logistic regression is conservative, because of the discreteness of the distribution. For the exact inference in logistic regression model we utilize the modified P-value. The modified P-value can not exceed the ordinary P-value, so the test of size $\alpha$ based on the modified P-value is less conservative. The modified exact confidence interval maintains at least a fixed confidence level but tends to be much narrower. The approach inverts results of a test with a modified P-value utilizing the test statistic and table probabilities in logistic regression model.

Proposal for the Estimation of the Hydraulic Conductivity of Porous Asphalt Concrete Pavement using Regression Analysis (단순회귀분석에 의한 배수성 아스팔트의 투수계수 산정모델 제안)

  • Jang, Yeongsun;Kim, Dowan;Mun, Sungho;Jang, Byungkwan
    • International Journal of Highway Engineering
    • /
    • v.15 no.3
    • /
    • pp.45-52
    • /
    • 2013
  • PURPOSES : This study is to construct the regression models of drainage asphalt concrete specimens and to provide the appropriate coefficients of hydraulic conductivity prediction models. METHODS: In terms of easy calculation of the hydraulic conductivity from porosity of asphalt concrete pavement, the estimation model of hydraulic conductivity was proposed using regression analysis. 10 specimens of drainage asphalt concrete pavement were made for measurement of the hydraulic conductivity. Hydraulic conductivity model proposed in this study was calculated by empirical model based on porosity and the grain size. In this study, it shows the compared results from permeability measured test and empirical equation, and the suitability of proposed model, using regression analysis. RESULTS: As the result of the regression analysis, the hydraulic conductivity calculated from the proposal model was similar to that resulted from permeability measured test. Also result of RMSE (Root Mean Square Error) analysis, a proposed regression model is resulted in more accurate model. CONCLUSIONS: The proposed model can be used in case of estimating the hydraulic conductivity at drainage asphalt concrete pavements in fields.