• Title/Summary/Keyword: principal components regression

Search Result 84, Processing Time 0.025 seconds

Principal Component Regression by Principal Component Selection

  • Lee, Hosung;Park, Yun Mi;Lee, Seokho
    • Communications for Statistical Applications and Methods
    • /
    • v.22 no.2
    • /
    • pp.173-180
    • /
    • 2015
  • We propose a selection procedure of principal components in principal component regression. Our method selects principal components using variable selection procedures instead of a small subset of major principal components in principal component regression. Our procedure consists of two steps to improve estimation and prediction. First, we reduce the number of principal components using the conventional principal component regression to yield the set of candidate principal components and then select principal components among the candidate set using sparse regression techniques. The performance of our proposals is demonstrated numerically and compared with the typical dimension reduction approaches (including principal component regression and partial least square regression) using synthetic and real datasets.

A New Deletion Criterion of Principal Components Regression with Orientations of the Parameters

  • Lee, Won-Woo
    • Journal of the Korean Statistical Society
    • /
    • v.16 no.2
    • /
    • pp.55-70
    • /
    • 1987
  • The principal components regression is one of the substitues for least squares method when there exists multicollinearity in the multiple linear regression model. It is observed graphically that the performance of the principal components regression is strongly dependent upon the values of the parameters. Accordingly, a new deletion criterion which determines proper principal components to be deleted from the analysis is developed and its usefulness is checked by simulations.

  • PDF

Numerical Investigations in Choosing the Number of Principal Components in Principal Component Regression - CASE I

  • Shin, Jae-Kyoung;Moon, Sung-Ho
    • Journal of the Korean Data and Information Science Society
    • /
    • v.8 no.2
    • /
    • pp.127-134
    • /
    • 1997
  • A method is proposed for the choice of the number of principal components in principal component regression based on the predicted error sum of squares. To do this, we approximately evaluate that statistic using a linear approximation based on the perturbation expansion. In this paper, we apply the proposed method to various data sets and discuss some properties in choosing the number of principal components in principal component regression.

  • PDF

Numerical Investigations in Choosing the Number of Principal Components in Principal Component Regression - CASE II

  • Shin, Jae-Kyoung;Moon, Sung-Ho
    • Journal of the Korean Data and Information Science Society
    • /
    • v.10 no.1
    • /
    • pp.163-172
    • /
    • 1999
  • We propose a cross-validatory method for the choice of the number of principal components in principal component regression based on the magnitudes of correlations with y. There are two different manners in choosing principal components, one is the order of eigenvalues(Shin and Moon, 1997) and the other is that of correlations with y. We apply our method to various data sets and compare results of those two methods.

  • PDF

Simple principal component analysis using Lasso (라소를 이용한 간편한 주성분분석)

  • Park, Cheolyong
    • Journal of the Korean Data and Information Science Society
    • /
    • v.24 no.3
    • /
    • pp.533-541
    • /
    • 2013
  • In this study, a simple principal component analysis using Lasso is proposed. This method consists of two steps. The first step is to compute principal components by the principal component analysis. The second step is to regress each principal component on the original data matrix by Lasso regression method. Each of new principal components is computed as the linear combination of original data matrix using the scaled estimated Lasso regression coefficient as the coefficients of the combination. This method leads to easily interpretable principal components with more 0 coefficients by the properties of Lasso regression models. This is because the estimator of the regression of each principal component on the original data matrix is the corresponding eigenvector. This method is applied to real and simulated data sets with the help of an R package for Lasso regression and its usefulness is demonstrated.

A Criterion for the Selection of Principal Components in the Robust Principal Component Regression (로버스트주성분회귀에서 최적의 주성분선정을 위한 기준)

  • Kim, Bu-Yong
    • Communications for Statistical Applications and Methods
    • /
    • v.18 no.6
    • /
    • pp.761-770
    • /
    • 2011
  • Robust principal components regression is suggested to deal with both the multicollinearity and outlier problem. A main aspect of the robust principal components regression is the selection of an optimal set of principal components. Instead of the eigenvalue of the sample covariance matrix, a selection criterion is developed based on the condition index of the minimum volume ellipsoid estimator which is highly robust against leverage points. In addition, the least trimmed squares estimation is employed to cope with regression outliers. Monte Carlo simulation results indicate that the proposed criterion is superior to existing ones.

Procedure for the Selection of Principal Components in Principal Components Regression (주성분회귀분석에서 주성분선정을 위한 새로운 방법)

  • Kim, Bu-Yong;Shin, Myung-Hee
    • The Korean Journal of Applied Statistics
    • /
    • v.23 no.5
    • /
    • pp.967-975
    • /
    • 2010
  • Since the least squares estimation is not appropriate when multicollinearity exists among the regressors of the linear regression model, the principal components regression is used to deal with the multicollinearity problem. This article suggests a new procedure for the selection of suitable principal components. The procedure is based on the condition index instead of the eigenvalue. The principal components corresponding to the indices are removed from the model if any condition indices are larger than the upper limit of the cutoff value. On the other hand, the corresponding principal components are included if any condition indices are smaller than the lower limit. The forward inclusion method is employed to select proper principal components if any condition indices are between the upper limit and the lower limit. The limits are obtained from the linear model which is constructed on the basis of the conjoint analysis. The procedure is evaluated by Monte Carlo simulation in terms of the mean square error of estimator. The simulation results indicate that the proposed procedure is superior to the existing methods.

Improving Polynomial Regression Using Principal Components Regression With the Example of the Numerical Inversion of Probability Generating Function (주성분회귀분석을 활용한 다항회귀분석 성능개선: PGF 수치역변환 사례를 중심으로)

  • Yang, Won Seok;Park, Hyun-Min
    • The Journal of the Korea Contents Association
    • /
    • v.15 no.1
    • /
    • pp.475-481
    • /
    • 2015
  • We use polynomial regression instead of linear regression if there is a nonlinear relation between a dependent variable and independent variables in a regression analysis. The performance of polynomial regression, however, may deteriorate because of the correlation caused by the power terms of independent variables. We present a polynomial regression model for the numerical inversion of PGF and show that polynomial regression results in the deterioration of the estimation of the coefficients. We apply principal components regression to the polynomial regression model and show that principal components regression dramatically improves the performance of the parameter estimation.

Logistic Regression Classification by Principal Component Selection

  • Kim, Kiho;Lee, Seokho
    • Communications for Statistical Applications and Methods
    • /
    • v.21 no.1
    • /
    • pp.61-68
    • /
    • 2014
  • We propose binary classification methods by modifying logistic regression classification. We use variable selection procedures instead of original variables to select the principal components. We describe the resulting classifiers and discuss their properties. The performance of our proposals are illustrated numerically and compared with other existing classification methods using synthetic and real datasets.

The Study of Korean Manufacturing Industry Wage : Principal Components Regression Analysis (한국 제조업의 임금결정에 대한 연구 : 외환위기 전·후를 중심으로)

  • Oh, Yu-Jin;Park, Sung-Joon;Kim, Yu-Seop
    • Journal of Labour Economics
    • /
    • v.28 no.1
    • /
    • pp.61-82
    • /
    • 2005
  • We investigate wage differentials in Korea in the manufacturing industry, as well as factors affecting structural change in wage determination for the pre- and post-financial crisis regimes. We use the 1995 and 1999 data from the Survey Report on the Wage Structure (SRWS) from the Ministry of Labor. Principal components regression analysis is used to tackle multicollinearity. We employ factor analysis to reduce a set of variables to a smaller number, which contain observed and latent variables. Our empirical investigation provide evidences for changes in wages structure between 1995 and 1999. In 1995, the job quality factor is the most critical in the determination of wages, while in 1999, the industry attributes factor impacts greatly on the wages.

  • PDF