• Title/Summary/Keyword: 공선성

Search Result 158, Processing Time 0.029 seconds

A Study on Technology Level Evaluation based on Patent without Multicollinearity (특허기반의 기술수준평가 모형의 다중 공선성을 제거한 기술수준 평가모형 제안)

  • Cho, Il-Gu;Oh, Jong-Hak
    • Proceedings of the Korea Contents Association Conference
    • /
    • 2014.11a
    • /
    • pp.461-462
    • /
    • 2014
  • 기존 전문가 델파이 평가를 대체하는 특허기반 기술수준 평가모형들의 독립변수로 활용되는 특허활동도, 특허집중도, 특허시장력, 특허경쟁력 및 특허영향력의 다중공선성이 존재하여 이를 제거함으로써 보다 신뢰성이 높은 기술수준 평가모형을 실증하여 제안하고자 한다.

  • PDF

Procedure for the Selection of Principal Components in Principal Components Regression (주성분회귀분석에서 주성분선정을 위한 새로운 방법)

  • Kim, Bu-Yong;Shin, Myung-Hee
    • The Korean Journal of Applied Statistics
    • /
    • v.23 no.5
    • /
    • pp.967-975
    • /
    • 2010
  • Since the least squares estimation is not appropriate when multicollinearity exists among the regressors of the linear regression model, the principal components regression is used to deal with the multicollinearity problem. This article suggests a new procedure for the selection of suitable principal components. The procedure is based on the condition index instead of the eigenvalue. The principal components corresponding to the indices are removed from the model if any condition indices are larger than the upper limit of the cutoff value. On the other hand, the corresponding principal components are included if any condition indices are smaller than the lower limit. The forward inclusion method is employed to select proper principal components if any condition indices are between the upper limit and the lower limit. The limits are obtained from the linear model which is constructed on the basis of the conjoint analysis. The procedure is evaluated by Monte Carlo simulation in terms of the mean square error of estimator. The simulation results indicate that the proposed procedure is superior to the existing methods.

Estimation of S&T Knowledge Production Function Using Principal Component Regression Model (주성분 회귀모형을 이용한 과학기술 지식생산함수 추정)

  • Park, Su-Dong;Sung, Oong-Hyun
    • Journal of Korea Technology Innovation Society
    • /
    • v.13 no.2
    • /
    • pp.231-251
    • /
    • 2010
  • The numbers of SCI paper or patent in science and technology are expected to be related with the number of researcher and knowledge stock (R&D stock, paper stock, patent stock). The results of the regression model showed that severe multicollinearity existed and errors were made in the estimation and testing of regression coefficients. To solve the problem of multicollinearity and estimate the effect of the independent variable properly, principal component regression model were applied for three cases with S&T knowledge production. The estimated principal component regression function was transformed into original independent variables to interpret properly its effect. The analysis indicated that the principal component regression model was useful to estimate the effect of the highly correlate production factors and showed that the number of researcher, R&D stock, paper or patent stock had all positive effect on the production of paper or patent.

  • PDF

Evaluation of Flood Vulnerability in Taehwa River Basin Using Flood Factors (홍수 인자를 활용한 태화강 유역 홍수 취약성 평가)

  • Kim, Min Kuk;Seol, Myung Sue;Park, Jun Sue;Lee, Jae Yung;Lee, Chung Dae
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2020.06a
    • /
    • pp.390-390
    • /
    • 2020
  • 자연재해 중 홍수의 경우 단기간에 발생하며, 큰 인명 및 금전적 피해를 가져오는 재해이다. 1970년~2017년 국내 홍수 피해 분석결과 사상자(총 8,152명)는 점차 줄어드는 추세를 보이지만, 반대로 피해액(총 17조5,000억원)은 증가하는 것으로 나타났다(wamis, 국가수자원관리종합정보시스템). 이러한 국내 홍수 피해를 최소화하기 위해서는 각 유역 또는 지역별 특성을 고려한 홍수 취약성 평가가 필요하다. 홍수 취약성은 대상 지역의 기상, 지형, 인문학적 상황에 따라 상이하게 나타나며, 홍수 취약성을 평가하는 인자의 선정 또한 매우 중요하다. 따라서 본 연구에서는 홍수 피해 자료와 홍수 인자간의 인과관계를 분석하여 홍수 취약성 지표 선정 및 취약성 평가를 실시하였다. 홍수 취약성 평가를 위해 홍수 피해 자료와 대상 인자간의 상관성 분석을 통해 상관계수 값이 상대적으로 높게 나온 인자를 선정하였다. 대상 인자는 크게 기상학적 인자, 지형학적 인자, 사회·인문학적 인자로 구분하였다 선정된 인자 간 서로 높은 상관성을 보일 시 공선성이 존재함을 의미하며, 이러한 공선성을 방지하기 위해 VIF (Variance Inflation Factor, 분산팽창계수)를 통한 공선성 검토를 적용하였다. 또한 각 인자 간 에는 서로 다른 단위 및 범위를 가진다. 이러한 경우 특정 인자들의 증감을 취약성 평가에 반영하기에 어려움이 있으며, 유역별 평가 시 신뢰성이 낮아진다. 따라서 Re-scaling 방법을 통해 각 인자의 단위 및 범위를 표준화 후 동일가중치 법을 적용하였다. 본 연구에서는 전체 유역 중 홍수피해가 가장 크게 발생하는 낙동강 태화강 유역을 연구 대상 지역으로 선정하였다. 태화강은 도심지의 중심부를 흐르는 하천이며, 산지의 고도가 높은 지형적 특성을 가지고 있어 홍수에 대한 취약성이 높은 것으로 나타났다(wamis, 국가수자원관리종합정보시스템). 태화강 유역 홍수 취약성 평가결과 유역별 기상, 지형, 인문학적 특성에 따라 홍수 취약성이 높게 나타나는 결과를 보였다. 이와 같은 결과는 유역 내 도심지 비율, 인구밀도, 토지피복 특성에 의한 것으로 주로 지형학적 인자로 인해 취약성이 높게 나타났다. 본 연구에서 활용한 홍수 취약성 평가 방법은 향후 홍수피해 대책 수립에 사용될 수 있을 것으로 판단된다.

  • PDF

Analyzing Financial Data from Banks and Savings Banks: Application of Bioinformatical Methods (은행과 저축은행 관련 재정 지표 분석: 생물 정보학 분석 기법의 응용)

  • Pak, Ro Jin
    • The Korean Journal of Applied Statistics
    • /
    • v.27 no.4
    • /
    • pp.577-588
    • /
    • 2014
  • The collection and storage of a large volumes of data are becoming easier; however, the number of variables is sometimes more than the number of samples(objects). We now face the problem of dependency among variables(such as multicollinearity) due to the increased number of variables. We cannot apply various statistical methods without satisfying independency assumption. In order to overcome such a drawback we consider a categorizing (or discretizing) observations. We have a data set of nancial indices from banks in Korea that contain 78 variables from 16 banks. Genetic sequence data is also a good example of a large data and there have been numerous statistical methods to handle it. We discover lots of useful bank information after we transform bank data into categorical data that resembles genetic sequence data and apply bioinformatic techniques.

Effects of Multicollinearity in Logit Model (로짓모형에 있어서 다중공선성의 영향에 관한 연구)

  • Ryu, Si-Kyun
    • Journal of Korean Society of Transportation
    • /
    • v.26 no.1
    • /
    • pp.113-126
    • /
    • 2008
  • This research aims to explore the effects of multicollinearity on the reliability and goodness of fit of logit model. To investigate the effects of multicollinearity on the multinominal logit model, numerical experiments are performed. The exploratory variables(attributes of utility functions) which have a certain degree of correlations from (rho=) 0.0 to (rho=) 0.9 are generated and rho-squares and t-statistics which are the indices of goodness of fit and reliability of logit model are traced. From the well designed numerical experiments, following findings are validated : 1) When a new exploratory variable is added, some of rho-squares increase while the others decrease. 2) The higher relations between generic variables lead a logit model worse with respect to goodness of fit. 3) Multicollinearity has a tendency to produce over-evaluated parameters. 4) The reliability of the estimated parameter has a tendency to decrease when the correlations between attributes are high. These results suggest that we have to examine the existence of multicollinearity and perform the proper treatments to diminish multicollinearity when we develop logit model.

Principal Components Regression in Logistic Model (로지스틱모형에서의 주성분회귀)

  • Kim, Bu-Yong;Kahng, Myung-Wook
    • The Korean Journal of Applied Statistics
    • /
    • v.21 no.4
    • /
    • pp.571-580
    • /
    • 2008
  • The logistic regression analysis is widely used in the area of customer relationship management and credit risk management. It is well known that the maximum likelihood estimation is not appropriate when multicollinearity exists among the regressors. Thus we propose the logistic principal components regression to deal with the multicollinearity problem. In particular, new method is suggested to select proper principal components. The selection method is based on the condition index instead of the eigenvalue. When a condition index is larger than the upper limit of cutoff value, principal component corresponding to the index is removed from the estimation. And hypothesis test is sequentially employed to eliminate the principal component when a condition index is between the upper limit and the lower limit. The limits are obtained by a linear model which is constructed on the basis of the conjoint analysis. The proposed method is evaluated by means of the variance of the estimates and the correct classification rate. The results indicate that the proposed method is superior to the existing method in terms of efficiency and goodness of fit.

Development of model for prediction of land sliding at steep slopes (급경사지 붕괴 예측을 위한 모형 개발)

  • Park, Ki-Byung;Joo, Yong-Sung;Park, Dug-Keun
    • Journal of the Korean Data and Information Science Society
    • /
    • v.22 no.4
    • /
    • pp.691-699
    • /
    • 2011
  • Land sliding is one of well-known nature disaster. As a part of effort to reduce damage from land sliding, many researchers worked on increasing prediction ability. However, because previous studies are conducted mostly by non-statisticians, previously proposed models were hardly statistically justifiable. In this paper, we predicted the probability of land sliding using the logistic regression model. Since most explanatory variables under consideration were correlated, we proposed the final model after backward elimination process.

A Criterion for the Selection of Principal Components in the Robust Principal Component Regression (로버스트주성분회귀에서 최적의 주성분선정을 위한 기준)

  • Kim, Bu-Yong
    • Communications for Statistical Applications and Methods
    • /
    • v.18 no.6
    • /
    • pp.761-770
    • /
    • 2011
  • Robust principal components regression is suggested to deal with both the multicollinearity and outlier problem. A main aspect of the robust principal components regression is the selection of an optimal set of principal components. Instead of the eigenvalue of the sample covariance matrix, a selection criterion is developed based on the condition index of the minimum volume ellipsoid estimator which is highly robust against leverage points. In addition, the least trimmed squares estimation is employed to cope with regression outliers. Monte Carlo simulation results indicate that the proposed criterion is superior to existing ones.