• Title/Summary/Keyword: 주성분분석

Search Result 1,982, Processing Time 0.025 seconds

Comparison of Significant Term Extraction Based on the Number of Selected Principal Components (주성분 보유수에 따른 중요 용어 추출의 비교)

  • Lee Chang-Beom;Ock Cheol-Young;Park Hyuk-Ro
    • The KIPS Transactions:PartB
    • /
    • v.13B no.3 s.106
    • /
    • pp.329-336
    • /
    • 2006
  • In this paper, we propose a method of significant term extraction within a document. The technique used is Principal Component Analysis(PCA) which is one of the multivariate analysis methods. PCA can sufficiently use term-term relationships within a document by term-term correlations. We use a correlation matrix instead of a covariance matrix between terms for performing PCA. We also try to find out thresholds of both the number of components to be selected and correlation coefficients between selected components and terms. The experimental results on 283 Korean newspaper articles show that the condition of the first six components with correlation coefficients of |0.4| is the best for extracting sentence based on the significant selected terms.

A Comparative Study on Factor Recovery of Principal Component Analysis and Common Factor Analysis (주성분분석과 공통요인분석에 대한 비교연구: 요인구조 복원 관점에서)

  • Jung, Sunho;Seo, Sangyun
    • The Korean Journal of Applied Statistics
    • /
    • v.26 no.6
    • /
    • pp.933-942
    • /
    • 2013
  • Common factor analysis and principal component analysis represent two technically distinctive approaches to exploratory factor analysis. Much of the psychometric literature recommends the use of common factor analysis instead of principal component analysis. Nonetheless, factor analysts use principal component analysis more frequently because they believe that principal component analysis could yield (relatively) less accurate estimates of factor loadings compared to common factor analysis but most often produce similar pattern of factor loadings, leading to essentially the same factor interpretations. A simulation study is conducted to evaluate the relative performance of these two approaches in terms of factor pattern recovery under different experimental conditions of sample size, overdetermination, and communality.The results show that principal component analysis performs better in factor recovery with small sample sizes (below 200). It was further shown that this tendency is more prominent when there are a small number of variables per factor. The present results are of practical use for factor analysts in the field of marketing and the social sciences.

Procedure for the Selection of Principal Components in Principal Components Regression (주성분회귀분석에서 주성분선정을 위한 새로운 방법)

  • Kim, Bu-Yong;Shin, Myung-Hee
    • The Korean Journal of Applied Statistics
    • /
    • v.23 no.5
    • /
    • pp.967-975
    • /
    • 2010
  • Since the least squares estimation is not appropriate when multicollinearity exists among the regressors of the linear regression model, the principal components regression is used to deal with the multicollinearity problem. This article suggests a new procedure for the selection of suitable principal components. The procedure is based on the condition index instead of the eigenvalue. The principal components corresponding to the indices are removed from the model if any condition indices are larger than the upper limit of the cutoff value. On the other hand, the corresponding principal components are included if any condition indices are smaller than the lower limit. The forward inclusion method is employed to select proper principal components if any condition indices are between the upper limit and the lower limit. The limits are obtained from the linear model which is constructed on the basis of the conjoint analysis. The procedure is evaluated by Monte Carlo simulation in terms of the mean square error of estimator. The simulation results indicate that the proposed procedure is superior to the existing methods.

Principal Component Transformation of the Satellite Image Data and Principal-Components-Based Image Classification (위성 영상데이터의 주성분변환 및 주성분 기반 영상분류)

  • Seo, Yong-Su
    • Journal of the Korean Association of Geographic Information Studies
    • /
    • v.7 no.4
    • /
    • pp.24-33
    • /
    • 2004
  • Advances in remote sensing technologies are resulting in the rapid increase of the number of spectral channels, and thus, growing data volumes. This creates a need for developing faster techniques for processing such data. One application in which such fast processing is needed is the dimension reduction of the multispectral data. Principal component transformation is perhaps the mostpopular dimension reduction technique for multispectral data. In this paper, we discussed the processing procedures of principal component transformation. And we presented and discussed the results of the principal component transformation of the multispectral data. Moreover principal components image data are classified by the Maximum Likelihood method and Multilayer Perceptron method. In addition, the performances of two classification methods and data reduction effects are evaluated and analyzed based on the experimental results.

  • PDF

Principal Component Analysis of Compositional Data using Box-Cox Contrast Transformation (Box-Cox 대비변환을 이용한 구성비율자료의 주성분분석)

  • 최병진;김기영
    • The Korean Journal of Applied Statistics
    • /
    • v.14 no.1
    • /
    • pp.137-148
    • /
    • 2001
  • Compositional data found in many practical applications consist of non-negative vectors of proportions with the constraint which the sum of the elements of each vector is unity. It is well-known that the statistical analysis of compositional data suffers from the unit-sum constraint. Moreover, the non-linear pattern frequently displayed by the data does not facilitate the application of the linear multivariate techniques such as principal component analysis. In this paper we develop new type of principal component analysis for compositional data using Box-Cox contrast transformation. Numerical illustrations are provided for comparative purpose.

  • PDF

Assessment of CO2 Emissions of Vehicles in Highway Sections Using Principal Component Analysis (주성분분석을 이용한 간선도로 구간 별 차량 당 CO2 다량 배출구간 평가)

  • Lee, Yoon Seok;Kim, Da Ye;Oh, Heung Un
    • KSCE Journal of Civil and Environmental Engineering Research
    • /
    • v.33 no.5
    • /
    • pp.1981-1987
    • /
    • 2013
  • $CO_2$ emissions of vehicles vary with vehicle's speeds. In addition, the speeds vary with road type, location, time and traffic volume. In this paper, the section in which a large quantity of $CO_2$ emissions per vehicle is exhausted is determined and analyzed with principal component analysis(PCA). In results of analysis, the principal components analysis were divided into two principal components. It had been identified that the main component was the time zone one which is able to explain each components' role. The first principal component could explain the role of a major component on $CO_2$ emissions per vehicle in the early morning and afternoon hour, respectively. The second principal component could explain the role of the component on $CO_2$ emissions per vehicle in the morning and afternoon peak hours, respectively. Therefore, the section in which a large quantity of $CO_2$ emissions per vehicle could be deterimined by PCA scores.

A Study on the Principal Component Transformation of the Multispectral Image Data (다중분광 영상데이터의 주성분변환에 관한 연구)

  • 서용수
    • Proceedings of the IEEK Conference
    • /
    • 2003.11a
    • /
    • pp.389-392
    • /
    • 2003
  • 원격감지(remote sensing) 기술의 비약적인 발전과 함께 다중분광 영상데이터의 분광대역수가 급속히 증가하고 있다. 대역수의 증가로 영상데이터의 양이 급격히 증가하게 되고, 이에 따라 이들 데이터를 처리하기 위해서는 처리속도가 빠른 영상 처리 기술이 필요하게 되었다. 분광 대역수를 줄여 빠르게 처리하는 한가지 방법으로 널리 사용되고 있는 것이 주성분변환이다. 본 논문에서는 주성분변환에 대한 처리방법에 대해 논한 후, 다중분광 영상데이터를 주성분 변환한 주성분 영상데이터를 분석하였다. 또한 주성분 영상데이터를 최대유사법으로 분류하고 그 결과를 분석하였다.

  • PDF

Principal Component Analysis for the Growth Data of Rice (주성분분석을 이용한 수도의 생장해석)

  • Hahn, Weon-Sik;Chae, Yeong-Am
    • KOREAN JOURNAL OF CROP SCIENCE
    • /
    • v.31 no.2
    • /
    • pp.173-178
    • /
    • 1986
  • Principal component analysis was used for ana1zing growth data to know the relationship between growth characteristics and yield as well as its components. The first principal component accounted for average time of the specific leaf area sampled, leaf area index, and dry weight, and the second component for the position of the changing point of growth characteristics. The component scores were more affected by the nitrogen level than variety. Yield were affected by fertility ratio and number of spikelets per hill which have close relation with the component score of leaf area index and dry weight per hill.

  • PDF

A dimensional reduction method in cluster analysis for multidimensional data: principal component analysis and factor analysis comparison (다차원 데이터의 군집분석을 위한 차원축소 방법: 주성분분석 및 요인분석 비교)

  • Hong, Jun-Ho;Oh, Min-Ji;Cho, Yong-Been;Lee, Kyung-Hee;Cho, Wan-Sup
    • The Journal of Bigdata
    • /
    • v.5 no.2
    • /
    • pp.135-143
    • /
    • 2020
  • This paper proposes a pre-processing method and a dimensional reduction method in the analysis of shopping carts where there are many correlations between variables when dividing the types of consumers in the agri-food consumer panel data. Cluster analysis is a widely used method for dividing observational objects into several clusters in multivariate data. However, cluster analysis through dimensional reduction may be more effective when several variables are related. In this paper, the food consumption data surveyed of 1,987 households was clustered using the K-means method, and 17 variables were re-selected to divide it into the clusters. Principal component analysis and factor analysis were compared as the solution for multicollinearity problems and as the way to reduce dimensions for clustering. In this study, both principal component analysis and factor analysis reduced the dataset into two dimensions. Although the principal component analysis divided the dataset into three clusters, it did not seem that the difference among the characteristics of the cluster appeared well. However, the characteristics of the clusters in the consumption pattern were well distinguished under the factor analysis method.

A study on principal component analysis using penalty method (페널티 방법을 이용한 주성분분석 연구)

  • Park, Cheolyong
    • Journal of the Korean Data and Information Science Society
    • /
    • v.28 no.4
    • /
    • pp.721-731
    • /
    • 2017
  • In this study, principal component analysis methods using Lasso penalty are introduced. There are two popular methods that apply Lasso penalty to principal component analysis. The first method is to find an optimal vector of linear combination as the regression coefficient vector of regressing for each principal component on the original data matrix with Lasso penalty (elastic net penalty in general). The second method is to find an optimal vector of linear combination by minimizing the residual matrix obtained from approximating the original matrix by the singular value decomposition with Lasso penalty. In this study, we have reviewed two methods of principal components using Lasso penalty in detail, and shown that these methods have an advantage especially in applying to data sets that have more variables than cases. Also, these methods are compared in an application to a real data set using R program. More specifically, these methods are applied to the crime data in Ahamad (1967), which has more variables than cases.