Application of Principal Component Analysis Prior to Cluster Analysis in the Concept of Informative Variables

  • Chae, Seong-San (Department of Information and Statistics, Daejeon University)
  • Published : 2003.12.01


Results of using principal component analysis prior to cluster analysis are compared with results from applying agglomerative clustering algorithm alone. The retrieval ability of the agglomerative clustering algorithm is improved by using principal components prior to cluster analysis in some situations. On the other hand, the loss in retrieval ability for the agglomerative clustering algorithms decreases, as the number of informative variables increases, where the informative variables are the variables that have distinct information(or, necessary information) compared to other variables.



  1. Journal of the Korean Statistical Society v.20 A Method to Predict the Number of Clusters Chae, Seong S.;Warde,W.D.
  2. The Korean Communication in Sttistics v.9 Results of Discriminant Analysis with Respect to Cluster Analysis Under Dimensional Reduction Chae, Seong S.
  3. Applied Statistics v.32 no.3 On using Principal Components before Separating a Mixture of Two Multivariate Normal Distributions Chang, Wei-Chien
  4. The Canadian Journal of Statistics v.7 A Mathematical Comparison of the Members of an Infinite Family of Agglomerative Clustering Algorithms DuBien,J.L.;Warde,W.D.
  5. Communications in Statistics, Theory and Method v.16 A Comparison of Agglomerative Clustering Methods with respect to Noise DuBien,J.L.;Warde,W.D.
  6. Journal of Classification v.5 Variable Selection in Clustering Fowlkes,E.B.;Gnanadesikan,R.;Kettenring,J.P.
  7. Applied Multivariate Statistical Analysis Johnson,R.A.;Wichern,D.W.
  8. Nature A Generalized Sorting Strategy for Computer Classification Lance,G.N.;Williams,W.T.
  9. The Computer Journal v.9 A General Theory of Classificatory Sorting Strategies, 1. Hierarchical Systems Lance,G.N.;Williams,W.T.
  10. Proceedings of National Academy Sciences in United State of America v.96 Distinctive gene expression patterns in human mammary epithelial cells and breast cancers Perou,C.M.;Jeffrey,S.S.;Rijn,M.V.;Rees,C.A.;Eisen,M.B.;Ross,D.T.;Pergamenschikov,A.;Williams,C.R.;Zhu,S.X.;Lee,J.C.F.;Lashkari,D.;Shalon,D.;Brown,P.O.;Botstein,D.
  11. Journal of the American Statistical Association v.66 Objective Criteria for the Evaluation of Clustering Methods Rand,W.M.
  12. Molecular Biology of the Cell v.9 Comprehensive Identification of Cell Cycle-regulated Genes of the Yeast Saccharomyces cerevisiae by Microarray Hybridization Spellman,P.T.;Sherlock,G.;Zhang,M.Q.;Iyer,V.R.;Eisen,M.B.;Brown,P.O.;Botstein,D.;Futcher,B.
  13. Technical Report, Stanford University Clustering Methods for the analysis of DNA microarray data Tibshirani,R.;Hastie,T.;Eisen,M.;Ross,G.;Botstein,D.;Brown,P.