Hierarchical Clustering of Gene Expression Data Based on Self Organizing Map

자기 조직화 지도에 기반한 유전자 발현 데이터의 계층적 군집화

  • Park, Chang-Beom (WatchVision, Inc.) ;
  • Lee, Dong-Hwan (Department of Computer Science and Engineering, Korea University) ;
  • Lee, Seong-Whan (Department of Computer Science and Engineering, Korea University, Interdisciplinary Graduate Program in Bioinformatics, Korea University)
  • Published : 2003.10.31


Gene expression data are the quantitative measurements of expression levels and ratios of numberous genes in different situations based on microarray image analysis results. The process to draw meaningful information related to genomic diseases and various biological activities from gene expression data is known as gene expression data analysis. In this paper, we present a hierarchical clustering method of gene expression data based on self organizing map which can analyze the clustering result of gene expression data more efficiently. Using our proposed method, we could eliminate the uncertainty of cluster boundary which is the inherited disadvantage of self organizing map and use the visualization function of hierarchical clustering. And, we could process massive data using fast processing speed of self organizing map and interpret the clustering result of self organizing map more efficiently and user-friendly. To verify the efficiency of our proposed algorithm, we performed tests with following 3 data sets, animal feature data set, yeast gene expression data and leukemia gene expression data set. The result demonstrated the feasibility and utility of the proposed clustering algorithm.