Clustering of 2D-Gel images

2H-Gel 이미지의 정렬 및 클러스터링

  • Hur Won (School of Biotechnology and Bioengineering, Kangwon National University)
  • 허원 (강원대학교 바이오산업공학부)
  • Published : 2005.04.01

Abstract

Alignment of 2D-gel images of biological samples can visualize the difference of expression profiles and also inform us candidates of protein spots to be further analyzed. However, comparison of two proteome images between the case and control does not always successfully identify differentially expressed proteins because of sample-to-sample variation, poor reproducibility of 2D-gel electrophoresis and inconsistent electrophoresis conditions. Multiple alignment of 2D-gel image must be preceded before visualizing the difference of expression profiles or clustering proteome images. Thus, a software for the alignment of multiple 2D-Gel images and their clustering was developed by applying various algorithms and statistical methods. Microsoft Visual C++ was used to implement the algorithms in this work. Multiresoultion-multilevel algorithm was found out to be suitable for fast alignment and for largely distorted images. Clustering of 10 different proteome images of Fetal Alcohol Syndrome, was carried out by implementing a k-means algorithm and it gave a phylogenetic tree of proteomic distance map of the samples. However, the phylogenetic tree does not discriminate the case and control. The whole image clustering shows that the proteomic distance is more dependent to age and sex.

2D-Gel 이미지간의 유사성을 기준으로 생물학적인 시료가 프로테옴 수준에서 유사성의 정도와 서로 다른 단백질 스팟을 파악해 낼 수 있다. 그러나 생물학적인 시료는 개체간 변화가 크고 2차원 전기영동장치의 재현성의 한계로 인하여 비교가 어려운 경우가 많고 의미 없는 차이점만 발견되는 경우 또한 비일비재하다. 이를 극복하기 위해서는 프로테옴 이미지간의 정렬을 통하여 정확한 비교가 가능하게 하여야한다. 본 연구에서는 이미지상의 단백질 스팟을 일일이 찾지 않고 여러 개의 원시 이미지를 동시에 정렬시키는 multiresolution-multilevel algorithm을 활용하여 소프트웨어를 개발하였다. 또 이렇게 정렬된 이미지들이 서로 얼마나 유사한지 보여주는 Phylogenetic tree를 자동으로 생성시키는 소프트웨어를 개발하였다. 이 방법을 이용하여 Fetal Alcohol Syndrome의 case와 control의 10개의 프로테옴 이미지에 대하여 클러스터링을 시도하였다. 이와 같이 2D-Gel 프로테옴 전체의 이미지를 비교하여 유사한 정도에 따라 모으는 클러스터링은 FAS 시료의 경우 case와 control 보다는 시료원의 외연적인 특징인 나이 혹은 성별에 더 의하여 의존하는 것으로 나타났다.

Keywords

References

  1. Jensen, O. N., M. R. Larsen, and P. Roepstorff (1998) Mass spectrometric identification and microcharacterization of proteins from electrophoretic gels: Strategies and applications, Proteins, Supplement 2, 74-89
  2. Lopez, M. F. (2000), Better approaches to finding the needle in a haystack: Optimizing Proteome analysis through automation, Electrophoresis 21, 1082-1093 https://doi.org/10.1002/(SICI)1522-2683(20000401)21:6<1082::AID-ELPS1082>3.0.CO;2-E
  3. Haynes, P. A. and J. R. Yates (2000), Proteome profiling - pitfalls and progress, Yeast 17, 81-87 https://doi.org/10.1002/1097-0061(20000630)17:2<81::AID-YEA22>3.0.CO;2-Z
  4. Harry, J. L., M. R. Wilkins, B. R. Herbert, N. H. Packer, A. A. Gooley, and K. L. Williams (2000), Proteomics: Capacity versus utility, Electrophoresis 21, 1071-1081 https://doi.org/10.1002/(SICI)1522-2683(20000401)21:6<1071::AID-ELPS1071>3.0.CO;2-M
  5. Michener, C. M., A. M. Ardekani, E. F. 3rd Petricoin, L. A. Liotta and E. C. Kohn (2002), Genomics and proteomics: application of novel technology to early detection and prevention of cancer, Cancer Detect Prevo 26, 249-55 https://doi.org/10.1016/S0361-090X(02)00092-2
  6. Cordwell, S. J., A. S. Nouwens, and B. J. Walsh (2001), Comparative proteomics of bacterial pathogens, Proteomics 1, 461-72 https://doi.org/10.1002/1615-9861(200104)1:4<461::AID-PROT461>3.0.CO;2-S
  7. Jungblut, P. R., D. Bumann, G. Haas, U. Zimny-Arndt, P. Holland, S. Lamer, F. Siejak, A. Aebischer, and T. F. Meyer (2000), Comparative proteome analysis of Helicobacter pylori, Molecular Microbiology 36, 710-725 https://doi.org/10.1046/j.1365-2958.2000.01896.x
  8. Celis, J. E., M. Ostergaard, H. H. Rasmussen, P. Gromov, L Gromova, H. Varmark, H. Palsdottir, N. Magnusson, I. Andersen, B. Basse, J. B. Lauridsen, G. Ratz, H. Wolf, T. F. Omtoft, P. Celis, and A. Celis (1999), A comprehensive protein resource for the study of bladder cancer, Electrophoresis 20, 300-309 https://doi.org/10.1002/(SICI)1522-2683(19990201)20:2<300::AID-ELPS300>3.0.CO;2-Q
  9. Tomlinson, A. J., M. Hincapie, G. E. Morris, and R. M. Chicz (2002), Global proteome analysis of a human gastric carcinoma, Electrophoresis 23, 3233-3240 https://doi.org/10.1002/1522-2683(200209)23:18<3233::AID-ELPS3233>3.0.CO;2-3
  10. Humpherysmith, I., S. J. Cordwell, and W. P. Blackstock (1997), Proteome research - Complementarity and limitations with respect to the RNA and DNA worlds, Electrophoresis 18, 304-318
  11. Hanash, S. M. and D. Teichroew (1988), Mining the human proteome - Eeperience with the human lymphoid protein database, Electrophoresis 19, 301-309
  12. Haynes, P. A., S. P. Gygi, D. Figeys, and R. Aebersold (1999), Proteome analysis Biological assay or data archive, Electrophoresis 19, 1403-1421
  13. Veenstra, T. D. and T. P. Comads (2003), Serum protein fmgerprinting, Curr. Opin. Mol. Ther. 5, 584-93
  14. Robinson M. K., J. E. Myrick, L. O. Henderson, C. D. Coles, M. K. Powell, G. A. Orr, and P. F. Lemkin (1995), Two-dimensional protein electrophoresis and multiple hypothesis testing to detect potential serum protein biomarkers in children with fetal alcohol syndrome, Electrophoresis 16, 1176-1183 https://doi.org/10.1002/elps.11501601195
  15. Veeser, S., M. J. Dunn, and G. Z. Yang (2001), Multiresolution image registration for two-dimensional gel electrophoresis, Proteomics 1, 856-870 https://doi.org/10.1002/1615-9861(200107)1:7<856::AID-PROT856>3.0.CO;2-R
  16. Han, J. and M. Kamber (2001), Data Mining: Concepts and Techniques, p314, Academic Press, San Diego
  17. Everitt, B. S., S. Landau and M. Leese (2001), Cluster Analysis, pll, Oxford University Press, New York
  18. Andrew, L. (1997), A book in methods in molecular biology, Vol. 112, Humana Press, Totowa, NJ, pp 339-410
  19. Smilansky, Z. (2001), Automatic registration of images of two-dimensional protein gels, Electrophoresis 22, 1616-1626 https://doi.org/10.1002/1522-2683(200105)22:9<1616::AID-ELPS1616>3.0.CO;2-Z