DOI QR코드

DOI QR Code

Use of Minimal Spanning Trees on Self-Organizing Maps

자기조직도에서 최소생성나무의 활용

  • Jang, Yoo-Jin (Department of Statistics, Korea University) ;
  • Huh, Myung-Hoe (Department of Statistics, Korea University) ;
  • Park, Mi-Ra (Department of Preventive Medicine, Eulji University)
  • Published : 2009.04.30

Abstract

As one of the unsupervised learning neural network methods, self-organizing maps(SOM) are applied to various fields. It reduces the dimension of multidimensional data by representing observations on the low dimensional manifold. On the other hand, the minimal spanning tree(MST) of a graph that achieves the most economic subset of edges connecting all components by a single open loop. In this study, we apply the MST technique to SOM with subnodes. We propose SOM's with embedded MST and a distance measure for optimum choice of the size and shape of the map. We demonstrate the method with Fisher's Iris data and a real gene expression data. Simulated data sets are also analyzed to check the validity of the proposed method.

비지도 학습 신경망모형의 한 종류인 자기조직도(self-organizing map: SOM)는 고차원 자료를 차원축소하고 저차원지도를 통해 유사한 개체를 군집화하는 방법이며 다양한 분야의 데이터에 적용되고 있다. 한편 최소생성나무(minimal spanning tree: MST)는 개체점들을 닫힌 루프 없이 가장 짧게 선분으로 연결하는 그래프 방법이다. 본 연구에서는 부노드 자기조직도에 최소생성나무를 적용하여 부노드 간 거리를 근사적으로 나타내는 자료 시각화 방법과 자기조직도의 최적 형태와 크기를 결정하기 위한 거리 측도를 제안하였다. 또한 피서의 붓꽃자료와 실제 유전자발현자료 및 모의생성 자료에 적용하여 이 방법의 유용성을 살펴보았다.

Keywords

References

  1. 김성수 (1999). 통계그래픽스를 이용한 K-평균 및 계층적 군집분석, <한국분류학회지>, 3, 13-27
  2. 엄익현, 허명회 (2005). SOM에서 개체의 시각화, <응용통계연구>, 18, 83-98 https://doi.org/10.5351/KJAS.2005.18.1.083
  3. 허명회 (2003). 주성분 자기조직화 지도 PC-SOM, <응용통계연구>, 16, 321-333 https://doi.org/10.5351/KJAS.2003.16.2.321
  4. Alizadeh, A. A., Eisen, M. B., Davis, R. E., Ma, C, Lossos, I. S., Rosenwald, A., Boldrick, J. C, Sabet, H., Tran, T., Yu, X., Powell, J. I., Yang, L., Marti, G. E., Moore, T., Hudson, J., Jr, Lu, L., Lewis, D. B., Tibshirani, R., Sherlock, G., Chan, W. C, Greiner, T. C, Weisenburger, D. D., Armitage, J. O., Warnke, R., Levy, R., Wilson, W., Grever, M. R., Byrd, J. C, Botstein, D., Brown, P. O. and Staudt, L. M. (2000). Different type of diffuse large B-cell lymphoma identified by gene expression profiling, Nature, 403, 503-511 https://doi.org/10.1038/35000501
  5. Berglund, E. and Sitte, J. (2006). The parameterless self-organizing map algorithm, IEEE Transactions on Neural Networks, 17, 305-316 https://doi.org/10.1109/TNN.2006.871720
  6. Gower, J. C and Ross, G. J. S. (1969). Minimum spanning trees and single linkage cluster analysis, Applied Statistics, 18, 54-64 https://doi.org/10.2307/2346439
  7. Haese, K., Goodhill, G. J. (2001). Auto-SOM: Recursive parameter estimation for guidance of self-organizing feature maps, Neural Computing, 13, 595-619 https://doi.org/10.1162/089976601300014475
  8. Hsu, A. L. and Halgamuge, S. K. (2003). Enhancement of topology preservation and hierarchical dynamic self-organising maps for data visualisation, International Journal of Approximate Reasoning, 32, 259-279 https://doi.org/10.1016/S0888-613X(02)00086-5
  9. Hsu, C C (2006). Generalizing self-organizing map for categorical data, IEEE Transactions on Neural Networks, 17, 294-304 https://doi.org/10.1109/TNN.2005.863415
  10. Kim, S. S., Kwon, S. and Cook, D. (2000). Interactive visualization of hierarchical clusters using MDS and MST, Metrika, 51, 39-51 https://doi.org/10.1007/s001840000043
  11. Kohonen, T. (1995). Self-Organizing Maps, Springer-Verlag, Berlin
  12. Park, M., Jang, Y. J. and Huh, M. H. (2005). Analysis of gene expression data using PC-SOM, In Proceedings of the 55th session of International Statistical Institute
  13. Prim, R. C (1957). Shortest connection networks and some generalizations, Bell System Technical Journal, 36, 1389-1401 https://doi.org/10.1002/j.1538-7305.1957.tb01515.x
  14. Sarnsonova, E. V., Kok, J. N. and Ijzerman, A. P. (2006). TreeSOM: Cluster analysis in the self-organizing map, Neural networks, 19,935-949 https://doi.org/10.1016/j.neunet.2006.05.003
  15. Tamayo, P., Slonirn, D., Mesirov, J., Zhu, Q., Kitareewan, 5., Dmitrovsky, E., Lander, E. S. and Golub, T. R. (1999). Interpreting patterns of gene expression with self-organizing maps: Methods and application to hematopoietic differentiation, In Proceedings of the National Academy of Sciences, 96, 2907-2912 https://doi.org/10.1073/pnas.96.6.2907
  16. Xu, Y., Olman, V. and Xu, D. (2001). Minimal spanning trees for gene expression data clustering, Genome Informatics, 12, 24-33
  17. Yan, A. (2006). Application of self-organizing maps in compounds pattern recognition and combinatorial library design, Combinatorial Chemistry & High Throughput Screening, 9, 473-480 https://doi.org/10.2174/138620706777698562