DOI QR코드

DOI QR Code

An Exploratory Methodology for Longitudinal Data Analysis Using SOM Clustering

자기조직화지도 클러스터링을 이용한 종단자료의 탐색적 분석방법론

  • Cho, Yeong Bin (Division of International Business Dept. of Business Administration)
  • 조영빈 (건국대학교 국제비즈니스학부 경영학과)
  • Received : 2022.03.13
  • Accepted : 2022.05.20
  • Published : 2022.05.28

Abstract

A longitudinal study refers to a research method based on longitudinal data repeatedly measured on the same object. Most of the longitudinal analysis methods are suitable for prediction or inference, and are often not suitable for use in exploratory study. In this study, an exploratory method to analyze longitudinal data is presented, which is to find the longitudinal trajectory after determining the best number of clusters by clustering longitudinal data using self-organizing map technique. The proposed methodology was applied to the longitudinal data of the Employment Information Service, and a total of 2,610 samples were analyzed. As a result of applying the methodology to the actual data applied, time-series clustering results were obtained for each panel. This indicates that it is more effective to cluster longitudinal data in advance and perform multilevel longitudinal analysis.

종단연구는 동일 대상에 대하여 반복적으로 측정한 종단자료를 기반으로 하는 연구방법을 말한다. 대부분의 종단분석 방법은 예측이나 추론에 적합하고, 탐색적 목적으로 사용하기에는 적합하지 않은 경우가 많다. 본 연구에서는 종단자료를 분석하는 탐색적 방법을 제시한다. 이 방법은 자기조직화지도기법을 사용하여 종단자료를 군집화 하여 최선의 군집 수를 정한 후 종단궤적을 찾는 방법이다. 제안한 방법론은 고용정보원의 종단자료에 적용되었으며, 총 2,610개의 샘플에 대하여 분석을 하였다. 방법론을 적용한 결과 패널 별로 시계열적으로 군집 화되는 결과를 얻었다. 이는 종단자료를 사전에 클러스터링하고 다층 종단분석을 하는 것이 더욱 효과적이라는 사실을 나타낸다.

Keywords

References

  1. Y. B. Cho. (2018). A Data Based Methodology for Estimating the Unconditional Model of the Latent Growth Modeling, J. Digital Convergence, 16(6), 85-93. DOI : 10.14400/JDC.2018.16.6.085
  2. G. M. Fitzmaurice, N. M. Laird & J. H. Ware. (2012). Applied Longitudinal Analysis, 2nd ed. John Wiley & Sons; Hoboken; New Jersey.
  3. J. D. Singer & J. B. Willet. (2006). Longitudinal data analysis: Present status; future prospects. In Presentation at the 45th Congress of the German Psychological Association, Nurnberg, Germany (pp. 17-21).
  4. G. S. Kim. (2009). Latent Growth Modeling and Structural Equation Model. Hannarae Academy.
  5. C. Genolini & B. Falissard. (2011). Kml: a package to cluster longitudinal data, Computer Methods and Programs in Biomedicine, 104, 112-121. https://doi.org/10.1016/j.cmpb.2010.05.009
  6. T. Kohonen. (1990). The Self-Organizing Map. Proceedings of the IEEE, 78(9), 1464-1480. https://doi.org/10.1109/5.58325
  7. G. W. Milligan & M. C. Cooper. (1985). An Examination of Procedures for Determining the Number of Clustera in a Data Set. Psychometrika, 50(2), 159-179. DOI : 10.1007/BF02294245
  8. Y. Shim, J. Chung & I. Choi. (2006). A Performance Comparison of Cluster Validity Indices based on K-means Algorithm. Asia Pacific Journal of Information Systems, 16(1), 127-144.
  9. N. D. Teuling, S Pauws & E Heuvel (2022). Clustering of longitudinal data: A tutorial on a variety of approaches-. arXiv preprint arXiv:2111.05469.
  10. L. Kaufman & P. Rousseeuw. (1990). Finding groups in data: an introduction to cluster analysis. NewYork.
  11. R. B. Calinski & J. A. Harabasz. (1974). dendrite method for cluster analysis, Communications in Statistics, 3, 1-27.
  12. S. Hong. (2009). Longitudinal Research Methodology Using Multilevel Model and Latent Growth Model. (Online). https://www.kli.re.kr/klips/downloadCnfrncSjIemFile.do?iemNo=237