DOI QR코드

DOI QR Code

EM 알고리즘에 의한 다변량 치우친 정규분포 혼합모형의 근사적 적합

An approximate fitting for mixture of multivariate skew normal distribution via EM algorithm

  • 김승구 (상지대학교 컴퓨터데이터정보학과)
  • Kim, Seung-Gu (Department of Data and Information, Sangji University)
  • 투고 : 2016.02.19
  • 심사 : 2016.03.07
  • 발행 : 2016.04.30

초록

다중 치우침 모수벡터를 가진 다변량 치우친 정규분포 (MSNMix)를 EM 알고리즘으로 적합하려면 E-step에서 다변량 절단 정규분포의 적률과 확률을 계산해야 하는데 이것은 매우 큰 계산 시간을 요구한다. 그래서 비대칭 자료를 적합하는데 흔히 단순 치우침 모수를 가진 모형을 적용한다. 이 모형은 단변량 처리방식으로 적합하는 것이 가능하기 때문에 처리속도가 매우 빠르다. 그러나 단순 치우침 모수를 적용하는 것은 응용에서 비현실적인 경우가 많다. 본 논문에서는 다중 치우침 모수를 가지는 MSNMix의 근사적 추정법을 제안하는데, 이 방법은 단변량 처리방식이 적용되므로 향상된 처리속도를 보장한다. 그리고 제안된 방법의 실효성을 보이기 위해 몇 가지 실험 결과를 제공한다.

Fitting a mixture of multivariate skew normal distribution (MSNMix) with multiple skewness parameter vectors via EM algorithm often requires a highly expensive computational cost to calculate the moments and probabilities of multivariate truncated normal distribution in E-step. Subsequently, it is common to fit an asymmetric data set with MSNMix with a simple skewness parameter vector since it allows us to compute them in E-step in an univariate manner that guarantees a cheap computational cost. However, the adaptation of a simple skewness parameter is unrealistic in many situations. This paper proposes an approximate estimation for the MSNMix with multiple skewness parameter vectors that also allows us to treat them in an univariate manner. We additionally provide some experiments to show its effectiveness.

키워드

참고문헌

  1. Azzalini, A. (1985). A class of distribution which includes the normal ones, Scandinavian Journal of Statistics, 33, 561-574.
  2. Azzalini, A. and Dalla-Valle, A. (1996). The multivariate skew normal distribution, Biometrika, 83, 715-726. https://doi.org/10.1093/biomet/83.4.715
  3. Arellano-Valle, R. B. and Genton, M. G. (2005). On fundamental skew distributions, Journal of Multivariate Analysis, 96, 93-116. https://doi.org/10.1016/j.jmva.2004.10.002
  4. Cabral, C. S., Lachos, V. H., and Prates, M. O. (2012). Multivariate mixture modeling using skew-normal independent distribution, Computational Statistics and Data Analysis, 56, 126-142. https://doi.org/10.1016/j.csda.2011.06.026
  5. Cook, R. D. and Weisberg, S. (1994). An Introduction to Regression Graphics, Wiley, New York.
  6. Ho, H. J., Lin, T. I., Chen, H.-Y., and Wang, W.-L. (2012). Some results on the truncated multivariate t distribution, Journal of Statistical Planning & Inference, 142, 25-40. https://doi.org/10.1016/j.jspi.2011.06.006
  7. Kim, S.-G. (2014). An alternating approach of maximum likelihood estimation for mixture of multivariate skew t-distribution, The Korean Journal of Applied Statistics, 27, 819-831. https://doi.org/10.5351/KJAS.2014.27.5.819
  8. Lee, S. X. and McLachlan, G. J. (2013). On mixtures of skew normal and skew t-distributions, Advances in Data Analysis and Classification, 7, 241-266. https://doi.org/10.1007/s11634-013-0132-8
  9. Lee, S. X. and McLachlan, G. J. (2014a). Finite mixtures of multivariate skew t-distributions: some recent and new results, Statistics and Computing, 24, 181-202. https://doi.org/10.1007/s11222-012-9362-4
  10. Lee, S. X. and McLachlan, G. J. (2014b). Finite mixtures of canonical fundamental skew t-distributions, arXiv: 1405.0685v1 [Stat. ME] 4 May 2014.
  11. Lin, T.-I. (2010). Robust mixture modeling using multivariate skew t-distributions, Statistics and Computing, 20, 343-356. https://doi.org/10.1007/s11222-009-9128-9
  12. Olson, J. M. and Weissfeld, L. A. (1991). Approximation of certain multivariate integrals, Statistics & Probability Letters, 11, 309-317. https://doi.org/10.1016/0167-7152(91)90040-X
  13. Pyne, S., Hu, X., Wang, K., Rossin, E., Lin, T. I., Maier, L., Baecher-Allan, C., McLachlan, G. J., Tamayo, P., Hafler, D. A., De Jager, P. L., and Mesirov, J. P. (2009). Automated high-dimensional flow cytometric data analysis, In Proceedings of the National Academy of Sciences, 106 , 8519-8524. https://doi.org/10.1073/pnas.0903028106
  14. Sahu, S. K., Dey, D. K., and Branco, M. D. (2003). A new class of multivariate skew distribution with application to Bayesian regression model, The Canadian Journal of Statistics, 31, 129-150. https://doi.org/10.2307/3316064