DOI QR코드

DOI QR Code

Comparison of multiscale multiple change-points estimators

SMUCE와 FDR segmentation 방법에 의한 다중변화점 추정법 비교

  • Kim, Jaehee (Department of Statistics, Duksung Women's University)
  • 김재희 (덕성여자대학교 정보통계학과)
  • Received : 2019.04.16
  • Accepted : 2019.06.05
  • Published : 2019.08.31

Abstract

We study false discovery rate segmentation (FDRSeg) and simultaneous multiscale change-point estimator (SMUCE) methods for multiscale multiple change-point estimation, and compare empirical behavior via simulation. FSRSeg is based on the control of a false discovery rate while SMUCE used for the multiscale local likelihood ratio tests. FDRSeg seems to work best if the number of change-points is large; however, FDRSeg and SMUCE methods can both provide similar estimation results when there are only a small number of change-points. As a real data application, multiple change-points estimation is done with the well-log data.

본 연구는 다층적 다중변화점 추정법으로 FDRSeg 기법과 SMUCE 기법의 이론적 특성을 파악하고 모의실험을 통해 경험적 특성을 비교하고자한다. FDRSeg (False discovery rate segmentation)기법은 FDR 기반 조절을 하여 변화점을 추정하고 SMUCE (simultaneous multiscale change-point estimator) 기법은 국소우도함수 기반 다중 검정으로 변화점을 추정한다. 변화점의 개수가 작을경우에는 두 기법에 의한 추정능력이 비슷하다. 변화점 개수가 많을수록 FDRSeg 의 추정이 변화점 개수와 추정측도 면에서 더 좋은 편이다. 실제 데이터 분석으로 검층 주상도 데이터에 대해 각 기법으로 다중변화점 추정을 하고 비교한다.

Keywords

References

  1. Bellman, R. (1957). Dynamic Programming, Princeton University Press, Princeton, NJ.
  2. Bellman, R. E. and Dreyfus, S. E. (1962). Applied Dynamic Programming, Princeton University Press, Princeton, NJ.
  3. Benjamini, Y. and Hochberg, Y. (1995). Controlling the false discovery rate: a practical and powerful approach to multiple testing, Journal of Royal Statistical Society. Series B (Methodological), 57, 289-300. https://doi.org/10.1111/j.2517-6161.1995.tb02031.x
  4. Birg'e, L. and Massart, P. (2006). Minimal penalties for Gaussian model selection, Probability Theory and Related Fields, 138, 33-73. https://doi.org/10.1007/s00440-006-0011-8
  5. Boysen, L., Kempe, A., Liebscher, V., Munk, A., and Wittich, O. (2009). Consistencies and rates of convergence of jump-penalized least squares estimators, The Annals of Statistics, 37, 157-183. https://doi.org/10.1214/07-AOS558
  6. Braun, J. V., Braun, R. K., and Muller, H. G. (2000). Multiple changepoint fitting via quasilikelihood, with application to DNA sequence segmentation, Biometrika, 87, 301-314. https://doi.org/10.1093/biomet/87.2.301
  7. Candes, E. and Tao, T. (2007) The Dantzig selector: statistical estimation when p is much larger than n, The Annals of Statistics, 35, 2313-2351. https://doi.org/10.1214/009053606000001523
  8. Chan, H. P. and Walther, G. (2013). Detection with the scan and the average likelihood ratio, Statistica Sinica, 23, 409-428.
  9. Cheon, S. and Kim, J. (2010). Multiple change-point detection of multivariate mean vectors with Bayesian approach, Computational Statistics & Data Analysis, 54, 406-425. https://doi.org/10.1016/j.csda.2009.09.003
  10. Chernoff, H. and Zacks, S. (1964). Estimating the current mean of a normal distribution which is subjected to change in time, The Annals of Mathematical Statistics, 35, 999-1018. https://doi.org/10.1214/aoms/1177700517
  11. Davies, P. L. and Kovac, A. (2001). Local extremes, runs, strings and multiresolution, The Annals of Statistics, 29, 1-65. https://doi.org/10.1214/aos/996986501
  12. Davies, P. L., Kovac, A., and Meise, M. (2009). Nonparametric regression, confidence regions and regularization, The Annals of Statistics, 37, 2597-2625. https://doi.org/10.1214/07-AOS575
  13. Dumbgen, L. and Walther, G. (2008). Multiscale inference about a density, The Annals of Statistics, 36, 1758-1785. https://doi.org/10.1214/07-AOS521
  14. Fearnhead, P. (2006). Exact and efficient Bayesian inference for multiple changepoint problems, Statistics and Computing, 16, 203-213. https://doi.org/10.1007/s11222-006-8450-8
  15. Frick, K., Munk, A., and Sieling, H. (2014). Multiscale change-point inference, Journal of the Royal Sta-tistical Society. Series B (Statistical Methodology), with discussion and rejoinder by the authors, 76, 495-580. https://doi.org/10.1111/rssb.12047
  16. Friedman, J., Hastie, T., Hofling, H., and Tibshirani, R. (2007). Pathwise coordinate optimization, Annals of Applied Statistics, 1, 302-332. https://doi.org/10.1214/07-AOAS131
  17. Fryzlewicz, P. (2014). Wild binary segmentation for multiple change-point detection, The Annals of Statistics, 42, 2243-2281. https://doi.org/10.1214/14-AOS1245
  18. Harchaoui, Z. and Levy-Leduc, C. (2010). Multiple change-point estimation with a total variation penalty, Journal of the American Statistical Association, 105, 1480-1493. https://doi.org/10.1198/jasa.2010.tm09181
  19. Hinkley, D. V. (1970). Inference about the change-point in a sequence of random variables, Biometrika, 57, 1-17. https://doi.org/10.1093/biomet/57.1.1
  20. Huskova, M. and Antoch, J. (2003). Detection of structural changes in regression, Tatra Mountains Mathematical Publications, 26, 201-215.
  21. Kander, Z. and Zacks, S. (1966). Test procedures for possible changes in parameters of statistical distributions occurring at unknown time points, The Annals of Mathematical Statistics, 37, 1196-1210. https://doi.org/10.1214/aoms/1177699265
  22. Kim, J. and Cheon, S. (2010). A Bayesian regime-switching time-series model, Journal of Time Series Analysis, 31, 365-378. https://doi.org/10.1111/j.1467-9892.2010.00670.x
  23. Kim, J. and Cheon, S. (2011). Bayesian multiple change-point estimation with annealing stochastic approximation Monte Carlo, Computational Statistics, 25, 215-239. https://doi.org/10.1007/s00180-009-0172-x
  24. Kim, J. and Hart, J. D. (2011). A change-point estimator using local Fourier series, Journal of Nonpara-metric Statistics, 23, 83-98. https://doi.org/10.1080/10485251003721232
  25. Kim, J. H. and Cheon, S. Y. (2013). Bayesian multiple change-point estimation and segmentation, Communications for Statistical Applications and Methods, 20, 439-454. https://doi.org/10.5351/CSAM.2013.20.6.439
  26. Killick, R., Fearnhead, P., and Eckley, I. A. (2012). Optimal detection of changepoints with a linear computational cost, Journal of the American Statistical Association, 107, 1590-1598. https://doi.org/10.1080/01621459.2012.737745
  27. Kolaczyk, E. D. and Nowark, R. D. (2004). Multiscale likelihood analysis and complexity penalized estimation, Annals of Statistics, 32, 500-527. https://doi.org/10.1214/009053604000000076
  28. Lavielle, M. (2005). Using penalized contrasts for the change-point problem, Signal Processing, 85, 1501-1510. https://doi.org/10.1016/j.sigpro.2005.01.012
  29. Lavielle, M. and Moulines, E. (2000). Least-squares estimation of an unknown number of shifts in a time series, Journal of Time Series Analysis, 21, 33-59. https://doi.org/10.1111/1467-9892.00172
  30. Levy-Leduc, C. and Roueff, F. (2009). Detection and localization of change-points in high-dimensional network traffic data, Annals of Applied Statistics, 3, 637-662. https://doi.org/10.1214/08-AOAS232
  31. Li, H., Munk, A., and Sieling, H. (2016). FDR-control in multiscale change-point segmentation, Electronic Journal of Statistics, 10, 918-959. https://doi.org/10.1214/16-EJS1131
  32. Olshen, A. B., Venkatraman, E. S., Lucito, R., and Wigler, M. (2004). Circular binary segmentation for the analysis of array-based DNA copy number data, Biostatistics, 5, 557-572. https://doi.org/10.1093/biostatistics/kxh008
  33. Siegmund, D. (1988). Confidence sets in change-point problems, International Statistical Review / Revue Internationale de Statistique, 56, 31-48. https://doi.org/10.2307/1403360
  34. Tibshirani, R., Saunders, M., Rosset, S., Zhu, J., and Knight, K. (2005). Sparsity and smoothness via the fused LASSO, Journal of the Royal Statistical Society Series B (Statistical Methodology), 67, 91-108. https://doi.org/10.1111/j.1467-9868.2005.00490.x
  35. Winkler, G. and Liebscher, V. (2002). Smoothers for discontinuous signals, Journal of Nonparametric Statistics, 14, 203-222. https://doi.org/10.1080/10485250211388
  36. Worsley, K. J. (1983). The power of likelihood ratio and cumulative sum tests for a change in a binomial probability, Biometrika, 70, 455-464. https://doi.org/10.1093/biomet/70.2.455
  37. Yao, Y. C. (1988). Estimating the number of change-points via Schwarz criterion, Statistics & Probability Letters, 6, 181-189. https://doi.org/10.1016/0167-7152(88)90118-6
  38. Yao, Y. C. and Au, S. T. (1989). Least-squares estimation of a step function, Sankhya: The Indian Journal of Statistics, Series A, 51, 370-381.
  39. Zhang, N. R. and Siegmund, D. O. (2007). A modified Bayes information criterion with applications to the analysis of comparative genomic hybridization data, Biometrics, 63, 22-32. https://doi.org/10.1111/j.1541-0420.2006.00662.x
  40. Zhang, N. R. and Siegmund, D. O. (2012). Model selection for high-dimensional, multi-sequence changepoint problems, Statistica Sinica, 22, 1507-1538.