Search | Korea Science

Jang, Woncheol;Kim, Gwangsu;Kim, Joungyoun
- The Korean Journal of Applied Statistics
- /
- v.29 no.6
- /
- pp.999-1005
- /
- 2016
The advent of big data brings the opportunity to answer many open scientic questions but also presents some interesting challenges. Main features of contemporary datasets are the high dimensionality and massive sample size. In this paper, we give an overview of major challenges caused by these two features: (1) noise accumulation and spurious correlations in high dimensional data; (ii) computational scalability for massive data. We also provide applications of big data in various fields including forecast of disasters, digital humanities and sabermetrics.
https://doi.org/10.5351/KJAS.2016.29.6.999 인용 PDF KSCI

Baek, Changryong
- The Korean Journal of Applied Statistics
- /
- v.26 no.6
- /
- pp.987-998
- /
- 2013
This paper considers the statistical characteristics on the air quality (PM10) of Korea collected hourly in 2011. PM10 in Korea exhibits very strong correlations even for higher lags, namely, long range dependence. It is power-law tailed in marginal distribution, and generalized Pareto distribution successfully captures the thicker tail than log-normal distribution. However, slowly decaying autocorrelations may confuse practitioners since a non-stationary model (such as changes in mean) can produce spurious long term correlations for finite samples. We conduct a statistical testing procedure to distinguish two models and argue that the high persistency can be explained by non-stationary changes in mean model rather than long range dependent time series models.
https://doi.org/10.5351/KJAS.2013.26.6.987 인용 PDF KSCI