DOI QR코드

DOI QR Code

Clustering and classification to characterize daily electricity demand

시간단위 전력사용량 시계열 패턴의 군집 및 분류분석

  • Park, Dain (Department of Statistics, Daegu University) ;
  • Yoon, Sanghoo (Department of Statistics and Computer Science, Daegu University & Institute of Basic Science, Daegu University)
  • 박다인 (대구대학교 일반대학원 통계학과) ;
  • 윤상후 (대구대학교 전산통계학과, 대구대학교 기초과학연구소)
  • Received : 2017.02.28
  • Accepted : 2017.03.27
  • Published : 2017.03.31

Abstract

The purpose of this study is to identify the pattern of daily electricity demand through clustering and classification. The hourly data was collected by KPS (Korea Power Exchange) between 2008 and 2012. The time trend was eliminated for conducting the pattern of daily electricity demand because electricity demand data is times series data. We have considered k-means clustering, Gaussian mixture model clustering, and functional clustering in order to find the optimal clustering method. The classification analysis was conducted to understand the relationship between external factors, day of the week, holiday, and weather. Data was divided into training data and test data. Training data consisted of external factors and clustered number between 2008 and 2011. Test data was daily data of external factors in 2012. Decision tree, random forest, Support vector machine, and Naive Bayes were used. As a result, Gaussian model based clustering and random forest showed the best prediction performance when the number of cluster was 8.

Acknowledgement

Supported by : 대구대학교

References

  1. Breiman, L. (2001). Random forests. Machine learning, 45, 5-32. https://doi.org/10.1023/A:1010933404324
  2. Cho, H., Goude, Y., Brossat, X. and Yao, Q.(2013). Modeling and forecasting daily electricity load curves: A hybrid approach. Journal of the American Statistical Association, 108, 7-21. https://doi.org/10.1080/01621459.2012.722900
  3. Dimitriadou, E., Hornik, K., Leisch, F., Meyer, D. and Weingessel, A. (2005). Misc Functions of the Department of Statistics (e1071), TU Wien. R package version 1.5-7, http://CRAN.R-project.org/.
  4. Fraley, C., Raftery, A. E., Scrucca, L., Murphy, T. B. and Fop, M. (2016). mclust: Normal mixture modelling for model-based clustering, classification, and density estimation, http://CRAN.R-project.org/package=mclust.Rpackageversion,5.
  5. Hwang, H. M., Lee, S. H., Park, J. B., Park, Y. G., and Son, S. Y. (2015). Load forecasting using hierarchical clustering method for building. Journal of the Korean Institute of Illuminating and Electrical Installation Engineers, 59-65.
  6. Kang, D. H., Park, J. D. and Song, K. B. (2016). 24-Hour load forecasting for anomalous weather days using hourly temperature. The Transactions of The Korean Institute of Electrical Engineers, 65, 1144-1150. https://doi.org/10.5370/KIEE.2016.65.7.1144
  7. Kim, C. H., Koo, B. G. and Park, J. H. (2012). Short-term electric load forecasting using data mining technique. Journal of Electrical Engineering & Technology, 7, 807-813. https://doi.org/10.5370/JEET.2012.7.6.807
  8. Liaw, A, and Wiener, M. (2002). Classification and regression by randomForest. IR news, 2, 18-22
  9. Lim, J. H., Kim, S. Y., Park, J. D. and Song, K. B. (2013). Representative temperature assessment for improvement of short-term load forecasting accuracy. Journal of the Korean Institute of Illuminating and Electrical Installation Engineers, 27, 39-43.
  10. Ma, P., Castillo-Davis, C. I., Zhong, W. and Liu, J. S. (2006). A data-driven clustering method for time course gene expression data. Nucleic Acids Research, 34, 1261-1269. https://doi.org/10.1093/nar/gkl013
  11. MacQueen, J. (1967). Some methods for classification and analysis of multivariate observations. Proceedings of the fifth Berkeley Symposium on Mathematical Statistics and Probability, 1, 281-297.
  12. Meyer, D., Dimitriadou, E., Hornik, K., Weingessel, A., Leisch, F., Chang, C. C. and Lin, C. C. (2015). Package 'e1071'. The Comprehensive R Archive Network, Available at https://cran.r-project.org/web/packages/e1071/e1071.pdf.
  13. Park, C. (2016). A simple diagnostic statistic for determining the size of random forest. Journal of the Korean Data & information Science Society, 27, 855-863. https://doi.org/10.7465/jkdi.2016.27.4.855
  14. Scott, A. J. and Symons, M. J. (1971). Clustering methods based on likelihood ratio criteria. Biometrics, 27, 387-397. https://doi.org/10.2307/2529003
  15. Song, K. B., Baek, Y. S., Hong, D. H., and Jang, G. (2005). Short-term load forecasting for the holidays using fuzzy linear regression method. IEEE transactions on power systems, 20, 96-101. https://doi.org/10.1109/TPWRS.2004.835632
  16. Therneau, T., Atkinson, B., Ripley, B., and Ripley, M. B. (2015). Package 'rpart', Available online cran.ma.ic.ac.uk/web/packages/rpart/rpart.pdf.
  17. Wi, Y. M. and Min, Y. K. (2016). Weekly peak load forecasting using weather stochastic model and weather sensitivity. The Transactions of the Korean Institute of Electrical Engineers, 64, 41-47.
  18. Yoon, S. H. and Choi, Y. J. (2015). Functional clustering for electricity demand data: A case study. Journal of the Korean Data & information Science Society, 26, 885-894. https://doi.org/10.7465/jkdi.2015.26.4.885