DOI QR코드

DOI QR Code

A Pre-processing Process Using TadGAN-based Time-series Anomaly Detection

TadGAN 기반 시계열 이상 탐지를 활용한 전처리 프로세스 연구

  • Lee, Seung Hoon (Department of Industrial and Management Engineering, Kyonggi University Graduate School) ;
  • Kim, Yong Soo (Department of Industrial System Engineering, Kyonggi University)
  • 이승훈 (경기대학교 일반대학원 산업경영공학과) ;
  • 김용수 (경기대학교 산업시스템공학과)
  • Received : 2022.07.13
  • Accepted : 2022.08.05
  • Published : 2022.09.30

Abstract

Purpose: The purpose of this study was to increase prediction accuracy for an anomaly interval identified using an artificial intelligence-based time series anomaly detection technique by establishing a pre-processing process. Methods: Significant variables were extracted by applying feature selection techniques, and anomalies were derived using the TadGAN time series anomaly detection algorithm. After applying machine learning and deep learning methodologies using normal section data (excluding anomaly sections), the explanatory power of the anomaly sections was demonstrated through performance comparison. Results: The results of the machine learning methodology, the performance was the best when SHAP and TadGAN were applied, and the results in the deep learning, the performance was excellent when Chi-square Test and TadGAN were applied. Comparing each performance with the papers applied with a Conventional methodology using the same data, it can be seen that the performance of the MLR was significantly improved to 15%, Random Forest to 24%, XGBoost to 30%, Lasso Regression to 73%, LSTM to 17% and GRU to 19%. Conclusion: Based on the proposed process, when detecting unsupervised learning anomalies of data that are not actually labeled in various fields such as cyber security, financial sector, behavior pattern field, SNS. It is expected to prove the accuracy and explanation of the anomaly detection section and improve the performance of the model.

Keywords

Acknowledgement

본 연구는 경기도의 경기도 지역협력연구센터 사업의 일환으로 수행하였음[GRRC경기2020-B03, 산업통계 및 데이터마이닝 연구].

References

  1. Breiman, L. 2001. Random forests. Machine learning, 45(1):5-32. https://doi.org/10.1023/A:1010933404324
  2. Carletti, M., Masiero, C., Beghi, A., & Susto, G. A. 2019. A deep learning approach for anomaly detection with industrial time series data: a refrigerators manufacturing case study. Procedia Manufacturing 38:233-240. https://doi.org/10.1016/j.promfg.2020.01.031
  3. Chang, K. B. G. N. A. Learning Anomaly Detection for Generating Predictive Maintenance Models from LBS-AUV Mission Data.
  4. Chen, T. & Guestrin, C. 2016. Xgboost: A scalable tree boosting system. In Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining pp. 785-794.
  5. Cho, K., Van Merrienboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H., & Bengio, Y. 2014. Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv preprint arXiv: 1406.1078.
  6. Choi, S. H. & Hur, J. 2020. Optimized-XG boost learner based bagging model for photovoltaic power forecasting. Transactions of the Korean Institute of Electrical Engineers 69(7):978-984. https://doi.org/10.5370/kiee.2020.69.7.978
  7. Cook, A. A., Misirli, G., & Fan, Z. 2019. Anomaly detection for IoT time-series data: A survey. IEEE Internet of Things Journal 7(7):6481-6494. https://doi.org/10.1109/jiot.2019.2958185
  8. Geiger, A., Liu, D., Alnegheimish, S., Cuesta-Infante, A., & Veeramachaneni, K. 2020. TadGAN: Time series anomaly detection using generative adversarial networks. In 2020 IEEE International Conference on Big Data (Big Data) pp. 33-43. IEEE.
  9. Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., ... & Bengio, Y. 2014. Generative adversarial nets. Advances in neural information processing systems, 27.
  10. Hochreiter, S. & Schmidhuber, J. 1997. Long short-term memory. Neural computation 9(8):1735-1780. https://doi.org/10.1162/neco.1997.9.8.1735
  11. Hwang, J. H. & Jin, K. H. 2021. Anomaly Detection and Performance Analysis using Deep Learning. In Proceedings of the Korean Institute of Information and Commucation Sciences Conference pp. 78-81. The Korea Institute of Information and Commucation Engineering.
  12. Jiang, W., Hong, Y., Zhou, B., He, X., & Cheng, C. 2019. A GAN-based anomaly detection approach for imbalanced industrial time series. IEEE Access 7:143608-143619. https://doi.org/10.1109/access.2019.2944689
  13. Jiao, Y., Yang, K., Song, D., & Tao, D. 2022. TimeAutoAD: Autonomous Anomaly Detection With Self-Supervised Contrastive Loss for Multivariate Time Series. IEEE Transactions on Network Science and Engineering 9(3):1604-1619. https://doi.org/10.1109/TNSE.2022.3148276
  14. Kim, H. J. 2022. Semi-Supervised Learning to Predict Default Risk for P2P Lending. Journal of Digital Convergence 20(4):185-192. https://doi.org/10.14400/JDC.2022.20.4.185
  15. Kim, H. S. & Choi, J. H. 2020. Distribution and Validation of RUL Prediction Parameters Considering Life Distribution. Journal of Applied Reliability 20(2):145-153. https://doi.org/10.33162/jar.2020.6.20.2.145
  16. Lee, G. H., Shin, B. C., & Hur, J. W. 2020. Fault Classification of Gear Pumps Using SVM. Journal of Applied Reliability, 20(2):187-196. https://doi.org/10.33162/jar.2020.6.20.2.187
  17. Lee, S. H. & Kim, Y. S. 2021. A Study on the Optimization of Long Short-Term Memory Hyperparameters Using the Taguchi Design of Experiments. Journal of Applied Reliability 21(3):238-245. https://doi.org/10.33162/JAR.2021.9.21.3.238
  18. Lee, S. H., Yoon, Y. A., Jung, J. H., Chang, T. W., & Kim, Y. S. 2020. A Machine Learning Model for Predicting Silica Concentrations through Time Series Analysis of Mining Data. Journal of the Korean Society for Quality Management 48(3):511-520. https://doi.org/10.7469/JKSQM.2020.48.3.511
  19. Li, Y., Peng, X., Zhang, J., Li, Z., & Wen, M. 2021. DCT-GAN: Dilated Convolutional Transformer-based GAN for Time Series Anomaly Detection. IEEE Transactions on Knowledge and Data Engineering.
  20. Meyer, P., Hackel, T., Reider, S., Korf, F., & Schmidt, T. C. 2021. Network Anomaly Detection in Cars: A Case for Time-Sensitive Stream Filtering and Policing. arXiv preprint arXiv:2112.11109.
  21. Nguyen, H. D., Tran, K. P., Thomassey, S., & Hamad, M. 2021. Forecasting and Anomaly Detection approaches using LSTM and LSTM Autoencoder techniques with the applications in supply chain management. International Journal of Information Management, 57, 102282. https://doi.org/10.1016/j.ijinfomgt.2020.102282
  22. Oh, M. J., Choi, E. S., Roh, K. W., Kim, J. S., & Cho, W. S. 2021. A Study on the design of supervised and unsupervised learning models for fault and anomaly detection in manufacturing facilities. The Journal of Bigdata, 6(1), 23-35.
  23. Oh, S. & Islam, M. R. 2021. Application TadGAN to Detect Collective Anomaly in Power Usage Data. The Journal of Contents Computing 3(1):297-306. https://doi.org/10.9728/jcc.2021.06.3.1.297
  24. Park, H. J., Cho, S. H., Jang, K. H., Seol, J. W., Kwon, B. G., Kwon, J. Y. & Choi, J. H. 2020. Study on Fault Diagnosis of Planetary Gearbox in Unmanned Aerial Vehicle Using Multi sensor Data. Journal of Applied Reliability 20(4):332-342. https://doi.org/10.33162/JAR.2020.12.20.4.332
  25. Park, H. J., Sim, J. W., Jang, J. W., Jang, K. H., Seol, J. W., Kwon, J. Y. & Choi, J. H. 2021. Study on Fault Severity Diagnosis of Planetary Gearbox in Unmanned Aerial Vehicle using Artificial Neural Network. Journal of Applied Reliability 21(4):329-340. https://doi.org/10.33162/JAR.2021.12.21.4.329
  26. Preuveneers, D., Rimmer, V., Tsingenopoulos, I., Spooren, J., Joosen, W., & Ilie-Zudor, E. 2018. Chained anomaly detection models for federated learning: An intrusion detection case study. Applied Sciences 8(12):2663. https://doi.org/10.3390/app8122663
  27. Ramotsoela, D., Abu-Mahfouz, A., & Hancke, G. 2018. A survey of anomaly detection in industrial wireless sensor networks with critical water system infrastructure as a case study. Sensors 18(8):2491. https://doi.org/10.3390/s18082491
  28. Rezapour, M. 2019. Anomaly detection using unsupervised methods: credit card fraud case study. International Journal of Advanced Computer Science and Applications 10(11).
  29. Song, B. & Suh, Y. 2019. Narrative texts-based anomaly detection using accident report documents: The case of chemical process safety. Journal of Loss Prevention in the Process Industries 57:47-54. https://doi.org/10.1016/j.jlp.2018.08.010
  30. Srivastava, N., Mansimov, E., & Salakhudinov, R. 2015. Unsupervised learning of video representations using lstms. In International conference on machine learning pp. 843-852. PMLR.
  31. TIPIRNENI, S. & REDDY, C. K. 2022. Self-Supervised Transformer for Sparse and Irregularly Sampled Multivariate Clinical Time-Series. ACM Trans. Knowl. Discov. Data, 1(1).
  32. Xu, J., Wu, H., Wang, J., & Long, M. 2021. Anomaly transformer: Time series anomaly detection with association discrepancy. arXiv preprint arXiv:2110.02642.