DOI QR코드

DOI QR Code

Analyzing the Impact of Plot Size in Vision-Based Time-Series Sound Classification

  • Received : 2024.10.04
  • Accepted : 2024.11.05
  • Published : 2024.11.29

Abstract

In recent years, visualizing time-series data as images for use in vision-based Artificial Intelligence (AI) models has gained significant attention. This approach transforms temporal sequences into images that can be processed by deep learning models, such as Convolutional Neural Network (CNN). Although its effectiveness has been demonstrated in various domains, the impact of plot size on model performance remains underexplored. In this study, we investigate the effect of varying plot sizes on classification accuracy by visualizing natural sounds (e.g., cats, crows) and testing five classes of 2,000 samples each using the YOLO model. While training was conducted on 320x320 plots, test sets were generated at six sizes (112x112 to 640x640). Results show that as the plot size of the test dataset diverged from that of the training dataset, both precision and recall decreased, highlighting the importance of plot size consistency in time-series visualization research.

최근 시계열 데이터를 이미지로 시각화하여 영상 인공지능 모델을 활용하는 방법이 주목받고 있다. 이 방법은 시계열 데이터를 이미지로 변환해, 합성곱 신경망(CNN: Convolutional Neural Network)과 같은 딥러닝(Deep Learning) 모델이 처리할 수 있도록 하여, 다양한 분야에서 그 효과가 입증되었지만, 플롯(plot) 크기가 모델 성능에 미치는 영향은 충분히 연구되지 않았다. 본 연구에서는 플롯 크기의 변화가 분류 정확도에 미치는 영향을 조사하기 위해 고양이, 까마귀 등의 자연의 소리를 플롯(plot)으로 시각화하고, 각 2,000개의 샘플로 구성된 5개의 클래스를 YOLO 모델을 통해 테스트하였다. 학습은 320x320 픽셀 크기의 플롯으로 진행되었으며, 테스트 데이터셋(Test dataset)은 112x112에서 640x640까지 6 종류의 픽셀 크기로 생성하였다. 그 결과, 테스트 데이터셋의 플롯 크기가 학습 데이터셋의 플롯 크기와 다를 수록 정밀도와 재현율이 감소하는 것을 확인했으며, 이는 시계열 시각화 연구에서 플롯 크기의 일관성이 중요함을 시사한다.

Keywords

Acknowledgement

This work was conducted during the sabbatical year.

References

  1. R. H. Shumway, D. S. Stoffer, R. H. Shumway, and D. S. Stoffer, "ARIMA models," Time series analysis and its applications: with R examples, pp. 75-163. 2017. DOI:10.1007/978-3-319-52452-8_3
  2. Y. Yu, X. Si, C. Hu, and J. Zhang, J., "A review of recurrent neural networks: LSTM cells and network architectures," Neural computation, Vol. 31, No. 7, pp. 1235-1270, 2019. DOI:10.1162/neco_a_01199
  3. Y. Fang, H. Xu, and J. Jiang, "A survey of time series data visualization research", In IOP Conference Series: Materials Science and Engineering, Vol. 782, No. 2, pp. 022013. IOP Publishing. 2020. DOI:10.1088/1757-899x/782/2/022013
  4. J.F. Torres, D. Hadjout, A. Sebaa, F. Martinez-Alvarez, and A. Troncoso, "Deep learning for time series forecasting: a survey," Big Data, Vol. 9. No. 1, pp. 3-21, 2021. DOI:10.1089/big.2020.0159
  5. Z. Mushtaq and S. Shun-Feng, "Efficient classification of environmental sounds through multiple features aggregation and data enhancement techniques for spectrogram images," Symmetry Vol. 12, No. 11, pp. 1821, 2020. DOI:10.3390/sym12111822
  6. M. T. Nguyen, W. L. Wei and H. H. Jin, "Heart sound classification using deep learning techniques based on log-mel spectrogram," Circuits, Systems, and Signal Processing Vol. 42, No. 1, pp. 344-360, 2023. DOI:10.1007/s00034-022-02124-1
  7. S. Barra, S.M. Carta, A. Corriga, A.S. Podda, and D.R. Recupero, "Deep learning and time series-to-image encoding for financial forecasting," IEEE/CAA Journal of Automatica Sinica, Vol. 7, No. 3, pp. 683-692, 2020. DOI:10.1109/JAS.2020.1003132
  8. K. Choi, J. Yi, C. Park, and S. Yoon, "Deep learning for anomaly detection in time-series data: Review, analysis, and guidelines," IEEE access, Vol. 9, pp. 120043-120065, 2021. DOI:10.1109/ACCESS.2021.3107975
  9. P. Arcaini, A. Bombarda, S. Bonfanti, and A. Gargantini, "Dealing with robustness of convolutional neural networks for image classification," In 2020 IEEE International Conference On Artificial Intelligence Testing (AITest), pp. 7-14. IEEE, 2020. DOI:10.1109/AITEST49225.2020.00009
  10. J. Djolonga, J. Yung, M. Tschannen, R. Romijnders, L. Beyer, A. Kolesnikov, J. Puigcerver, M. Minderer, A. D'Amour, D. Moldovan, and S. Gelly, "On robustness and transferability of convolutional neural networks," In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 16458-16468. 2021. DOI:10.1109/cvpr46437.2021.01619
  11. P. Jiang, D. Ergu, F. Liu, Y. Cai, and B. Ma, "A Review of Yolo algorithm developments," Procedia computer science Vol. 199, pp. 1066-1073, 2022. DOI:10.1016/j.procs.2022.01.135
  12. A. Casolaro, V. Capone, G. Iannuzzo, and F. Camastra, "Deep learning for time series forecasting: Advances and open problems," Information Vol. 14, No. 11 pp. 598. 2023. DOI: 10.3390/info14110598
  13. C. Li, J. Xiong, X. Zhu, Q. Zhang, and S. Wang, "Fault diagnosis method based on encoding time series and convolutional neural network," IEEE Access, Vol. 8, pp. 165232-165246, 2020. DOI:10.1109/ACCESS.2020.3021007
  14. X. Sun, P. Liu, Z. He, Y. Han, and B. Su, "Automatic classification of electrocardiogram signals based on transfer learning and continuous wavelet transform," Ecological Informatics, Vol. 69, p. 101628, 2022. DOI:10.1016/j.ecoinf.2022.101628
  15. A. Bhowmik, A., M. Sannigrahi, D. Chowdhury, A.D. Dwivedi, and R.R. Mukkamala, "Dbnex: Deep belief network and explainable ai based financial fraud detection," In 2022 IEEE International Conference on Big Data (Big Data), IEEE, pp. 3033-3042, 2022. DOI:10.1109/BigData55660.2022.10020494
  16. N. Hatami, Y. Gavet, and J. Debayle, "Classification of time-series images using deep convolutional neural networks," In Tenth international conference on machine vision (ICMV 2017), Vol. 10696, pp. 242-249. SPIE. 2018. DOI:10.1117/12.2309486
  17. D. Braun, R. Borgo, M. Sondag, and T. von Landesberger, "Reclaiming the horizon: Novel visualization designs for time-series data with large value ranges," IEEE Transactions on Visualization and Computer Graphics, 2023. DOI:10.1109/TVCG.2023.3326576
  18. W. Xie, Y. Li, J. Lei, J. Yang, J. Li, X. Jia, and Z. Li, " Unsupervised spectral mapping and feature selection for hyperspectral anomaly detection," Neural Networks, Vol. 132, pp. 144-154. 2020. DOI:10.1016/j.neunet.2020.08.010
  19. Z. Qin, Y. Zhang, S. Meng, Z. Qin, and K.K.R. Choo, "Imaging and fusing time series for wearable sensor-based human activity recognition," Information Fusion, Vol. 53, pp. 80-87. 2020. =DOI:10.1016/j.inffus.2019.06.014
  20. G. Uribarri and G.B. Mindlin, "Dynamical time series embeddings in recurrent neural networks," Chaos, Solitons & Fractals, Vol. 154, p. 111612, 2022. DOI:10.1016/j.chaos.2021.111612
  21. H. V. Dudukcu, M. Taskiran, Z. G. C. Taskiran, and T. Yildirim, "Temporal Convolutional Networks with RNN approach for chaotic time series prediction," Applied soft computing, Vol. 133, p. 109945. 2023 DOI:10.1016/j.asoc.2022.109945
  22. K. J. Piczak, "ESC: Dataset for environmental sound classification," In Proceedings of the 23rd ACM international conference on Multimedia, pp. 1015-1018. 2015. DOI:10.1145/2733373.2806390