Analyzing the Impact of Plot Size in Vision-Based Time-Series Sound Classification

Euihyun Jung;

doi:10.9708/jksci.2024.29.11.049

한국컴퓨터정보학회논문지 (Journal of the Korea Society of Computer and Information)

제29권11호
/
Pages.49-56
/
2024
/
1598-849X(pISSN)
/
2383-9945(eISSN)

한국컴퓨터정보학회 (Korean Society of Computer Information)

DOI QR Code

Analyzing the Impact of Plot Size in Vision-Based Time-Series Sound Classification

Euihyun Jung (Dept. of AI Convergence, Anyang University)

투고 : 2024.10.04
심사 : 2024.11.05
발행 : 2024.11.29

https://doi.org/10.9708/jksci.2024.29.11.049 인용 PDF

PDF 다운로드

⟨ 이전 논문 다음 논문 ⟩

초록

최근 시계열 데이터를 이미지로 시각화하여 영상 인공지능 모델을 활용하는 방법이 주목받고 있다. 이 방법은 시계열 데이터를 이미지로 변환해, 합성곱 신경망(CNN: Convolutional Neural Network)과 같은 딥러닝(Deep Learning) 모델이 처리할 수 있도록 하여, 다양한 분야에서 그 효과가 입증되었지만, 플롯(plot) 크기가 모델 성능에 미치는 영향은 충분히 연구되지 않았다. 본 연구에서는 플롯 크기의 변화가 분류 정확도에 미치는 영향을 조사하기 위해 고양이, 까마귀 등의 자연의 소리를 플롯(plot)으로 시각화하고, 각 2,000개의 샘플로 구성된 5개의 클래스를 YOLO 모델을 통해 테스트하였다. 학습은 320x320 픽셀 크기의 플롯으로 진행되었으며, 테스트 데이터셋(Test dataset)은 112x112에서 640x640까지 6 종류의 픽셀 크기로 생성하였다. 그 결과, 테스트 데이터셋의 플롯 크기가 학습 데이터셋의 플롯 크기와 다를 수록 정밀도와 재현율이 감소하는 것을 확인했으며, 이는 시계열 시각화 연구에서 플롯 크기의 일관성이 중요함을 시사한다.

In recent years, visualizing time-series data as images for use in vision-based Artificial Intelligence (AI) models has gained significant attention. This approach transforms temporal sequences into images that can be processed by deep learning models, such as Convolutional Neural Network (CNN). Although its effectiveness has been demonstrated in various domains, the impact of plot size on model performance remains underexplored. In this study, we investigate the effect of varying plot sizes on classification accuracy by visualizing natural sounds (e.g., cats, crows) and testing five classes of 2,000 samples each using the YOLO model. While training was conducted on 320x320 plots, test sets were generated at six sizes (112x112 to 640x640). Results show that as the plot size of the test dataset diverged from that of the training dataset, both precision and recall decreased, highlighting the importance of plot size consistency in time-series visualization research.

키워드

과제정보

This work was conducted during the sabbatical year.

참고문헌

R. H. Shumway, D. S. Stoffer, R. H. Shumway, and D. S. Stoffer, "ARIMA models," Time series analysis and its applications: with R examples, pp. 75-163. 2017. DOI:10.1007/978-3-319-52452-8_3
Y. Yu, X. Si, C. Hu, and J. Zhang, J., "A review of recurrent neural networks: LSTM cells and network architectures," Neural computation, Vol. 31, No. 7, pp. 1235-1270, 2019. DOI:10.1162/neco_a_01199
Y. Fang, H. Xu, and J. Jiang, "A survey of time series data visualization research", In IOP Conference Series: Materials Science and Engineering, Vol. 782, No. 2, pp. 022013. IOP Publishing. 2020. DOI:10.1088/1757-899x/782/2/022013
J.F. Torres, D. Hadjout, A. Sebaa, F. Martinez-Alvarez, and A. Troncoso, "Deep learning for time series forecasting: a survey," Big Data, Vol. 9. No. 1, pp. 3-21, 2021. DOI:10.1089/big.2020.0159
Z. Mushtaq and S. Shun-Feng, "Efficient classification of environmental sounds through multiple features aggregation and data enhancement techniques for spectrogram images," Symmetry Vol. 12, No. 11, pp. 1821, 2020. DOI:10.3390/sym12111822
M. T. Nguyen, W. L. Wei and H. H. Jin, "Heart sound classification using deep learning techniques based on log-mel spectrogram," Circuits, Systems, and Signal Processing Vol. 42, No. 1, pp. 344-360, 2023. DOI:10.1007/s00034-022-02124-1
S. Barra, S.M. Carta, A. Corriga, A.S. Podda, and D.R. Recupero, "Deep learning and time series-to-image encoding for financial forecasting," IEEE/CAA Journal of Automatica Sinica, Vol. 7, No. 3, pp. 683-692, 2020. DOI:10.1109/JAS.2020.1003132
K. Choi, J. Yi, C. Park, and S. Yoon, "Deep learning for anomaly detection in time-series data: Review, analysis, and guidelines," IEEE access, Vol. 9, pp. 120043-120065, 2021. DOI:10.1109/ACCESS.2021.3107975
P. Arcaini, A. Bombarda, S. Bonfanti, and A. Gargantini, "Dealing with robustness of convolutional neural networks for image classification," In 2020 IEEE International Conference On Artificial Intelligence Testing (AITest), pp. 7-14. IEEE, 2020. DOI:10.1109/AITEST49225.2020.00009
J. Djolonga, J. Yung, M. Tschannen, R. Romijnders, L. Beyer, A. Kolesnikov, J. Puigcerver, M. Minderer, A. D'Amour, D. Moldovan, and S. Gelly, "On robustness and transferability of convolutional neural networks," In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 16458-16468. 2021. DOI:10.1109/cvpr46437.2021.01619
P. Jiang, D. Ergu, F. Liu, Y. Cai, and B. Ma, "A Review of Yolo algorithm developments," Procedia computer science Vol. 199, pp. 1066-1073, 2022. DOI:10.1016/j.procs.2022.01.135
A. Casolaro, V. Capone, G. Iannuzzo, and F. Camastra, "Deep learning for time series forecasting: Advances and open problems," Information Vol. 14, No. 11 pp. 598. 2023. DOI: 10.3390/info14110598
C. Li, J. Xiong, X. Zhu, Q. Zhang, and S. Wang, "Fault diagnosis method based on encoding time series and convolutional neural network," IEEE Access, Vol. 8, pp. 165232-165246, 2020. DOI:10.1109/ACCESS.2020.3021007
X. Sun, P. Liu, Z. He, Y. Han, and B. Su, "Automatic classification of electrocardiogram signals based on transfer learning and continuous wavelet transform," Ecological Informatics, Vol. 69, p. 101628, 2022. DOI:10.1016/j.ecoinf.2022.101628
A. Bhowmik, A., M. Sannigrahi, D. Chowdhury, A.D. Dwivedi, and R.R. Mukkamala, "Dbnex: Deep belief network and explainable ai based financial fraud detection," In 2022 IEEE International Conference on Big Data (Big Data), IEEE, pp. 3033-3042, 2022. DOI:10.1109/BigData55660.2022.10020494
N. Hatami, Y. Gavet, and J. Debayle, "Classification of time-series images using deep convolutional neural networks," In Tenth international conference on machine vision (ICMV 2017), Vol. 10696, pp. 242-249. SPIE. 2018. DOI:10.1117/12.2309486
D. Braun, R. Borgo, M. Sondag, and T. von Landesberger, "Reclaiming the horizon: Novel visualization designs for time-series data with large value ranges," IEEE Transactions on Visualization and Computer Graphics, 2023. DOI:10.1109/TVCG.2023.3326576
W. Xie, Y. Li, J. Lei, J. Yang, J. Li, X. Jia, and Z. Li, " Unsupervised spectral mapping and feature selection for hyperspectral anomaly detection," Neural Networks, Vol. 132, pp. 144-154. 2020. DOI:10.1016/j.neunet.2020.08.010
Z. Qin, Y. Zhang, S. Meng, Z. Qin, and K.K.R. Choo, "Imaging and fusing time series for wearable sensor-based human activity recognition," Information Fusion, Vol. 53, pp. 80-87. 2020. =DOI:10.1016/j.inffus.2019.06.014
G. Uribarri and G.B. Mindlin, "Dynamical time series embeddings in recurrent neural networks," Chaos, Solitons & Fractals, Vol. 154, p. 111612, 2022. DOI:10.1016/j.chaos.2021.111612
H. V. Dudukcu, M. Taskiran, Z. G. C. Taskiran, and T. Yildirim, "Temporal Convolutional Networks with RNN approach for chaotic time series prediction," Applied soft computing, Vol. 133, p. 109945. 2023 DOI:10.1016/j.asoc.2022.109945
K. J. Piczak, "ESC: Dataset for environmental sound classification," In Proceedings of the 23rd ACM international conference on Multimedia, pp. 1015-1018. 2015. DOI:10.1145/2733373.2806390

한국컴퓨터정보학회논문지 (Journal of the Korea Society of Computer and Information)

Analyzing the Impact of Plot Size in Vision-Based Time-Series Sound Classification

초록

키워드

과제정보

참고문헌

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)