그림 2. RNN 구조 Fig. 2 RNN Architecture
그림 3. CRNN 구조 Fig. 3 CRNN Architecture
그림 1. CNN 구조 Fig. 1 CNN Architecture
그림 4. Learning rate 변화에 따른 비용함수와 정확도의 epoch에 따른 수렴 특성. Fig. 4 Convergence Characteristics of Loss Function and Accuracy as Learning rate changes.
표 1. Learning rate 변화에 따른 CRNN 성능 Table 1. Performances of CRNN as learning rate changes
표 2. FNN, CNN, RNN, CRNN 간의 성능비교 Table 2. Performance Comparison Between FNN, CNN, RNN and CRNN.
참고문헌
- M. Nandwana, A. Ziaei, and J. Hansen, "Robust Unsupervised Detection of Human Screams In Noisy Acoustic Environments," Proceedings of the International Conference on Acoustics, Speech and Signal Processing, Brisbane, Australia, April, 2015.
- M. Crocco, M. Christani, A. Trucco, and V. Murino, "Audio Surveillance: A Systematic Review," ACM Computing Surveys, vol. 48. no. 4, 2016, pp. 52:1-52:46.
- Y. Lee and P. Moon, "A Comparison and Analysis of Deep Learning Framework," J. of the Korea Institute of Electronic Communication Sciences, vol. 12, no. 1, 2017, pp. 115-122. https://doi.org/10.13067/JKIECS.2017.12.1.115
- Y. Wang, L. Neves, and F. Metze, "Audio-based Multimedia Event Detection Using Deep Recurrent Neural Networks," IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Shanghai, China, 2016, pp. 2742-2746.
- A. Mesaros, T. Heittola, and T. Virtanen, "Metrics for polyphonic sound event detection," Applied Sciences, vol. 6, no. 6, 2016, pp. 321-337. https://doi.org/10.3390/app6110321
- A. Krizhevsky, I. Sutskever, and G. E. Hinton, "Imagenet Classification with Deep Convolutional Neural Networks," Communications of the ACM, vol. 60, no. 6, 2017, pp. 84-90. https://doi.org/10.1145/3065386
- A. Graves, A. Mohamed, and G. E. Hinton, "Speech Recognition with Deep Recurrent Neural Networks," Proceedings of the IEEE Int. Conf. on Acoustics Speech and Signal Processing (ICASSP), Vancouver, Canada, 2013, pp. 6645-6649.
- S. Bang, "Implementation of Image based Fire Detection System Using Convolution Neural Network," J. of the Korea Institute of Electronic Communication Sciences, vol. 12, no. 2, 2017, pp. 331-336. https://doi.org/10.13067/JKIECS.2017.12.2.331
- S. Chung and Y. Chung, "Comparison of Audio Event Detection Performance using DNN," J. of the Korea Institute of Electronic Communication Sciences, vol. 13, no. 3, 2018, pp. 571-577. https://doi.org/10.13067/JKIECS.2018.13.3.571
- E. Cakir, G. Parascandolo, T. Heittola, H. Huttunen, and T. Virtanen, "Convolutional Recurrent Neural Networks for Polyphonic Sound Event Detection," IEEE/ACM Trans. On Audio Speech and Language Process, vol. 26, no. 6, 2017, pp. 1291-1303.
- T. Sainath, O. Vinyals, A. Senior, and H. Sak, "Convolutional, Long Short-term Memory, Fully Connected Deep Neural Networks," Proceedings of the 2015 IEEE Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP), Brisbane, Australia, 2015, pp. 4580-4584.
- K. Choi, G. Fazekas, M. Sandler, and K. Cho, "Convolutional Recurrent Neural Networks for Music Classification," Proceedings of the 2017 IEEE Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP), New Orleans, USA, 2017, pp. 2392-2396.
- TUT-SED Synthetic Database 2016, Availab:http://www.cs.tut.fi/sgn/arg/taslp2017-crnn-sed/tut-sed-synthetic-2016