과제정보
This work was supported by Institute of Information & communications Technology Planning & Evaluation (IITP) grant funded by the Korea Government (MSIT) (No. 2019-01-01768, Deep Neural Network based Real-Time Accurate Voice Source Localization using Drones).
참고문헌
- Salamon, Justin, and Juan Pablo Bello. "Deep convolutional neural networks and data augmentation for environmental sound classification." IEEE Signal Processing Letters, 24(3), pp.279-283, Jan 2017. https://doi.org/10.1109/LSP.2017.2657381
- Cubuk, Ekin D., et al. "Autoaugment: Learning augmentation strategies from data." Proceedings of the IEEE conference on computer vision and pattern recognition. May 24 2018.
- Cubuk, Ekin D., et al. "Randaugment: Practical automated data augmentation with a reduced search space." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp. 702-703. 2020.
- Hendrycks, Dan, et al. "Augmix: A simple data processing method to improve robustness and uncertainty." arXiv preprint arXiv:1912. 02781, Dec 5 2019.
- Sharma, Jivitesh, Ole-Christoffer Granmo, and Morten Goodwin. "Environment Sound Classification using Multiple Feature Channels and Deep Convolutional Neural Networks." arXiv preprint arXiv:1908.11219, Aug 28 2019.
- Park, Daniel S., et al. "Specaugment: A simple data augmentation method for automatic speech recognition." arXiv preprint arXiv:1904.08779, Apr 18 2019.
- Hwang, Yeongtae, et al. "Mel-spectrogram augmentation for sequence to sequence voice conversion." arXiv preprint arXiv:2001.01401, Jan 6 2020.
- Piczak, Karol J. "ESC: Dataset for environmental sound classification." Proceedings of the 23rd ACM international conference on Multimedia, pp. 1015-1018, Oct 13 2015.
- I lya Loshchilov and Frank Hutter. Sgdr: Stochastic gradient descent with warm restarts. arXiv preprint arXiv:1608.03983, Aug 13 2016.
- Venkatesh Boddapati, Andrej Petef, Jim Rasmusson, and Lars Lundberg. Classifying environmental sounds using image recognition networks. Procedia Computer Science, 112:2048 -2056, Jan 1 2017. https://doi.org/10.1016/j.procs.2017.08.250
- Yuji Tokozume, Yoshitaka Ushiku, and Tatsuya Harada. Learning from between-class examples for deep sound recognition. CoRR, abs/1711.10282, 2017.
- Yusuf Aytar, Carl Vondrick, and Antonio Torralba. Soundnet: Learning sound representations from unlabeled video. In Proceedings of the 30th International Conference on Neural Information Processing Systems, NIPS'16, pp. 892-900, 2016.
- Zhichao Zhang, Shugong Xu, Shan Cao, and Shunqing Zhang. Deep convolutional neural network with mixup for environmental sound classification. In Jian-Huang Lai, Cheng-Lin Liu, Xilin Chen, Jie Zhou, Tieniu Tan, Nanning Zheng, and Hongbin Zha, editors, Pattern Recognition and Computer Vision, pp. 356-367, 2018.
- Z. Zhang, S. Xu, S. Zhang, T. Qiao, and S. Cao. Learning attentive representations for environmental sound classification. IEEE Access, 7:130327-130339, 2019.
- Xinyu Li, Venkata Chebiyyam, and Katrin Kirchhoff. Multi-stream network with temporal attention for environmental sound classification. CoRR, abs/1901.08608, 2019.