참고문헌
- Y. Bengio, A. Courville, and P. Vincent, "Representation learning: A review and new perspectives," IEEE transactions on Pattern Analysis and Machine Intelligence, Vol.35, No.8, pp.1798-1828, 2013. https://doi.org/10.1109/TPAMI.2013.50
- A. Krizhevsky, I. Sutskever, and G. Hinton, "ImageNet classification with deep convolutional neural networks," In Advances in Neural Information Processing Systems (pp. 1097-1105), 2012.
- Y. L. Boureau, and Y. L, Cun, "Sparse feature learning for deep belief networks," Proc. of Advances in Neural Information Processing Systems, pp. 1185-1192, 2008.
- J. Chung, C. Gulcehre, K. Cho, and Y. Bengio, "Empirical evaluation of gated recurrent neural networks on sequence modeling," arXiv preprint arXiv:1412.3555, 2014.
- S. Hochreiter and J. Schmidhuber, "Long short-term memory," Neural Computation, Vol.9, No.8, pp.1735-1780, 1997. https://doi.org/10.1162/neco.1997.9.8.1735
- J. Bernd, D. Borth, B. Elizalde, G. Friedland, H. Gallagher, L. Gottlieb, A. Janin, S. Karabashlieva, J. Takahashi, and J. Won, "The YLI-MED corpus: Characteristics, procedures, and plans," arXiv preprint arXiv:1503.04250, 2015.
- C. Goller, and A. Kuchler, "Learning task-dependent distributed representations by backpropagation through structure," Proc. of IEEE International Conference on Neural Networks, pp.347-352, 1996.
- S. Hochreiter, "The vanishing gradient problem during learning recurrent neural nets and problem solutions," International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems, Vol.6, No.2, pp.107-116, 1998. https://doi.org/10.1142/S0218488598000094
- K. Ashraf, B. Elizalde, F. Iandola, M. Moskewicz, J. Bernd, G. Friedland, and K. Keutzer, "Audio-based multimedia event detection with DNNs and sparse sampling," Proc. of the 5th ACM on International Conference on Multimedia Retrieval, pp.611-614, 2015.
- D. E. Rumelhart, G. E. Hinton, and R. J. Williams, "Learning representations by back-propagating errors," Nature, Vol.323, No.9, pp.533-536, 1986. https://doi.org/10.1038/323533a0
- O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy, A. Khosla, M. Bernstein, A. C. Berg, and L. Fei-Fei, "ImageNet Large Scale Visual Reconition Challenge," International Journal of Computer Vision, Vol.115, No.3, pp.211-252, 2015. https://doi.org/10.1007/s11263-015-0816-y
- C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, and Z. Wojna, "Rethinking the inception architecture for computer vision," arXiv preprint arXiv:1512.00567, 2015.
- F. Eyben, M. Wöllmer, and B. Schuller, "Opensmile: the munich versatile and fast open-source audio feature extractor," Proc. of the 18th ACM International Conference on Multimedia, pp.1459-1462, 2010.
- Abadi, M., Agarwal, A., et al., "TensorFlow: Large-scale machine learning on heterogeneous systems," Available: http://tensorflow.org (retrieved 2016, Feb. 2)