Acknowledgement
Supported by : 삼성미래기술육성재단
References
- Chung, J., Gulcehre, C., Cho, K. and Bengio, Y. (2014). Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint arXiv:1412.3555.
- Clevert, D., Unterthiner, T. and Hochreiter, S. (2015). Fast and accurate deep network learning by exponential linear units (elus). arXiv preprint arXiv:1511.07289.
- Cybenko G. (1989). Approximation by superpositions of a sigmoidal function. Mathematics of Control, Signals, and Systems (MCSS), 2, 303-314. https://doi.org/10.1007/BF02551274
- Eldan, R. and Shamir, O. (2016). The power of depth for feedforward neural networks. Conference on Learning Theory, 907-940.
- He, K., Zhang, X., Ren, S. and Sun, J. (2015). Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. Proceedings of the IEEE International Conference on Computer Vision, 1026-1034.
- He, K., Zhang, X., Ren, S. and Sun, J. (2016). Deep residual learning for image recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 770-778.
- Hinton, G. E. and Salakhutdinov, R. R. (2006). Reducing the dimensionality of data with neural networks. Science, 313, 504-507. https://doi.org/10.1126/science.1127647
- Hinton, G. E., Osindero, S. and Teh, Y. (2006). A fast learning algorithm for deep belief nets. Neural Computation, 18, 1527-1554. https://doi.org/10.1162/neco.2006.18.7.1527
- Hinton, G. E., Srivastava, N., Krizhevsky, A., Sutskever, I. and Salakhutdinov, R. R. (2012). Improving neural networks by preventing co-adaptation of feature detectors. arXiv preprint arXiv:1207.0580.
- Hochreiter, S. and Schmidhuber, J. (1997). Long short-term memory. Neural Computation, 9, 1735-1780. https://doi.org/10.1162/neco.1997.9.8.1735
- Hornik, K., Stinchcombe, Ma. and White, H. (1989). Multilayer feedforward networks are universal approximators. Neural Networks, 2, 359-366. https://doi.org/10.1016/0893-6080(89)90020-8
- Ioffe, S. and Szegedy, C. (2015). Batch normalization: Accelerating deep network training by reducing internal covariate shift. arXiv preprint arXiv:1502.03167.
- Kingma, D. and Ba, J. (2014). Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980.
- Krizhevsky, A., Sutskever, I. and Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. Advances in Neural Information Processing Systems, 1097-1105.
- Larochelle, H., Erhan, D., Courville, A., Bergstra, J. and Bengio, Y. (2007). An empirical evaluation of deep architectures on problems with many factors of variation. Proceedings of the 24th International Conference on Machine Learning, 473-480.
- LeCun, Y., Bottou, L., Bengio, Y. and Haffner, P. (1998). Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86, 2278-2324. https://doi.org/10.1109/5.726791
- Lee, K. J., Lee, H. J. and Oh, K. J. (2015). Using fuzzy-neural network to predict hedge fund survival. Journal of the Korean Data & Information Science Society, 26, 1189-1198. https://doi.org/10.7465/jkdi.2015.26.6.1189
- Lee, W. (2017). A deep learning analysis of the KOSPI’s directions. Journal of the Korean Data & Information Science Society, 28, 287-295. https://doi.org/10.7465/jkdi.2017.28.2.287
- Lee, W. and Chun, H. (2016). A deep learning analysis of the Chinese Yuan’s volatility in the onshore and offshore markets. Journal of the Korean Data & Information Science Society, 27, 327-335. https://doi.org/10.7465/jkdi.2016.27.2.327
- Maas, A. L., Hannun, A. Y. and Ng, A. Y. (2013). Rectifier nonlinearities improve neural network acoustic models. Proceedings of the 30th International Conference on Machine Learning, 30.
- Mikolov, T., Karafiat, M., Burget, L., Cernock'y, J. and Khudanpur, S. (2010). Recurrent neural network based language model. Interspeech, 2.
- Miotto, R., Wang, F., Wang, S., Jiang, X. and Dudley, J. T. (2017). Deep learning for healthcare: review, opportunities and challenges. Briefings in Bioinformatics.
- Montufar, G. F., Pascanu, R., Cho, K. and Bengio, Y. (2014). On the number of linear regions of deep neural networks. Advances in Neural Information Processing Systems, 2924-2932.
- Nair, V. and Hinton, G. E. (2010). Rectified linear units improve restricted boltzmann machines. Proceedings of the 27th International Conference on Machine Learning, 807-814.
- Oord, A., and Dieleman, S., Zen, H., Simonyan, K., Vinyals, O., Graves, A., Kalchbrenner, N., Senior, A. and Kavukcuoglu, K. (2016). Wavenet: A generative model for raw audio. arXiv preprint arXiv:1609.03499.
- Pascanu, R., Montufar, G. and Bengio, Y. (2013). On the number of response regions of deep feed forward networks with piece-wise linear activations. arXiv preprint arXiv:1312.6098.
- Raghu, M., Poole, B., Kleinberg, J., Ganguli, S. and Sohl-Dickstein, J. (2016). On the expressive power of deep neural networks. arXiv preprint arXiv:1606.05336.
- Silver, D., Huang, A., Maddison, C. J., Guez, A., Sifre, L., Van Den Driessche, G., Schrittwieser, J., Antonoglou, I., Panneershelvam, V., Lanctot, M. and others. (2016). Mastering the game of Go with deep neural networks and tree search. Nature, 529, 484-489. https://doi.org/10.1038/nature16961
- Silver, D., Schrittwieser, J., Simonyan, K., Antonoglou, I., Huang, A., Guez, A., Hubert, T., Baker, L., Lai, M., Bolton, A. and others. (2017). Mastering the game of go without human knowledge. Nature, 550, 354-359. https://doi.org/10.1038/nature24270
- Smolensky, P. (1986). Information processing in dynamical systems: Foundations of harmony theory. Colorado University at Boulder Department of Computer Science.
- Sutskever, I., Vinyals, O and Le, Q. V. (2014). Sequence to sequence learning with neural networks. Advances in Neural Information Processing Systems, 3104-3112.
- Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V. and Rabinovich, A. (2015). Going deeper with convolutions. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 1-9.
- Tieleman, T. and Hinton, G. (2012). Lecture 6.5-rmsprop: Divide the gradient by a running average of its recent magnitude. Coursera: Neural Networks for Machine Learning, 4.
- Zeiler, M. D. (2012). ADADELTA: an adaptive learning rate method. arXiv preprint arXiv:1212.5701.
Cited by
- Finding the Optimal Data Classification Method Using LDA and QDA Discriminant Analysis vol.13, pp.4, 2020, https://doi.org/10.13160/ricns.2020.13.4.132