References
- Olah, Chris, and Shan Carter. "Attention and augmented recurrent neural networks." Distill 1.9: e1, 2016.
- http://www.comp.hkbu.edu.hk/-markus/teaching/comp7650/tnn-94-gradient.pdf
- Hochreiter, S. and J. Schmidhuber. "Long Short-Term Memory," Neural Computation 9: 1735-1780, 1997. https://doi.org/10.1162/neco.1997.9.8.1735
- LSTM Figure (source: https://upload.wikimedia.org/wikipedia/commons/9/98/LSTM.png)
- Andrew L. Maas, Raymond E. Daly, Dan Huang, Andrew Y. Ng, and Christopher Potts. Learning Word Vectors for Sentiment Analysis. The 49th Annual Meeting of the Association for Computational Linguistics (ACL), 2011.
- Hashemi, M. Enlarging smaller images before inputting into convolutional neural network: zero-padding vs. interpolation. J Big Data 6, 98, 2019. DOI: https://doi.org/10.1186/s40537-019-0263-7
- Kusner, Matt, et al. "From word embeddings to document distances," International conference on machine learning. 2015.
- one-hot encoding: RODRIGUEZ, Pau, et al. Beyond one-hot encoding: Lower dimensional target embedding. Image and Vision Computing, 75: 21-31, 2018. https://doi.org/10.1016/j.imavis.2018.04.004
- Farzad, Amir, Hoda Mashayekhi, and Hamid Hassanpour. "A comparative performance analysis of different activation functions in LSTM networks for classification," Neural Computing and Applications 31.7: 2507-2521, 2019. https://doi.org/10.1007/s00521-017-3210-6
- Jiang, Siyu, and Yimin Chen. "Hand gesture recognition by using 3DCNN and LSTM with adam optimizer," Pacific Rim Conference on Multimedia. Springer, Cham, 2017.
- ARPIT, Devansh, et al. h-detach: Modifying the lstm gradient towards better optimization. arXiv preprint arXiv:1810.03023, 2018.
- Bottou, Leon. "Stochastic gradient descent tricks." Neural networks: Tricks of the trade. Springer, Berlin, Heidelberg, 421-436, 2012.
- Kurbiel, Thomas, and Shahrzad Khaleghian. "Training of deep neural networks based on distance measures using RMSProp," arXiv preprint arXiv:1708.01911, 2017.
- Kingma, Diederik P., and Jimmy Ba. "Adam: A method for stochastic optimization," arXiv preprint arXiv:1412.6980, 2014.
- Stehman, Stephen V. "Selecting and interpreting measures of thematic classification accuracy," Remote Sensing of Environment. 62 (1): 77-89. Bibcode:1997RSEnv..62...77S, 1997. DOI: https://doi.org/10.1016/S0034-4257(97)00083-7