- Candanedo LM, Feldheim V, and Deramaix D (2017). Data driven prediction models of energy use of appliances in a low-energy house, Energy and Buildings, 140, 81-97. https://doi.org/10.1016/j.enbuild.2017.01.083
- Chen S, Wang XX, and Harris CJ (2008). NARX-based nonlinear system identification using orthogonal least squares basis hunting, IEEE Transactions on Control Systems Technology, 16, 78-84. https://doi.org/10.1109/TCST.2007.899728
- Cho K, van Merrienboer B, Bahdanau D, and Bengio Y (2014a). On the properties of neural machine translation: Encoder-decoder approaches, arXiv:1409.1259.
- Cho K, van Merrienboer B, Gulcehre C, Bahdanau D, Bougares F, Schwenk H, and Bengio Y (2014b). Learning phrase representations using RNN encoder-decoder for statistical machine translation, arXiv:1406.1078.
- Elman JL (1991). Distributed representations, simple recurrent networks, and grammatical structure, Machine Learning, 7, 195-225. https://doi.org/10.1007/BF00114844
- Hochreiter S and Schmidhuber J (1997). Long short-term memory, Neural Computation, 9, 1735-1780. https://doi.org/10.1162/neco.1997.9.8.1735
- Huang Z, Xu W, and Yu K (2015). Bidirectional LSTM-CRF models for sequence tagging. arXiv: 1508.01991.
- Hyndman RJ and Benitez JM (2016). Bagging exponential smoothing methods using STL decomposition and Box-Cox transformation, International Journal of Forecasting, 32, 303-312. https://doi.org/10.1016/j.ijforecast.2015.07.002
- Jang E, Gu S, and Poole B (2016). Categorical reparameterization with gumbel-softmax, arXiv:1611. 01144.
- Li H, Shen Y, and Zhu Y (2018). Stock price prediction using attention-based multi-Input LSTM. In Proceedings of the 10th Asian Conference on Machine Learning, 454-469.
- Li G, Wen C, Zheng W, and Chen Y (2011). Identification of a class of nonlinear autoregressive models with exogenous inputs based on kernel machines, IEEE Transactions on Signal Processing, 59, 2146-2159. https://doi.org/10.1109/TSP.2011.2112355
- Liu B and Lane I (2016). Attention-based recurrent neural network models for joint intent detection and slot filling, arXiv:1609.01454.
- Liu Y, Gong C, Yang L, and Chen Y (2019). DSTP-RNN: a dual-stage two-phase attention-based recurrent neural network for long-term and multivariate time series prediction, arXiv:1904.07464.
- McLeod AI and Li WK (1983). Diagnostic checking ARMA time series models using squared-residual autocorrelations, Journal of Time Series Analysis, 4, 269-273. https://doi.org/10.1111/j.1467-9892.1983.tb00373.x
- Nair V and Hinton GE (2010). Rectified linear units improve restricted boltzmann machines. In ICML.
- Pedregosa F, Varoquaux G, Gramfort A, et al. (2011). Scikit-learn: machine learning in Python, The Journal of Machine Learning Research, 12, 2825-2830.
- Pham H, Tran V, and Yang BS (2010). A hybrid of nonlinear autoregressive model with exogenous input and autoregressive moving average model for long-term machine state forecasting, Expert Systems with Applications, 37, 3310-3317. https://doi.org/10.1016/j.eswa.2009.10.020
- Qin Y, Song D, Chen H, Cheng W, Jiang G, and Cottrell G (2017). A dual-stage attention-based recurrent neural network for time series prediction, arXiv:1704.02971.
- Rumelhart DE, Hinton GE, and Williams RJ (1986). Learning internal representations by backpropagating errors, Nature, 323, 533-536. https://doi.org/10.1038/323533a0
- Sutskever I, Vinyals O, and Le QV (2014). Sequence to sequence learning with neural networks. In Advances in Neural Information Processing Systems, 3104-3112.
- Tao Y, Ma L, Zhang W, Liu J, Liu W, and Du Q (2018). Hierarchical attention based recurrent highway networks for time series prediction, arXiv:1806.00685.
- Werbos P (1990). Backpropagation through time: What it does and how to do it. In Proceedings of the IEEE, 78, 1550-1560.
- Yang Z, Yang D, Dyer C, He X, Smola A, and Hovy E (2016). Hierarchical attention networks for document classification. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 1480-1489.
- Yu Y and Kim YJ (2019). Two-dimensional attention-based LSTM model for stock index prediction, Journal of Information Processing Systems, 15, 1231-1242. https://doi.org/10.3745/jips.02.0121
- Zamora-Martinez F, Romeu-Guallart P, and Pardo J (2014). UCI Machine Learning Repository: SML2010 Data Set, UCI Machine Learning Repository. Available from: https://archive.ics.uci.edu/ml/datasets/SML2010