References
- H. Wang et al., Dense trajectories and motion boundary descriptors for action recognition, Int. J. Comput. Vis. 103 (2013), no. 1, 60-79. https://doi.org/10.1007/s11263-012-0594-8
- H. Wang et al., Evaluation of local spatio-temporal features for action recognition, in Proc. British Mach. Vis. Conf. (BMVC), (London, UK), Sept. 2009, pp. 124.1-124.11.
- A. F. Bobick and J. W. Davis, The recognition of human movement using temporal templates, IEEE Trans. Pattern Anal. Mach. Intell. 3 (2001), 257-267.
- W. Zhu et al., Co-occurrence feature learning for skeleton based action recognition using regularized deep lstm networks, Proc. AAAI Conf. Artif. Intell. 30 (2016), no. 1, 3697-3703.
- K. Simonyan and A. Zisserman, Two-stream convolutional networks for action recognition in videos, in Proc. Annu. Conf. Neural Inf. Process. Syst. (Montreal, Canada), Dec. 2014, pp. 568-576.
- S. Ji et al., 3D convolutional neural networks for human action recognition, IEEE Trans. Pattern Anal. Mach. Intell. 35 (2012), no. 1, 221-231. https://doi.org/10.1109/TPAMI.2012.59
- Z. Li et al., Videolstm convolves, attends and flows for action recognition, Comput. Vis. Image Underst. 166 (2018), 41-50. https://doi.org/10.1016/j.cviu.2017.10.011
- B. Singh et al., A multi-stream bi-directional recurrent neural network for fine-grained action detection, in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (Las Vegas, NV, USA), June 2016, pp. 1961-1970.
- J. Marin et al., Learning appearance in virtual scenarios for pedestrian detection, in Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit. (San Francisco, CA, USA), June 2010, pp. 137-144.
- G. Varol et al., Learning from synthetic humans, in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (Honolulu, HI, USA), July 2017, pp. 109-117.
- D. Vazquez et al., Virtual and real world adaptation for pedestrian detection, IEEE Trans. Pattern Anal. Mach. Intell. 36 (2013), no. 4, 797-809. https://doi.org/10.1109/TPAMI.2013.163
- M. Hoai and A. Zisserman, Improving human action recognition using score distribution and ranking, in Asian Conference on Computer Vision, vol. 9007, Springer, Cham, Switzerland, 2014, pp. 3-20.
- J. Yue-Hei Ng et al., Beyond short snippets: deep networks for video classification, in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (Boston, MA, USA), June 2015, pp. 4694-4702.
- B. Fernando et al., Modeling video evolution for action recognition, in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (Boston, MA, USA), June 2015, pp. 5378-5387.
- L. Sun et al., Lattice long short-term memory for human action recognition, in Proc. IEEE Int. Conf. Comput. Vis. (ICCV), (Venice, Italy), Oct. 2017, pp. 2147-2156.
- U. Ahsan, C. Sun, and I. Essa, Discrimnet: semi-supervised action recognition from videos using generative adversarial networks, arXiv preprint, CoRR, 2018, arXiv: 1801.07230.
- W. Lotter, G. Kreiman, and D. Cox, Deep predictive coding networks for video prediction and unsupervised learning, arXiv preprint, CoRR, 2016, arXiv: 1605.08104.
- M. Mathieu, C. Couprie, and Y. LeCun, Deep multi-scale video prediction beyond mean square error, arXiv preprint, CoRR, 2015, arXiv: 1511.05440.
- S. Tulyakov et al., Mocogan: Decomposing motion and content for video generation, in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (Salt Lake City, UT, USA), June 2018, pp. 1526-1535.
- C. Vondrick, H. Pirsiavash, and A. Torralba, Generating videos with scene dynamics, in Proc. Conf. Neural Inf. Process. Syst. (Barcelona, Spain), Dec. 2016, pp. 613-621.
- S. Wen et al., Generating realistic videos from keyframes with concatenated gans, IEEE Trans. Circuits Syst. Video Technol. 29 (2018), no. 8, 2337-2348. https://doi.org/10.1109/tcsvt.2018.2867934
- M. Ranzato et al., Video (language) modeling: A baseline for generative models of natural videos, arXiv preprint, CoRR, 2014, arXiv: 1412.6604.
- N. Srivastava, E. Mansimov and R. Salakhudinov, Unsupervised learning of video representations using lstms, in Proc. Int. Conf. Mach. Learn. (Lille, France), July 2015, pp. 843-852.
- A. Mikolajczyk and M. Grochowski, Data augmentation for improving deep learning in image classification problem, in Proc. Int. Interdiscip. PhD Workshop (IIPhDW), (Swinoujscie, Poland), May 2018, pp. 117-122.
- M. N. Haque et al., Heterogeneous ensemble combination search using genetic algorithm for class imbalanced data classification, PloS ONE 11 (2016), no. 1, article no. e0146116.
- P. Yang et al., Ensemble-based wrapper methods for feature selection and class imbalance learning, in Advances in Knowledge Discovery and Data Mining, vol. 7818, Springer, Berlin, Heidelberg, Germany, 2013, pp. 544-555.
- N. R. Howe and A. Deschamps, Better foreground segmentation through graph cuts, arXiv preprint, CoRR, 2004, arXiv: cs/0401017.
- D. Weinland, R. Ronfard, and E. Boyer, Free viewpoint action recognition using motion history volumes, Comput. Vision Image Underst. 104 (2006), no. 2-3, 249-257. https://doi.org/10.1016/j.cviu.2006.07.013
- S. Singh, S. A. Velastin and H. Ragheb, Muhavi: A multicamera human action video dataset for the evaluation of action recognition methods, in Proc. IEEE Int. Conf. Adv. Video Signal based Surveillance (Boston, MA, USA), Sept. 2010, pp. 48-55.
- N. Nida et al., Instructor activity recognition through deep spatiotemporal features and feedforward extreme learning machines, Math. Probl. Eng. 2019 (2019).
- K. He et al., Deep residual learning for image recognition, in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (Las Vegas, NV, USA), June 2016, pp. 770-778.
- C. Szegedy et al., Going deeper with convolutions, in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (Boston, MA, USA), June 2015, pp. 1-9.
- F. Iandola et al., Densenet: implementing efficient convnet descriptor pyramids, arXiv preprint, CoRR, 2014, arXiv: 1404.1869.
- Q. Ke et al., Skeletonnet: Mining deep part features for 3-d action recognition, IEEE Sig. Process. Lett. 24 (2017), no. 6, 731-735. https://doi.org/10.1109/LSP.2017.2690339
- W. Zhu et al., Hierarchical extreme learning machine for unsupervised representation learning, in Proc. Int. Joint Conf. Neural Netw. (IJCNN), (Killarney, Ireland), July 2015, pp. 1-8.
- A. A. Chaaraoui, P. Climent-Perez, and F. Florez-Revuelta, Silhouette-based human action recognition using sequences of key poses, Pattern Recognit. Lett. 34 (2013), no. 15, 1799-1807. https://doi.org/10.1016/j.patrec.2013.01.021
- A. Farhadi and M. K. Tabrizi, Learning to recognize activities from the wrong view point, in Computer Vision-ECCV 2008, vol. 5302, Springer, Berlin, Heidelberg, Germany, 2008, 154-166.
- C-H. Huang, Y.-R. Yeh, and Y.-C. F. Wang, Recognizing actions across cameras by exploring the correlated subspace, in Computer Vision-ECCV 2012: Workshops and Demonstrations, vol. 7583, Springer, Berlin, Heidelberg, Germany, 2012, pp. 342-351.
- K. K. Reddy, J. Liu, and M. Shah, Incremental action recognition using feature-tree, in Proc. IEEE Int. Conf. Comput. Vis. (Kyoto, Japan), Sept. 2009, pp. 1010-1017.
- D. Weinland, E. Boyer and R. Ronfard, Action recognition from arbitrary views using 3D exemplars, in Proc. IEEE Int. Conf. Comput. Vis. (Rio de Janeiro, Brazil), Oct. 2007, pp. 1-7.
- F. Murtaza, M. H. Yousaf, and S. A. Velastin, Multi-view human action recognition using 2d motion templates based on mhis and their hog description, IET Comput. Vis. 10 (2016), no. 7, 758-767. https://doi.org/10.1049/iet-cvi.2015.0416
- J. Carreira and A. Zisserman, Quo vadis, action recognition? A new model and the kinetics dataset, in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (Honolulu, HI, USA), July 2017, pp. 6299-6308.
- A. Vaswani et al., Attention is all you need, in Proc. Conf. Neural Inf. Process. Syst. (Long Beach, CA, USA), Dec. 2017, pp. 5998-6008.
- X.-Y. Zhang et al., Learning transferable self-attentive representations for action recognition in untrimmed videos with weak supervision, Proc. AAAI Conf. Artif. Intell. 33 (2019), no. 1, pp. 9227-9234.