References
- Redmon, Joseph, et al. "You only look once: Unified, realtime object detection." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016.
- Redmon, Joseph, and Ali Farhadi. "YOLO9000: better, faster, stronger." arXiv preprint arXiv:1612.08242 (2016).
- Liu, Wei, et al. "Ssd: Single shot multibox detector." arXiv preprint arXiv:1512.02325 (2015).
- Ren, Shaoqing, et al. "Faster R-CNN: Towards real-time object detection with region proposal networks." Advances in neural information processing systems. 2015.
- Henriques, Joao, et al. "Exploiting the circulant structure of tracking-by-detection with kernels." Computer Vision-ECCV 2012 (2012): 702-715.
- Henriques, Joao F., et al. "High-speed tracking with kernelized correlation filters." IEEE Transactions on Pattern Analysis and Machine Intelligence 37.3 (2015): 583-596. https://doi.org/10.1109/TPAMI.2014.2345390
- Kuznetsova, Alina, and Sung Ju Hwang. "Incremental learning framework for object detection in videos." U.S. Patent Application No. 14/887,141.
- Kuznetsova, Alina, et al. "Expanding Object Detector's HORIZON: Incremental Learning Framework for Object Detection in Videos (supplementary materials)."
- Felsberg M. et al. (2016) The Thermal Infrared Visual Object Tracking VOT-TIR2016 Challenge Results. I16 Workshops. ECCV 2016. Lecture Notes in Computer Science, vol 9914. Springer, Cham
- A. Robicquet, A. Sadeghian, A. Alahi, S. Savarese, Learning Social Etiquette: Human Trajectory Prediction In Crowded Scenes in European Conference on Computer Vision (ECCV), 2016.
- Tianmin Shu, Dan Xie, Brandon Rothrock, Sinisa Todorovic and Song-Chun Zhu. Joint inference of groups, events and human roles in aerial videos. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015.
- Donahue, Jeffrey, et al. "Long-term recurrent convolutional networks for visual recognition and description." Proceedings of the IEEE conference on computer vision and pattern recognition. 2015.
- Pan, Pingbo, et al. "Hierarchical recurrent neural encoder for video representation with application to captioning." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016.
- Shetty, Rakshith, and Jorma Laaksonen. "Video captioning with recurrent networks based on frame-and video-level features and visual content classification." arXiv preprint arXiv:1512.02949 (2015).