딥러닝 기반 비디오 분석 기술

  • Published : 2015.09.18

Abstract

Keywords

References

  1. Moez Baccouche, Franck Mamalet, Christian Wolf, Christophe Garcia, and Atilla Baskurt. Sequential deep learning for human action recognition. In Human Behavior Understanding, pages 29-39. Springer, 2011.
  2. John L Barron, David J Fleet, and Steven S Beauchemin. Performance of optical flow techniques. International journal of computer vision, 12(1 ):43-77, 1994. https://doi.org/10.1007/BF01420984
  3. Moshe Blank, Lena Gorelick, Eli Shechtman, Michal Irani, and Ronen Basri. Actions as space-time shapes. In The Tenth IEEE International Conference on Computer Vision (ICCV'05), pages 1395-1402, 2005.
  4. Xianjie Chen and Alan L Yuille. Articulated pose estimation by a graphical model with image dependent pairwise relations. In Advances in Neural Information Processing Systems, pages 1736-1744, 2014.
  5. Navneet Dalal and Bill Triggs. Histograms of oriented gradients for human detection. In Computer Vision and Pattern Recognition, 2005. CVPR 2005. Computer Society Conference on, volume 1, pages 886-893. IEEE, 2005.
  6. David Eigen, Christian Puhrsch, and Rob Fergus. Depth map prediction from a single image using a multi-scale deep network. In Advances in Neural Information Processing Systems, pages 2366-2374, 2014.
  7. Sepp Hochreiter and Jurgen Schmidhuber. Long short-term memory. Neural computation, 9(8): 1735-1780, 1997. https://doi.org/10.1162/neco.1997.9.8.1735
  8. Junlin Hu, Jiwen Lu, and Yap-Peng Tan. Discriminative deep metric learning for face verification in the wild. In Computer Vision and Pattern Recognition (CVPR), 2014 IEEE Conference on, pages 1875-1882. IEEE, 2014.
  9. Shuiwang Ji, Wei Xu, Ming Yang, and Kai Yu. 3d convolutional neural networks for human action recognition. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 35(1):221-231, 2013. https://doi.org/10.1109/TPAMI.2012.59
  10. Andrej Karpathy, George Toderici, Sachin Shetty, Tommy Leung, Rahul Sukthankar, and Li Fei-Fei. Large-scale video classification with convolutional neural networks. In Computer Vision and Pattern Recognition (CVPR), 2014 IEEE Conference on, pages 1725-1732. IEEE, 2014.
  11. Ho-Joon Kim, Joseph S Lee, and Hyun-Seung Yang. Human action recogni-tion usmg a modified convolutional neural network. In Advances in Neural Networks - ISNN 2007, pages 715-723. Springer, 2007.
  12. Alexander Klaser, Marcin Marszalek, and Cordelia Schmid. A spatio-temporal descriptor based on 3d-gradients. In BMVC 2008-19th British Machine Vision Conference, pages 275-1. British Machine Vision Association, 2008.
  13. Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. Imagenet clas- sification with deep convolutional neural networks. In Advances in neural information processing systems, pages 1097-1105, 2012.
  14. H. Kuehne, H. Jhuang, E. Garrote, T. Poggio, and T. Serre. HMDB: a large video database for human motion recognition. In Proceedings of the International Conference on Computer Vision (ICCV), 2011.
  15. Ivan Laptev and Tony Lindeberg. On space-time interest points. Interna- tional Journal of Computer Vision, 64(2-3): 107-123, 2005. https://doi.org/10.1007/s11263-005-1838-7
  16. Ivan Laptev, Marcin Marszalek, Cordelia Schmid, and Benjamin Rozenfeld. Learning realistic human actions from movies. In Computer Vision and Pattern Recognition, 2008. CVPR 2008. IEEE Conference on, pages 1-8. IEEE, 2008.
  17. Quoc V Le, Will Y Zou, Serena Y Yeung, and Andrew Y Ng. Learning hierarchical invariant spatio-temporal features for action recognition with independent subspace analysis. In Computer Vision and Pattern Recogni- tion (CVPR), 2011 IEEE Conference on, pages 3361-3368. IEEE, 2011.
  18. Yann LeCun, Leon Bottou, Yoshua Bengio, and Patrick Haffner. Gradient- based learning applied to document recognition. Proceedings of the IEEE, 86(11):2278-2324, 1998. https://doi.org/10.1109/5.726791
  19. David G Lowe. Distinctive image features from scale-invariant keypoints. International journal of computer vision, 60(2):91-110, 2004. https://doi.org/10.1023/B:VISI.0000029664.99615.94
  20. Tomas Pfister, Karen Simonyan, James Charles, and Andrew Zisserman. Deep convolutional neural networks for efficient pose estimation in gesture videos. In Computer Vision-ACCV 2014, pages 538-552. Springer, 2015.
  21. Marc'Aurelio Ranzato, Arthur Szlam, Joan Bruna, Michael Mathieu, Ro-nan Collobert, and Sumit Chopra. Video (language) modeling: a baseline for generative models of natural videos. CoRR, abs/1412.6604, 2014.
  22. MarcAurelio Ranzato, Arthur Szlam, Joan Bruna, Michael Mathieu, Ronan Collobert, and Sumit Chopra. Video (language) modeling: a baseline for generative models of natural videos. arXiv preprint arXiv:1412.6604, 2014.
  23. Mikel D Rodriguez, Javed Ahmed, and Mubarak Shah. Action mach a spatio-temporal maximum average correlation height filter for action recog- nition. In Computer Vision and Pattern Recognition, 2008. CVPR 2008. IEEE Conference on, pages 1-8. IEEE, 2008.
  24. Christian Schuldt, Ivan Laptev, and Barbara Caputo. Recognizing human actions: a local svm approach. In Pattern Recognition, 2004. ICPR 2004. of the 17th International Conference on, volume 3, pages 32-36. IEEE, 2004.
  25. Pierre Sermanet, David Eigen, Xiang Zhang, Michael Mathieu, Rob Fer-gus, and Yann LeCun. Overfeat: Integrated recognition, localization and detection using convolutional networks. arXiv preprint arXiv: 1312.6229, 2013.
  26. Karen Simonyan and Andrew Zisserman. Two-stream convolutional net- works for action recognition in videos. In Advances in Neural Information Processing Systems, pages 568-576, 2014.
  27. Khurram Soomro, Amir Roshan Zamir, and Mubarak Shah. Ucf101: A dataset of 101 human actions classes from videos in the wild. arXiv preprint arXiv: 1212.0402, 2012.
  28. Nitish Srivastava, Elman Mansimov, and Ruslan Salakhutdinov. Unsu-pervised learning of video representations using Istms. arXiv preprint arXiv: 1502.04681, 2015.
  29. Lin Sun, Kui Jia, Tsung-Han Chan, Yuqiang Fang, Gang Wang, and Shuicheng Yan. Dl-sfa: Deeply-learned slow feature analysis for action recognition. In Computer Vision and Pattern Recognition (CVPR), 2014 IEEE Conference on, pages 2625-2632. IEEE, 2014.
  30. Yi Sun, Yuheng Chen, Xiaogang Wang, and Xiaoou Tang. Deep learn-ing face representation by joint identification-verification. In Advances in Neural Information Processing Systems, pages 1988-1996, 2014.
  31. Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, and Andrew Ra-binovich. Going deeper with convolutions. arXiv preprint arXiv: 1409.4842, 2014.
  32. Yaniv Taigman, Ming Yang, Marc' Aurelio Ranzato, and Lars Wolf. Deep-face: Closing the gap to human-level performance in face verification. In Computer Vision and Pattern Recognition (CVPR), 2014 IEEE Conference on, pages 1701-1708. IEEE, 2014.
  33. Jonathan J Tompson, Arjun Jain, Yann LeCun, and Christoph Bregler. Joint training of a convolutional network and a graphical model for human pose estimation. In Advances in Neural Information Processing Systems, pages 1799-1807, 2014.
  34. Alexander Toshev and Christian Szegedy. Deeppose: Human pose estima- tion via deep neural networks. In Computer Vision and Pattern Recognition (CVPR), 2014 IEEE Conference on, pages 1653-1660. IEEE, 2014.
  35. Heng Wang, Muhammad Muneeb Ullah, Alexander Klaser, Ivan Laptev, and Cordelia Schmid. Evaluation of local spatio-temporal features for ac- tion recognition. In BMVC 2009-British Machine Vision Conference, pages 124-1. BMVA Press, 2009.
  36. Geert Willems, Tinne Tuytelaars, and Luc Van Gool. An efficient dense and scale-invariant spatio-temporal interest point detector. In Computer Vision-ECCV 2008, pages 650-663. Springer, 2008.
  37. Bolei Zhou, Agata Lapedriza, Jianxiong Xiao, Antonio Torralba, and Aude Oliva. Learning deep features for scene recognition using places database. In Advances in Neural Information Processing Systems, pages 487-495, 2014.
  38. WW Zhu, A Berndsen, EC Madsen, M Tan, IH Stairs, A Brazier, P Lazarus, R Lynch, P Scholz, K Stovall, et al. Searching for pulsars using image pattern recognition. The Astrophysical Journal, 781(2):117, 2014. https://doi.org/10.1088/0004-637X/781/2/117
  39. 김지섭, 김은솔, 유낭웅, 정문식, 최현수, 장병탁. Deep convolutional neural network을 이용한 2d 영상에서의 사람 자세, 행동 및 위치 통합 인식 시스템. In 2015 한국컴퓨터종합학술대회(KCC2015)논문집, pages 846-848. 한국정보과학회, 2015.