Deep Learning Technologies for Analysis of TV Drama Video Stories

TV 드라마 비디오 스토리 분석 딥러닝 기술

  • Published : 2017.01.30

Abstract

비디오 정보를 자동으로 학습하고 관련 문제를 해결하기 위해서는, 비디오의 기본 구성요소인 영상, 음성, 언어 정보의 학습을 기반으로 고차원의 추상적 개념을 파악하는 기술이 필수적이다. 최근 딥러닝이 실용적인 수준으로 이러한 기술을 가능하게 함에 따라, 보다 도전적인 비디오 스토리 분석과 이해 문제 해결을 시도할 수 있게 되었다. 본 고에서는 비디오의 요소별 분석에 적용 가능한 최신 딥러닝 기술을 소개하고, 딥러닝 기술을 핵심으로 한 TV 드라마의 스토리 분석 사례를 살펴본다.

Keywords

References

  1. A. Karpathy, G. Toderici, S. Shetty, T. Leung, R. Sukthankar and L. Fei-Fei. Large-scale video classification with convolutional neural networks. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. pp. 1725-1732. (2014)
  2. I.-H. Jhuo and D.T. Lee. Video event detection via multi-modality deep Learning. In Proceedings of International Conference on Pattern Recognition. pp. 666-671. (2014)
  3. D. Tran, L. Bourdev, R. Fergus, L. Torresani and M. Paluri. C3D: Generic features for video analysis. arXiv preprint arXiv:1412.0767. (2014)
  4. C.-J. Nan, K.-M. Kim and B.-T. Zhang. Social network analysis of TV drama characters via deep concept hierarchies. In Proceedings of International Conference on Advances in Social Networks Analysis and Mining. pp. 831-836. (2015)
  5. K. Kim, C. Nan, M.-O. Heo, S.-H. Choi and B.-T. Zhang. PororoQA: Cartoon video series dataset for story understanding. In Proceedings of NIPS 2016 Workshop on Large Scale Computer Vision System. (2016)
  6. A. Krizhevsky, I. Sutskever and G. E. Hinton. Imagenet classification with deep convolutional neural networks. In Proceedings of Advances in neural information processing systems. (2012)
  7. K. Simonyan and A. Zisserman. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556. (2014)
  8. C. Szegedy, W. Liu, W., Y. Jia, P. Sermanet, S. Reed, D. Anguelov and A. Rabinovich. Going deeper with convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 1-9. (2015)
  9. K. He, X. Zhang, S. Ren and J. Sun. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. (2016)
  10. I. Goodfellow, J. Pouget-Abadie et al. Generative adversarial nets. In Proceedings of Advances in Neural Information Processing Systems. pp.2672-2680. (2014)
  11. A. Radford, L. Metz, and S. Chintala. Unsupervised representation learning with deep convolutional generative adversarial networks. In Proceedings of International Conference on Learning Representations. (2015)
  12. A. Graves, A. Mohamed, G. Hinton. Speech recognition with deep recurrent neural networks. In Proceedings of 2013 IEEE international conference on acoustics, speech and signal processing. pp. 6645-6649. (2013)
  13. S. Hochreiter and J. Schmidhuber. Long short-term memory. Neural Comput. vol. 9. pp. 1735-1780. (1997) https://doi.org/10.1162/neco.1997.9.8.1735
  14. K. Cho, B. Van Merrienboer, C. Gulcehre, D. Bahdanau, F. Bougares, H Schwenk and Y. Bengio. Learning phrase representations using RNN encoder-decoder for statistical machine translation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. (2014)
  15. http://benjamin.wtf
  16. O. Vinyals, A. Toshev, S. Bengio and D. Erhan. Show and tell: A neural image caption generator. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 3156-3164. (2015)
  17. J.-W. Ha, K.-M. Kim and B.-T. Zhang. Automated construction of visual-linguistic knowledge via concept learning from cartoon videos. In Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence. pp. 522-528. (2015)
  18. R. Socher, B. Huval, B. Bath, C. D. Manning and A. Y. Ng. Convolutional-recursive deep learning for 3D object classification. In Proceedings of Advances in Neural Information Processing Systems. pp. 665-673. (2012)
  19. R. Girshick, J. Donahue, T. Darrell and J. Malik. Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of International Conference on Pattern Recognition. pp. 580-587. (2014)
  20. T. Mikolov, I. Sutskever, K. Chen, G. Corrado and J. Dean. Distributed representations of words and phrases and their compositionality. In Proceedings of Advances in Neural Information Processing Systems. pp. 3111-3119. (2013)