Deep Learning Technologies for Analysis of TV Drama Video Stories

Nam, Jang-Gun;Kim, Jin-Hwa;Kim, Byeong-Hui;Jang, Byeong-Tak;

Broadcasting and Media Magazine (방송과미디어)

Volume 22 Issue 1
/
Pages.91-102
/
2017
/
2383-9708(pISSN)

The Korean Institute of Broadcast and Media Engineers (한국방송∙미디어공학회)

Deep Learning Technologies for Analysis of TV Drama Video Stories

TV 드라마 비디오 스토리 분석 딥러닝 기술

남장군 (서울대학교) ;
김진화 (서울대학교) ;
김병희 (서울대학교) ;
장병탁 (서울대학교)

Published : 2017.01.30

PDF KSCI

Download PDF

⟨ Previous Next ⟩

Abstract

비디오 정보를 자동으로 학습하고 관련 문제를 해결하기 위해서는, 비디오의 기본 구성요소인 영상, 음성, 언어 정보의 학습을 기반으로 고차원의 추상적 개념을 파악하는 기술이 필수적이다. 최근 딥러닝이 실용적인 수준으로 이러한 기술을 가능하게 함에 따라, 보다 도전적인 비디오 스토리 분석과 이해 문제 해결을 시도할 수 있게 되었다. 본 고에서는 비디오의 요소별 분석에 적용 가능한 최신 딥러닝 기술을 소개하고, 딥러닝 기술을 핵심으로 한 TV 드라마의 스토리 분석 사례를 살펴본다.

Keywords

References

A. Karpathy, G. Toderici, S. Shetty, T. Leung, R. Sukthankar and L. Fei-Fei. Large-scale video classification with convolutional neural networks. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. pp. 1725-1732. (2014)
I.-H. Jhuo and D.T. Lee. Video event detection via multi-modality deep Learning. In Proceedings of International Conference on Pattern Recognition. pp. 666-671. (2014)
D. Tran, L. Bourdev, R. Fergus, L. Torresani and M. Paluri. C3D: Generic features for video analysis. arXiv preprint arXiv:1412.0767. (2014)
C.-J. Nan, K.-M. Kim and B.-T. Zhang. Social network analysis of TV drama characters via deep concept hierarchies. In Proceedings of International Conference on Advances in Social Networks Analysis and Mining. pp. 831-836. (2015)
K. Kim, C. Nan, M.-O. Heo, S.-H. Choi and B.-T. Zhang. PororoQA: Cartoon video series dataset for story understanding. In Proceedings of NIPS 2016 Workshop on Large Scale Computer Vision System. (2016)
A. Krizhevsky, I. Sutskever and G. E. Hinton. Imagenet classification with deep convolutional neural networks. In Proceedings of Advances in neural information processing systems. (2012)
K. Simonyan and A. Zisserman. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556. (2014)
C. Szegedy, W. Liu, W., Y. Jia, P. Sermanet, S. Reed, D. Anguelov and A. Rabinovich. Going deeper with convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 1-9. (2015)
K. He, X. Zhang, S. Ren and J. Sun. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. (2016)
I. Goodfellow, J. Pouget-Abadie et al. Generative adversarial nets. In Proceedings of Advances in Neural Information Processing Systems. pp.2672-2680. (2014)
A. Radford, L. Metz, and S. Chintala. Unsupervised representation learning with deep convolutional generative adversarial networks. In Proceedings of International Conference on Learning Representations. (2015)
A. Graves, A. Mohamed, G. Hinton. Speech recognition with deep recurrent neural networks. In Proceedings of 2013 IEEE international conference on acoustics, speech and signal processing. pp. 6645-6649. (2013)
S. Hochreiter and J. Schmidhuber. Long short-term memory. Neural Comput. vol. 9. pp. 1735-1780. (1997) https://doi.org/10.1162/neco.1997.9.8.1735
K. Cho, B. Van Merrienboer, C. Gulcehre, D. Bahdanau, F. Bougares, H Schwenk and Y. Bengio. Learning phrase representations using RNN encoder-decoder for statistical machine translation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. (2014)
http://benjamin.wtf
O. Vinyals, A. Toshev, S. Bengio and D. Erhan. Show and tell: A neural image caption generator. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 3156-3164. (2015)
J.-W. Ha, K.-M. Kim and B.-T. Zhang. Automated construction of visual-linguistic knowledge via concept learning from cartoon videos. In Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence. pp. 522-528. (2015)
R. Socher, B. Huval, B. Bath, C. D. Manning and A. Y. Ng. Convolutional-recursive deep learning for 3D object classification. In Proceedings of Advances in Neural Information Processing Systems. pp. 665-673. (2012)
R. Girshick, J. Donahue, T. Darrell and J. Malik. Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of International Conference on Pattern Recognition. pp. 580-587. (2014)
T. Mikolov, I. Sutskever, K. Chen, G. Corrado and J. Dean. Distributed representations of words and phrases and their compositionality. In Proceedings of Advances in Neural Information Processing Systems. pp. 3111-3119. (2013)

Broadcasting and Media Magazine (방송과미디어)

Deep Learning Technologies for Analysis of TV Drama Video Stories

TV 드라마 비디오 스토리 분석 딥러닝 기술

Abstract

Keywords

References

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)