Keyframe Extraction from Home Videos Using 5W and 1H Information

육하원칙 정보에 기반한 홈비디오 키프레임 추출

  • Jang, Cheolhun (Department of Computer Science and Engineering, Pohang University of Science and Technology (POSTECH)) ;
  • Cho, Sunghyun (Department of Computer Science and Engineering, Pohang University of Science and Technology (POSTECH)) ;
  • Lee, Seungyong (Department of Computer Science and Engineering, Pohang University of Science and Technology (POSTECH))
  • 장철훈 (포항공과대학교 컴퓨터공학과) ;
  • 조성현 (포항공과대학교 컴퓨터공학과) ;
  • 이승용 (포항공과대학교 컴퓨터공학과)
  • Received : 2013.05.15
  • Accepted : 2013.06.08
  • Published : 2013.06.10

Abstract

We propose a novel method to extract keyframes from home videos based on the 5W and 1H information. Keyframe extraction is a kind of video summarization which selects only specific frames containing important information of a video. As a home video may have content with a variety of topics, we cannot make specific assumptions for information extraction. In addition, to summarize a home video we must analyze human behaviors, because people are important subjects in home videos. In this paper, we extract 5W and 1H information by analyzing human faces, human behaviors, and the global information of background. Experimental results demonstrate that our technique extract more similar keyframes to human selections than previous methods.

본 논문에서는 육하원칙 정보를 기반으로 홈비디오에서 키프레임을 추출하는 방법을 제시한다. 키프레임 추출방법이란 비디오에서 중요하다고 생각되는 특정 프레임만을 선출하여 비디오를 요약하는 방법이다. 홈비디오의 경우 그 주제가 다양하여 특별한 가정을 통한 정보 추출이 어렵고, 주로 인물이 비디오의 중심이 되기 때문에 인물의 행동을 중심으로 요약을 수행하여야 한다. 본 논문에서는 인물의 얼굴, 인물의 행동, 전체 배경 정보를 분석하여 인물 중심의 보편적인 요약 기준인 육하원칙의 주요 정보를 추출한다. 추가적으로 비디오의 매 프레임의 블러 크기를 측정하여 이용함으로써 프레임별로 얼마나 많은 정보를 포함하고 있는지 측정하고, 가장 많은 정보를 포함한 프레임을 키프레임으로 선출한다. 사용자 실험을 통해 사용자가 홈비디오에서 여러 개의 키프레임을 선택할 경우, 기존의 방법보다 사용자의 선택과 유사함을 확인할 수 있다.

Keywords

References

  1. Y. Li, T. Zhang, and D. Tretter, "An overview of video abstraction techniques," Technical Report HPL-2001-191, HP Laboratory, 2001.
  2. B.T. Truong and S. Venkatesh, "Video Abstraction" A Sys tematic Review and Classification," ACM Trans . Multimedia Comput. Commun. Appl., vol. 3, no. 1, 2007.
  3. M. Cooper and J. Foote, "Discriminative techniques for keyframe selection," ICME, pp. 4, 2005.
  4. G. Ciocca and R. Schettini, "An innovative algorithm for key frame extraction in video summarization," JRTIP, pp. 69-88, 2006.
  5. T. Liu and J. R. Kender, "An efficient error-minimizing algorithm for variable-rate temporal video sampling," ICME, pp. 413-416, 2002.
  6. W. Wolf, "Key Frame Selection by Motion Analysis," ICASSP, pp. 1228-1231, 1996.
  7. T. Liu, H. Zhang, and F. Qi, "Novel Video Key-Frame-Extraction Algorithm Based on Perceived Motion Energy Model," TCSVT, pp. 1006-1013, 2003.
  8. F. Arman, R. Depommier, A. Hsu, and M. Chiu, "Content-based Browsing of Video Sequences," ACM Multimedia, pp.97-103, 1994.
  9. S.W. Smoliar and H. Zhang, "Content Based Video Indexing and Retrieval," IEEE MultiMedia, IEEE Computer Society Press, pp.62-72, 1994.
  10. H. Ueda, T. Miyatake, S. Sumino, and A. Nagasaka, "Automatic structure visualization for video editing," IProceedings of the INTERACT 193 and CHI 193 conference on Human factors in computing systems, ACM, pp. 137-141, 1993.
  11. H. Zhang, J. Wu, D. Zhong, and S.W. Smoliar, "An Integrated System for Content-Based Video Retrieval and Browsing," Pattern Recognition, pp. 643-658, 1997.
  12. C. Kim and J. Hwang, "Object-Based video abstraction for video surveillance systems," TCSVT, pp. 1128-1138, 2002.
  13. B. Gunsel and M. Tekalp, "Content-Based Video Abstraction," ICIP, pp. 128-132, 1998.
  14. X. Zhang, T. Liu, K. Lo, and J. Feng, "Dynamic selection and effective compression of key frames for video abstraction," Pattern Recognition, pp. 1523-1532, 2003.
  15. Y. Zhuang, Y. Rui, T.S. Huang, and S. Mehrotra, "Adaptive Key Frame Extraction using Unsupervised Clustering," ICIP, pp. 866-870, 1998.
  16. A. Hanjalic and H. Zhang, "An Integrated Scheme for Automated Video Abstraction Based on Unsupervised Cluster-Validity Analysis," TCSVT, pp. 1280-1289, 1999.
  17. B. Yu, W. Ma, K. Nahrstedt, and H. Zhang, "Video Summarization based on User Log Enhanced Link Analysis," ACM MM, pp. 382-391, 2004.
  18. J. Luo, C. Papin, K. Costello, "Towards Extracting Semantically Meaningful Key Frames From Personal Video Clips" From Humans to Computers," TCSVT, pp. 289-301, 2009.
  19. H.S. Chang, S. Sull, and S.U. Lee, "Efficient video indexing scheme for content-based retrieval," TCSVT, pp. 1269-1279, 1999.
  20. J. Rong, W. Jin, and L. Wu, "Key frame extraction using inter-shot information," ICME, pp. 571-574, 2004.
  21. T. Liu and J.R. Kender, "Optimization algorithms for the selection of key frame sequences of variable length," ECCV, pp. 403-417, 2002.
  22. S. Han and I. Kweon, "Scalable temporal interest points for abstraction and classification of video events," ICME, pp. 4, 2005.
  23. M. Werlberger, T. Pock, and H. Bischof, "Motion Estimation with Non-Local Total Variation Regularization," Order A Journal On The Theory Of Ordered Sets And Its Applications, IEEE, pp. 2464-2471, 2010.
  24. M. Werlberger, W. Trobin, T. Pock, A. Wedel, D. Cremers, and H. Bischof, "Anisotropic Huber L1 Optical Flow," Proceedings of the British Machine Vision Conference (BMVC), pp. 1-11, 2009.
  25. Y. Matsushita, E. Ofek, W. Ge, X. Tang, and H. Shum, "Full-frame video stabilization with motion inpainting," IEEE Transactions on Pattern Analysis and Machine Intelligence, pp. 1150-1163, 2006.
  26. A. Doshi, and AG, Bors, "Robust processing of optical flow of fluids," IEEE Transactions on Image Processing, pp. 2232-2344, 2010.
  27. B. D. Lucas and T. Kanade, "An iterative image registration technique with an application to stereo vision," Proceddings of the 7th international joint conference on Artificial intelligence, 1981
  28. M. A. Fischler and R, C Bolles, "Random sample consensus" a paradigm for model fitting with application to image analysis and automated cartography," Communications of the ACM 24.6, pp. 381-395, 1981 https://doi.org/10.1145/358669.358692