A Study on the Interactive Effect of Spoken Words and Imagery not Synchronized in Multimedia Surrogates for Video Gisting

비디오 의미 파악을 위한 멀티미디어 요약의 비동시적 오디오와 이미지 정보간의 상호 작용 효과 연구

  • 김현희 (명지대학교 문헌정보학과)
  • Received : 2011.04.18
  • Accepted : 2011.05.11
  • Published : 2011.05.30


The study examines the interactive effect of spoken words and imagery not synchronized in audio/image surrogates for video gisting. To do that, we conducted an experiment with 64 participants, under the assumption that participants would better understand the content of videos when viewing audio/image surrogates rather than audio or image surrogates. The results of the experiment showed that overall audio/image surrogates were better than audio or image surrogates for video gisting, although the unsynchronized multimedia surrogates made it difficult for some participants to pay attention to both audio and image when the content they present is very different.


Interactive Effect;Unsynchronized Multimedia Surrogate;Video Gisting


  1. 김현희. 2007. 비디오 자료의 의미 추출을 위한 영상 초록의 효용성에 관한 실험적 연구. 정보관리학회지, 24(4): 53-72.
  2. 김현희. 2009. 비디오의 오디오 정보 요약 기법에 관한 연구. 정보관리학회지, 26(3): 169-188.
  3. 이경미 외. 2008. 내용, 감성, 메타데이터의 결합을 이용한 텍스타일 영상 검색. 한국인터넷정보학회논문집, 9(5): 99-108.
  4. Ding, W., et al. 1999. "Multimodal surrogates for video browsing." Proceedings of the fourth ACM Conference on Digital Libraries, 85-93. Berkeley, CA.
  5. Gunther, R., Kazman, R., & MaccGregor, C. 2004. "Using 3D sound as a navigational aid in virtual environments." Behaviour and Information Technology, 23(6): 435-446.
  6. Hughes, A., et al. 2003. "Text or pictures? an eye-tracking study of how people view digital video surrogates." Proceedings of CIVR 2003, 271-280.
  7. Iyer, H., & Lewis, C. 2007. "Prioritization strategies for video storyboard keyframes." Journal of American Society for Information Science and Technology, 58(5): 629-644.
  8. Kennedy, L., Naaman, M., Ahern, S., Nair, R., & Rattenbury, T. 2007. "How Flickr helps us make sense of the world: Context and content in community-contributed media collections." Proceedings of ACM Multimedia, 2007. Augsburg, Germany. [online]. [cited 2010.5.10]. .
  9. Kristin, B., et al. 2006. Audio Surrogation for Digital Video: A Design Framework. UNC School of Information and Library Science(SILS) Technical Report TR 2006-21.
  10. Marchionini, G., et al. 2009. "Multimedia surrogates for video gisting: Toward combining spoken words and imagery." Information Processing and Management, 45: 615-630.
  11. Paivio, A. 1986. Mental Representations. New York: Oxford University Press.
  12. Schmandt, C., & Mullins, A. 1995. "AudioStreamer: Exploiting simultaneity for listening." CHI 95 Conference Companion 1995, 218-219.
  13. Song, Y., & Marchionini, G. 2007. "Effects of audio and visual surrogates for making sense of digital video." Proceedings of CHI 2007, 867-876. San Jose, CA, USA.
  14. Song, Y., Marchionini, G., & Oh, C. 2010. "What are the most eye-catching and ear-catching features in the video?: implications for video summarization." Proceedings of the 19th International Conference on World Wide Web 2010. Raleigh, North Carolina.
  15. Wildemuth, B., et al. 2002. "Alternative surrogates for video objects in a digital library: Users' perspectives on their relative usability." Proceedings of the 6th European Conference on Digital Libraries, 493-507. New York: Springer.
  16. Yang, M., 2005. An Exploration of Users' Video Relevance Criteria. Ph.D. diss., University of North Carolina at Chapel Hill.
  17. Yang, M., & Marchionini, G. 2004. "Exploring users' video relevance criteria: A pilot study." Proceedings of the Annual Meeting of the American Society of Information Science and Technology, Nov. 12-17, 2004. 229-238. Providence, RI.

Cited by

  1. Generic speech summarization of transcribed lecture videos: Using tags and their semantic relations vol.67, pp.2, 2016,
  2. Investigating an Automatic Method in Summarizing a Video Speech Using User-Assigned Tags vol.46, pp.1, 2012,
  3. Investigating an Automatic Method for Summarizing and Presenting a Video Speech Using Acoustic Features vol.29, pp.4, 2012,