DOI QR코드

DOI QR Code

A Study on Shot Segmentation and Indexing of Language Education Videos by Content-based Visual Feature Analysis

교육용 어학 영상의 내용 기반 특징 분석에 의한 샷 구분 및 색인에 대한 연구

  • 한희준 (경기대학교 대학원 문헌정보학과)
  • Received : 2017.02.21
  • Accepted : 2017.03.13
  • Published : 2017.03.30

Abstract

As IT technology develops rapidly and the personal dissemination of smart devices increases, video material is especially used as a medium of information transmission among audiovisual materials. Video as an information service content has become an indispensable element, and it has been used in various ways such as unidirectional delivery through TV, interactive service through the Internet, and audiovisual library borrowing. Especially, in the Internet environment, the information provider tries to reduce the effort and cost for the processing of the provided information in view of the video service through the smart device. In addition, users want to utilize only the desired parts because of the burden on excessive network usage, time and space constraints. Therefore, it is necessary to enhance the usability of the video by automatically classifying, summarizing, and indexing similar parts of the contents. In this paper, we propose a method of automatically segmenting the shots that make up videos by analyzing the contents and characteristics of language education videos and indexing the detailed contents information of the linguistic videos by combining visual features. The accuracy of the semantic based shot segmentation is high, and it can be effectively applied to the summary service of language education videos.

IT기술이 급속히 발달하고 스마트 기기의 개인보급이 늘어나면서 정보의 전달 매체로 시청각 자료 중에서도 특히 영상 자료가 많이 활용된다. 문헌정보서비스 콘텐츠로서 영상자료는 필수 요소가 되었으며, TV를 통한 단방향 전달, 인터넷을 통한 양방향 서비스, 도서관 시청각 자료 대출 등 다양한 방법으로 활용되고 있다. 특히 인터넷 환경에서 스마트 기기를 통한 영상서비스 관점에서 정보 제공자는 제공 정보에 대한 가공에 적은 노력과 비용을 들이고자 하고, 또한 사용자는 과도한 데이터 사용량에 대한 부담과 시간, 공간적인 제약으로 인해 원하는 부분만을 효율적으로 이용하고자 한다. 따라서 영상에 대한 내용을 유사한 부분끼리 자동으로 구분하고 요약, 색인하여 이용 편의성을 높일 필요가 있다. 본 논문에서는 교육용 어학 영상의 내용과 그 특성을 분석하여 영상을 이루는 샷을 자동으로 구분하고 비주얼 특징을 조합하여 어학 영상의 세분화된 내용 정보를 결정하고 색인하는 방법을 제안한다. 외국어 강의 영상을 이용한 실험에 의해 의미기반의 샷 결정에 높은 정확률을 보였으며, 교육용 어학 영상의 요약 서비스에 효율적으로 적용 가능함을 확인하였다.

Keywords

References

  1. Basu, S., Yu, Y., & Zimmermann, R. (2016). Fuzzy clustering of lecture videos based on topic modeling. Proceedings of the 2016 14th International Workshop on Content-Based Multimedia Indexing (CBMI), 1-6. https://doi.org/10.1109/cbmi.2016.7500264
  2. Cieplinski, L., Jeannin, S., Kim, M., & Ohm, J. R. (2000). Visual working draft 4.0. ISO/IEC JTC1/SC29/WG11 N, 3399.
  3. Cieplinski, L., Kim, M., Ohm, J. R., Pickering, M., & Yamada, A. (2001). Text of ISO/IEC 15938-3/FCD information technology-multimedia content description interface-part 3 visual. ISO/IEC JTC1/SC29/ WG11 N, 4062, 30-53.
  4. Divakaran, A., Peker, K. A., Radhakrishnan, R., Xiong, Z., & Cabasson, R. (2003). Video summarization using mpeg-7 motion activity and audio descriptors. Video Mining, 6, 91-121. https://doi.org/10.1007/978-1-4757-6928-9_4
  5. Fei, M., Jiang, W., & Mao, W. (2017). Memorable and rich video summarization. Journal of Visual Communication and Image Representation, 42, 207-217. https://doi.org/10.1016/j.jvcir.2016.12.001
  6. Hu, W., Xie, N., Li, L., Zeng, X., & Maybank, S. (2011). A survey on visual content-based video indexing and retrieval. IEEE Transactions on Systems, Man, and Cybernetics, Part C, 41(6), 797-819. https://doi.org/10.1109/TSMCC.2011.2109710
  7. Lee, H. K., Kim, C. S., Nam, J. H., Kang, K. O., & Ro, Y. M. (2002). Video contents summary using the combination of multiple MPEG-7 metadata. Proceedings of the Korean Institute of Broadcast and Media Engineers Conference, 227-232.
  8. Manjunath, B. S., Salembier, P., & Sikora, T. (2002). Introduction to MPEG-7: multimedia content description interface. Chichester; New York: John Wiley & Sons.
  9. Mundur, P., Rao, Y., & Yesha, Y. (2006). Keyframe-based video summarization using delaunay clustering. International Journal on Digital Libraries, 6(2), 219-232. https://doi.org/10.1007/s00799-005-0129-9
  10. Ngo, C. W., Ma, Y. F., & Zhang, H. J. (2005). Video summarization and scene detection by graph modeling. IEEE Transactions on Circuits and Systems for Video Technology, 15(2), 296-305. https://doi.org/10.1109/TCSVT.2004.841694
  11. Peker, K. A., Divakaran, A., & Papathomas, T. V. (2001). Automatic measurement of intensity of motion activity of video segments. Progress in Biomedical Optics and Imaging, (4315), 341-351.
  12. Peng, J., & Xiaolin, Q. (2010). Keyframe-based video summary using visual attention clues. IEEE MultiMedia, 17(2), 64-73. https://doi.org/10.1109/MMUL.2009.65
  13. Ro, Y. M., Yoo, K. W., Kim, M. C., & Kim, J. W. (1999). Texture description using radon transform. ISO/IEC JTC1 SC29 WG11 (MPEG).
  14. Salembier, P., & Smith, J. R. (2001). MPEG-7 multimedia description schemes. IEEE transactions on circuits and systems for video technology, 11(6), 748-759. https://doi.org/10.1109/76.927435
  15. Sudhir, G., Lee, J. C. M., & Jain, A. K. (1998). Automatic classification of tennis video for high-level content-based retrieval. Proceedings of the 1998 IEEE International Workshop on Content-Based Access of Image and Video Database, 81-90. https://doi.org/10.1109/caivd.1998.646036
  16. Thakre, K. S., Rajurkar, A. M., & Manthalkar, R. R. (2016). Video partitioning and secured keyframe extraction of MPEG video. Procedia Computer Science, 78, 790-798. http://dx.doi.org/10.1016/j.procs.2016.02.058
  17. Walker, T., & Sull, S. (1999). Proposal for a video summary description scheme. Proceedings of the 2000 IEEE International Conference, 3, 1559-1562.
  18. Yamada, A., Pickering, M., Jeannin, S., & Jens, L. C. (2001). MPEG-7 visual part of experimentation model. Version 9.0-Part 3 Dominant Color, ISO/IEC JTC1/SC29/WG11/N3914.
  19. Zhong, D., & Chang, S. F. (1997). Video object model and segmentation for content-based video indexing. Proceedings of the 1997 IEEE International Symposium on Circuits and Systems, 2, 1492-1495. https://doi.org/10.1109/iscas.1997.622202