DOI QR코드

DOI QR Code

Video Scene Detection using Shot Clustering based on Visual Features

시각적 특징을 기반한 샷 클러스터링을 통한 비디오 씬 탐지 기법

  • Received : 2012.04.06
  • Accepted : 2012.06.04
  • Published : 2012.06.30

Abstract

Video data comes in the form of the unstructured and the complex structure. As the importance of efficient management and retrieval for video data increases, studies on the video parsing based on the visual features contained in the video contents are researched to reconstruct video data as the meaningful structure. The early studies on video parsing are focused on splitting video data into shots, but detecting the shot boundary defined with the physical boundary does not cosider the semantic association of video data. Recently, studies on structuralizing video shots having the semantic association to the video scene defined with the semantic boundary by utilizing clustering methods are actively progressed. Previous studies on detecting the video scene try to detect video scenes by utilizing clustering algorithms based on the similarity measure between video shots mainly depended on color features. However, the correct identification of a video shot or scene and the detection of the gradual transitions such as dissolve, fade and wipe are difficult because color features of video data contain a noise and are abruptly changed due to the intervention of an unexpected object. In this paper, to solve these problems, we propose the Scene Detector by using Color histogram, corner Edge and Object color histogram (SDCEO) that clusters similar shots organizing same event based on visual features including the color histogram, the corner edge and the object color histogram to detect video scenes. The SDCEO is worthy of notice in a sense that it uses the edge feature with the color feature, and as a result, it effectively detects the gradual transitions as well as the abrupt transitions. The SDCEO consists of the Shot Bound Identifier and the Video Scene Detector. The Shot Bound Identifier is comprised of the Color Histogram Analysis step and the Corner Edge Analysis step. In the Color Histogram Analysis step, SDCEO uses the color histogram feature to organizing shot boundaries. The color histogram, recording the percentage of each quantized color among all pixels in a frame, are chosen for their good performance, as also reported in other work of content-based image and video analysis. To organize shot boundaries, SDCEO joins associated sequential frames into shot boundaries by measuring the similarity of the color histogram between frames. In the Corner Edge Analysis step, SDCEO identifies the final shot boundaries by using the corner edge feature. SDCEO detect associated shot boundaries comparing the corner edge feature between the last frame of previous shot boundary and the first frame of next shot boundary. In the Key-frame Extraction step, SDCEO compares each frame with all frames and measures the similarity by using histogram euclidean distance, and then select the frame the most similar with all frames contained in same shot boundary as the key-frame. Video Scene Detector clusters associated shots organizing same event by utilizing the hierarchical agglomerative clustering method based on the visual features including the color histogram and the object color histogram. After detecting video scenes, SDCEO organizes final video scene by repetitive clustering until the simiarity distance between shot boundaries less than the threshold h. In this paper, we construct the prototype of SDCEO and experiments are carried out with the baseline data that are manually constructed, and the experimental results that the precision of shot boundary detection is 93.3% and the precision of video scene detection is 83.3% are satisfactory.

비디오 데이터는 구조화되지 않은 복합 데이터의 형태를 지닌다. 이러한 비디오 데이터의 효율적인 관리 및 검색을 위한 비디오 데이터 구조화의 중요성이 대두되면서 콘텐츠 내 시각적 특징을 기반으로 비디오 씬(scene)을 탐지하고자 하는 연구가 활발히 진행되었다. 기존의 연구들은 주로 색상 정보만을 이용하여 샷(shot) 간의 유사도 평가를 기반한 클러스터링(clustering)을 통해 비디오 씬을 탐지하고자 하였다. 하지만 비디오 데이터의 색상 정보는 노이즈(noise)를 포함하고, 특정 사물의 개입 등으로 인해 급격하게 변화하기 때문에 색상만을 특징으로 고려할 경우, 비디오 샷 혹은 씬에 대한 올바른 식별과 디졸브(dissolve), 페이드(fade), 와이프(wipe)와 같은 화면의 점진적인 전환(gradual transitions) 탐지는 어렵다. 이러한 문제점을 해결하기 위해, 본 논문에서는 프레임(frame)의 컬러 히스토그램과 코너 에지, 그리고 객체 컬러 히스토그램에 해당하는 시각적 특징을 기반으로 동일한 이벤트를 구성하는 의미적으로 유사한 샷의 클러스터링을 통해 비디오 씬을 탐지하는 방법(Scene Detector by using Color histogram, corner Edge and Object color histogram, SDCEO)을 제안한다. SDCEO는 샷 바운더리 식별을 위해 컬러 히스토그램 분석 단계에서 각 프레임의 컬러 히스토그램 정보를 이용하여 1차적으로 연관성 있는 연속된 프레임을 샷 바운더리로 병합한 후, 코너 에지 분석 단계에서 병합된 샷 내 처음과 마지막 프레임의 코너 에지 특징 비교를 통하여 샷 바운더리를 정제하여 최종 샷을 식별한다. 키프레임 추출 단계에서는 샷 내 프레임간 유사도 비교를 통해 모든 프레임과 가장 유사한 프레임을 각 샷을 대표하는 키프레임으로 추출한다. 그 후, 비디오 씬 탐지를 위해, 컬러 히스토그램과 객체 컬러 히스토 그램에 해당하는 프레임의 시각적 특징을 기반으로 상향식 계층 클러스터링 방법을 이용하여 의미적인 연관성을 지니는 샷의 군집화를 통해 비디오 씬을 탐지하는 방법이다. 본 논문에서는 SDCEO의 프로토 타입을 구축하고 3개의 비디오 데이터를 이용한 실험을 통하여 SDCEO의 효율성을 평가하였고 샷 바운더리 식별의 성능의 정확도는 평균 93.3%, 비디오 씬 탐지 성능의 정확도는 평균 83.3%로 만족할만한 성능을 보였다.

Keywords

References

  1. 김광백, 윤홍원, 노영욱, "컬러 정보와 퍼지 C-means 알고리즘을 이용한 주차관리 시스템 개발", 지능정보연구, 8권 1호(2002), 87-101.
  2. 이연호, 오경진, 신위살, 조근식, "링크드 데이터를 이용한 협업적 비디오 어노테이션 및 브라우징시스템", 지능정보연구, 17권 3호(2011), 203-219.
  3. 허진경, 김향태, "히스토그램 분포도 역추적 변경에 의한 영상 강조", 지능정보연구, l권 8호(2004), 1-11.
  4. Amiri, A., N. Abdollahi, M. Jafari, M. Fathy, "Hierarchical Key-Frame Based Video Shot Clustering Using Generalized Trace Kernel", Communicationsin Computer and Information Science, Vol.241, No.5(2011), 251-257.
  5. Chasanis, V., A. Likas, and N. Galatsanos, "Scene Detection in Videos Using Shot Clustering and Sequence Alignment", IEEE Transactionson Multimedia, Vo1.11, No.1(2009), 89-100. https://doi.org/10.1109/TMM.2008.2008924
  6. Gao, X., J. Li, and Y. Shi, "A Video Shot Boundary Detection Algorithm Based on Feature Tracking", In Proceedings of the Rough Sets and Knowledge Technology, (2006), 651-658.
  7. Gargi, U., R. Kasturi, and S. Strayer, "Performance characterization of video-shot-change detection methods", IEEE Transactions on Circuits and Systems for Video Technology, Vol.10(2000), 1-13.
  8. Hanjalic, A., R. Lagendijk, and J. Biemond, "Automated high-level movie segmentation for advanced video-retrieval systems", IEEE Transactionson Circuitsand Systems for Video Technology, Vol.9, No.4(1999), 580-588. https://doi.org/10.1109/76.767124
  9. Huang, C., H. Lee, and C. Chen, "Shot Change Detection via Local Keypoint Matching", IEEE Transactionson Multimedia, Vol.10, No.6(2008), 1097-1108. https://doi.org/10.1109/TMM.2008.2001374
  10. Lee, M., Y. Yang, and S. Lee, "Automatic video parsing using shot boundary detection and camera operation analysis", Journal of the Pattern Recognition Society, Vol.34, No.3(2001), 711-719. https://doi.org/10.1016/S0031-3203(00)00007-8
  11. Lu, H., Y. Tan, and X. Xue, "Real-Time, Adaptive, and Locality-Based Graph Partitioning Method for Video Scene Clustering", IEEE Transactionson Circuitsand Systems for Video Technology, Vol.21, No.11(2011), 1747-1759. https://doi.org/10.1109/TCSVT.2011.2147190
  12. Manning, C. and P. Raghavan, H. Schutze, "Introduction to Information Retrieval", Cambridge University Press, 2008.
  13. Mohonta, P., S. Saha, and B. Chanda, "A Hueristic Algorithm for Video Scene Detection Using Shot Cluster Sequence Analysis", In Proceedingsof the 7th Indian Conferenceon Computer Vision, Graphicsand Image Processing, (2010), 464-471.
  14. Pass, G., R. Zabih, and J. Miller, "Comparing Images Using Color Coherence Vectors", ACMConferenceon Multimedia, (1996), 65-74.
  15. Rasheed Z. and M. Shah, "Detection and representation of scene in videos", IEEE Transactionsof Multimedia, Vol.7, No.6(2005), 1097- 1105. https://doi.org/10.1109/TMM.2005.858392
  16. Sakarya, U., Z. Telatar, "Video scene detection using graph-based representations", Signal Processing Image Communication, Vol.25, No.10(2010), 774-783. https://doi.org/10.1016/j.image.2010.10.001
  17. Sangoh, J., "Histogram-Based Color Image Retrieval", Technical Report, Psych221/EE362 Project, 2001.
  18. Sobel, I. and G. Feldman, "A 3x3 Isotropic Gradient Operator for Image Processing", InProceedings Pattern Classification and Scene Analysis, (1973), 271-272.
  19. Truong, B., S. Venkatesh, and C. Dorai, "Scene extraction in motion picture", IEEE Transactionson Circuitsand Systems for Video Technology, Vol.13, No.1(2003), 5-15. https://doi.org/10.1109/TCSVT.2002.808084
  20. Yeung, M. and B. Yeo, "Segmentation of video by clustering and graph analysis", Journal of Computer Visionand Image Understanding, Vol.71, No.1(1998), 97-109.
  21. Yeung, M. and B. Yeo, "Time-Constrained Clustering for Segmentation of Video into Story Units", InProceedings of 13th International Conference on Pattern Recognition, Vol.3(1996), 375-380.
  22. Zhu, S. and Y. Liu, "Video scene segmentation and semantic representation using a novel scheme", Multimedia Tools and Applications, Vol.42, No.2(2009), 183-205. https://doi.org/10.1007/s11042-008-0233-0