DOI QR코드

DOI QR Code

Bilayer Segmentation of Consistent Scene Images by Propagation of Multi-level Cues with Adaptive Confidence

다중 단계 신호의 적응적 전파를 통한 동일 장면 영상의 이원 영역화

  • Lee, Soo-Chahn (Automation and Systems Research Institute, Seoul National University) ;
  • Yun, Il-Dong (School of Digital Information Engineering, Hankuk University of Foreign Studies) ;
  • Lee, Sang-Uk (School of Electrical Engineering and Computer Science, Seoul National University)
  • 이수찬 (서울대학교 자동화시스템연구소) ;
  • 윤일동 (한국외국어대학교 용인캠퍼스 디지털정보공학과) ;
  • 이상욱 (서울대학교 전기.컴퓨터공학부)
  • Published : 2009.07.30

Abstract

So far, many methods for segmenting single images or video have been proposed, but few methods have dealt with multiple images with analogous content. These images, which we term consistent scene images, include concurrent images of a scene and gathered images of a similar foreground, and may be collectively utilized to describe a scene or as input images for multi-view stereo. In this paper, we present a method to segment these images with minimum user input, specifically, manual segmentation of one image, by iteratively propagating information via multi-level cues with adaptive confidence depending on the nature of the images. Propagated cues are used as the bases to compute multi-level potentials in an MRF framework, and segmentation is done by energy minimization. Both cues and potentials are classified as low-, mid-, and high- levels based on whether they pertain to pixels, patches, and shapes. A major aspect of our approach is utilizing mid-level cues to compute low- and mid- level potentials, and high-level cues to compute low-, mid-, and high- level potentials, thereby making use of inherent information. Through this process, the proposed method attempts to maximize the amount of both extracted and utilized information in order to maximize the consistency of the segmentation. We demonstrate the effectiveness of the proposed method on several sets of consistent scene images and provide a comparison with results based only on mid-level cues [1].

최근까지 단일 영상이나 동영상을 영역화하는 기법들은 다양하게 제시되어 왔으나, 유사한 장면에 대한 여러 장의 영상을 동시에 영역화하는 기법은 많지 않았다. 본 논문에서는 한 장소에서 연속적으로 촬영하였거나 전경 물체가 유사한 여러 영상들을 동일 장면 영상으로 정의하고, 이런 동일 장면 영상들을 적은 양의 사용자 입력을 통해 효과적으로 영역화하는 기법을 제안한다. 구체적으로, 사용자가 최초의 영상 한 장을 직접 영역화한 후, 그 영상의 영역화 결과와 영상의 특성을 토대로 다중 단계 신호를 적응적 가중치를 주어서 인접 영상으로 전파하고, 이를 통해 제안하는 기법은 인접 영상을 반복적으로 영역화한다. 영역화는 마르코프 랜덤 장에서의 에너지 최소화를 통해 이루어지는데, 전파되는 신호는 각 픽셀에 대한 에너지를 정의하는 바탕이 되며, 픽셀, 픽셀 패치, 그리고 영상 전체로부터 비롯되었는가에 따라 낮은 단계, 중간 단계, 그리고 높은 단계의 신호로 지칭된다. 또한 에너지 최소화 틀 안에서 전파된 신호를 통해 정의되는 에너지 역시 낮은 단계, 중간 단계, 그리고 높은 단계의 세 단계로 정의한다. 이런 과정을 통해 전파된 신호를 최대한 다양하게 활용하고, 이를 통해 다양한 영상에 영역화 결과가 일관되게 유지된다. 다양한 동일 장면 영상들에 제안하는 기법을 적용하여 성능을 평가하고, 픽셀 패치를 바탕으로 하는 중간 단계 신호만을 이용한 결과와 제안하는 다중 신호를 적용하는 기법의 결과를 비교한다.

Keywords

References

  1. 1. Lu, L., Hager, G.: A nonparametric treatment for location/segmentation based visual tracking. In: CVPR07. (2007)
  2. Comaniciu, D., Meer, P.: Mean shift: A robust approach toward feature space analysis. PAMI 24 (2002) 603–619 https://doi.org/10.1109/34.1000236
  3. Felzenszwalb, P., Huttenlocher, D.: Efficient graph-based image segmentation. IJCV 59 (2004) 167–181 https://doi.org/10.1023/B:VISI.0000022288.19776.77
  4. Shi, J., Malik, J.: Normalized cuts and image segmentation. PAMI 22 (2000) 888–905 https://doi.org/10.1109/34.868688
  5. Boykov, Y., Funka Lea, G.: Graph cuts and efficient n-d image segmentation. IJCV 70 (2006) 109–131 https://doi.org/10.1007/s11263-006-7934-5
  6. Rother, C., Kolmogorov, V., Blake, A.: ”grabcut”: interactive foreground extraction using iterated graph cuts. ACM Trans. Graph. 23 (2004) 309–14 https://doi.org/10.1145/1015706.1015720
  7. Kumar, M., Torr, P., Zisserman, A.: Obj cut. In: CVPR05. (2005) https://doi.org/10.1109/CVPR.2005.249
  8. Freedman, D., Zhang, T.: Interactive graph cut based segmentation with shape priors. In: CVPR05. (2005) I: 755–62 https://doi.org/10.1109/CVPR.2005.191
  9. Li, Y., Sun, J., Shum, H.Y.: Video object cut and paste. ACM Trans. Graph. 24 (2005) https://doi.org/10.1145/1073204.1073234
  10. Wang, J., Bhat, P., Colburn, R.A., Agrawala, M., Cohen, M.F.: Interactive video cutout. ACM Trans. Graph. 24 (2005) 585–94 https://doi.org/10.1145/1073204.1073233
  11. Kolmogorov, V., Criminisi, A., Blake, A., Cross, G., Rother, C.: Probabilistic fusion of stereo with color and contrast for bi-layer segmentation. IJCV 76 (2008) https://doi.org/10.1007/s11263-007-0070-z
  12. Sun, J., Zhang, W., Tang, X., Shum, H.: Background cut. In: ECCV06. (2006) II: 628–41
  13. Criminisi, A., Cross, G., Blake, A., Kolmogorov, V.: Bilayer segmentation of live video. In: CVPR06. (2006)
  14. Yin, P., Criminisi, A., Winn, J., Essa, I.: Tree-based classifiers for bilayer video segmentation. In: CVPR07. (2007)
  15. Boykov, Y., Kolmogorov, V.: An experimental comparison of min-cut/max-flow algorithms for energy minimization in vision. PAMI 26 (2004) 1124–137 https://doi.org/10.1109/TPAMI.2004.60
  16. Homepage, T.P.V.O.C.: (http://www.pascal-network.org/challenges/ voc/)
  17. Rother, C., Minka, T., Blake, A., Kolmogorov, V.: Cosegmentation of image pairs by histogram matching: Incorporating a global constraint into mrfs. In: CVPR06. (2006) I: 993–000 https://doi.org/10.1109/CVPR.2006.91
  18. Campbell, N., Vogiatzis, G., Hernandez, C., Cipolla, R.: Automatic 3d object segmentation in multiple views using volumetric graph-cuts. In: BMVC07. (2007)
  19. Zheng, S., Tu, Z., Yuille, A.: Detecting object boundaries using low-, mid-, and high-level information. In: CVPR07. (2007)
  20. Kohli, P., Kumar, M., Torr, P.: P3 & beyond: Solving energies with higher order cliques. In: CVPR07. (2007)
  21. Li., S.Z.: Markov Random Field Modeling in Computer Vision. Springer-Verlag (1995)
  22. Rubner, Y., Tomasi, C., Guibas, L.: The earth mover’s distance as a metric for image retrieval. IJCV 40 (2000) 99–21 https://doi.org/10.1023/A:1026543900054
  23. Kwon, D., Yun, I., Lee, S.: Efficient feature-based nonrigid registration of multi-phase liver ct volumes. BMVC (2008)
  24. Xu, C., Prince, J.: Snakes, shapes, and gradient vector flow. T-IP 7 (1998) 359–69 https://doi.org/10.1109/83.661186
  25. Chuang, Y.Y., Agarwala, A., Curless, B., Salesin, D.H., Szeliski, R.: Video matting of complex scenes. ACM Transactions on Graphics 21 (2002) 243–48 Sepcial Issue of the SIGGRAPH 2002 Proceedings. https://doi.org/10.1145/566570.566572
  26. Chuang, Y.Y., Agarwala, A., Curless, B., Salesin, D.H., Szeliski, R.: Video matting of complex scenes. ACM Transactions on Graphics 21 (2002) 243–48 Sepcial Issue of the SIGGRAPH 2002 Proceedings