가상 시점 영상 합성을 위한 깊이 기반 가려짐 영역 메움법

A Depth-based Disocclusion Filling Method for Virtual Viewpoint Image Synthesis

  • 안일구 (한국과학기술원 전기및전자공학과) ;
  • 김창익 (한국과학기술원 전기및전자공학과)
  • 투고 : 2011.09.01
  • 발행 : 2011.11.25

초록

최근 3차원 영상과 자유 시점 영상에 대한 연구가 매우 활발하다. 다수의 카메라로부터 취득된 다시점 영상 사이를 가상적으로 이동하며 시청할 수 있는 자유 시점 렌더링은 다양한 분야에 적용될 수 있어 주목받는 연구주제이다. 하지만 다시점 카메라 시스템은 경제적인 비용 및 전송의 제약이 따른다. 이러한 문제를 해결하기 위한 대안으로 한 장의 텍스처 영상과 상응하는 깊이 영상을 이용하여 가상 시점을 생성하는 방법이 주목받고 있다. 가상 시점 생성 시 발생하는 문제점은 원래 시점에서는 객체에 의해 가려져 있던 영역이 가상시점에서는 보이게 된다는 것이다. 이 가려짐 영역을 자연스럽게 채우는 것은 가상 시점 렌더링의 질을 결정한다. 본 논문은 가상 시점 렌더링에서 필연적으로 발생하는 가려짐 영역을 깊이 기반 인페인팅을 이용하여 합성하는 방법을 제안한다. 텍스처 합성 기술에서 우수한 성능을 보인 패치 기반 비모수적 텍스처 합성 방법에서 중요한 요소는 어느 부분을 먼저 채울 지 우선순위를 결정하는 것과 어느 배경 영역으로 채울 지 예제를 결정하는 것이다. 본 논문에서는 헤시안(Hessian) 행렬 구조 텐서(structure tensor)를 이용해 잡음에 강건한 우선순위 설정 방법을 제안한다. 또한 홀 영역을 채울 적절한 배경 패치를 결정하는 데에 있어서는 깊이 영상을 이용해 배경영역을 알아내고 에피폴라 라인을 고려한 패치 결정 방법을 제안한다. 기존 방법들과 객관적인 비교와 주관적인 비교를 통하여 제안된 방법의 우수성을 보이고자 한다.

Nowadays, the 3D community is actively researching on 3D imaging and free-viewpoint video (FVV). The free-viewpoint rendering in multi-view video, virtually move through the scenes in order to create different viewpoints, has become a popular topic in 3D research that can lead to various applications. However, there are restrictions of cost-effectiveness and occupying large bandwidth in video transmission. An alternative to solve this problem is to generate virtual views using a single texture image and a corresponding depth image. A critical issue on generating virtual views is that the regions occluded by the foreground (FG) objects in the original views may become visible in the synthesized views. Filling this disocclusions (holes) in a visually plausible manner determines the quality of synthesis results. In this paper, a new approach for handling disocclusions using depth based inpainting algorithm in synthesized views is presented. Patch based non-parametric texture synthesis which shows excellent performance has two critical elements: determining where to fill first and determining what patch to be copied. In this work, a noise-robust filling priority using the structure tensor of Hessian matrix is proposed. Moreover, a patch matching algorithm excluding foreground region using depth map and considering epipolar line is proposed. Superiority of the proposed method over the existing methods is proved by comparing the experimental results.

키워드

참고문헌

  1. M. Tanimoto, "FTV (free viewpoint television) for 3D scene reproduction and creation," in CVPRW'06: Proceedings of the 2006 Conference on Computer Vision and Pattern Recognition Workshop, p. 172, Washington, DC, USA, 2006.
  2. D. Ruijters and S. Zinger, "IGLANCE: transmission to medical high definition autostereoscopic displays," in 3DTV Conference: The True Vision - Capture, Transmission and Display of 3D Video, pp. 1-4, Potsdam, May 2009.
  3. A. Kubota, A. Smolic, M. Magnor, M. Tanimoto, T. Chen, and C. Zhang, "Multiview imaging and 3DTV," IEEE Signal Processing Magazine, vol. 24, no. 6, pp. 10-1, 2007.
  4. C. Leung and B. C. Lovell, "3D reconstruction through segmentation of multi-view image sequences," Workshop on Digital Image Computing, vol. 1, pp. 87-2, 2003.
  5. C.L. Zitnick, S.B. Kang, M. Uyttendaele, S. Winder, and R. Szeliski, "High-quality video view interpolation using a layered representation," in ACM SIGGRAPH, pp. 600- 608. New York, NY, USA, 2004.
  6. C. Fehn, "Depth Image Based Rendering (DIBR), compression and transmission for a new approach on 3D-TV,"in Proc. SPIE Stereoscopic Disp. Virtual Reality Syst. XI, San Jose, CA, pp.93-04, Jan. 2004.
  7. L. Zhang and W. J. Tamm, "Stereoscopic image generation based on depth images for 3-D TV," IEEE Trans. Broadcasting, vol. 51, no. 2, pp. 191-99, Jun. 2005. https://doi.org/10.1109/TBC.2005.846190
  8. P.-J. Lee and Effendi, "Adaptive edge-oriented depth image smoothing approach for depth image based rendering,"in Proc. IEEE Int. Symp. Broadband Multimedia Syst. Broadcasting, pp. 1-5, Shanghai, China, Mar. 2010
  9. W.-Y. Chen, Y.-L. Chang, S.-F. Lin, L.-F. Ding, and L.-G. Chen, "Efficient depth image based rendering with edge dependent depth filter and interpolation," in Proc. IEEE Int. Conf. Multimedia Expo, pp. 1314-1317, Amsterdam, The Netherlands, Jul. 2005.
  10. K.J. Oh, S. Yea, and Y.S. Ho, "Hole filling method using depth based in-painting for view synthesis in free viewpoint television and 3-d video," in PCS, pp. 1-4, 2009.
  11. S. Zinger, L. Do, and P. H. N. de With, "Free-viewpoint depth image based rendering," J. Vis. Commun. Image Representation, vol. 21, no. 5-6, pp. 533-541, 2010. https://doi.org/10.1016/j.jvcir.2010.01.004
  12. Y. Mori, N. Fukushima, T. Yendo, T. Fujii, and M. Tanimoto, "View generation with 3-D warping using depth information for FTV," IEEE J. Signal Process., vol. 24, no. 1-2, pp. 65-72, Jan.-Feb. 2009.
  13. W. R. Mark, "Post-rendering 3-D image warping:Visibility, reconsruction, and performance for depth-image warping," Ph.D. dissertation, Graph. Image Process. Lab., Dept. Comput. Sci., Univ. North Carolina, Chapel Hill, 1999.
  14. M. Tanimoto, T. Fujii, and K. Suzuki, "View Synthesis Algorithm in View Synthesis Reference Software 2.0 (VSRS 2.0)," Lausanne, Switzerland, ISO/IEC JTC1/SC29/WG11 M16090, Feb. 2008.
  15. A. Telea, "An image inpainting technique based on the fast marching method," Int. J. Graphic Tools, vol. 9, no. 1, pp. 25-36, 2004.
  16. X. Jiufei, X. Ming, L. Dongxiao, and Z. Ming, "A new virtual view rendering method based on depth image," in Proc. Asia-Pacific Conf. Wearable Computing Syst., Shenzhen, China, pp. 147-150, Apr. 2010.
  17. C.-M. Cheng, S.-J. Lin, S.-H. Lai, and J.-C. Yang, "Improved novel view synthesis from depth image with large baseline," in Proc. IEEE Int. Conf. Pattern Recognit., Tampa, FL, pp. 1-4, Dec. 2008.
  18. D. J. Heeger and J. R. Bergen, "Pyramid-based texture analysis/synthesis," in Proc. ACM SIGGRAPH, pp.229-238, Los Angeles, CA, 1995.
  19. J. Portilla and E. P. Simoncelli, "A parametric texture model based on joint statistics of complex wavelet coefficients," Int. J. Comput. Vis., vol. 40, no. 1, pp. 49-71, 2000. https://doi.org/10.1023/A:1026553619983
  20. G. Doretto, A. Chiuso, Y. N. Wu, and S. Soatto, "Dynamic textures," Int. J. Comput. Vis., pp. 91 -109, Feb. 2004.
  21. A. Criminisi, P. Perez, and K. Toyama, "Region filling and object removal by exemplar-based image inpainting," IEEE Transactions on Image Processing, vol. 13, no. 9, pp. 1200-1212, 2004. https://doi.org/10.1109/TIP.2004.833105
  22. J. Hayes and A. Efros, "Scene completion using millions of photographs," ACM Transactions on Graphics (SIGGRAPH 2007), vol. 26, No. 3, Aug. 2007.
  23. J. S. DeBonet, "Multiresolution sampling procedure for analysis and synthesis of texture images," in Proc. ACM SIGGRAPH, pp. 361- 368, 1997.
  24. M. Ashikhmin, "Synthesizing natural textures," in Proc. ACM Symp. Interactive 3-D Graphics, pp. 217-226, New York, 2001.
  25. V. Kwatra, A. Schödl, I. Essa, G. Turk, and A. Bobick, "Graphcut textures: Image and video synthesis using graph cuts," in Proc. ACM SIGGRAPH, pp. 277-286, San Diego, CA, Jul. 2003.
  26. I. Daribo and B. Pesquet-Popescu, "Depth-aided image inpainting for novel view synthesis," in IEEE International Workshop on Multimedia Signal Processing, 2010.
  27. J. Gautier, O. Le Meur, C. Guillemot, "Depth-based image completion for view synthesis," 3DTV Conference: The True Vision - Capture, Transmission and Display of 3D Video (3DTV-CON), pp. 1-4, 2011.
  28. S. Di Zenzo, "A note on the gradient of a multi-image," Computer Vision, Graphics, and Image Processing, vol. 33, no. 1, pp. 116-125, 1986. https://doi.org/10.1016/0734-189X(86)90223-9
  29. M. Tanimoto, T. Fujii, and K. Suzuki, "Depth Estimation Reference Software (DERS) 5.0," Lausanne, Switzerland, ISO/IEC JTC1/SC29/WG11 M16923, Oct. 2009.
  30. http://research.microsoft.com/en-us/um/people/ sbkang/3dvideodownload/
  31. http://en.wikipedia.org/wiki/Peak_signal-tonoise_ ratio
  32. Z. Wang, A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli, "Image quality assessment: From error visibility to structural similarity," IEEE Trans. Image Process., vol. 13, no. 4, pp. 600- 612, Apr. 2004. https://doi.org/10.1109/TIP.2003.819861