DOI QR코드

DOI QR Code

Video Coding Method Using Visual Perception Model based on Motion Analysis

움직임 분석 기반의 시각인지 모델을 이용한 비디오 코딩 방법

  • Oh, Hyung-Suk (College of Electronics and Radio Engr. Kyung Hee University) ;
  • Kim, Won-Ha (College of Electronics and Radio Engr. Kyung Hee University)
  • 오형석 (경희대학교 전자정보대학 전자전파공학과) ;
  • 김원하 (경희대학교 전자정보대학 전자전파공학과)
  • Received : 2011.12.28
  • Accepted : 2012.03.21
  • Published : 2012.03.30

Abstract

We develop a video processing method that allows the more advanced human perception oriented video coding. The proposed method necessarily reflects all influences by the rate-distortion based optimization and the human visual perception that is affected by the visual saliency, the limited space-time resolution and the regional moving history. For reflecting the human perceptual effects, we devise an online moving pattern classifier using the Hedge algorithm. Then, we embed the existing visual saliency into the proposed moving patterns so as to establish a human visual perception model. In order to realize the proposed human visual perception model, we extend the conventional foveation filtering method. Compared to the conventional foveation filter only smoothing less stimulus video signals, the developed foveation filter can locally smooth and enhance signals according to the human visual perception without causing any artifacts. Due to signal enhancement, the developed foveation filter more efficiently transfers the bandwidth saved at smoothed signals to the enhanced signals. Performance evaluation verifies that the proposed video processing method satisfies the overall video quality, while improving the perceptual quality by 12%~44%.

본 논문에서는 인간 인지 기반 비디오 코딩을 위한 비디오 처리 방법을 개발한다. 제안하는 방법은 율-왜곡(rate-distortion) 최적화의 영향뿐만 아니라 제한적인 시, 공간 해상도, 지역적인 움직임 이력(history), visual saliency에 의한 인간 시각 인지를 고려한다. 이러한 인간의 인지적인 효과들을 고려하기 위하여 본 논문에서는 움직임 패턴을 모델링하고 Hedge 알고리듬을 사용하여 움직임 패턴을 결정하는 기법을 개발한다. 그 다음, 제안한 움직임 패턴과 기존의 visual saliency와의 결합을 통하여 인간 시각 인지 모델을 수립한다. 제안된 인간 시각 인지 모델을 구현하기 위하여 기존의 foveation filtering 방법을 확장한다. 시각적 자극이 덜한 지역만을 부드럽게(smoothing)하는 기존의 foveation filtering 기법과 비교하여 제안하는 foveation filtering 기법은 인간 시각 인지 모델에 따라 지역적으로 부드럽게 또는 지역적 특성을 향상시킴으로써, 시각적 자극이 덜한 지역에서 줄여진 대역폭을 효과적으로 시각적 자극이 큰 지역에서 사용하도록 이동 시킬 수 있는 장점이 있다. 제안된 방법의 성능은 전반적인 비디오 화질을 만족할 뿐만 아니라 인간이 인지하는 화질의 품질을 12%~44% 향상시킨다.

Keywords

References

  1. Final report from the video quality experts group on the validation of objective quality metircs for video quality assessment.
  2. K. Minoo and T. Q. Nguyen, "Perceptual video coding with H.264", IEEE Asilomar Conf. Signals, Syst. Comput., pp. 741-745, Nov. 2005.
  3. N. Jacobson and T. Q. Nguyen, "Video processing with scale-aware saliency: application to frame rate up-conversion", Int. Conf. Acoustics, Speech and Signal Processing (ICASSP), pp. 1313-1316, May 2011.
  4. K. Shin, M. Stolte and S. C. Chong, "The effect of spatial attention on invisible stimuli", Attention, Perception, & Psychophysics, vol. 71. No. 7, pp. 1507-1513, 2009. https://doi.org/10.3758/APP.71.7.1507
  5. Y. Yeshurun, and M. Carrasco, "Attention improves or impairs visual performance by enhancing spatial resolution", Nature, Vol. 396, pp. 72-75, Nov. 1998. https://doi.org/10.1038/23936
  6. Z. Wang, and L. Lu, and A. C. Bovik, "Foveation scalable video coding with automatic fixation selection", IEEE Trans. Image Processing, Vol. 12, No. 2, pp. 243-254, Feb. 2003. https://doi.org/10.1109/TIP.2003.809015
  7. D. Walther and C. Koch, "Modelling attention to salient proto-objects", Neural Networks, Vol. 19, pp. 1395-1407, 2006. https://doi.org/10.1016/j.neunet.2006.10.001
  8. N. Jacobson and Y. Lee and V. Mahadevan and N. Vasconcelos and T. Q. Nguyen, "A novel approach to FRUC using discriminant saliency and frame segmentation", IEEE Trans. Image Processing, Vol. 19, No. 11, pp. 2924-2934, Nov. 2010. https://doi.org/10.1109/TIP.2010.2050928
  9. J. Harel, and C. Koch, and P. Perona, "Graph-based visual saliency", Advances in Neural Infomation Processing Systems, Vol. 19, pp. 545-552, 2007.
  10. Y. F. Ma and H. J. Zhang, "A model of motion attention for video skimming", ICIP, pp.129-132, Sep. 2002.
  11. E. Shechtman and Y. Caspi and M. Irani, "Space-time super-resolution", IEEE Trans. Pattern Anal. and Mach. Intell. Vol. 27, No. 4, pp. 531-545, Apr. 2005. https://doi.org/10.1109/TPAMI.2005.85
  12. D. Melcher, and M. C. Morrone, "Spatiotopic temporal integration of visual motion across saccadic eye movements", Nature Neuroscience, Vol. 6, No. 8, pp. 877-881, Aug. 2003. https://doi.org/10.1038/nn1098
  13. 김원하, 오형석, 엄태하, "Video Processing based on Motion Analysis", IPIU2012, 2월, 2012
  14. N. Jacobson, and Y. Freund, and T. Q. Nguyen, "Occlusion boundary detection using an online learning framework", Int. Conf. Acoustics, Speech and Signal Processing (ICASSP), pp. 913-916, May 2011.
  15. John Watkinson, The Mpeg Handbook, Focal Press, 2004.
  16. H. D. Cheng and J. Chen and J. Li, "Threshold selection based on fuzzy c-partition entropy approach", Pattern Recognition, Vol. 31, No. 7, pp. 857-870, 1998. https://doi.org/10.1016/S0031-3203(97)00113-1
  17. R. Gonzalez and R. E. Woods, Digital Image Processing, 2nd edition, Prentice Hall, 2002.
  18. D. Martin and D. Tal and J. Malik, "A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics", ICCV, Vol. 2, pp. 416-423, 2001.
  19. S. Aghagolzadeh, and O. K. Ersoy, "Transform Image Enhancement", Opt. Eng., Vol. 31, pp. 614-626, Mar. 1992. https://doi.org/10.1117/12.56095
  20. H.264 Reference Software Version JM17.2[Online], Available: http://iphome.hhi.de/suehring/tml, 2010.
  21. N. D. Narvekar and L. J. Karam, "A no-reference image blor metric based on the cumulative probability of blur detection (CPBD)", IEEE Trans. Image Processing, vol. PP, No. 99, pp. 1-7, 2011.
  22. Methodology for the Subjective Assessment of the Quality of Television Pictures 2002, ITU-R Recommendation BT.500-11, 2002.