Spatiotemporal Removal of Text in Image Sequences

비디오 영상에서 시공간적 문자영역 제거방법

  • 이창우 (경북대학교 컴퓨터공학과) ;
  • 강현 (경북대학교 컴퓨터공학과) ;
  • 정기철 (숭실대학교 미디어학부) ;
  • 김항준 (경북대학교 컴퓨터공학과)
  • Published : 2004.03.01

Abstract

Most multimedia data contain text to emphasize the meaning of the data, to present additional explanations about the situation, or to translate different languages. But, the left makes it difficult to reuse the images, and distorts not only the original images but also their meanings. Accordingly, this paper proposes a support vector machines (SVMs) and spatiotemporal restoration-based approach for automatic text detection and removal in video sequences. Given two consecutive frames, first, text regions in the current frame are detected by an SVM-based texture classifier Second, two stages are performed for the restoration of the regions occluded by the detected text regions: temporal restoration in consecutive frames and spatial restoration in the current frame. Utilizing text motion and background difference, an input video sequence is classified and a different temporal restoration scheme is applied to the sequence. Such a combination of temporal restoration and spatial restoration shows great potential for automatic detection and removal of objects of interest in various kinds of video sequences, and is applicable to many applications such as translation of captions and replacement of indirect advertisements in videos.

많은 시각적 정보를 포함한 비디오 데이터들의 자동화된 처리 기술 중, 비디오 데이터들의 시청자적인 정보를 보강시키고, 부가적인 정보를 첨가하기 위한 일환으로 자막을 삽입하는 경우가 많다. 이러한 자막은 때로 영상자료의 재사용성(reusability)을 저해하고, 원 영상을 훼손하는 경우가 발생한다. 본 논문에서는 영상의 재사용성을 높이고 원 영상 복원을 위해 Support Vector Machines(SVM)과 시공간적 영상복원 방법(spatiotemporal restoration)을 이용한 비디오 영상에서의 자동 문자 검출과 제거 방법을 제안한다. 연속적인 두 프레임 이상의 영상을 입력받아, 현재 프레임 영상에서 SVM을 이용하여 문자 영역을 검출한 다음, 검출된 문자 영역을 제거하고, 문자 영역에 의해 가려졌던 원 영상을 복원하기 위한 두 단계- 시간적 복원(temporal restoration)과 공간적 복원(spatial restoration)접근방법을 제안한다. 제안된 복원 방법은 글자 모션(text motion) 정보와 두 영상의 배경 차이(background difference)를 이용하여 영상을 그 특징에 따라 분류하고, 각 영상의 특징에 맞는 복원 방법을 적용한다. 제안된 방법은 다양한 종류의 영상에서 문자뿐만 아니라 관심의 대상이 되는 객체의 자동 검출 및 복원 등 다양한 응용분야를 포함한다.

Keywords

References

  1. A. C. Kokaram, R. D. Morris, W. J. Fitzgerald, P. J. W. Rayner, 'Interpolation of Miss ing Data in Image Sequences,' IEEE Transaction on Image Processing, vol. 4, no. 11, pp. 1509-1519, 1995 https://doi.org/10.1109/83.469932
  2. M. Bertalrnio, G. Sapiro, V. Caselles, C. Ballester, 'Image Inpainting,' Siggraph 2000 Conference Proceedings, pp. 417 - 424, 2000 https://doi.org/10.1145/344779.344972
  3. T. Chan, J. Shen, 'Inpainting, zooming, and edge coding,' Special Session on Inverse Problems and Image Analysis at the AMS Annual Conference, January 2001
  4. L. Y. Wei, M. Levoy, 'Fast Texture Syn thesis using Tree-structured Vector Quantization,' Siggraph 2000 Conference Proceedings, pp. 479-488, 2000 https://doi.org/10.1145/344779.345009
  5. M. Irani, S. Peleg, 'Motion Analysis for Image Enhancement: Resolution, Occlusion, and Transparency,' Journal on Visual Communications and Image Representation, vol. 4, no. 4, pp. 324 - 335, 1993 https://doi.org/10.1006/jvci.1993.1030
  6. B. T. Chun, Y. Bae, 'A Method for Recover ing Original Image for Video Caption Area and Replacing Caption Text,' in Proceeding on International Workshop of Content-based Multimedia Indexing'2001 (CBMI'01), Brescia, Italy, September, 2001
  7. R. Lienhart, F. Stuber, 'Automatic Text Recognition in Digital Videos,' SPIE - The Inter national Society for Optical Engineering, pp. 180-188, 1996 https://doi.org/10.1117/12.234741
  8. A. K Jain, B. Yu, 'Automatic Text Location in Images and Video Frames,' Pattern Re cognition, vol. 31, no. 12, pp. 2055-2076, 1998 https://doi.org/10.1016/S0031-3203(98)00067-3
  9. E. Y. Kim, K. Jung, K Y. Jeong, H. J. Kim, 'Automatic Text Region Extraction Using Cluster-based Templates,' International Conference on Application and Pattern Recognition and Digital Techniques, pp. 418-421, 2000 https://doi.org/10.1109/ICPR.2002.1047937
  10. M. D. Richard, R. P. Lippmann, 'Neural Net work Classifiers Estimates Bayesian a Posteriori Probabilities,' Neural Computation, vol. 3, pp. 461-483, 1991 https://doi.org/10.1162/neco.1991.3.4.461
  11. Y. Zhong, K. Karu, A. K. Jain, 'Locating Text in Complex Color Images,' Pattern Recognition, vol. 28, no. 10, pp. 1523 - 1535, 1995 https://doi.org/10.1016/0031-3203(95)00030-4
  12. A. K. Jain, K. Karu, 'Learning Texture Discrimination Masks,' IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 18, no. 2, pp, 195-205, 1996 https://doi.org/10.1109/34.481543
  13. K Y. Jeong, K Jung, E. Y. Kim, H. J. Kim, 'Neural Network-based Text Location for News Video Indexing,' Proc. of International Conference of Image Processing, vol. 3, pp, 319-323, 1999 https://doi.org/10.1109/ICIP.1999.817127
  14. H. Li, D. Doerman, O. Kia, 'Automatic Text Detection and Tracking in Digital Video,' IEEE Transactions on Image Processing, vol. 9, no. 1, pp. 147-156, 2000 https://doi.org/10.1109/83.817607
  15. Y. Zhong, H. Zhang, A. K Jain, 'Automatic Caption Localization in Compressed Video' IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 22, no. 4, pp. 385-392, 2000 https://doi.org/10.1109/34.845381
  16. Y. M. Y. Hasan, L. J. Karam, 'Morpholo gical Text Extraction from Images,' IEEE Transactions on Image Processing, vol. 9, no. 11, pp. 1978-1983, 2000 https://doi.org/10.1109/83.877220
  17. K. Jung, 'Neural network-based text location in color images,' Pattern Recognition Letters, vol. 22, pp. 1503 - 1515, 2001 https://doi.org/10.1016/S0167-8655(01)00096-4
  18. T. Hastie, R. Tibshirani, J. Friedman, Elements of Statistical Learning: Data Mining, Inference and Prediction, Springer-Verlag, New York. 2001
  19. K. I. Kim, K. Jung, S. H. Park, H. J. Kim, 'Support Vector Machines for Texture Classification,' IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 24, no. 11, pp. 1542 - 1550, 2002 https://doi.org/10.1109/TPAMI.2002.1046177
  20. B. Schokopf, A. J. Smola, Learning with Kernels, The MIT Press, 2002
  21. T. M. Cover, 'Geometrical and Statistical Properties of Systems of Linear Inequalities with Applications in Pattern Recognition,' IEEE Transactions on Electronic Computers, vol. 14, pp. 326-334, 1965 https://doi.org/10.1109/PGEC.1965.264136
  22. V. Vapnik, Statistical Learning Theory, John Wiley & Sons, New York, 1998
  23. Scholkopf B., Sung K. K., Burges C. J. C, Girosi F., Niyogi P., Poggio T., and Vapnik V., 'Comparing Support Vector Machines with Gaussian Kernels to Radial Basis Function Classifiers,' IEEE Transactions on Signal Processing 45, pp. 2753-2765, 1997 https://doi.org/10.1109/78.650102
  24. Haykin S. Neural Network- A Comprehensive Foundation, 2nd edition. Prentice Hall, NJ. 1999
  25. S. Scholkopf, C. J. C. Burges, V. Vapnik, 'Extracting Support Data for a Given Task,' Proc. Int. Conf. on Knowledge Discovery & Data Mining, PP. 252-257, 1995
  26. A. Bovik, Hand Book of Image and Video Processing, Academic Press, 2000
  27. N. Habili, A. Moini, N. Burgess, 'Automatic Thresholding for Change Detection in Digital Video,' Proc. Visual Com. And Image Processing, 4067, pp. 133-142, 2000 https://doi.org/10.1117/12.386578
  28. P. Perona, J. Malik, 'Scale-space And Edge Detection using Anisotropic Diffusion,' IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 12, no. 7, pp. 629-639, 1990 https://doi.org/10.1109/34.56205
  29. C. W. Lee, Spatiotemporal Approach to Removal of Superimposed Text in Image Sequences, PhD thesis, Kyungpook National University, 2003. 12
  30. R. Nakagaki, and A. K Katsaggelos, 'A VQ-Based Blind Image Restoration Algorithm,' IEEE Transactions on Image Processing, vol. 12, no. 9, pp. 1044-1053, 2003 https://doi.org/10.1109/TIP.2003.816007
  31. M. M. Oliveira, B. Bowen, R. McKenna, and Y. S. Chang, 'Fast Digital Image Inpainting,' in Proc, of the International Conferenee on Visualization, Imaging and Image Processing, pp. 261-266, 200