DOI QR코드

DOI QR Code

Object-based Compression of Thermal Infrared Images for Machine Vision

머신 비전을 위한 열 적외선 영상의 객체 기반 압축 기법

  • Received : 2021.09.06
  • Accepted : 2021.11.17
  • Published : 2021.11.30

Abstract

Today, with the improvement of deep learning technology, computer vision areas such as image classification, object detection, object segmentation, and object tracking have shown remarkable improvements. Various applications such as intelligent surveillance, robots, Internet of Things, and autonomous vehicles in combination with deep learning technology are being applied to actual industries. Accordingly, the requirement of an efficient compression method for video data is necessary for machine consumption as well as for human consumption. In this paper, we propose an object-based compression of thermal infrared images for machine vision. The input image is divided into object and background parts based on the object detection results to achieve efficient image compression and high neural network performance. The separated images are encoded in different compression ratios. The experimental result shows that the proposed method has superior compression efficiency with a maximum BD-rate value of -19.83% to the whole image compression done with VVC.

오늘날 딥러닝 기술의 향상으로 영상 분류, 객체 탐지, 객체 분할, 객체 추적 등 컴퓨터 비전 분야 또한 큰 발전을 이루고 있다. 지능적 감시, 로봇, 사물 인터넷, 자율주행 자동차 등 딥러닝 기술이 결합된 다양한 응용 기술들은 실제 산업에 적용되고 있으며, 이에 따라 사람의 소비를 위한 영상 데이터 뿐만 아니라 머신 비전을 위한 영상 데이터의 효율적인 압축 방식에 대한 필요성이 대두되고 있다. 본 논문에서는 머신 비전을 위한 열 적외선 영상의 객체 기반 압축 기법을 제안한다. 효율적인 영상 압축과 신경망의 좋은 성능을 유지하기 위해 본 논문에서는 신경망의 객체 탐지 결과와 객체 크기에 따라 입력 영상을 객체 부분과 배경 부분으로 나누어 서로 다른 압축률로 부호화를 수행하는 방법을 제안한다. 제안하는 방법은 VVC로 영상 전체를 압축하는 방식보다 BD-rate 값이 최대 -19.83%로 압축 효율이 뛰어나다는 것을 확인할 수 있다.

Keywords

Acknowledgement

본 논문은 2021년도 정부(과학기술정보통신부)의 재원으로 정보통신기획평가원의 지원을 받아 수행된 연구임(No. 2021-0-00191, 기계를 위한 영상 부호화 기술).

References

  1. E. Ke, S. Wang, C. Lin, C. Lin, Y. Nien, T. Li, D. Liu, "[VCM] Feature map compression for VCM", the 130th MPEG meeting, Alpbach, April, 2020.
  2. Y.k Yoon, D. Park, S. Chun, J. Kim, "[VCM] Results of feature map coding for object segmentation on Cityscapes datasets", the 132nd MPEG meeting, Online, October, 2020.
  3. S. Kim, M. Jeong, H. Jin, H. Lee, H. Choo, H. Lim, J. Seo, "[VCM] A report on intermediate feature coding for object detection and segmentation", the 132nd MPEG meeting, Online, October, 2020.
  4. H. Han, H. Choi, S. Kwak, J. Yun, W. Cheong, J. Seo, "[VCM] Investigation on feature map channel reordering and compression for object detection", the 134th MPEG meeting, Online, April, 2021.
  5. H. Wang, H. Wang, L. Wang, Y. Zhang, "[VCM] Feature prediction using Levene Test, KL-Divergence, and linear correlation and A Pipeline for Feature Coding", the 135th MPEG meeting, Online, July, 2021.
  6. Y. Yoon, D. Kim, J. Kim, "[VCM] Compression of reordered feature sequences based on channel means for object detection", the 135th MPEG meeting, Online, July, 2021.
  7. H. Wang, H. Wang, L. Wang, Y. Zhang, D. Yan, "[VCM] A Method of Intra-Frame Channel Prediction for Feature Coding", the 134th MPEG meeting, Online, April, 2021.
  8. H. Choi, M. Lee, J. Kim, K. Kim, Y. Lee, D. Sim, S. Oh, J. Do, H. Kwon, S. Jeong, "[VCM] A result of feature data reduction using PCA for object detection", the 132nd MPEG meeting, Online, October, 2020.
  9. S. Wang, Z. Wang, Y. Ye, S. Wang, "[VCM] Image or video format of feature map compression for object detection", the 133rd MPEG meeting, Online, January, 2021.
  10. S. Wang, Z. Wang, Y. Ye, S. Wang, "[VCM] Investigation on feature map quantization for object detection and compression", the 133rd MPEG meeting, Online, January, 2021.
  11. Y. Yoon, J. Kim, "[VCM] Evaluation results of object segmentation with deep learning-based image compression", the 133rd MPEG meeting, Online, January, 2021.
  12. B. Zhu, L. Yu, D. Li, "[VCM] Deep learning-based compression for machine vision", the 135th MPEG meeting, Online, July, 2021.
  13. S. Wang, C. Lin, C. Lin, T. Li, Y. Nie, "[VCM] Video codec optimization for VCM", the 135th MPEG meeting, Online, July, 2021.
  14. C. Hollmann, J. Strom, M. Damghanian, L. Litwic, "[VCM] VCM-based rate-distortion optimization for VVC", the 135th MPEG meeting, Online, July, 2021.
  15. W. Gao, X. Xu, S. Liu, "[VCM] Response to CfE: Investigation of VVC Codec for Video Coding for Machine", the 134th MPEG meeting, Online, April, 2021.
  16. Y. Lee, S. Kim, K. Yoon, H. Lim, H. Choo, W. Cheong, Je. Seo, "[VCM] Updated Evidence of object detection in FLIR using VTM 12.0", the 135th MPEG meeting, Online, July, 2021.
  17. WG 02 MPEG Technical requirements, "Evaluation Framework for Video Coding for Machines", the 135th MPEG meeting, Online, July, 2021.
  18. C. Hollmann, J. Strom, M. Damghanian, L. Litwic, "[VCM] VCM-based rate-distortion optimization for VVC", the 135th MPEG meeting, Online, July, 2021.
  19. D. Minnen, J. Balle, and G. Toderici, "Joint Autoregressive and Hierarchical Priors for Learned Image Compression", arXiv:1809.02736, Sep. 2018, http://arxiv.org/abs/1809.02736(accessed Aug. 30, 2021).
  20. Z. Cheng, H. Sun, M. Takeuchi, and J. Katto, "Learned Image Compression with Discretized Gaussian Mixture Likelihoods and Attention Modules", arXiv:2001.01568, Jan. 2020, http://arxiv.org/abs/2001.01568 (accessed Aug. 30, 2021).
  21. Free FLIR Thermal dataset, https://www.flir.com/oem/adas/dataset/ (accessed Aug. 30, 2021).
  22. Detectron2, https://github.com/facebookresearch/detectron2 (accessed Aug. 30, 2021).
  23. T Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollar, and C. Zitnick, "Microsoft coco: Common objects in context", in European conference on computer vision. Springer, 2014, pp. 740-755.
  24. WG 05 MPEG Joint Video Coding Team(s) with ITU-T SG 16, "Test Model 12 for Versatile Video Coding (VTM 12)", the 133rd MPEG meeting, Online, January, 2021.