DOI QR코드

DOI QR Code

Suboptimal video coding for machines method based on selective activation of in-loop filter

  • Ayoung Kim (Division of Software, Yonsei University) ;
  • Eun-Vin An (Division of Software, Yonsei University) ;
  • Soon-heung Jung (Media Research Division, Electronics and Telecommunications Research Institute) ;
  • Hyon-Gon Choo (Media Research Division, Electronics and Telecommunications Research Institute) ;
  • Jeongil Seo (Department of Computer Engineering, Dong-A University) ;
  • Kwang-deok Seo (Division of Software, Yonsei University)
  • Received : 2023.03.09
  • Accepted : 2023.11.13
  • Published : 2024.06.20

Abstract

A conventional codec aims to increase the compression efficiency for transmission and storage while maintaining video quality. However, as the number of platforms using machine vision rapidly increases, a codec that increases the compression efficiency and maintains the accuracy of machine vision tasks must be devised. Hence, the Moving Picture Experts Group created a standardization process for video coding for machines (VCM) to reduce bitrates while maintaining the accuracy of machine vision tasks. In particular, in-loop filters have been developed for improving the subjective quality and machine vision task accuracy. However, the high computational complexity of in-loop filters limits the development of a high-performance VCM architecture. We analyze the effect of an in-loop filter on the VCM performance and propose a suboptimal VCM method based on the selective activation of in-loop filters. The proposed method reduces the computation time for video coding by approximately 5% when using the enhanced compression model and 2% when employing a Versatile Video Coding test model while maintaining the machine vision accuracy and compression efficiency of the VCM architecture.

Keywords

Acknowledgement

This research was supported by the Institute of Information and Communications Technology Planning & Evaluation (IITP) grant funded by the Korea government (MSIT) (No. 2020-0-00011, Video Coding for Machine) and the Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education (NRF-2021R1F1A1048404).

References

  1. H. Kwon, S. Cheong, J. Choi, T. Lee, and J. Seo, Standardization trends in video coding for machines, Electron. Telecommun. Trends 35 (2020), 102-111.
  2. L. Duan, J. Liu, W. Yang, T. Huang, and W. Gao, Video coding for machines: a paradigm of collaborative compression and intelligent analytics, IEEE Trans. Image Process. 29 (2020), 8680-8695. https://doi.org/10.1109/TIP.2020.3016485
  3. S. Yang, Y. Hu, W. Yang, L. Duan, and J. Liu, Towards coding for human and machine vision: scalable face image coding, IEEE Trans. Multimedia 23 (2021), 2957-2971. https://doi.org/10.1109/TMM.2021.3068580
  4. Y. Zhang, M. Rafie, S. Liu, and C. Hollmann, BoG report on video coding for machines, M58352, ISO/IEC JTC1/SC29/WG2, 2021.
  5. Y. Zhang, C. Rosewarne, S. Liu, and C. Hollmann, Call for evidence on video coding for machines, N00215, ISO/IEC JTC1/SC29/WG2, 2022.
  6. C. Hollmann, S. Liu, M. Rafie, and Y. Zhang, Call for proposals on video coding for machines, N00191, ISO/IEC JTC1/SC29/WG2, 2022.
  7. Z. Liu, S. Liu, W. Gao, and C. Hollmann, Common test conditions and evaluation methodology for video coding for machines, N00192, ISO/IEC JTC1/SC29/WG2, 2022.
  8. I. Krasin, T. Duerig, N. Alldrin, V. Ferrari, S. Abu-El-Haija, A. Kuznetsova, H. Rom, J. Uijlings, S. Popov, S. Kamali, M. Malloci, J. Pont-Tuset, A. Veit, S. Belongie, V. Gomes, A. Gupta, C. Sun, G. Chechik, D. Cai, Z. Feng, D. Narayanan, and K. Murphy, OpenImages: a public dataset for large-scale multi-label and multi-class image classification, 2017, Available at: https://storage.googleapis.com/ope-nimages/web/index.html [last accessed September 2022].
  9. Teledyne FLIR, Free Teledyne FLIR thermal dataset for algorithm training, Available at: https://www.flir.com/oem/adas/adas-dataset-form/ [last accessed September 2022].
  10. W. Gao, X. Xu, M. Qin, and S. Liu, An open dataset for video coding for machines standardization, (IEEE International Conference on Image Processing-ICIP, Bordeaux, France), 2022, pp. 4008-4012.
  11. H. Choi, E. Hosseini, S. Ranjbar Alvar, R. Cohen, and I. Bajic, SFU-HW-objects-v1: object labelled dataset on raw video sequences, 2020, Available at: https://doi.org/10.17632/ hwm673bv4m.1 [last accessed September 2022].
  12. M. Rafie, Y. Zhang, and S. Liu, Evaluation framework for video coding for machines, N00104, ISO/IEC JTC1/SC29/WG2, 2021.
  13. W. Gao, X. Xu, S. Liu, Investigation of VVC codec for video coding for machine, M56681, ISO/IEC JTC1/SC29/WG2, 2021.
  14. K. Fischer, C. Herglotz, and A. Kaup, On intra video coding and in-loop filtering for neural object detection networks, (IEEE International Conference on Image Processing, Virtual), 2020, pp. 1147-1151.
  15. S. Wang, C. Lin, and C. Lin, A study on impact of coding tools on machine vision performance and visual quality, M56867, ISO/IEC JTC1/SC 29/WG2, 2021.
  16. J. Chen, Y. Ye, and S. Kim, Test model 12 for versatile video coding (VTM 12), N00032, ISO/IEC JTC1/SC 29/WG5, 2021.
  17. M. Coban, F. L. Leannec, K. Naser, and J. Strom, Algorithm description of enhanced compression model 4 (ECM 4), N00115, ISO/IEC JTC1/SC 29/WG5, 2022.
  18. ISO/IEC 23090-3:2022, Information technology-coded representation of immersive media-part 3: versatile video coding, 2022.
  19. B. Bross, Y. Wang, Y. Ye, S. Liu, J. Chen, G. J. Sullivan, and J. R. Ohm, Overview of the versatile video coding (VVC) standard and its applications, IEEE Trans. Circuits Syst. Video Technol. 31 (2021), 3736-3764. https://doi.org/10.1109/TCSVT.2021.3101953
  20. M. Karczewicz, N. Hu, J. Taquet, C. Chen, K. Misra, K. Andersson, P. Yin, T. Lu, E. Francois, and J. Chen, VVC inloop filters, IEEE Trans. Circuits Syst. Video Technol. 31 (2021), 3907-3925. https://doi.org/10.1109/TCSVT.2021.3072297
  21. W. Chien, J. Boyce, Y. Chen, R. Chernyak, K. Choi, R. Hashimoto, Y. Huang, H. Jang, R. Liao, and S. Liu, JVET AHG report: tool reporting procedure (AHG13), JVET-T0013, Joint Video Experts Team (JVET) of ITU-T SG16 WP3 and ISO/IEC JTC1/SC29, 2020.
  22. M. Karczewicz, Y. Ye, L. Zhang, B. Bross, X. Li, K. Naser, and H. Yang, JVET AHG report: enhanced compression beyond VVC capability (AHG12), JVET-Z0012, Joint Video Experts Team (JVET) of ITU-T SG16 WP3 and ISO/IEC JTC1/SC29, 2022.
  23. S. Ren, K. He, R. Girshick, and J. Sun, Faster R-CNN: towards real-time object detection with region proposal networks, IEEE Trans. Pattern Anal. Mach. Intell. 39 (2017), 1137-1149. https://doi.org/10.1109/TPAMI.2016.2577031
  24. K. He, G. Gkioxari, P. Dollar, and R. Girshick, Mask R-CNN, (IEEE International Conference on Computer Vision, Venice, Italy), 2017, pp. 2961-2969.