Acknowledgement
This research was supported by the Institute of Information and Communications Technology Planning & Evaluation (IITP) grant funded by the Korea government (MSIT) (No. 2020-0-00011, Video Coding for Machine) and the Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education (NRF-2021R1F1A1048404).
References
- H. Kwon, S. Cheong, J. Choi, T. Lee, and J. Seo, Standardization trends in video coding for machines, Electron. Telecommun. Trends 35 (2020), 102-111.
- L. Duan, J. Liu, W. Yang, T. Huang, and W. Gao, Video coding for machines: a paradigm of collaborative compression and intelligent analytics, IEEE Trans. Image Process. 29 (2020), 8680-8695. https://doi.org/10.1109/TIP.2020.3016485
- S. Yang, Y. Hu, W. Yang, L. Duan, and J. Liu, Towards coding for human and machine vision: scalable face image coding, IEEE Trans. Multimedia 23 (2021), 2957-2971. https://doi.org/10.1109/TMM.2021.3068580
- Y. Zhang, M. Rafie, S. Liu, and C. Hollmann, BoG report on video coding for machines, M58352, ISO/IEC JTC1/SC29/WG2, 2021.
- Y. Zhang, C. Rosewarne, S. Liu, and C. Hollmann, Call for evidence on video coding for machines, N00215, ISO/IEC JTC1/SC29/WG2, 2022.
- C. Hollmann, S. Liu, M. Rafie, and Y. Zhang, Call for proposals on video coding for machines, N00191, ISO/IEC JTC1/SC29/WG2, 2022.
- Z. Liu, S. Liu, W. Gao, and C. Hollmann, Common test conditions and evaluation methodology for video coding for machines, N00192, ISO/IEC JTC1/SC29/WG2, 2022.
- I. Krasin, T. Duerig, N. Alldrin, V. Ferrari, S. Abu-El-Haija, A. Kuznetsova, H. Rom, J. Uijlings, S. Popov, S. Kamali, M. Malloci, J. Pont-Tuset, A. Veit, S. Belongie, V. Gomes, A. Gupta, C. Sun, G. Chechik, D. Cai, Z. Feng, D. Narayanan, and K. Murphy, OpenImages: a public dataset for large-scale multi-label and multi-class image classification, 2017, Available at: https://storage.googleapis.com/ope-nimages/web/index.html [last accessed September 2022].
- Teledyne FLIR, Free Teledyne FLIR thermal dataset for algorithm training, Available at: https://www.flir.com/oem/adas/adas-dataset-form/ [last accessed September 2022].
- W. Gao, X. Xu, M. Qin, and S. Liu, An open dataset for video coding for machines standardization, (IEEE International Conference on Image Processing-ICIP, Bordeaux, France), 2022, pp. 4008-4012.
- H. Choi, E. Hosseini, S. Ranjbar Alvar, R. Cohen, and I. Bajic, SFU-HW-objects-v1: object labelled dataset on raw video sequences, 2020, Available at: https://doi.org/10.17632/ hwm673bv4m.1 [last accessed September 2022].
- M. Rafie, Y. Zhang, and S. Liu, Evaluation framework for video coding for machines, N00104, ISO/IEC JTC1/SC29/WG2, 2021.
- W. Gao, X. Xu, S. Liu, Investigation of VVC codec for video coding for machine, M56681, ISO/IEC JTC1/SC29/WG2, 2021.
- K. Fischer, C. Herglotz, and A. Kaup, On intra video coding and in-loop filtering for neural object detection networks, (IEEE International Conference on Image Processing, Virtual), 2020, pp. 1147-1151.
- S. Wang, C. Lin, and C. Lin, A study on impact of coding tools on machine vision performance and visual quality, M56867, ISO/IEC JTC1/SC 29/WG2, 2021.
- J. Chen, Y. Ye, and S. Kim, Test model 12 for versatile video coding (VTM 12), N00032, ISO/IEC JTC1/SC 29/WG5, 2021.
- M. Coban, F. L. Leannec, K. Naser, and J. Strom, Algorithm description of enhanced compression model 4 (ECM 4), N00115, ISO/IEC JTC1/SC 29/WG5, 2022.
- ISO/IEC 23090-3:2022, Information technology-coded representation of immersive media-part 3: versatile video coding, 2022.
- B. Bross, Y. Wang, Y. Ye, S. Liu, J. Chen, G. J. Sullivan, and J. R. Ohm, Overview of the versatile video coding (VVC) standard and its applications, IEEE Trans. Circuits Syst. Video Technol. 31 (2021), 3736-3764. https://doi.org/10.1109/TCSVT.2021.3101953
- M. Karczewicz, N. Hu, J. Taquet, C. Chen, K. Misra, K. Andersson, P. Yin, T. Lu, E. Francois, and J. Chen, VVC inloop filters, IEEE Trans. Circuits Syst. Video Technol. 31 (2021), 3907-3925. https://doi.org/10.1109/TCSVT.2021.3072297
- W. Chien, J. Boyce, Y. Chen, R. Chernyak, K. Choi, R. Hashimoto, Y. Huang, H. Jang, R. Liao, and S. Liu, JVET AHG report: tool reporting procedure (AHG13), JVET-T0013, Joint Video Experts Team (JVET) of ITU-T SG16 WP3 and ISO/IEC JTC1/SC29, 2020.
- M. Karczewicz, Y. Ye, L. Zhang, B. Bross, X. Li, K. Naser, and H. Yang, JVET AHG report: enhanced compression beyond VVC capability (AHG12), JVET-Z0012, Joint Video Experts Team (JVET) of ITU-T SG16 WP3 and ISO/IEC JTC1/SC29, 2022.
- S. Ren, K. He, R. Girshick, and J. Sun, Faster R-CNN: towards real-time object detection with region proposal networks, IEEE Trans. Pattern Anal. Mach. Intell. 39 (2017), 1137-1149. https://doi.org/10.1109/TPAMI.2016.2577031
- K. He, G. Gkioxari, P. Dollar, and R. Girshick, Mask R-CNN, (IEEE International Conference on Computer Vision, Venice, Italy), 2017, pp. 2961-2969.