3D Object Detection via Multi-Scale Feature Knowledge Distillation

Se-Gwon Cheon;Hyuk-Jin Shin;Seung-Hwan Bae;

doi:10.9708/jksci.2024.29.10.035

Journal of the Korea Society of Computer and Information (한국컴퓨터정보학회논문지)

Volume 29 Issue 10
/
Pages.35-45
/
2024
/
1598-849X(pISSN)
/
2383-9945(eISSN)

Korean Society of Computer Information (한국컴퓨터정보학회)

DOI QR Code

3D Object Detection via Multi-Scale Feature Knowledge Distillation

Se-Gwon Cheon (Vision & Learning Lab, Dept. of Electrical and Computer Engineering, Inha University) ;
Hyuk-Jin Shin (Vision & Learning Lab, Dept. of Electrical and Computer Engineering, Inha University) ;
Seung-Hwan Bae (Vision & Learning Lab, Dept. of Electrical and Computer Engineering, Inha University)

Received : 2024.07.16
Accepted : 2024.10.04
Published : 2024.10.31

https://doi.org/10.9708/jksci.2024.29.10.035 Citation PDF

Download PDF

⟨ Previous Next ⟩

Abstract

In this paper, we propose Multi-Scale Feature Knowledge Distillation for 3D Object Detection (M3KD), which extracting knowledge from the teacher model, and transfer to the student model consider with multi-scale feature map. To achieve this, we minimize L2 loss between feature maps at each pyramid level of the student model with the correspond teacher model so student model can mimic the teacher model backbone information which improves the overall accuracy of the student model. We apply the class logits knowledge distillation used in the image classification task, by allowing student model mimic the classification logits of the teacher model, to guide the student model to improve the detection accuracy. In KITTI (Karlsruhe Institute of Technology and Toyota Technological Institute) dataset, our M3KD (Multi-Scale Feature Knowledge Distillation for 3D Object Detection) student model achieves 30% inference speed improvement compared to the teacher model. Additionally, our method achieved an average improvement of 1.08% in 3D mean Average Precision (mAP) across all classes and difficulty levels compared to the baseline student model. Furthermore, when integrated with the latest knowledge distillation methods such as PKD and SemCKD, our approach achieved an additional 0.42% and 0.52% improvement in 3D mAP, respectively, further enhancing performance.

본 연구에서는 모델의 경량화를 위해 교사 모델의 출력 특징맵에서 3D 객체의 정보를 추출해 학생 모델의 다중 스케일 특징맵(Multi-scale feature map)에 맞게 증류하는 3D 객체 검출용 다중스케일 특징 지식 증류 기법인 M3KD (Multi-Scale Feature Knowledge Distillation for 3D Object Detection)를 제안한다. M3KD는 지식 증류 수행 시 학생 모델과 교사 모델의 다중 스케일 특징맵들 간 L2 손실(loss)을 사용해 특징맵 값의 차이를 줄이게 함으로써 학생 모델이 교사 모델의 백본을 모방하게 하여 학생 모델의 전체적인 정확도를 향상시키고, 기존의 이미지 분류 태스크(Task)에서 사용하는 클래스 로짓(Logits) 지식 증류를 적용해 교사 모델의 클래스 분류 로짓을 모방함으로써 학생 모델의 검출 정확도를 향상시킨다. 본 연구가 제안한 M3KD의 효과를 증명하기 위해 KITTI (Karlsruhe Institute of Technology and Toyota Technological Institute) 데이터 셋에서 실험을 진행하였으며, 이때 학습한 학생 모델이 교사 모델 대비 30%의 추론 속도 향상을 달성하였다. 또한, 정확도에서 기존의 학생 모델과 비교시 모든 클래스 및 모든 난이도에서 평균적으로 1.08%의 3D mAP (Mean Average Precision) 향상이 있음을 확인하였다. 또한 최신 지식 증류 기법인 PKD, SemCKD에 제안하는 기법을 추가로 적용하였을 시 기존 대비 0.42%, 0.52% 높은 정확도 (3D mAP)를 나타내 성능 향상을 달성하였다.

Keywords

Acknowledgement

This work was supported in part by the National Research Foundation of Korea (NRF) grants funded by the Korea government (MSIT) (No. NRF-2022R1C1C1009208) and funded by the Ministry of Education (No.2022R1A6A1A03051705); supported in part by Institute of Information & communications Technology Planning & Evaluation (IITP) grants funded by the Korea government (MSIT) (No.2022-0-00448/RS-2022-II220448: Deep Total Recall, 30%, No.RS-2022-00155915: Artificial Intelligence Convergence Innovation Human Resources Development (Inha University))

References

Mao Jiageng, "3D object detection for autonomous driving: A comprehensive survey," International Journal of Computer Vision, Vol. 131, No. 8, pp. 1909-1963, August 2023. DOI: 10.1007/S11263-023-01790-1
He Kaiming, "Deep residual learning for image recognition," Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770-778, Las Vegas, NV, USA, June 2016. DOI: 10.1109/CVPR.2016.90
Zhou Shengchao, "UniDistill: A Universal Cross-Modality Knowledge Distillation Framework for 3D Object Detection in Bird's-Eye View," Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5116-5125, Vancouver, BC, Canada, June 2023. DOI: 10.1109/CVPR52729.2023.00495
Chong Zhiyu, "Monodistill: Learning spatial features for monocular 3d object detection," arXiv preprint arXiv:2201.10830 Vol. abs/2201.10830, 2022.
Zeng Jia, "Distilling Focal Knowledge from Imperfect Expert for 3D Object Detection," Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 992-1001, Vancouver, BC, Canada, June 2023. DOI: 10.1109/CVPR52729.2023.00102
Chen Defang, "Knowledge distillation with the reused teacher classifier," Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11923-11932, New Orleans, LA, USA, June 2022. DOI: 10.1109/CVPR52688.2022.01163
G. Hinton, O. Vinyals, and J. Dean, "Distilling the Knowledge in a Neural Network," arXiv, March 2015. DOI: 10.48550/arXiv.1503.02531
Andreas Geiger, Philip Lenz, and Raquel Urtasun, "Are we ready for autonomous driving? the KITTI vision benchmark suite," Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3354-3361, Providence, RI, USA, June 2012. DOI: 10.1109/CVPR.2012.6248074
Cao Weihan, "Pkd: General distillation framework for object detectors via pearson correlation coefficient," Advances in Neural Information Processing Systems 35, pp. 15394-15406, New Orleans, LA, USA, November 2022.
Wang Can, "SemCKD: Semantic calibration for cross-layer knowledge distillation," IEEE Transactions on Knowledge and Data Engineering, Vol. 35, No. 8, pp. 6305-6319, June 2023: 6305-6319. DOI: 10.1109/TKDE.2022.3171571
Seung-Hwan Bae, "Deformable Part Region Learning and Feature Aggregation Tree Representation for Object Detection," IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 45, pp. 10817-10834, September 2023. DOI: 10.1109/TPAMI.2023.3268864
Seung-Hwan Bae, "Deformable part Region Learning for object detection," Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 36, No. 1, pp. 95-103, 2022. DOI:10.1609/AAAI.V36I1.19883
Seong-Ho Lee, and Seung-Hwan Bae, "AFI-GAN: Improving feature interpolation of feature pyramid networks via adversarial training for object detection," Pattern Recognition, Vol. 138, pp. 1-14, June 2023. DOI: 10.1016/J.PATCOG.2023.109365
Lang Alex H, "Pointpillars: Fast encoders for object detection from point clouds," Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12697-12705, Long Beach, CA, USA, June 2019. DOI: 10.1109/CVPR.2019.01298
Brazil, Garrick, and Xiaoming Liu, "M3d-rpn: Monocular 3d region proposal network for object detection," Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 9286-9295, Seoul, Korea, October 2019. DOI: 10.1109/CVPR.2019.01298
Shi Xuepeng, "Geometry-based distance decomposition for monocular 3d object detection," Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 15152-15161, Montreal, QC, Canada, October 2021. DOI: 10.1109/ICCV48922.2021.01489
Lin Tsung-Yi, "Feature pyramid networks for object detection," Proceedings of the IEEE Conference on Computer Vision and Pattern recognition, pp. 2117-2125, Hawaii, USA, July 2017. DOI: 10.1109/CVPR.2017.106
Ren Shaoqing, "Faster R-CNN: Towards real-time object detection with region proposal networks," IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 39, No. 6, pp. 1137-1149, June 2017, DOI: 10.1109/TPAMI.2016.2577031
Xiaozhi Chen, Kaustav Kundu, Ziyu Zhang, Huimin Ma, Sanja Fidler, and Raquel Urtasun, "Monocular 3d object detection for autonomous driving," In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2147-2156, Las Vegas, NV, USA, June 2016. DOI: 10.1109/CVPR.2016.236

Journal of the Korea Society of Computer and Information (한국컴퓨터정보학회논문지)

3D Object Detection via Multi-Scale Feature Knowledge Distillation

Abstract

Keywords

Acknowledgement

References

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)