객체 탐지 과업에서의 트랜스포머 기반 모델의 특장점 분석 연구

A Survey on Vision Transformers for Object Detection Task

  • 투고 : 2022.10.12
  • 심사 : 2022.11.21
  • 발행 : 2022.12.31


Transformers are the most famous deep learning models that has achieved great success in natural language processing and also showed good performance on computer vision. In this survey, we categorized transformer-based models for computer vision, particularly object detection tasks and perform comprehensive comparative experiments to understand the characteristics of each model. Next, we evaluated the models subdivided into standard transformer, with key point attention, and adding attention with coordinates by performance comparison in terms of object detection accuracy and real-time performance. For performance comparison, we used two metrics: frame per second (FPS) and mean average precision (mAP). Finally, we confirmed the trends and relationships related to the detection and real-time performance of objects in several transformer models using various experiments.



이 논문은 2022년도 정부 (과학기술정보통신부)의 재원으로 정보통신기획평가원의 지원을 받아 수행된 연구임 (No.2021-0-00994, 지속가능하고 견고한 자율주행 인공지능 교육/개발 통함 플랫폼과 No.RS-2022-00167194, 미션 크리티컬 시스템을 위한 신뢰 가능한 인공지능).


