Acknowledgement
This work was supported by Dongseo University, "Dongseo Cluster Project" Research Fund of 2023 (DSU-20230004).
References
- A. Bochkovskiy, C. Wang, H. M. Liao, and R. Girshick, "YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors," Apr. 2021. DOI: https://doi.org/10.48550/arXiv.2207.02696
- J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, "You Only Look Once: Unified, Real-Time Object Detection," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016. DOI: https://doi.org/10.48550/arXiv.1506.02640
- J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, "YOLOv3: An Incremental Improvement," Apr. 2018. DOI: https://doi.org/10.48550/arXiv.1804.02767
- A. Bochkovskiy, C. Wang, and H. M. Liao, "YOLOv4: Optimal Speed and Accuracy of Object Detection," Apr. 2020. DOI: https://doi.org/10.48550/arXiv.2004.10934
- N. Carion, F. Massa, G. Synnaeve, N. Usunier, A. Kirillov, and S. Zagoruyko, "End-to-end object detection with transformers," in European Conference on Computer Vision (ECCV), 2020, pp. 213-229. Springer. DOI: https://doi.org/10.48550/arXiv.2005.12872
- A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Kaiser, and I. Polosukhin, "Attention is all you need," in Advances in neural information processing systems (NeurIPS), 2017. DOI: https://doi.org/10.48550/arXiv.1706.03762
- A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly, et al., "An image is worth 16x16 words: Transformers for image recognition at scale," Oct. 2020. DOI: https://doi.org/10.48550/arXiv.2010.11929
- Z. Chen, Y. Duan, W. Wang, J. He, T. Lu, J. Dai, and Y. Qiao, "Vision Tranformer Adapter for Dense Predictions," In Proceedings of the 9th International Conference on Learning Representations (ICLR), Feb. 2023. DOI: https://doi.org/10.48550/arXiv.2205.08534
- H. Q. Nguyen et al., "VinDr-CXR: An Open Dataset of Chest X-rays with Radiologist's Annotations," Jan. 2022. DOI: https://doi.org/10.48550/arXiv.2012.15029
- J. Zhu, T. Park, P. Isola, and A. A. Efros, "Unpaired image-to-image translation using cycle-consistent adversarial networks," in Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2017. DOI: https://doi.org/10.48550/arXiv.1703.10593
- R. Zhang, P. Isola, and A. A. Efros, "Colorful image colorization," in European Conference on Computer Vision (ECCV), 2016. DOI: https://doi.org/10.48550/arXiv.1603.08511
- L. A. Gatys, A. S. Ecker, and M. Bethge, "Image style transfer using convolutional neural networks," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016.
- M. Tan and Q. V. Le, "EfficientDet: Scalable and efficient object detection," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020. DOI: https://doi.org/10.48550/arXiv.1911.09070
- J. Hosang, R. Benenson, P. Dollar, and B. Schiele, "Learning non-maximum suppression," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017. DOI: https://doi.org/10.48550/arXiv.1705.02950
- D. Bahdanau, K. Cho, and Y. Bengio, "Neural machine translation by jointly learning to align and translate," in Proceedings of the 3rd International Conference on Learning Representations (ICLR), Sep. 2014. https://doi.org/10.48550/arXiv.1409.0473