Acknowledgement
This work was supported by Seokyeong University in 2022 and by Seokyeong University in 2023.
References
- A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, "Attention Is All You Need," 31st Conf. on Neural Information Processing Systems(NIPS 2017), 2017. DOI: 10.48550/arXiv.1706.03762
- Chih-Yang Lin, Yi-Cheng Chiu, Hui-Fuang Ng, Timothy K. Shih, Kuan-Hung Lin, "Global-and-Local Context Network for Semantic Segmentation of Street View Images," Sensors, Vol.20, No.10, 2020. DOI: 10.3390 /s20102907 https://doi.org/10.3390
- Ze Liu, Yutong Lin, Yue Cao, Han Hu, Yixuan Wei, Zheng Zhang, Stephen Lin, Baining Guo, "Swin Transformer: Hierarchical Vision Transformer Using Shifted Windows," Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp.10012-10022, 2021.
- Jinpeng Li, Yichao Yan, Shengcai Liao, Xiaokang Yang, Ling Shao, "Local-to-Global Self-Attention in Vision Transformers," 2021. DOI: 10.48550/arXiv.2107.04735
- Nikolas Ebert, Didier Stricker, Oliver Wasenmuller, "PLG-ViT: Vision Transformer with Parallel Local and Global Self-Attention," Sensors, Vol.23, No.7, 2023. DOI: 10.3390/s23073447
- B Yang, J Li, DF Wong, LS Chao, X Wang, Z Tu, "Context-aware self-attention networks," Proceedings of the AAAI conference on artificial intelligence, 2019. DOI: 10.48550/arXiv.1902.05766
- Ali Hassani, Steven Walton, Jiachen Li, Shen Li, Humphrey Shi, "Neighborhood Attention Transformer," Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp.6185-6194, 2023. DOI: 10.48550/arXiv.2204.07143
- Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, SylvainGelly, Jakob Uszkoreit, Neil Houlsby, "An image is worth 16×16 words: Transformersfor image recognition at scale," ICLR, 2020. DOI: 10.48550/arXiv.2010.11929