Deep Image Retrieval using Attention and Semantic Segmentation Map

Minjung Yoo;Eunhye Jo;Byoungjun Kim;Sunok Kim;

doi:10.5909/JBE.2023.28.2.230

방송공학회논문지 (Journal of Broadcast Engineering)

제28권2호
/
Pages.230-237
/
2023
/
1226-7953(pISSN)
/
2287-9137(eISSN)

한국방송∙미디어공학회 (The Korean Institute of Broadcast and Media Engineers)

DOI QR Code

관심 영역 추출과 영상 분할 지도를 이용한 딥러닝 기반의 이미지 검색 기술

Deep Image Retrieval using Attention and Semantic Segmentation Map

유민정 (한국항공대학교) ;
조은혜 (한국항공대학교) ;
김병준 (한국항공대학교) ;
김선옥 (한국항공대학교)

Minjung Yoo (Korea Aerospace University) ;
Eunhye Jo (Korea Aerospace University) ;
Byoungjun Kim (Korea Aerospace University) ;
Sunok Kim (Korea Aerospace University)

투고 : 2023.01.20
심사 : 2023.03.30
발행 : 2023.03.30

https://doi.org/10.5909/JBE.2023.28.2.230 인용 PDF

PDF 다운로드

⟨ 이전 논문 다음 논문 ⟩

초록

자율주행은 4차 산업의 핵심 기술로 차, 드론, 자동차, 로봇 등 다양한 곳에 응용 가능하다. 그 중 위치 추정 기술은 GPS, 센서, 지도 등을 활용하여, 객체나 사용자의 위치를 파악하는 기술로 자율주행을 구현하기 위한 핵심적인 기술 중 하나이다. GPS나 LIDAR 등의 센서를 이용하여 위치 추정이 가능하지만, 이는 매우 고가이고 무거운 장비를 탑재해야 하며 지하 혹은 터널 등 전파 방해가 있는 곳의 경우 정밀한 위치 추정이 어렵다는 단점이 있다. 본 논문에서는 이를 보완하기 위해 저가의 비전 카메라로 획득한 컬러 영상을 입력으로 하여 관심 영역 추출 네트워크와 영상 분할 지도를 이용한 영상 검색 기술을 제안한다.

Self-driving is a key technology of the fourth industry and can be applied to various places such as cars, drones, cars, and robots. Among them, localiztion is one of the key technologies for implementing autonomous driving as a technology that identifies the location of objects or users using GPS, sensors, and maps. Locilization can be made using GPS or LIDAR, but it is very expensive and heavy equipment must be mounted, and precise location estimation is difficult for places with radio interference such as underground or tunnels. In this paper, to compensate for this, we proposes an image retrieval using attention module and image segmentation maps using color images acquired with low-cost vision cameras as an input.

키워드

과제정보

This was supported by the National Research Foundation of Korea(NRF) grant funded by the Korea government(MSIP)(NRF-2021R1C1C2005202).

참고문헌

Hausler, S., Garg, S., Xu. M., Milford, M., & Fischer, T, "Patch-netvlad: Multi-scale fusion of locally-global descriptors for place recognition," Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR), 2021. doi: https://doi.org/10.1109/CVPR46437.2021.01392
G. Tolias, S. Ronan, and J. Herve, "Particular object retrieval with integral max-pooling of CNN activations," arXiv preprint, 2015. doi: https://doi.org/10.48550/arXiv.1511.05879
A. Gordo, J. Almazan, J. Revaud, & D. Larlus, "End-to-end learning of deep visual representations for image retrieval," International Journal of Computer Vision, Vol. 124, No. 2, pp. 237-254, 2017. doi: https://doi.org/10.1007/s11263-017-1016-8
J. G. Kwak, Y. Jin, Y. Li, D. Yoon, D. Kim, and H. Ko, "Adverse Weather Image Translation with Asymmetric and Uncertainty-aware GAN," arXiv preprint, 2021. doi: https://doi.org/10.48550/arXiv.2112.04283
B. Cheng, D. C. Maxwell, Z. Yukun, L. Ting, S. H. Thomas, A. Hartwig, C. Liang, "Panoptic-deeplab: A simple, strong, and fast baseline for bottom-up panoptic segmentation," Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR), 2020. https://doi.org/10.1109/CVPR42600.2020.01249
M. Cordts, M. Omran, S. Ramos, T. Rehfeld, M. Enzweiler, R. Benenson, U. Franke, S. Roth, B. Schiele, "The cityscapes dataset for semantic urban scene understanding," Proceedings of the IEEE conference on computer vision and pattern recognition(CVPR), 2016. https://doi.org/10.1109/CVPR.2016.350
S. Xie, R. Girshick, P. Dollar, Z. Tu, and K. He, "Aggregated residual transformations for deep neural networks," Proceedings of the IEEE conference on computer vision and pattern recognition(CVPR), 2017. doi: https://doi.org/10.1109/CVPR.2017.634
S. Kim, S. Kim, D. Min, K. Sohn, "Laf-net: Locally adaptive fusion networks for stereo confidence estimation," Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2019. doi: https://doi.org/10.1109/CVPR.2019.00029
S. Woo, J. Park, JY. Lee, I.S. Kwoen, "Cbam: Convolutional block attention module," Proceedings of the European conference on computer vision (ECCV), 2018. doi: https://doi.org/10.1007/978-3-030-01234-2_1

방송공학회논문지 (Journal of Broadcast Engineering)

관심 영역 추출과 영상 분할 지도를 이용한 딥러닝 기반의 이미지 검색 기술

Deep Image Retrieval using Attention and Semantic Segmentation Map

초록

키워드

과제정보

참고문헌

자세히 찾기