DOI QR코드

DOI QR Code

웨이블릿 변환을 활용한 효율적인 의미론적 분할 기술

Efficient Semantic Segmentation Using Wavelet-transform

  • 안택현 (한국전자통신연구원 초지능창의연구소) ;
  • 최정단 (한국전자통신연구원 초지능창의연구소)
  • Taeg-Hyun An (Electronics and Telecommunications Research Institute) ;
  • Jeong Dan Choi (Electronics and Telecommunications Research Institute)
  • 투고 : 2024.09.30
  • 심사 : 2024.10.17
  • 발행 : 2024.10.31

초록

의미론적 영상 분할 기술은 오브젝트 검출과 더불어 자율주행 차량의 주변 환경 인식에 많이 사용되고 있다. 제한된 장비와 자원을 사용하는 자율주행 특성상 가볍고 빠른 네트워크가 선호되는데, 본 논문에서는 웨이블릿 변환을 활용하여 효율적인 의미론적 영상 분할을 하는 방법을 제안한다. 먼저 웨이블릿 변환을 사용하여 영상 데이터를 고주파, 저주파 성분으로 나누어 주고, 각각의 성분에 대하여 서로 다른 특징지도 추출을 하여 서로 다른 정보를 적합하게 합쳤다. 자율주행에 적합한 가벼운 네트워크를 베이스라인으로 Cityscapes 데이터 세트에 제안된 방식을 적용했을 때, 0.2%의 파라미터 증가를 통해 2.2% 성능향상을 달성했다. 이 같은 알고리즘을 활용하여 더욱더 안정적이고 정확한 주변 환경 인식에 적용되길 기대한다.

Semantic segmentation and object detection are widely used to perceive surrounding environment during autonomous driving. Owing to the nature of autonomous driving, which operates with limited resources and equipment, lightweight and fast networks are preferred. In this paper, we propose an efficient semantic segmentation algorithm using a wavelet transform. First, we apply the wavelet transform to separate high-frequency and low-frequency components from an input image. For each component, different feature maps are extracted, and the distinct information appropriately merged. When the proposed method was applied to the Cityscapes dataset using a lightweight network suitable for autonomous driving, a 2.2% performance improvement was achieved from a 0.2% parameter increase. We expect this algorithm can be applied to achieve more stable and accurate perceptions of the surrounding environment.

키워드

과제정보

본 연구는 국토교통부/국토교통과학기술진흥원의 지원으로 수행되었음(과제번호 RS-2021-KA161756, 과제명: 실시간 수요대응 자율주행 대중교통 모빌리티 서비스 기술 개발)

참고문헌

  1. An, T. H., Kang, J. and Min, K. W.(2023), "Network adaptation for color image semantic segmentation", IET Image Processing, vol. 17, no. 10, pp.2972-2983.
  2. Antonini, M., Barlaud, M., Mathieu, P. and Daubechies, I.(1992), "Image coding using wavelet transform", IEEE Trans. Image Processing, vol. 1, no. 2, pp.205-220.
  3. Azimi, S. M., Fischer, P., Korner, M. and Reinartz, P.(2018), "Aerial LaneNet: Lane-marking semantic segmentation in aerial imagery using wavelet-enhanced cost-sensitive symmetric fully convolutional neural networks", IEEE Transactions on Geoscience and Remote Sensing, vol. 57, no. 5, pp.2920-2938.
  4. Badrinarayanan, V., Kendall, A. and Cipolla, R.(2017), "SegNet: A deep convolutional encoder-decoder architecture for image segmentation", IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 39, no. 12, pp.2481-2495.
  5. Chen, L. C., Papandreou, G., Kokkinos, I., Murphy, K. and Yuille, A. L.(2017), "Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs", IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 40, no. 4, pp.834-848.
  6. Cordts, M., Omran, M., Ramos, S., Rehfeld, T., Enzweiler, M., Benenson, R. and Schiele, B.(2016), "The cityscapes dataset for semantic urban scene understanding", Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.3213-3223.
  7. Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T. and Houlsby, N.(2021), "An image is worth 16x16 words: Transformers for image recognition at scale", International Conference on Learning Representations (ICLR).
  8. Hoffman, J., Tzeng, E., Park, T., Zhu, J. Y., Isola, P., Saenko, K. and Darrell, T.(2018), "Cycada: Cycle-consistent adversarial domain adaptation", Proceedings of the 35th International Conference on Machine Learning (ICML), pp.1989-1998.
  9. Howard, A. G., Zhu, M., Chen, B., Kalenichenko, D., Wang, W., Weyand, T. and Adam, H.(2017), "Mobilenets: Efficient convolutional neural networks for mobile vision applications", arXiv preprint arXiv:1704.04861.
  10. Kang, J., Han, S. J., Kim, N. and Min, K. W.(2021), "ETLi: Efficiently annotated traffic LiDAR dataset using incremental and suggestive annotation", ETRI Journal, vol. 43, no. 4, pp.630-639.
  11. Kingma D. P. and Ba J. L.(2015), "ADAM: A method for stochastic optimization", in Proc. third International Conference on Learning Representations (ICLR), San Diego, California, pp.1-15.
  12. Krizhevsky, A., Sutskever, I. and Hinton, G. E.(2012), "ImageNet classification with deep convolutional neural networks", Advances in Neural Information Processing Systems, vol. 25.
  13. Liao, Y., Xie, J. and Geiger, A.(2022), "Kitti-360: A novel dataset and benchmarks for urban scene understanding in 2d and 3d", IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 45, no. 3, pp.3292-3310.
  14. Long, J., Shelhamer, E. and Darrell, T.(2015), "Fully convolutional networks for semantic segmentation", Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.3431-3440.
  15. Pan, H., Hong, Y., Sun, W. and Jia, Y.(2022), "Deep dual-resolution networks for real-time and accurate semantic segmentation of traffic scenes", IEEE Transactions on Intelligent Transportation Systems, vol. 24, no. 3, pp.3448-3460.
  16. Paszke, A., Chaurasia, A., Kim, S. and Culurciello, E.(2016), "ENet: A deep neural network architecture for real-time semantic segmentation", arXiv preprint arXiv:1606.02147.
  17. Romera, E., Alvarez, J. M., Bergasa, L. M. and Arroyo, R.(2018), "ERFNet: Efficient residual factorized convnet for real-time semantic segmentation", IEEE Transactions on Intelligent Transportation Systems, vol. 19, no. 1, pp.263-272.
  18. Zhang, X., Zhou, X., Lin, M. and Sun, J.(2018), "Shufflenet: An extremely efficient convolutional neural network for mobile devices", Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.6848-6856.
  19. Zhao, C., Xia, B., Chen, W., Guo, L., Du, J., Wang, T. and Lei, B.(2021), "Multi-scale wavelet network algorithm for pediatric echocardiographic segmentation via hierarchical feature guided fusion", Applied Soft Computing, vol. 107, 107386.
  20. Zheng, S., Lu, J., Zhao, H., Zhu, X., Luo, Z., Wang, Y. and Yu, G.(2021), "Rethinking semantic segmentation from a sequence-to-sequence perspective with transformers", Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp.6881-6890.