DOI QR코드

DOI QR Code

Building Detection by Convolutional Neural Network with Infrared Image, LiDAR Data and Characteristic Information Fusion

적외선 영상, 라이다 데이터 및 특성정보 융합 기반의 합성곱 인공신경망을 이용한 건물탐지

  • Cho, Eun Ji (Dept. of Geoinfomation Engineering, Sejong University) ;
  • Lee, Dong-Cheon (Dept. of Environment, Energy & Geoinfomatics, Sejong University)
  • Received : 2020.11.19
  • Accepted : 2020.12.11
  • Published : 2020.12.31

Abstract

Object recognition, detection and instance segmentation based on DL (Deep Learning) have being used in various practices, and mainly optical images are used as training data for DL models. The major objective of this paper is object segmentation and building detection by utilizing multimodal datasets as well as optical images for training Detectron2 model that is one of the improved R-CNN (Region-based Convolutional Neural Network). For the implementation, infrared aerial images, LiDAR data, and edges from the images, and Haralick features, that are representing statistical texture information, from LiDAR (Light Detection And Ranging) data were generated. The performance of the DL models depends on not only on the amount and characteristics of the training data, but also on the fusion method especially for the multimodal data. The results of segmenting objects and detecting buildings by applying hybrid fusion - which is a mixed method of early fusion and late fusion - results in a 32.65% improvement in building detection rate compared to training by optical image only. The experiments demonstrated complementary effect of the training multimodal data having unique characteristics and fusion strategy.

딥러닝(DL)을 이용한 객체인식, 탐지 및 분할하는 연구는 여러 분야에서 활용되고 있으며, 주로 영상을 DL 모델의 학습 데이터로 사용하고 있지만, 본 논문은 영상뿐 아니라 공간정보 특성을 포함하는 다양한 학습 데이터(multimodal training data)를 향상된 영역기반 합성곱 신경망(R-CNN)인 Detectron2 모델 학습에 사용하여 객체를 분할하고 건물을 탐지하는 것이 목적이다. 이를 위하여 적외선 항공영상과 라이다 데이터의 내재된 객체의 윤곽 및 통계적 질감정보인 Haralick feature와 같은 여러 특성을 추출하였다. DL 모델의 학습 성능은 데이터의 수량과 특성뿐 아니라 융합방법에 의해 좌우된다. 초기융합(early fusion)과 후기융합(late fusion)의 혼용방식인 하이브리드 융합(hybrid fusion)을 적용한 결과 33%의 건물을 추가적으로 탐지 할 수 있다. 이와 같은 실험 결과는 서로 다른 특성 데이터의 복합적 학습과 융합에 의한 상호보완적 효과를 입증하였다고 판단된다.

Keywords

References

  1. Audebert, N., Le Saux, B., and Lefevre, S. (2018), Beyond RGB: Very high resolution urban remote sensing with multimodal deep networks, ISPRS Journal of Photogrammetry and Remote Sensing, Vol. 140, pp. 20-32. https://doi.org/10.1016/j.isprsjprs.2017.11.011
  2. Canny, J. (1986), A computational approach to edge detection, IEEE Transactions on Pattern Aalysis and Machine Intelligence, Vol. PAMI-8, No. 6, pp. 679-698. https://doi.org/10.1109/TPAMI.1986.4767851
  3. Cramer, M. (2010), The DGPF test on digital aerial camera evaluation - Overview and test design. Photogrammetrie, Fernerkundung, Geoinformation, Vol. 2, pp. 73-82. https://doi.org/10.1127/1432-8364/2010/0041
  4. Girshick, R. (2015), Fast R-CNN, IEEE International Conference on Computer Vision, ICCV 2015, 13-16 December, Santiago, Chile, pp. 1440-1448.
  5. Girshick, R., Donahue, J., Darrell, T., and Malik, J. (2016), Region-based convolutional networks for accurate object detection and segmentation, IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 38, No. 1, pp. 1-16.
  6. Haralick, R., Shangmugam, K., and Dinstein, I. (1973), Textural features for image classification, IEEE Transactions on Systems, Man and Cybernetics, Vol. SMC-3, No. 6, pp. 610-621. https://doi.org/10.1109/TSMC.1973.4309314
  7. He, K., Gkioxari, G., Dollar, P., and Girshick, R. (2017), Mask R-CNN, Proceedings of IEEE International Conference on Computer Vision (ICCV) 2017, 22-29 October, Venice, Italy, pp. 2980-2988.
  8. He, K., Zhang, X., Ren, S., and Sun, J. (2016), Deep residual learning for image recognition, Proceedings of IEEE International Conference on Computer Vision and Pattern Recognition, 2016, pp. 770-778.
  9. Kim, H., Lee, J. Bae, K. and Eo, Y. (2018), Application research on obstruction area detection of building wall using R-CNN technique, Journal of Cadastre & Land InformatiX, Vol.48 No.2, pp. 213-225. (in Korean with English abstract) https://doi.org/10.22640/LXSIRI.2018.48.2.213
  10. Lee, D., Cho, E., and Lee, D.C. (2018), Evaluation of Building Detection from Aerial Images Using Region-based Convolutional Neural Network for Deep Learning, Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography, Vol. 36, No. 6, 469-481. (in Korean with English abstract) https://doi.org/10.7848/KSGPC.2018.36.6.469
  11. Lee, D., Cho, E., and Lee, D.C. (2019), Semantic classification of DSM using convolutional neural network, Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography, Vol. 37, No. 6, 435-444. (in Korean with English abstract) https://doi.org/10.7848/KSGPC.2019.37.6.435
  12. Lee, D. and Lee, D.C. (2016), Key point extraction from LiDAR data for 3D modeling, Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography, Vol. 34, No. 5, pp. 479-493. (in Korean with English abstract) https://doi.org/10.7848/ksgpc.2016.34.5.479
  13. Ren, S., He, K., Girshick, R. and Sun, J. (2017), Faster R-CNN: Towards real-time object detection with region proposal networks, IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 39, No. 6, pp. 1137-1149. https://doi.org/10.1109/TPAMI.2016.2577031
  14. Wu, Y., Kirillov, A., Massa, F., Lo, W. Y. and Girshick, R. (2019), Detectron2, Github, https://github.com/facebookresearch/detectron2 , (last date accessed: 1 December 2020).
  15. Yoo, E. and Lee, D.C. (2016), Determination of physical footprints of buildings with consideration terrain surface LiDAR data, Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography, Vol. 34, No. 5, pp. 503-514. (in Korean with English abstract) https://doi.org/10.7848/ksgpc.2016.34.5.503