DOI QR코드

DOI QR Code

Assessing Techniques for Advancing Land Cover Classification Accuracy through CNN and Transformer Model Integration

CNN 모델과 Transformer 조합을 통한 토지피복 분류 정확도 개선방안 검토

  • Woo-Dam SIM (Department of Forest Management, Division of Forest Sciences, College of Forest and Environmental Sciences, Kangwon National University) ;
  • Jung-Soo LEE (Department of Forest Management, Division of Forest Sciences, College of Forest and Environmental Sciences, Kangwon National University)
  • 심우담 (강원대학교 산림경영학과) ;
  • 이정수 (강원대학교 산림경영학과)
  • Received : 2024.03.08
  • Accepted : 2024.03.15
  • Published : 2024.03.31

Abstract

This research aimed to construct models with various structures based on the Transformer module and to perform land cover classification, thereby examining the applicability of the Transformer module. For the classification of land cover, the Unet model, which has a CNN structure, was selected as the base model, and a total of four deep learning models were constructed by combining both the encoder and decoder parts with the Transformer module. During the training process of the deep learning models, the training was repeated 10 times under the same conditions to evaluate the generalization performance. The evaluation of the classification accuracy of the deep learning models showed that the Model D, which utilized the Transformer module in both the encoder and decoder structures, achieved the highest overall accuracy with an average of approximately 89.4% and a Kappa coefficient average of about 73.2%. In terms of training time, models based on CNN were the most efficient. however, the use of Transformer-based models resulted in an average improvement of 0.5% in classification accuracy based on the Kappa coefficient. It is considered necessary to refine the model by considering various variables such as adjusting hyperparameters and image patch sizes during the integration process with CNN models. A common issue identified in all models during the land cover classification process was the difficulty in detecting small-scale objects. To improve this misclassification phenomenon, it is deemed necessary to explore the use of high-resolution input data and integrate multidimensional data that includes terrain and texture information.

본 연구는 Transformer 모듈을 기반으로 다양한 구조의 모델을 구성하고, 토지피복 분류를 수행하여 Transformer 모듈의 활용방안 검토를 목적으로 하였다. 토지피복 분류를 위한 딥러닝 모델은 CNN 구조를 가진 Unet 모델을 베이스 모델로 선정하였으며, 모델의 인코더 및 디코더 부분을 Transformer 모듈과 조합하여 총 4가지 딥러닝 모델을 구축하였다. 딥러닝 모델의 학습과정에서 일반화 성능 평가를 위해 같은 학습조건으로 10회 반복하여 학습을 진행하였다. 딥러닝 모델의 분류 정확도 평가결과, 모델의 인코더 및 디코더 구조 모두 Transformer 모듈을 활용한 D모델이 전체 정확도 평균 약 89.4%, Kappa 평균 약 73.2%로 가장 높은 정확도를 보였다. 학습 소요시간 측면에서는 CNN 기반의 모델이 가장 효율적이었으나 Transformer 기반의 모델을 활용할 경우, 분류 정확도가 Kappa 기준 평균 0.5% 개선되었다. 차후, CNN 모델과 Transformer의 결합과정에서 하이퍼파라미터 조절과 이미지 패치사이즈 조절 등 다양한 변수들을 고려하여 모델을 고도화 할 필요가 있다고 판단된다. 토지피복 분류과정에서 모든 모델이 공통적으로 발생한 문제점은 소규모 객체들의 탐지가 어려운 점이었다. 이러한 오분류 현상의 개선을 위해서는 고해상도 입력자료의 활용방안 검토와 함께 지형 정보 및 질감 정보를 포함한 다차원적 데이터 통합이 필요할 것으로 판단된다.

Keywords

Acknowledgement

본 연구는 국립산림과학원 "산림자원 평가 및 모니터링을 위한 농림위성 융합 산출물 개발(과제번호:FM 0103-2021-04-2023)"의 지원으로 수행되었습니다.

References

  1. Atkinson, P.M. and A.R.L. Tatnall. 1997. Introduction neural networks in remote sensing. International Journal of remote sensing 18(4):699-709. 
  2. Chen, J., Lu, Y., Yu, Q., Luo, X., Adeli, E., Lu, L., Yuille, A.L. and Y. Zhou. 2021. Transunet: Transformers make strong encoders for medical image segmentation. arXiv preprint arXiv: 2102.04306. 
  3. Choi, S.H., Park, K.B. and J.Y. Lee. 2023. SwinResNet: Volumetric Medical Image Segmentation by Fusing Swin Transformer and ResNet. Korean Journal of Computational Design and Engineering 28(3): 282-293. 
  4. Choung, Y.J. 2014. Analysis of Land Uses in the Nakdong River Floodplain Using RapidEye Imagery and LiDAR DEM. Journal of the Korean Association of Geographic Information Studies 17(4):189-199. 
  5. Das, S. 2024. A new technique for classification method with imbalanced training data. International Journal of Information Technology:1-9. 
  6. Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehahani, M., Minderer, M., Heigold, G., Gelly, S., Uszkoreit, J. and N. Houlsby. 2020. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929. 
  7. Foody, G.M. and A. Mathur. 2004. A relative evaluation of multiclass image classification by support vector machines. IEEE Transactions on geoscience and remote sensing 42(6):1335-1343. 
  8. Geng, S., Zhai, S. and C. Li. 2024. Swin transformer based transfer learning model for predicting porous media permeability from 2D images. Computers and Geotechnics. 168(106177). 
  9. Handayanto, R.T. 2024. Land Cover Segmentation of Multispectral Images Using U-Net and DeeplabV3+ Architecture. Jurnal Ilmu Komputer dan Informasi 17(1):89-96. 
  10. Huang, M.H. and R.T. Rust. 2018. Artificial intelligence in service. Journal of service research 21(2):155-172. 
  11. Kammerer, M., Iverson, A.L., Li, K. and S.C. Goslee. 2024. Not just crop or forest: an integrated land cover map for agricultural and natural areas. Scientific Data 11(1):137. 
  12. Kingma, D.P. and J. Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980. 
  13. Kwon, S.K., Kim, E.H., Lim, J.B. and A.R. Yang. 2021. The Analysis of Changes in Forest Status and Deforestation of North Korea's DMZ Using RapidEye Satellite Imagery and Google Earth. Journal of the Korean Association of Geographic Information Studies 24(4):113-126. 
  14. Lee, S.H. and M.J. Lee. 2020. A study on deep learning optimization by land cover classification item using satellite imagery. Korean Journal of Remote Sensing 36(6_2):1591-1604. 
  15. Lin, A., Chen, B., Xu, J., Zhang, Z., Lu, G. and D. Zhang. 2022. Ds-transunet: Dual swin transformer u-net for medical image segmentation. IEEE Transactions on Instrumentation and Measurement 71: 1-15. 
  16. Liu, Z., Lin, Y., Cao, Y., Hu, H., Wei, Y., Zhang, Z., Lin, S. and B. Guo. 2021. Swin transformer: Hierarchical vision transformer using shifted windows. In Proceedings of the IEEE/CVF international conference on computer vision. pp.10012-10022. 
  17. Loshchilov, I. and F. Hutter. 2018. Fixing weight decay regularization in adam. https://openreview.net/forum?id=rk6qdGgCZ. (Accessed Feb 15, 2024)
  18. Mountrakis, G., Im, J. and C. Ogole. 2011. Support vector machines in remote sensing: A review. ISPRS journal of photogrammetry and remote sensing 66(3):247-259. 
  19. Oktay, O., Schlemper, J., Folgoc, L.L., Lee, M., Heinrich, M., Misawa, K., Mori, K. and D. Rueckert. 2018. Attention u-net: Learning where to look for the pancreas. arXiv preprint arXiv:1804.03999. 
  20. Pal, M. and P.M. Mather. 2003. An assessment of the effectiveness of decision tree methods for land cover classification. Remote sensing of environment 86(4):554-565. 
  21. Qiu, C., Zhang, X., Tong, X., Guan, N., Yi, X., Yang, K., Zhu, J. and A. Yu. 2024. Few-shot remote sensing image scene classification: Recent advances, new baselines, and future trends. ISPRS Journal of Photogrammetry and Remote Sensing 209:368-382. 
  22. Ronneberger, O., Fischer, P. and T. Brox. 2015. U-net: Convolutional networks for biomedical image segmentation. In Medical Image Computing and Computer-Assisted Inter vention MICCAI International Conference. pp.234-241. 
  23. Rouhi, R., Jafari, M., Kasaei, S. and P. Keshavarzian. 2015. Benign and malignant breast tumors classification based on region growing and CNN segmentation. Expert Systems with Applications 42(3):990-1002. 
  24. Rumelhart, D.E., Hinton, G.E. and R.J. Williams. 1986. Learning representations by back-propagating errors. nature 323(6088):533-536. 
  25. Smith, L.N. and N. Topin. 2019. Super-convergence: Very fast training of neural networks using large learning rates. In Artificial intelligence and machine learning for multi-domain operations applications Vol. 11006, pp.369-386. 
  26. Stoian, A., Poulain, V., Inglada, J., Poughon, V. and D. Derksen. 2019. Land cover maps production with high resolution satellite image time series and convolutional neural networks: Adaptations and limits for operational systems. Remote Sensing 11(17):1986. 
  27. Talukder, M.A., Sharmin, S., Uddin, M.A., Islam, M.M. and S. Aryal. 2024. MLSTL-WSN: Machine Learning-based Intrusion Detection using SMOTETomek in WSNs. arXiv preprint arXiv:2402.13277. 
  28. Wang, H., Xing, C., Yin, J. and J. Yang. 2022. Land cover classification for polarimetric SAR images based on vision transformer. Remote Sensing 14(18):4656. 
  29. Wang, H., Lin, J., Li, Y., Dong, X., Tong, X. and S. Lu. 2024. Self-Supervised Pre-Training Transformer for Seismic Data Denoising. IEEE Transactions on Geoscience and Remote Sensing 62: 5907525. 
  30. Won, M.S., Kim, Y.S. and K.H. Kim. 2014. Estimation on Greenhouse Gases(GHGs) Emission of Large Forest Fire Area in 2013. Journal of the Korean Association of Geographic Information Studies 17(3):54-67. 
  31. Xie, Z., Lin, Y., Yao, Z., Zhang, Z., Dai., Q., Cao, Y. and H. Hu. 2021. Self-supervised learning with swin transformers. arXiv preprint arXiv:2105.04553. 
  32. Xu, R., Dong, X.M., Li, W., Peng, J., Sun, W. and Y. Xu. 2024. DBCTNet: Double Branch Convolution-Transformer Network for Hyperspectral Image Classification. IEEE Transactions on Geoscience and Remote Sensing Vol. 62. 
  33. Zhang, L., Zhang, L. and B. Du. 2016. Deep learning for remote sensing data: A technical tutorial on the state of the art. IEEE Geoscience and remote sensing magazine 4(2): 22-40. 
  34. Zhang, P., Ke, Y., Zhang, Z., Wang, M., Li, P. and S. Zhang. 2018. Urban land use and land cover classification using novel deep learning models based on high spatial resolution satellite imagery. Sensors 18(11):3717. 
  35. Zhao, Y., Bao, W., Xu, X. and Y. Zhou. 2024. Hyperspectral image classification based on local feature decoupling and hybrid attention SpectralFormer network. International Journal of Remote Sensing 45(5):1727-1754. 
  36. Zhou, Z., Rahman Siddiquee, M.M., Tajbakhsh, N. and J. Liang. 2018. Unet++: A nested u-net architecture for medical image segmentation. Proceedings of the In Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support: 4th International Workshop. pp.3-11. 
  37. 국토교통부. 2023. 지적통계. https://stat.molit.go.kr/portal/cate/statMetaView.do?hRsId=24. (Accessed February 2, 2024) 
  38. 춘천시. 2024. 춘천 소개. https://www.chuncheon.go.kr/cityhall/about-chuncheon/introduction/general/. (Accessed February 2, 2024).