DOI QR코드

DOI QR Code

건설현장 컴퓨터비전 AI 성능에 대한 학습 이미지 데이터셋 크기 및 특화성의 영향 분석

Investigating the Effects of Training Image Dataset's Size and Specificity on Visual Scene Understanding AI in Construction

  • 김진우 (한양대학교 건설환경공학과) ;
  • 지석호 (서울대학교 건설환경공학부)
  • Jinwoo Kim ;
  • Seokho Chi
  • 투고 : 2024.08.09
  • 심사 : 2024.10.22
  • 발행 : 2024.12.30

초록

컴퓨터비전 AI는 건설산업의 디지털 전환과 로봇 자동화의 중요한 요소로, 기존에는 학습 이미지 데이터가 많으면 많을수록 모델의 성능이 더 높아진다라는 가설을 기반으로 활발히 연구되어왔다. 그러나 시간에 따라 토공-기초-골조-마감작업이 진행되는 건설현장에서는 각 공종에 특화된 이미지를 사용하는 것이 데이크셋의 크기를 확장하는 것보다 더 중요하다는 대안 가설을 세울 수 있다. 이러한 배경에서 본 연구는 학습 이미지 데이터셋의 크기와 특화성(각 공종에 특화된 정도)이 컴퓨터비전 AI 성능에 미치는 영향을 조사한다. 구체적으로, 학습 이미지 개수를 변경해가며 전 공종 범용 이미지셋과 공종별 특화 데이터셋을 각각 준비하고, 영상 기반 작업자 탐지 모델을 학습시킨 뒤 공종별 시험 이미지셋을 활용하여 그 성능을 평가-비교분석한다. 또한, 공종별 컴퓨터비전 AI의 최고 성능과 전 생애주기에서의 모델 성능 변화, 필요한 학습 데이터 개수를 종합적으로 분석하여, 건설현장 컴퓨터비전 AI 개발을 위한 최적의 학습 데이터셋 크기와 특화성을 결정한다. 연구결과는 앞서 언급한 두 가지 가설을 과학적으로 검증할 수 있는 기초자료가 될 것이며, 더 나아가 계속해서 변화하는 건설현장에 활용할 AI 모델을 위해 학습 이미지 데이터셋을 어떻게 구축하고 업데이트해야 하는지에 대한 이론적 기반을 마련할 것이다.

Visual scene understanding AI, a pivotal factor for digital transformation and robotic automation in construction, has primarily been researched under the hypothesis that the more training images, the higher the model performance. Alternatively, one can hypothesize that prioritizing activity-specific training images tailored to each construction phase would be more critical than merely enlarging the size of the dataset. This approach is particularly vital in dynamic construction environments where visual characteristics undergo significant changes across the construction phases, from earthmoving, foundation, and superstructure to finishing activities. In this background, we investigate the effects of a training image dataset's size and specificity on visual scene understanding AI in construction. We build an all-in-one, universal training image dataset as well as an activity-specific dataset, varying the number of training images. We then train vision-based worker detection models using each dataset and assess their performance in activity-specific, dynamic test environments. We analyze the optimal performance achieved in each test environment and how the model's performance varies depending on the dataset's size over the entire test phase. Our findings will help scientifically validate the dual hypotheses and lay a solid foundation for building and updating a training image dataset when developing a visual scene understanding AI model in dynamic construction sites.

키워드

과제정보

이 논문은 한양대학교 교내연구지원사업으로 연구되었음(HY-202400000001265).

참고문헌

  1. 권영주.문성호(2023), "드론 촬영 이미지 데이터를 기반으로 한 도로 균열 탐지 딥러닝 모델 개발", 「LHI Journal」, 14(2): 125~135.
  2. 염준호(2023), "무감독 SVM 분류 기법을 통한 드론 영상 경계 박스 내 차량 자동 추출 연구", 「LHI Journal」, 14(4): 95~102.
  3. Assadzadeh, A., M. Arashpour, I. Brilakis, T. Ngo and E. Konstantinou (2022), "Vision-based Excavator Pose Estimation Using Synthetically Generated Datasets with Domain Randomization", Automation in Construction, 134: 104089.
  4. Braun, A. and A. Borrmann (2019), "Combining Inverse Photogrammetry and BIM for Automated Labeling of Construction Site Images for Machine Learning", Automation in Construction, 106: 102879.
  5. Ding, Y. and X. Luo (2024), "A Virtual Construction Vehicles and Workers Dataset with Three-Dimensional Annotations", Engineering Applications of Artificial Intelligence, 133: 107964.
  6. Ding, Y., Liu, M., and Luo, X. (2022), "Safety Compliance Checking of Construction Behaviors Using Visual Question Answering", Automation in Construction, 144: 104580.
  7. Duan, R., H. Deng, M. Tian, Y. Deng and J. Lin (2022), "SODA: A Large-scale Open Site Object Detection Dataset for Deep Learning in Construction", Automation in Construction, 142: 104499.
  8. Kim, J., and S. Chi (2019), "Action Recognition of Earthmoving Excavators Based on Sequential Pattern Analysis of Visual Features and Operation Cycles", Automation in Construction, 104: 255~264.
  9. Kim, J., and S. Chi (2022), "Graph Neural Network-Based Propagation Effects Modeling for Detecting Visual Relationships among Construction Resources", Automation in Construction, 141: 104443.
  10. Kim, J., D. Kim, S. Lee and S. Chi (2023), "Hybrid DNN Training Using both Synthetic and Real Construction Images to Overcome Training Data Shortage", Automation in Construction, 149: 104771.
  11. Kim, J., J. Hwang, I. Jeong, S. Chi, J. O. Seo and J. Kim (2024), "Generalized Vision-Based Framework for Construction Productivity Analysis Using a Standard Classification System", Automation in Construction, 165: 105504.
  12. Kim, J., S. Chi and J. Seo (2018), "Interaction Analysis for Vision-Based Activity Identification of Earthmoving Excavators And Dump Trucks", Automation in Construction, 87: 297~308.
  13. Lee, J. G., J. Hwang, S. Chi and J. Seo (2022), "Synthetic Image Dataset Development for Vision-Based Construction Equipment Detection", Journal of Computing in Civil Engineering, 36(5): 04022020.
  14. Luo, X., H. Li, Y. Yu, C. Zhou, and D. Cao (2020), "Combining Deep Features and Activity Context to Improve Recognition of Activities of Workers in Groups", Computer-Aided Civil and Infrastructure Engineering, 35(9): 965~978.
  15. Mahmood, B., S. Han and J. Seo (2022), "Implementation Experiments on Convolutional Neural Network Training Using Synthetic Images for 3D Pose Estimation of an Excavator on Real Images", Automation in Construction, 133: 103996.
  16. Park, M., D. Q. Tran, J. Bak and S. Park (2023), "Small and Overlapping Worker Detection at Construction Sites", Automation in Construction, 151: 104856.
  17. Roberts, D., W. T. Calderon, S. Tang, and M. Golparvar-fard (2020), "Vision-Based Construction Worker Activity Analysis Informed by Body Posture", Journal of Computing in Civil Engineering, 34(4): 04020017.
  18. Torres Calderon, W., D. Roberts and M. Golparvar-Fard (2021), "Synthesizing Pose Sequences from 3D Assets for Vision-Based Activity Analysis", Journal of Computing in Civil Engineering, 35(1): 04020052.
  19. Wang, M., G. Yao, Y. Yang, Y. Sun, M. Yan and R. Deng (2023), "Deep Learning-Based Object Detection for Visible Dust and Prevention Measures on Construction Sites", Developments in the Built Environment, 16: 100245.
  20. Wang, Q., H. Liu, W. Peng, C. Tian, and C. Li (2024), "A Vision-Based Approach for Detecting Occluded Objects in Construction Sites", Neural Computing and Applications, 36(18): 10825~10837.
  21. Xiao, B. and S.-H. Kang (2021), "Development of an Image Data Set of Construction Machines for Deep Learning Object Detection", Journal of Computing in Civil Engineering, 32(2): 05020005.
  22. Xiong, R. and P. Tang (2021), "Machine Learning Using Synthetic Images for Detecting Dust Emissions on Construction Sites", Smart and Sustainable Built Environment, 10(3): 487~503.
  23. Xuehui, A., Z. Li, L. Zuguang, W. Chengzhi, L. Pengfei, and L. Zhiwei (2021), "Dataset and Benchmark for Detecting Moving Objects in Construction Sites", Automation in Construction, 122: 103482.
  24. Yan, X., H. Zhang, and H. Li (2020), "Computer Vision-Based Recognition of 3D Spatial Relationship between Moving Objects for Monitoring Struck-By Accidents", Computer-Aided Civil and Infrastructure Engineering, 35(9): 1023~1038.
  25. Yan, X., H. Zhang, Y. Wu, C. Lin and S. Liu (2023), "Construction Instance Segmentation (CIS) Dataset for Deep Learning-Based Computer Vision", Automation in Construction, 156: 105083.
  26. Yang, M., C. Wu, Y. Guo, R. Jiang, F. Zhou, J. Zhang, and Z. Yang (2023), "Transformer-Based Deep Learning Model and Video Dataset for Unsafe Action Identification in Construction Projects", Automation in Construction, 146: 104703.
  27. Ding, Y. and X. Luo (2023.5.29), "Monocular 2D Camera-based Proximity Monitoring for Human-Machine Collision Warning on Construction Sites", arXiv, https://arxiv.org/abs/2305.17931.