DOI QR코드

DOI QR Code

Text mining-based Data Preprocessing and Accident Type Analysis for Construction Accident Analysis

건설사고 분석을 위한 텍스트 마이닝 기반 데이터 전처리 및 사고유형 분석

  • Yoon, Young Geun (Department of Safety Engineering, Incheon National University) ;
  • Lee, Jae Yun (Department of Safety Engineering, Incheon National University) ;
  • Oh, Tae Keun (Department of Safety Engineering, Incheon National University)
  • Received : 2022.01.13
  • Accepted : 2022.02.19
  • Published : 2022.04.30

Abstract

Construction accidents are difficult to prevent because several different types of activities occur simultaneously. The current method of accident analysis only indicates the number of occurrences for one or two variables and accidents have not reduced as a result of safety measures that focus solely on individual variables. Even if accident data is analyzed to establish appropriate safety measures, it is difficult to derive significant results due to a large number of data variables, elements, and qualitative records. In this study, in order to simplify the analysis and approach this complex problem logically, data preprocessing techniques, such as latent class cluster analysis (LCCA) and predictor importance were used to discover the most influential variables. Finally, the correlation was analyzed using an alluvial flow diagram consisting of seven variables and fourteen elements based on accident data. The alluvial diagram analysis using reduced variables and elements enabled the identification of accident trends into four categories. The findings of this study demonstrate that complex and diverse construction accident data can yield relevant analysis results, assisting in the prevention of accidents.

Keywords

Acknowledgement

This research was supported by Basic Science Research Program through the National Research Foundation of Korea(NRF) funded by the Ministry of Education(No. 2021R1I1A205091211).

References

  1. B. U. Ayhan, O. B. Tokdemir and M. ASCE, "Accident Analysis for Construction Safety Using Latent Class Clustering and Artificial Neural Networks", J. Constr. Eng. Manage, Vol. 146, Article ID 04019114, 2020.
  2. KOSHA, Industrial Accident Statistics 2017-2019.
  3. L. A. Vidal, F. Marle and J. Bocquet, "Measuring Project Complexity using the Analytic Hierarchy Process", Int. J. Project Manage. Vol. 29, pp. 718-727, 2011. https://doi.org/10.1016/j.ijproman.2010.07.005
  4. S. Sarkar, S. Vinay, R. Raj, J. Maiti and P. Mitra, "Application of Optimized Machine Learning Techniques for Prediction of Occupational Accidents", Computers and Operations Research, Vol. 106, pp. 210-224, 2019. https://doi.org/10.1016/j.cor.2018.02.021
  5. F. T. Matsunaga, J. D. Brancher and R. M. Busto, "Data Mining Applications and Techniques: A Systematic Review", Rev. Eletronica Argentina-Brasil Tecnologias da Informacao e da Comunicacao, Vol. 1, pp. 1-14, 2014.
  6. B. S. Kim, S. R. Chang and Y. Suh, "Text Analytics for Classifying Types of Accident Occurrence Using Accident Report Documents", J. Korean Soc. Saf., Vol. 33, No. 3, pp. 58-64, 2018. https://doi.org/10.14346/JKOSOS.2018.33.3.58
  7. S. Kang and Y. Suh, "On the Development of Risk Factor Map for Accident Analysis using Textmining and Self-Organizing Map(SOM) Algorithms", J. Korean Soc. Saf., Vol. 33, No. 6, pp. 77-84, 2018. https://doi.org/10.14346/JKOSOS.2018.33.6.77
  8. A. J. P. Tixier, M. R. Hallowell, R. Balaji and D. Bowman, "Application of Machine Learning to Construction Injury Prediction. Automat. Constr", Vol. 69, pp. 102-114, 2016. https://doi.org/10.1016/j.autcon.2016.05.016
  9. J. Y. Lee, Y. G. Yoon, T. K. Oh, S. H. Hee and S. I. Ryu, "A Study on Data Pre-Processing and Accident Prediction Modelling for Occupational Accident Analysis in the Construction Industry", Applied Science, Vol. 10, No. 21, pp. 1-23, 2020.
  10. R. Houari, A. Bounceur, M. Kechadi, A. Tari and R. Euler, "Dimensionality Reduction in Data Mining : A Copula Approach", Expert Syst. Appl. Vol. 64, pp. 247-260, 2016. https://doi.org/10.1016/j.eswa.2016.07.041
  11. J. D. Ona, G. Lopez, R. Mujalli and F. J. Calvo, "Analysis of Traffic Accidents on Rural Highways using Latent Class Clustering and Bayesian Networks", Accid. Anal. Prev, Vol. 51, pp. 1-10, 2013. https://doi.org/10.1016/j.aap.2012.10.016
  12. Y. Freund, R. Schapire and N. Abe, "A Short Introduction to Boosting", J. Japanese Soc. Artif. Intell, Vol. 14, pp. 771-780, 1999.