DOI QR코드

DOI QR Code

Application of Deep Learning and Optical Character Recognition Technology to Automate Classification and Database of Borehole Log for Ground Stability Investigation of Abandoned Mines

폐광산 지반안정성 조사용 시추주상도의 분류 및 데이터베이스화를 위한 딥러닝 및 광학문자인식 기술의 적용

  • Hosang Han (Department of Energy and Mineral Resources Engineering, Kangwon National University) ;
  • Jangwon Suh (Department of Energy Resources and Chemical Engineering, Kangwon National University)
  • 한호상 (강원대학교 에너지자원융합공학과) ;
  • 서장원 (강원대학교 에너지자원화학공학과)
  • Received : 2024.09.06
  • Accepted : 2024.10.03
  • Published : 2024.10.29

Abstract

Boring logs are essential for the evaluation of ground stability in abandoned mine areas, representing geomaterial and subsurface structure information. However, because boring logs are maintained in various analog formats, extracting useful information from them is prone to human error and time-consuming. Therefore, this study develops an algorithm to efficiently manage and analyze boring log data for abandoned mine ground investigation provided in PDF format. For this purpose, the EfficientNet deep learning model was employed to classify the boring logs into five types with a high classification accuracy of 1.00. Then, optical character recognition (OCR) and PDF text extraction techniques were utilized to extract text data from each type of boring log. The OCR technique resulted in many cases of misrecognition of the text data of the boring logs, but the PDF text extraction technique extracted the text with very high accuracy. Subsequently, the structure of the database was established, and the text data of the boring logs were reorganized according to the established schema and written as structured data in the form of a spreadsheet. The results of this study suggest an effective approach for managing boring logs as part of the transition to digital mining, and it is expected that the structured boring log data from legacy data can be readily utilized for machine learning analysis.

시추주상도는 지질매체와 지하구조 정보를 나타내며, 폐광산 지역의 지반 안정성 평가에 필수적으로 사용되는 중요한 자료이다. 다만 시추주상도는 양식이 다양하고 아날로그 형태로 관리되고 있어 이로부터 유용한 정보를 도출하는 과정에는 인적 오류가 발생되거나 시간 및 비용이 소모된다는 단점이 있다. 따라서 본 연구에서는 PDF 파일 형식으로 제공되는 폐광산 지반조사용 시추주상도 데이터를 효율적으로 관리하고 분석할 수 있는 알고리즘을 개발하였다. 이를 위해 EfficientNet 딥러닝 모델을 사용하여 시추주상도를 5개 유형으로 분류하였으며, 분류 정확도는 1.00으로 매우 높게 나타났다. 이후 분류된 각 유형별 시추주상도를 광학문자인식(optical character recognition, OCR) 기술과 PDF 텍스트 추출 기법을 활용하여 텍스트를 추출하였다. OCR 기술은 시추주상도의 텍스트 데이터를 오인식하는 결과가 다수 발생하였으나, PDF 텍스트 추출 기법은 매우 높은 정확도로 텍스트를 추출하였다. 이후 데이터베이스의 구조를 정립하고, 설계된 구조에 따라 시추주상도의 텍스트 데이터를 재구성하여 스프레드시트 형태의 정형 데이터로 작성하였다. 본 연구결과는 디지털 광산으로의 전환에 있어 효과적인 시추주상도 관리 방안을 제시하며, 레거시 데이터로부터 정형화된 시추주상도 데이터는 머신러닝 분석에 용이하게 활용될 수 있을 것으로 기대한다.

Keywords

Acknowledgement

본 연구는 2021년도 정부(산업통상자원부)의 재원으로 해외자원개발협회의 지원을 받아 수행된 연구임(데이터 사이언스 기반 석유·가스 탐사 컨소시엄).

References

  1. Alzubaidi, F., Mostaghimi, P., Swietojanski, P., Clark, S.R. and Armstrong, R.T. (2021) Automated Lithology Classification from Drill Core Images Using Convolutional Neural Networks. Journal of Petroleum Science and Engineering, v.197, p.107933. doi: 10.1016/j.petrol.2020.107933.
  2. Bonassi, F., Farina, M., Xie, J. and Scattolini, R. (2022) On Recurrent Neural Networks for learning-based control: Recent results and ideas for future developments. Journal of Process Control, v.114, p.92-104. doi: 10.1016/j.jprocont.2022.04.011.
  3. Ghaemimood, S. (2021) Application of Text-Mining and Image Processing Techniques on Digitizing Drillers Logs and Developing Big Well Log Datasets (Master's thesis, Southern University and Agricultural and Mechanical College).
  4. Huang, G., Liu, Z., van der Maaten, L. and Weinberger, K.Q. (2017) Densely connected convolutional networks, In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp. 4700-4708. doi: 10.1109/cvpr.2017.243
  5. Jang. Y., Jeon. H., Chae, D. and Cho, W. (2013) A State-of-the-Practice Review on the Management of the Domestic Geotechnical and Geological Information Data, Journal of the Korean Geo-Environmental Society, v.14(4), p.39-46.
  6. Kim, J., Kim, W. and Choi, Y. (2018) Comparison of Drilling Practices for the Exploration of Non-metallic Mineral Resources in Korea and Overseas. Journal of the Korean Society of Mineral and Energy Resources Engineers, v.55(3), p.219-225. doi: 10.32390/ksmer.2018.55.3.219.
  7. Kim, S., Suh, J., Roh, T.D., Hyun, C.U., Yi, H., Oh, S. and Park, H.D. (2013) Efficient management and application of National Borehole Data in Korea. Environmental & Engineering Geoscience, v.19(3), p.221-230. doi: 10.2113/gseegeosci.19.3.221.
  8. Lee, K., Lim, J., Yoon, D. and Jung, H. (2019) Prediction of Shale-Gas Production at Duvernay Formation Using Deep-Learning Algorithm. SPE Journal, v.24(6), p.2423-2437. doi: 10.2118/195698-PA.
  9. Qiu, Q., Tan, Y., Ma, K., Tian, M., Xie, Z. and Tao, L. (2023) Geological Symbol Recognition on Geological Map Using Convolutional Recurrent Neural Network With Augmented Data. Ore Geology Reviews, v.153, p.105262. doi: 10.1016/j.oregeorev.2022.105262.
  10. Pak, S., Koh, G., Park, J., Moon, D. and Yoon, W. (2015) Study of Geological Log Database for Public Wells, Jeju Island. Economic and Environmental Geology. The Korean Society of Economic and Environmental Geology, v.48(6), p.509-523. doi: 10.9719/eeg.2015.48.6.509.
  11. Park, K., Han, J. and Yoon, Y. (2021) A Study on the Automatic Digital DB of Boring Log Using AI. Journal of the Korean Geotechnical Society, v.37(11), p.119-129. doi: 10.7843/KGS.2021.37.11.119.
  12. Pham, C. and Shin, H. (2020) A Feasibility Study on Application of a Deep Convolutional Neural Network for Automatic Rock Type Classification. Tunnel and Underground Space, v.30(5), p.462-472. doi: 10.7474/TUS.2020.30.5.462.
  13. Saroji, S., Winata, E., Hidayat, P.P.W., Prakoso, S. and Herdiansyah, F. (2021) The Implemention of Machine Learning in Lothofacies Classification using Multi-well Logs Data. Aceh International Journal of Science and Technology, v.10(1), p.9-17. doi: 10.13170/aijst.10.1.18749.
  14. Sim, H., Jung, W., Hong, S., Seo, J., Park, C. and Song, Y. (2022) Evaluating the Effectiveness of an Artificial Intelligence Model for Classification of Basic Volcanic Rocks Based on Polarized Microscope Image. Korea Economic and Environmental Geology, v.55, p.309-316. doi: 10.9719/EEG2022.55.3.309.
  15. Simonyan, K. and Zisserman, A. (2014) Very deep convolutional networks for large-scale imge recognition, arXiv preprint, arXiv:1409.1556. doi: 10.48550/arXiv.1409.1556
  16. Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V. and Rabinovich, A. (2015) Going deeper with convolutions. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp.1-9. doi: 10.1109/cvpr.2015.7298594
  17. Tan, M. and Le, Q.V. (2019) EfficientNet: Rethinking Model Scaling for Convolutional Neural Network. Proceedings of the 36th International Conference on Machine Learning, ICML 2019, Long Beach, 9-15 June 2019, p.6105-6114, http://proceedings.mlr.press/v97/tan19a.html.
  18. Yu, J., de Antonio, A. and Villalba-Mora, E. (2022) Deep Learning (CNN, RNN) Applications for Smart Homes: A Systematic Review. Computers, v.11(2), p.26. doi: 10.3390/computers11020026
  19. Zhang, J., Zhang, Y., Tian, Y., Liu, G., Xu, L., and Hu, Y. (2020) A Rapid Method for Information Extraction from Borehole Log Images. Applied Sciences, v.10(16), p.5520. doi: 10.3390/app10165520.