Pilot Experiment for Named Entity Recognition of Construction-related Organizations from Unstructured Text Data

  • Baek, Seungwon (Department of Civil and Environmental Engineering, Yonsei University) ;
  • Han, Seung H. (Department of Civil and Environmental Engineering, Yonsei University) ;
  • Jung, Wooyong (Department of Nuclear Power Plant Engineering, KEPCO International Nuclear Graduate School) ;
  • Kim, Yuri (Department of Nuclear Power Plant Engineering, KEPCO International Nuclear Graduate School)
  • Published : 2022.06.20

Abstract

The aim of this study is to develop a Named Entity Recognition (NER) model to automatically identify construction-related organizations from news articles. This study collected news articles using web crawling technique and construction-related organizations were labeled within a total of 1,000 news articles. The Bidirectional Encoder Representations from Transformers (BERT) model was used to recognize clients, constructors, consultants, engineers, and others. As a pilot experiment of this study, the best average F1 score of NER was 0.692. The result of this study is expected to contribute to the establishment of international business strategies by collecting timely information and analyzing it automatically.

Keywords

Acknowledgement

This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT) (No. NRF-2020R1A2C1012739 and NRF-2022R1A2C1012018).