DOI QR코드

DOI QR Code

Aviation Safety Mandatory Report Topic Prediction Model using Latent Dirichlet Allocation (LDA)

잠재 디리클레 할당(LDA)을 이용한 항공안전 의무보고 토픽 예측 모형

  • 김준환 (항공안전기술원 항공데이터실) ;
  • 백현진 (항공안전기술원 항공데이터실) ;
  • 전성진 (항공안전기술원 항공데이터실) ;
  • 최영재 (항공안전기술원 항공데이터실)
  • Received : 2023.07.07
  • Accepted : 2023.07.19
  • Published : 2023.09.30

Abstract

Not only in aviation industry but also in other industries, safety data plays a key role to improve the level of safety performance. By analyzing safety data such as aviation safety report (text data), hazard can be identified and removed before it leads to a tragic accident. However, pre-processing of raw data (or natural language data) collected from each site should be carried out first to utilize proactive or predictive safety management system. As air traffic volume increases, the amount of data accumulated is also on the rise. Accordingly, there are clear limitation in analyzing data directly by manpower. In this paper, a topic prediction model for aviation safety mandatory report is proposed. In addition, the prediction accuracy of the proposed model was also verified using actual aviation safety mandatory report data. This research model is meaningful in that it not only effectively supports the current aviation safety mandatory report analysis work, but also can be applied to various data produced in the aviation safety field in the future.

Keywords

Acknowledgement

본 연구는 국토교통과학기술진흥원의 "빅데이터 기반 항공안전관리 기술개발 및 플랫폼 구축"(20BDAS-B158275-01)의 일환으로 수행되었으며, 지원에 감사드립니다.

References

  1. Paek. H., Kim. J. H., Lim. J. J, Jeon. S., and Choi. Y. J., "Quantitative safety risk assessment using aviation safety data", Journal of the Korean Society for Aviation and Aeronautics, 30(4), 2022, pp.145-158. https://doi.org/10.12985/ksaa.2022.30.4.145
  2. ICAO, "Annex 13 - Aircraft Accident and Incident Investigation 12th Edition", 2020.
  3. MOLIT, "Aviation Safety Act, Article 59", 2021.
  4. de Vries, V., "Classification of aviation safety reports using machine learning", 2020 International Conference on Artificial Intelligence and Data Analytics for Air Transportation, IEEE, Singapore, 2020, pp.1-6.
  5. Karanikas, N., Nederend, J., "The controllability classification of safety events and its application to aviation investigation reports", Safety Science, 108, 2018, pp.89-103. https://doi.org/10.1016/j.ssci.2018.04.025
  6. MOLIT, "Aviation Safety Enforcement, Article 26", 2023.
  7. MOLIT, "Aviation Safety Regulation, Enclosure No.65", 2023.
  8. Blei, D. M., Ng, A. Y., and Jordan, M. I., "Latent dirichlet allocation", Journal of Machine Learning Research, 3, 2003, pp.993-1022.
  9. Nam, S., Ha, C., and Lee, H. C., "Redesigning in-flight service with service blueprint based on text analysis", Sustainability, 10(12), 2018, Online Published.
  10. Bastani, K., Namavari, H., and Shaffer, J., "Latent dirichlet allocation (LDA) for topic modeling of the CFPB consumer complaints", Expert Systems with Applications, 127, 2019, pp.256-271. https://doi.org/10.1016/j.eswa.2019.03.001
  11. Bao, S., Xu, S., Zhang, L., Yan, R., Su, Z., Han, D., and Yu, Y., "Mining social emotions from affective text", IEEE Transactions on Knowledge and Data Engineering, 24(9), 2011, pp.1658-1670. https://doi.org/10.1109/TKDE.2011.188
  12. Rao, Y., Lei, J., Wenyin, L., Li, Q., and Chen, M., "Building emotional dictionary for sentiment analysis of online news", World Wide Web, 17, 2014, pp.723-742. https://doi.org/10.1007/s11280-013-0221-9
  13. Kozareva, Z., "Everyone likes shopping! multi-class product categorization for e-commerce" In Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2015, pp. 1329-1333.
  14. Kim, S. W., and Gil, J. M., "Research paper classification systems based on TF-IDF and LDA schemes. Human-centric", Computing and Information Sciences, 9, 2019, pp.1-21. https://doi.org/10.1186/s13673-019-0192-7
  15. Hasan, M., Rahman, A., Karim, M. R., Khan, M. S. I., and Islam, M. J., "Normalized approach to find optimal number of topics in Latent Dirichlet Allocation (LDA)", Proceedings of International Conference on Trends in Computational and Cognitive Engineering, TCCE, Singapore, 2021, pp.341-354.
  16. Aletras, N., Stevenson, M., "Evaluating topic coherence using distributional semantics", 10th International Conference on Computational Semantics, IWCS, 2013, pp.13-22.
  17. Mimno, D., Wallach, H., Talley, E., Leenders, M., and McCallum, A., "Optimizing semantic coherence in topic models", 2011 Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics, Scotland, 2011, pp.262-272.
  18. Newman, D., Lau, J. H., Grieser, K., and Baldwin, T., "Automatic evaluation of topic coherence", The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Association for Computational Linguistics, California, 2010, pp.100-108.
  19. Stevens, K., Kegelmeyer, P., Andrzejewski, D., and Buttler, D., "Exploring topic coherence over many models and many topics", 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, Association for Computational Linguistics, Korea, 2012, pp.952-961.
  20. Nam, S., and Lee, H. C., "A text analytics-based importance performance analysis and its application to airline service", Sustain- ability, 11(21), 2019, Online Published.
  21. Bi, J. W., Liu, Y., Fan, Z. P., and Zhang, J., "Wisdom of crowds: Conducting importance-performance analysis (IPA) through online reviews", Tourism Management, 70, 2019, pp.460-478. https://doi.org/10.1016/j.tourman.2018.09.010