DOI QR코드

DOI QR Code

Maritime Safety Tribunal Ruling Analysis using SentenceBERT

SentenceBERT 모델을 활용한 해양안전심판 재결서 분석 방법에 대한 연구

  • Bori Yoon (Department of Industrial Management Big Data Engineering, Dong-eui University) ;
  • SeKil Park (Maritime Digital Transformation Research Center, Korea Research Institute of Ships & Ocean Engineering) ;
  • Hyerim Bae (Industrial Data Science & Engineering Major, Department of Industrial Engineering Pusan National University Busan) ;
  • Sunghyun Sim (Department of Industrial Management Big Data Engineering, Dong-eui University)
  • 윤보리 (동의대학교 산업경영빅데이터공학과) ;
  • 박세길 (한국해양과학기술원 부설 선박해양플랜트연구소 ) ;
  • 배혜림 (부산대학교 산업공학과 ) ;
  • 심성현 (동의대학교 산업경영빅데이터공학과)
  • Received : 2023.10.19
  • Accepted : 2023.12.29
  • Published : 2023.12.31

Abstract

The global surge in maritime traffic has resulted in an increased number of ship collisions, leading to significant economic, environmental, physical, and human damage. The causes of these maritime accidents are multifaceted, often arising from a combination of crew judgment errors, negligence, complexity of navigation routes, weather conditions, and technical deficiencies in the vessels. Given the intricate nuances and contextual information inherent in each incident, a methodology capable of deeply understanding the semantics and context of sentences is imperative. Accordingly, this study utilized the SentenceBERT model to analyze maritime safety tribunal decisions over the last 20 years in the Busan Sea area, which encapsulated data on ship collision incidents. The analysis revealed important keywords potentially responsible for these incidents. Cluster analysis based on the frequency of specific keyword appearances was conducted and visualized. This information can serve as foundational data for the preemptive identification of accident causes and the development of strategies for collision prevention and response.

전 세계 선박 통행량의 증가에 따른 선박 충돌 사고의 증가는 큰 경제적, 환경적, 물리적 및 인간적 손해를 가져왔다. 선박 사고의 원인은 선원의 판단 오류나 부주의, 항로의 복잡성, 기상 조건, 선박의 기술적 결함 등 다양한 요인이 겹쳐 작용하여 사고를 유발하기 때문에 문장의 깊은 의미와 문맥 정보를 고려할 수 있는 방법론이 필요하다. 따라서, 본 연구는 부산해심 지역에서의 최근 20년 동안의 선박 충돌사고 데이터를 포함하고 있는 해양안전심판 재결서를 SentenceBERT 모델을 활용해 분석하였다. 분석 결과 사고의 주요 원인이 될 수 있는 키워드가 도출되었으며, 특정 키워드 출현 빈도를 바탕으로 군집 분석을 시행하고 시각화하였다. 추후 사고의 원인을 미리 파악함으로써, 이를 통해 선박 충돌 사고의 예방 및 사고 대응 전략 개발의 기초 자료로써 활용하고자 한다.

Keywords

Acknowledgement

이 논문은 2023년도 정부(과학기술정보통신부)의 재원으로 한국연구재단의 지원을 받아 수행된 연구 (No.RS-2023-00218913)와 해양수산부 재원으로 선박해양플랜트연구소의 기본사업인 "스마트 해양안전 및 기업지원을 위한 오픈플랫폼 기술개발"에 의해 수행되었습니다 (1525014880, PES4880).

References

  1. Abualigah, L. M., A. T. Khader, and M. A. Al-Betar(2016), Multi-objectives-based text clustering technique using K-mean algorithm. In 2016 7th International Conference on Computer Science and Information Technology (CSIT), IEEE, pp. 1-6. 
  2. Ashari, I. F., E. D. Nugroho, R. Baraku, I. N. Yanda, and R. Liwardana(2023), Analysis of Elbow, Silhouette, Davies-Bouldin, Calinski-Harabasz, and Rand-Index Evaluation on K-Means Algorithm for Classifying Flood-Affected Areas in Jakarta. Journal of Applied Informatics and Computing, 7(1), 95-103.  https://doi.org/10.30871/jaic.v7i1.4947
  3. Chen, P., Y. Huang, J. Mou, and P. Van Gelder(2018), Ship collision candidate detection method: A velocity obstacle approach. Ocean Engineering, 170, pp. 186-198.  https://doi.org/10.1016/j.oceaneng.2018.10.023
  4. Cho, D. O., J. Y. Mok, and Y. U. Park(2002), The direction of development for the maritime safety tribunal system in Korea. Han'guk Haeyang Susan Kaebarwon. 
  5. Choi, C. W., Y. N. Roh, D. S. Shin, H. M. Kim, and H. C. Park(2021), Identifying Risk Factors of Marine Accidents in Coastal Area by Marine Accident Types. Journal of the Korean Society of Transportation, 39(4), 540-554.  https://doi.org/10.7470/jkst.2021.39.4.540
  6. Devlin, J., M. W. Chang, K. Lee, and K. Toutanova(2018), Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805. 
  7. Fan, S., E. Blanco-Davis, Z. Yang, J. Zhang, and X. Yan (2020), Incorporation of human factors into maritime accident analysis using a data-driven Bayesian network. Reliability Engineering & System Safety, 203, 107070. 
  8. Faruqui, M., Y. Tsvetkov, P. Rastogi, and C. Dyer(2016), Problems with evaluation of word embeddings using word similarity tasks. ACL 2016, 30. 
  9. Ham, J., Y. J. Choe, K. Park, I. Choi, and H. Soh(2020), KorNLI and KorSTS: New benchmark datasets for Korean natural language understanding. arXiv preprint arXiv:2004.03289. 
  10. Han, Y. J.(2022), Development of risk leading indicators by sea area based on ship operation characteristics (Master's thesis). Pusan National University. 
  11. He, A., C. Luo, X. Tian, and W. Zeng(2018), A twofold siamese network for real-time object tracking. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 4834-4843. 
  12. Huang, Y., P. Van Gelder, and Y. Wen(2018), Velocity obstacle algorithms for collision prevention at sea. Ocean Engineering, 151, pp. 308-321.  https://doi.org/10.1016/j.oceaneng.2018.01.001
  13. Jee, T. C., H. J. Lee, and Y. B. Lee(2007), Determining the number of Clusters in On-Line Document Clustering Algorithm. The KIPS Transactions: PartB, 14(7), 513-522.  https://doi.org/10.3745/KIPSTB.2007.14-B.7.513
  14. Joshi, A., A. Kajale, J. Gadre, S.Deode, and R. Joshi(2023), L3Cube-MahaSBERT and HindSBERT: Sentence BERT Models and Benchmarking BERT Sentence Representations for Hindi and Marathi. In Science and Information Conference (pp. 1184-1199). Cham: Springer Nature Switzerland. 
  15. Jung, C. H.(2018), A study on the improvement of safety by accidents analysis of fishing vessels, J. Fish. Mar. Sci. Educ, Vol. 30, pp. 179-186.  https://doi.org/10.13000/JFMSE.2018.02.30.1.176
  16. Kadhim, A. I., Y. N. Cheah, and N. H. Ahamed(2014), Text document preprocessing and dimension reduction techniques for text document clustering. In 2014 4th international conference on artificial intelligence with applications in engineering and technology, IEEE, pp. 69-73. 
  17. Kalra, V. and R. Aggarwal(2017), Importance of Text Data Preprocessing & Implementation in RapidMiner. ICITKM, 14, pp. 71-75. 
  18. Kassambara, A.(2017), Practical guide to cluster analysis in R: Unsupervised machine learning, Vol. 1. 
  19. Kim, G. and H. Kim(2011), Development of ship safety navigation supporting equipmentusing infrared LED. Journal of the Korea Institute of Information Technology, 9(2), pp. 27-32. 
  20. Kim, S. -K. and J. -P. Kang(2011), A Study on the Relationships between the Casualties of Fishing Boats and Meteorological Factors (Doctoral dissertation). 
  21. Kim, W. -S., Y. -K. Hyun, and Y. -W. Lee(2020), Risk factors of fisher on stow net fishing vessel using analysis of adjudication, Journal of the Korean Society of Fisheries and Ocean Technology, Vol. 56, pp. 155-162.  https://doi.org/10.3796/KSFOT.2020.56.2.155
  22. Kodinariya, T. M. and P. R. Makwana(2013), Review on determining number of Cluster in K-Means Clustering. International Journal, 1(6), pp. 90-95. 
  23. Korean Maritime Safety Tribunal(2007), Busan Regional Maritime Safety Tribunal Decision 2007-036. 
  24. Korean Maritime Safety Tribunal(2008), Busan Regional Maritime Safety Tribunal Decision 2008-002. 
  25. Korean Maritime Safety Tribunal(2009a), Busan Regional Maritime Safety Tribunal Decision 2009-032. 
  26. Korean Maritime Safety Tribunal(2009b), Busan Regional Maritime Safety Tribunal Decision 2009-039. 
  27. Korean Maritime Safety Tribunal(2009c), Busan Regional Maritime Safety Tribunal Decision 2009-059. 
  28. Korean Maritime Safety Tribunal(2010a), Busan Regional Maritime Safety Tribunal Decision 2010-020. 
  29. Korean Maritime Safety Tribunal(2010b), Busan Regional Maritime Safety Tribunal Decision 2010-062. 
  30. Korean Maritime Safety Tribunal(2012), Busan Regional Maritime Safety Tribunal Decision 2012-046. 
  31. Korean Maritime Safety Tribunal(2016a), Busan Regional Maritime Safety Tribunal Decision 2016-052. 
  32. Korean Maritime Safety Tribunal(2016b), Busan Regional Maritime Safety Tribunal Decision 2016-061. 
  33. Korean Maritime Safety Tribunal(2017a), Busan Regional Maritime Safety Tribunal Decision 2017-054: Summary of the Collision Case between Fishing Vessels Deukyongho and Buyeongho. 
  34. Korean Maritime Safety Tribunal(2017b), Busan Regional Maritime Safety Tribunal Decision 2017-058: Summary of the Collision Case between Fishing Vessel Geoseongho and Hansungho. 
  35. Korean Maritime Safety Tribunal(2017c), Busan Regional Maritime Safety Tribunal Decision 2017-069: Summary. 
  36. Korean Maritime Safety Tribunal(2018a), Busan Regional Maritime Safety Tribunal Decision 2018-021: Summary. 
  37. Korean Maritime Safety Tribunal(2018b), Busan Regional Maritime Safety Tribunal Decision 2018-069: Summary. 
  38. Korean Maritime Safety Tribunal(2019a), Busan Regional Maritime Safety Tribunal Decision 2019-007: Summary. 
  39. Korean Maritime Safety Tribunal(2019b), Busan Regional Maritime Safety Tribunal Decision 2019-018: Summary. 
  40. Korean Maritime Safety Tribunal(2020a), Busan Regional Maritime Safety Tribunal Decision 2020-008: Summary. 
  41. Korean Maritime Safety Tribunal(2020b), Busan Regional Maritime Safety Tribunal Decision 2020-024: Summary. 
  42. Korean Maritime Safety Tribunal(2020c), Busan Regional Maritime Safety Tribunal Decision 2020-086: Summary. 
  43. Korean Maritime Safety Tribunal(2021), Busan Regional Maritime Safety Tribunal Decision 2021-024: Summary. 
  44. KMST. (2022). Marine accident statistics and casebook, 12. 
  45. Lee, J. S., B. K. Lee, and I. S. Cho(2019), Text Mining Analysis Technique on ECDIS Accident Report. Journal of the Korean Society of Marine Environment & Safety, 25(4), 405-412.  https://doi.org/10.7837/kosomes.2019.25.4.405
  46. Liu, Y., M. Ott, N. Goyal, J. Du, M. Joshi, D. Chen, O. Levy, M. Lewis, L. Zettlemoyer, and V. Stoyanov(2019), Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692. 
  47. Madhulatha, T. S.(2012), An overview on clustering methods. arXiv preprint arXiv:1205.1117. 
  48. Park, H., M. A. Cheon, Y. Namgung, H. Yoon, M. S. Choi, J. G. Kim, and J. H. Kim(2020), Classification of vessel accidents according to word and sentence embedding. Proceedings of the Korean Institute of Information Scientists and Engineers Conference, 413-415. 
  49. Park, S. -A. and D. -J. Park(2023), A study on the analysis of marine accidents on fishing ships using accident cause data, Journal of Korean Navigation and Port Research, Vol. 47-1, pp. 1-9. 
  50. Reimers, N. and I. Gurevych(2019), Sentence-bert: Sentence embeddings using siamese bert-networks. arXiv preprint arXiv:1908.10084. 
  51. Soni, N. and A. Ganatra(2012), Categorization of several clustering algorithms from different perspective: a review. International Journal of Advanced Research in Computer Science and Software Engineering, Vol. 2, No. 8, pp. 63-68. 
  52. Tan, P. -N., M. Steinbach, and V. Kumar(2005), Introduction to Data Mining, Addison-Wesley, ISBN 0-321-32136-7, Chapter 8, page 500. 
  53. Tirunagari, S., N. Poh, D. Windridge, A.Iorliam, N. Suki, and A. T. Ho(2015), Detection of face spoofing using visual dynamics. IEEE transactions on information forensics and security, 10(4), pp. 762-777.  https://doi.org/10.1109/TIFS.2015.2406533
  54. Vijaymeena, M. K. and K. Kavitha(2016), A survey on similarity measures in text mining. Machine Learning and Applications: An International Journal, 3(2), pp. 19-28.  https://doi.org/10.5121/mlaij.2016.3103
  55. Wolfram Research(2007), CosineDistance - Wolfram Language & System Documentation Center, wolfram.com. 
  56. Xie, J. and S. Jiang(2010), A simple and fast algorithm for global k-means clustering. 2010 Second International Workshop on Education Technology and Computer Science.