DOI QR코드

DOI QR Code

A Study on the Establishment of Comparison System between the Statement of Military Reports and Related Laws

군(軍) 보고서 등장 문장과 관련 법령 간 비교 시스템 구축 방안 연구

  • Jung, Jiin (Department of Industrial Engineering, Yonsei University) ;
  • Kim, Mintae (Department of Industrial Engineering, Yonsei University) ;
  • Kim, Wooju (Department of Industrial Engineering, Yonsei University)
  • Received : 2020.08.11
  • Accepted : 2020.09.20
  • Published : 2020.09.30

Abstract

The Ministry of National Defense is pushing for the Defense Acquisition Program to build strong defense capabilities, and it spends more than 10 trillion won annually on defense improvement. As the Defense Acquisition Program is directly related to the security of the nation as well as the lives and property of the people, it must be carried out very transparently and efficiently by experts. However, the excessive diversification of laws and regulations related to the Defense Acquisition Program has made it challenging for many working-level officials to carry out the Defense Acquisition Program smoothly. It is even known that many people realize that there are related regulations that they were unaware of until they push ahead with their work. In addition, the statutory statements related to the Defense Acquisition Program have the tendency to cause serious issues even if only a single expression is wrong within the sentence. Despite this, efforts to establish a sentence comparison system to correct this issue in real time have been minimal. Therefore, this paper tries to propose a "Comparison System between the Statement of Military Reports and Related Laws" implementation plan that uses the Siamese Network-based artificial neural network, a model in the field of natural language processing (NLP), to observe the similarity between sentences that are likely to appear in the Defense Acquisition Program related documents and those from related statutory provisions to determine and classify the risk of illegality and to make users aware of the consequences. Various artificial neural network models (Bi-LSTM, Self-Attention, D_Bi-LSTM) were studied using 3,442 pairs of "Original Sentence"(described in actual statutes) and "Edited Sentence"(edited sentences derived from "Original Sentence"). Among many Defense Acquisition Program related statutes, DEFENSE ACQUISITION PROGRAM ACT, ENFORCEMENT RULE OF THE DEFENSE ACQUISITION PROGRAM ACT, and ENFORCEMENT DECREE OF THE DEFENSE ACQUISITION PROGRAM ACT were selected. Furthermore, "Original Sentence" has the 83 provisions that actually appear in the Act. "Original Sentence" has the main 83 clauses most accessible to working-level officials in their work. "Edited Sentence" is comprised of 30 to 50 similar sentences that are likely to appear modified in the county report for each clause("Original Sentence"). During the creation of the edited sentences, the original sentences were modified using 12 certain rules, and these sentences were produced in proportion to the number of such rules, as it was the case for the original sentences. After conducting 1 : 1 sentence similarity performance evaluation experiments, it was possible to classify each "Edited Sentence" as legal or illegal with considerable accuracy. In addition, the "Edited Sentence" dataset used to train the neural network models contains a variety of actual statutory statements("Original Sentence"), which are characterized by the 12 rules. On the other hand, the models are not able to effectively classify other sentences, which appear in actual military reports, when only the "Original Sentence" and "Edited Sentence" dataset have been fed to them. The dataset is not ample enough for the model to recognize other incoming new sentences. Hence, the performance of the model was reassessed by writing an additional 120 new sentences that have better resemblance to those in the actual military report and still have association with the original sentences. Thereafter, we were able to check that the models' performances surpassed a certain level even when they were trained merely with "Original Sentence" and "Edited Sentence" data. If sufficient model learning is achieved through the improvement and expansion of the full set of learning data with the addition of the actual report appearance sentences, the models will be able to better classify other sentences coming from military reports as legal or illegal. Based on the experimental results, this study confirms the possibility and value of building "Real-Time Automated Comparison System Between Military Documents and Related Laws". The research conducted in this experiment can verify which specific clause, of several that appear in related law clause is most similar to the sentence that appears in the Defense Acquisition Program-related military reports. This helps determine whether the contents in the military report sentences are at the risk of illegality when they are compared with those in the law clauses.

군(軍)에서 방위력개선사업(이하 방위사업)은 매우 투명하고 효율적으로 이루어져야 함에도, 방위사업 관련 법 및 규정의 과도한 다양화로 많은 실무자들이 원활한 방위사업 추진에 어려움을 겪고 있다. 한편, 방위사업 관련 실무자들이 각종 문서에서 다루는 법령 문장은 문장 내에서 표현 하나만 잘못되더라도 심각한 문제를 유발하는 특징을 가지고 있으나, 이를 실시간으로 바로잡기 위한 문장 비교 시스템 구축에 대한 노력은 미미했다. 따라서 본 논문에서는 Siamese Network 기반의 자연어 처리(NLP) 분야 인공 신경망 모델을 이용하여 군(軍)의 방위사업 관련 문서에서 등장할 가능성이 높은 문장과 이와 관련된 법령 조항의 유사도를 비교하여 위법 위험 여부를 판단·분류하고, 그 결과를 사용자에게 인지시켜 주는 '군(軍) 보고서 등장 문장과 관련 법령 간 비교 시스템' 구축 방안을 제안하려고 한다. 직접 제작한 데이터 셋인 모(母)문장(실제 법령에 등장하는 문장)과 자(子)문장(모(母)문장에서 파생시킨 변형 문장) 3,442쌍을 사용하여 다양한 인공 신경망 모델(Bi-LSTM, Self-Attention, D_Bi-LSTM)을 학습시켰으며 1 : 1 문장 유사도 비교 실험을 통해 성능 평가를 수행한 결과, 상당히 높은 정확도로 자(子)문장의 모(母)문장 대비 위법 위험 여부를 분류할 수 있었다. 또한, 모델 학습에 사용한 자(子)문장 데이터는 법령 문장을 일정 규칙에 따라 변형한 형태이기 때문에 모(母)·자(子)문장 데이터만으로 학습시킨 모델이 실제 군(軍) 보고서에 등장하는 문장을 효과적으로 분류한다고 판단하기에는 제한된다는 단점을 보완하기 위해, 실제 군(軍) 보고서에 등장하는 형태에 보다 더 가깝고 모(母)문장과 연관된 새로운 문장 120문장을 추가로 작성하여 모델의 성능을 평가해본 결과, 모(母)·자(子)문장 데이터만으로 학습시킨 모델로도 일정 수준 이상의 성능을 확인 할 수 있었다. 결과적으로 본 연구를 통해 방위사업 관련 군(軍) 보고서에서 등장하는 여러 특정 문장들이 각각 어느 관련 법령의 어느 조항과 가장 유사한지 살펴보고, 해당 조항과의 유사도 비교를 통해 위법 위험 여부를 판단하는 '실시간 군(軍) 문서와 관련 법령 간 자동화 비교 시스템'의 구축 가능성을 확인할 수 있었다.

Keywords

References

  1. Bahdanau, D., K. H. Cho, Y. Bengio, "Neural Machine Translation by Jointly Learning to Align and Translate," ICLR(2015).
  2. Bromley, J., I. Guyon, Y. Lecun, E. Sackinger, R. Shah, "Signature Verification using a "Siamese" Time Delay Neural Network," International Journal of Pattern Recognition and Artificial Intelligence, Vol.7, No.4(1994).
  3. Han, H., S. Choi, "An Artificial Neural Network Approach for the Prediction of Unlawful Company in Defense Procurement," Journal of the Military Operations Research Society of Korea, Vol.37, No.1(2011).
  4. Hochreiter, S., J. Schmidhuber, "Long Short-Term Memory," Neural Computation, Vol.9, No.8(1997).
  5. Kim, M., H. Han, S. Choi, "A Study on the EAC Estimation of Defense Acquisition Project using Artificial Neural Network," Journal of Korea Management Engineers Society, Vol. 16, No.3(2011).
  6. Kim, M. T., Y. T. Oh, W. J. Kim, "Sentence Similarity Prediction based on Siamese CNN-Bidirectional LSTM with Self-attention," Journal of KIISE, Vol.46, No.3(2019).
  7. Kim, S. Y., Theory and Practice of Defense Acquisition, Bookorea, 2017.
  8. Lee, D. K., M. T. Kim, W. J. Kim, "Query-based Answer Extraction using Korean Dependency Parsing," Journal of Intelligence and Information Systems, Vol.25, No.3(2019).
  9. Lee, M. S., S. W. Yang, H. J. Lee, "Selective Word Embedding for Sentence Classification by Considering Information Gain and Word Similarity," Journal of Intelligence and Information Systems, Vol.25, No.4(2019). https://doi.org/10.13088/jiis.2019.25.4.089
  10. Lin, Z., M. Feng, C. N. Santos, M. Yu, B. Xiang, B. Zhou, Y. Bengio, "A Structured Selfattentive Sentence Embedding," ICLR(2017).
  11. Mueller, J., A. Thyagarajan, "Siamese recurrent architectures for learning sentence similarity," AAAI'16: Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence(2016).
  12. Park, H. Y., K. J. Kim, "Sentiment Analysis of Movie Review Using Integrated CNN-LSTM Model," Journal of Intelligence and Information Systems, Vol.25, No.4(2019). https://doi.org/10.13088/jiis.2019.25.4.089
  13. Rumelhart, D. E., J. L. McClelland, Parallel Distributed Processing, A Bradford Book, Cambridge, 1986.
  14. Schuster, M., K. K. Paliwal, "Bidirectional Recurrent Neural Networks," IEEE TRANSACTIONS ON SIGNAL PROCESSING, Vol.45, No.11(1997). https://doi.org/10.1109/TSP.1997.650089
  15. Zhu, W., T. Yao, J. Ni, B. Wei, Z. Lu, "Dependency-based Siamese long short-term memory network for learning sentence representations," PLoS One, Vol.13, No.3(2018).