DOI QR코드

DOI QR Code

NLP와 BiLSTM을 적용한 조세 결정문의 분석과 예측

Tax Judgment Analysis and Prediction using NLP and BiLSTM

  • 이영근 (공주대학교 컴퓨터공학과) ;
  • 박구락 (공주대학교 컴퓨터공학부) ;
  • 이후영 (공주대학교 컴퓨터 공학과)
  • Lee, Yeong-Keun (Dept. of Computer Engineering, Kongju National University) ;
  • Park, Koo-Rack (Dept. of Computer Science & Engineering, Kongju National University) ;
  • Lee, Hoo-Young (Dept. of Computer Engineering, Kongju National University)
  • 투고 : 2021.08.31
  • 심사 : 2021.09.20
  • 발행 : 2021.09.28

초록

일반인에게 난해한 법률분야를 이해하기 쉽고 예측 가능 할 수 있도록 인공지능을 적용한 법률 서비스에 대한 연구의 중요성이 대두되고 있다. 본 연구에서는 조세심판원의 결정정보를 수집하고 데이터 처리와 자체 학습을 통한 모델을 구축하여 사용자의 질의에 맞는 답변을 예측하기 위한 시스템을 제안한다. 제안 모델은 웹크롤링을 통해서 조세 결정문의 정보 수집 및 자연어 처리과정을 통하여 유용한 데이터를 추출하고, 최적화된 산출물을 Word2Vec의 Fast Text 알고리즘을 적용하여 단어의 벡터를 생성하였다. 2017년부터 2019년까지 총 11,103건의 정보를 수집하고 분류하였으며 RNN 기술의 BiLSTM을 적용하여 자체학습을 통한 결과 예측 프로그램을 구축하여 70%정확도로 실증하였다. 향후 다양한 법률시스템으로 활용성을 기대할 수 있으며 보다 효율적인 적용을 위한 연구와 정확도 향상을 위한 연구가 계속되어야 한다.

Research and importance of legal services applied with AI so that it can be easily understood and predictable in difficult legal fields is increasing. In this study, based on the decision of the Tax Tribunal in the field of tax law, a model was built through self-learning through information collection and data processing, and the prediction results were answered to the user's query and the accuracy was verified. The proposed model collects information on tax decisions and extracts useful data through web crawling, and generates word vectors by applying Word2Vec's Fast Text algorithm to the optimized output through NLP. 11,103 cases of information were collected and classified from 2017 to 2019, and verified with 70% accuracy. It can be useful in various legal systems and prior research to be more efficient application.

키워드

참고문헌

  1. J. R. Park & S. O. Noe. (2018). A study on legal service of AI. Journal of The Korea Society of Computer and Information, 23(7), 105-111. DOI : 10.9708/JKSCI.2018.23.07.105
  2. Eliot. Lance. (2020). AI and Legal Reasoning Essentials. LBE Press Publishing.
  3. Baker. J. J. (2018). 2018: A Legal Research Odyssey: Artificial Intelligence as Disruptor. Law Library Journal, 110(Issue 1), 5-30.
  4. Genesereth. M. (2019). Computational law. Stanford Center for Legal Informatics.
  5. Markou. C & Deakin. S. (2020). Is Law Computable? From Rule of Law to Legal Singularity. May, 4, 2020. SSRN, University of Cambridge Faculty of Law Research Paper.
  6. Aleven. V. (2003). Using Background Knowledge in Case-based Legal Reasoning: a Computational model and an Intelligent Learning Environment. Artificial Intelligence, 150(1-2), 183-237. DOI : 10.1016/S0004-3702(03)00105-X
  7. Hage. J. (2000). Dialectical models in artificial intelligence and law. Artificial Intelligence and Law, 8(2-3), 137-172. https://doi.org/10.1023/A:1008348321016
  8. Ashley. K., Branting. K, Margolis. H & Sunstein. C. R. (2001). Legal Reasoning and Artificial Intelligence: How Computers "Think" Like Lawyers. University of Chicago Law School Roundtable, 8(1), 1-28.
  9. El Ghosh. M. (2018). Automation of legal reasoning and decision based on ontologies. Doctoral dissertation. Normandie Universite.
  10. Ho. J. H., Lee. G. G & Lu. M. T. (2020). Exploring the Implementation of a Legal AI Bot for Sustainable Development in Legal Advisory Institutions. Sustainability, 12(15), 5991. DOI : 10.3390/su12155991
  11. Mikolov. T., Chen. K., Corrado. G & Dean. J. (2013). Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781.
  12. Mikolov. T., Sutskever. I., Chen. K., Corrado. G. S & Dean. J. (2013). Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems, 3111-3119.
  13. Sahlgren. M. (2008). The distributional hypothesis. Italian Journal of Disability Studies, 20, 33-53.
  14. Bybee. J. L & Hopper. P. J. (Eds.). (2001). Frequency and the emergence of linguistic structure (Vol. 45). John Benjamins Publishing.
  15. Steck. H. (2011, October). Item popularity and recommendation accuracy. In Proceedings of the fifth ACM conference on Recommender systems, 125-132.
  16. Bojanowski. P., Grave. E., Joulin. A & Mikolov. T. (2017). Enriching word vectors with subword information. Transactions of the Association for Computational Linguistics, 5, 135-146. DOI : 10.1162/tacl_a_00051
  17. Joulin. A., Grave. E., Bojanowski. P & Mikolov. T. (2016). Bag of tricks for efficient text classification. arXiv preprint arXiv:1607.01759.
  18. Graves. A & Schmidhuber. J. (2005). Framewise phoneme classification with bidirectional LSTM and other neural network architectures. Neural networks, 18(5-6), 602-610. DOI : 10.1016/j.neunet.2005.06.042
  19. Gers. F. A., Schraudolph. N. N & Schmidhuber. J. (2002). Learning precise timing with LSTM recurrent networks. Journal of machine learning research, 3(Aug), 115-143.
  20. Graves. A., Mohamed. A. R & Hinton. G. (2013, May). Speech recognition with deep recurrent neural networks. In 2013 IEEE international conference on acoustics, speech and signal processing, IEEE, 6645-6649. DOI : 10.1109/ICASSP.2013.6638947