DOI QR코드

DOI QR Code

The Prediction of Export Credit Guarantee Accident using Machine Learning

기계학습을 이용한 수출신용보증 사고예측

  • Cho, Jaeyoung (Korea Trade Insurance Corporation) ;
  • Joo, Jihwan (School of Management Engineering, College of Business, KAIST) ;
  • Han, Ingoo (College of Business, KAIST)
  • Received : 2021.01.05
  • Accepted : 2021.03.08
  • Published : 2021.03.31

Abstract

The government recently announced various policies for developing big-data and artificial intelligence fields to provide a great opportunity to the public with respect to disclosure of high-quality data within public institutions. KSURE(Korea Trade Insurance Corporation) is a major public institution for financial policy in Korea, and thus the company is strongly committed to backing export companies with various systems. Nevertheless, there are still fewer cases of realized business model based on big-data analyses. In this situation, this paper aims to develop a new business model which can be applied to an ex-ante prediction for the likelihood of the insurance accident of credit guarantee. We utilize internal data from KSURE which supports export companies in Korea and apply machine learning models. Then, we conduct performance comparison among the predictive models including Logistic Regression, Random Forest, XGBoost, LightGBM, and DNN(Deep Neural Network). For decades, many researchers have tried to find better models which can help to predict bankruptcy since the ex-ante prediction is crucial for corporate managers, investors, creditors, and other stakeholders. The development of the prediction for financial distress or bankruptcy was originated from Smith(1930), Fitzpatrick(1932), or Merwin(1942). One of the most famous models is the Altman's Z-score model(Altman, 1968) which was based on the multiple discriminant analysis. This model is widely used in both research and practice by this time. The author suggests the score model that utilizes five key financial ratios to predict the probability of bankruptcy in the next two years. Ohlson(1980) introduces logit model to complement some limitations of previous models. Furthermore, Elmer and Borowski(1988) develop and examine a rule-based, automated system which conducts the financial analysis of savings and loans. Since the 1980s, researchers in Korea have started to examine analyses on the prediction of financial distress or bankruptcy. Kim(1987) analyzes financial ratios and develops the prediction model. Also, Han et al.(1995, 1996, 1997, 2003, 2005, 2006) construct the prediction model using various techniques including artificial neural network. Yang(1996) introduces multiple discriminant analysis and logit model. Besides, Kim and Kim(2001) utilize artificial neural network techniques for ex-ante prediction of insolvent enterprises. After that, many scholars have been trying to predict financial distress or bankruptcy more precisely based on diverse models such as Random Forest or SVM. One major distinction of our research from the previous research is that we focus on examining the predicted probability of default for each sample case, not only on investigating the classification accuracy of each model for the entire sample. Most predictive models in this paper show that the level of the accuracy of classification is about 70% based on the entire sample. To be specific, LightGBM model shows the highest accuracy of 71.1% and Logit model indicates the lowest accuracy of 69%. However, we confirm that there are open to multiple interpretations. In the context of the business, we have to put more emphasis on efforts to minimize type 2 error which causes more harmful operating losses for the guaranty company. Thus, we also compare the classification accuracy by splitting predicted probability of the default into ten equal intervals. When we examine the classification accuracy for each interval, Logit model has the highest accuracy of 100% for 0~10% of the predicted probability of the default, however, Logit model has a relatively lower accuracy of 61.5% for 90~100% of the predicted probability of the default. On the other hand, Random Forest, XGBoost, LightGBM, and DNN indicate more desirable results since they indicate a higher level of accuracy for both 0~10% and 90~100% of the predicted probability of the default but have a lower level of accuracy around 50% of the predicted probability of the default. When it comes to the distribution of samples for each predicted probability of the default, both LightGBM and XGBoost models have a relatively large number of samples for both 0~10% and 90~100% of the predicted probability of the default. Although Random Forest model has an advantage with regard to the perspective of classification accuracy with small number of cases, LightGBM or XGBoost could become a more desirable model since they classify large number of cases into the two extreme intervals of the predicted probability of the default, even allowing for their relatively low classification accuracy. Considering the importance of type 2 error and total prediction accuracy, XGBoost and DNN show superior performance. Next, Random Forest and LightGBM show good results, but logistic regression shows the worst performance. However, each predictive model has a comparative advantage in terms of various evaluation standards. For instance, Random Forest model shows almost 100% accuracy for samples which are expected to have a high level of the probability of default. Collectively, we can construct more comprehensive ensemble models which contain multiple classification machine learning models and conduct majority voting for maximizing its overall performance.

2020년 8월 정부는 한국판 뉴딜을 뒷받침하기 위한 공공기관의 역할 강화방안으로서 각 공공기관별 역량을 바탕으로 5대 분야에 걸쳐 총 20가지 과제를 선정하였다. 빅데이터(Big Data), 인공지능 등을 활용하여 대국민 서비스를 제고하고 공공기관이 보유한 양질의 데이터를 개방하는 등의 다양한 정책을 통해 한국판 뉴딜(New Deal)의 성과를 조기에 창출하고 이를 극대화하기 위한 다양한 노력을 기울이고 있다. 그중에서 한국무역보험공사(KSURE)는 정책금융 공공기관으로 국내 수출기업들을 지원하기 위해 여러 제도를 운영하고 있는데 아직까지는 본 기관이 가지고 있는 빅데이터를 적극적으로 활용하지 못하고 있는 실정이다. 본 연구는 한국무역보험공사의 수출신용보증 사고 발생을 사전에 예측하고자 공사가 보유한 내부 데이터에 기계학습 모형을 적용하였고 해당 모형 간에 예측성과를 비교하였다. 예측 모형으로는 로지스틱(Logit) 회귀모형, 랜덤 포레스트(Random Forest), XGBoost, LightGBM, 심층신경망을 사용하였고, 평가 기준으로는 전체 표본의 예측 정확도 이외에도 표본별 사고 확률을 구간으로 나누어 높은 확률로 예측된 표본과 낮은 확률로 예측된 경우의 정확도를 서로 비교하였다. 각 모형별 전체 표본의 예측 정확도는 70% 내외로 나타났고 개별 표본을 사고 확률 구간별로 세부 분석한 결과 양 극단의 확률구간(0~20%, 80~100%)에서 90~100%의 예측 정확도를 보여 모형의 현실적 활용 가능성을 보여주었다. 제2종 오류의 중요성 및 전체적 예측 정확도를 종합적으로 고려할 경우, XGBoost와 심층신경망이 가장 우수한 모형으로 평가되었다. 랜덤포레스트와 LightGBM은 그 다음으로 우수하며, 로지스틱 회귀모형은 가장 낮은 성과를 보였다. 본 연구는 한국무역보험공사의 빅데이터를 기계학습모형으로 분석해 업무의 효율성을 높이는 사례로서 향후 기계학습 등을 활용하여 실무 현장에서 빅데이터 분석 및 활용이 활발해지기를 기대한다.

Keywords

References

  1. Ahn, H., K. Kim, and I. Han, "Purchase Prediction Model using the Support Vector Machine," Journal of Intelligence and Information Systems, Vol.11, No.2 (2005), 69-81.
  2. Ahn, S. M., "Deep Learning Architectures and Applications," Journal of Intelligence and Information Systems, Vol.22, No.2 (2016), 127-142. https://doi.org/10.13088/jiis.2016.22.2.127
  3. Altman, E. I., "Financial ratios, discriminant analysis and the prediction of corporate bankruptcy," Journal of Finance, Vol.23, No.4 (1968), 589-609. https://doi.org/10.1111/j.1540-6261.1968.tb00843.x
  4. Beaver, W. H., "Financial ratios as predictors of failure," Journal of Accounting Research, Vol.4 (1966), 71-111. https://doi.org/10.2307/2490171
  5. Cha, S. J. and J. S. Kang, "Corporate Default Prediction Model using Deep Learning Time Series Algorithm, RNN and LSTM," Journal of Intelligence and Information Systems, Vol.24, No.4 (2018), 1-32. https://doi.org/10.13088/JIIS.2018.24.4.001
  6. Chen, T. and C. Gustrin, "XGBoost: A Scalable Tree Boosting System," 22nd ACM KDD Conference on Knowledge Discovery and Data Mining (2016), 785-794.
  7. Elmer, P. J. and D. M. Borowski, "An Expert System Approach to Financial Analysis: The Case of S&L Bankruptcy," Financial Management, Vol.17, No.3 (1988), 66-76. https://doi.org/10.2307/3666073
  8. Goodfellow, I., J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio, "Generative Adversarial Nets," In Advances in Neural Information Processing Systems, (2014), 2672-2680.
  9. Gumpert, A., H. Li, A. Moxnes, N. Ramondo, and F. Tintelnot, "The Life-Cycle Dinamics of Exporters and Multinational Firms," Journal of International Economics, Vol.126 (2020), 1-38.
  10. Han, I. G., Y. S. Kwon, and K. C. Lee, "Development of Intelligent Corporate Credit Evaluation System," Korean Management Review, Vol.24, No.4 (1995), 91-118.
  11. Han, I. G., H. K. Cho, and K. S. Shin, "The Hybrid System for Credit Rating," Journal of the Korean Operations Research and Management Science Society, Vol.22, No.3 (1997), 163-173.
  12. Ke, G., Q. Meng, T. Finley, T. Wang, W. Chen, W. Ma, Q. Ye, and T. Liu, "LightGBM: A Highly Efficient Gradient Boosting Decision Tree," 31st Conference on Neural Information Processing Systems, (2017), CA, USA.
  13. Kim, K. W., "Symptoms and Predictions of Business Failure based on Financial Ratios," Korean Management Review, Vol.16, No.2(1987), 263-316.
  14. Kim, J. R., S. B. Kim, and J. H. Nam, "The Performance Measurement of Credit Guarantee and Methods of Improvement of Its System," Korea Review of Applied Economics, Vol.16, No.2 (2014), 33-64.
  15. Kim, Y. T. and M. H. Kim, "An Artificial Neural Network Model for Business Failure Prediction," Korean Journal of Accounting Research, Vol.6, No.1 (2001), 275-294.
  16. Krugman, P., Pricing to Market when the Exchange Rate Changes, MIT Press, Massachusetts, 1986.
  17. Kwon, H. K., D. K. Lee, and M. S. Shin, "Dynamic Forecasts of Bankruptcy with Recurrent Neural Network Model," Journal of Intelligence and Information Systems, Vol.23, No.3 (2017), 139-153. https://doi.org/10.13088/jiis.2017.23.3.139
  18. Lee, H. J., "A Study on Prediction Model of Peer-to-Peer (P2P) Social Lending Debtor using Deep Learning Technique," Journal of Digital Contents Society, Vol.20, No.7 (2019), 1409-1416. https://doi.org/10.9728/dcs.2019.20.7.1409
  19. Ohlson, J. A., "Financial Ratios and the Probabilistic Prediction of Bankruptcy," Journal of Accounting Research, Vol.18, No.1 (1980), 109-131. https://doi.org/10.2307/2490395
  20. Schmidhuber, J., "Deep Learning in Neural Networks: An Overview," Neural Networks, Vol.61 (2015), 85-117. https://doi.org/10.1016/j.neunet.2014.09.003
  21. Seo, C. S. and B. K. Lee, "A Study on the Optimal Credit Guarantee Fund Operation Model: Focused on Local Credit Guarantee Foundations," Korea Trade Review, Vol.31, No.5 (2006), 197-217.
  22. Shin, S. H., H. J. Lee, and J. J. Ahn, "A Study on Initial Price Change Prediction of IPO Shares using Non-financial Information", Journal of the Korean Data And Information Science Society, Vol.29, No.2 (2018), 425-439. https://doi.org/10.7465/jkdi.2018.29.2.425
  23. Yoon, J. M., "Effectiveness Analysis of Credit Card Default Risk with Deep Learning Neural Network," Journal of Money and Finance, Vol.33, No.1 (2019), 151-183. https://doi.org/10.21023/JMF.33.1.5