DOI QR코드

DOI QR Code

Corporate Bankruptcy Prediction Model using Explainable AI-based Feature Selection

설명가능 AI 기반의 변수선정을 이용한 기업부실예측모형

  • Gundoo Moon (Department of MIS, Graduate School, Dongguk University_Seoul) ;
  • Kyoung-jae Kim (Department of MIS, Business School, Dongguk University_Seoul)
  • 문건두 (동국대학교_서울 일반대학원 경영정보학과) ;
  • 김경재 (동국대학교_서울 경영대학 경영정보학과)
  • Received : 2023.05.24
  • Accepted : 2023.06.12
  • Published : 2023.06.30

Abstract

A corporate insolvency prediction model serves as a vital tool for objectively monitoring the financial condition of companies. It enables timely warnings, facilitates responsive actions, and supports the formulation of effective management strategies to mitigate bankruptcy risks and enhance performance. Investors and financial institutions utilize default prediction models to minimize financial losses. As the interest in utilizing artificial intelligence (AI) technology for corporate insolvency prediction grows, extensive research has been conducted in this domain. However, there is an increasing demand for explainable AI models in corporate insolvency prediction, emphasizing interpretability and reliability. The SHAP (SHapley Additive exPlanations) technique has gained significant popularity and has demonstrated strong performance in various applications. Nonetheless, it has limitations such as computational cost, processing time, and scalability concerns based on the number of variables. This study introduces a novel approach to variable selection that reduces the number of variables by averaging SHAP values from bootstrapped data subsets instead of using the entire dataset. This technique aims to improve computational efficiency while maintaining excellent predictive performance. To obtain classification results, we aim to train random forest, XGBoost, and C5.0 models using carefully selected variables with high interpretability. The classification accuracy of the ensemble model, generated through soft voting as the goal of high-performance model design, is compared with the individual models. The study leverages data from 1,698 Korean light industrial companies and employs bootstrapping to create distinct data groups. Logistic Regression is employed to calculate SHAP values for each data group, and their averages are computed to derive the final SHAP values. The proposed model enhances interpretability and aims to achieve superior predictive performance.

기업의 부실 예측 모델은 기업의 재무 상태를 객관적으로 모니터링하는 데 필수적인 도구 역할을 한다. 적시에 경고하고 대응 조치를 용이하게 하며 파산 위험을 완화하고 성과를 개선하기 위한 효과적인 관리 전략을 수립할 수 있도록 지원한다. 투자자와 금융 기관은 금융 손실을 최소화하기 위해 부실 예측 모델을 이용한다. 기업 부실 예측을 위한 인공지능(AI) 기술 활용에 대한 관심이 높아지면서 이 분야에 대한 광범위한 연구가 진행되고 있다. 해석 가능성과 신뢰성이 강조되며 기업 부실 예측에서 설명 가능한 AI 모델에 대한 수요가 증가하고 있다. 널리 채택된 SHAP(SHapley Additive exPlanations) 기법은 유망한 성능을 보여주었으나 변수 수에 따른 계산 비용, 처리 시간, 확장성 문제 등의 한계가 있다. 이 연구는 전체 데이터 세트를 사용하는 대신 부트스트랩 된 데이터 하위 집합에서 SHAP 값을 평균화하여 변수 수를 줄이는 새로운 변수 선택 접근법을 소개한다. 이 기술은 뛰어난 예측 성능을 유지하면서 계산 효율을 향상시키는 것을 목표로 한다. 해석 가능성이 높은 선택된 변수를 사용하여 랜덤 포레스트, XGBoost 및 C5.0 모델을 훈련하여 분류 결과를 얻고자 한다. 분류 결과는 고성능 모델 설계를 목표로 soft voting을 통해 생성된 앙상블 모델의 분류 정확성과 비교한다. 이 연구는 1,698개 한국 경공업 기업의 데이터를 활용하고 부트스트래핑을 사용하여 고유한 데이터 그룹을 생성한다. 로지스틱 회귀 분석은 각 데이터 그룹의 SHAP 값을 계산하는 데 사용되며, SHAP 값 평균은 최종 SHAP 값을 도출하기 위해 계산된다. 제안된 모델은 해석 가능성을 향상시키고 우수한 예측 성능을 달성하는 것을 목표로 한다.

Keywords

References

  1. Adadi, A. and M. Berrada, "Peeking inside the black-box: a survey on explainable artificial intelligence (XAI).", IEEE access, 6 (2018), 52138~52160. https://doi.org/10.1109/ACCESS.2018.2870052
  2. Ahn, Y. A. and H. J. Cho, "A Study on XAI-based Clinical Decision Support System", Journal of the Korea Contents Association, Vol.21, No.12(2021), 13~22.
  3. Bastani, O., C. Kim, and H. Bastani., "Interpreting blackbox models via model extraction." arXiv preprint arXiv:1705.08504 (2017).
  4. Bems, J., O. Stary, M. Macas, J. Zegklitz and P. Posik, P, "Innovative default prediction approach.", Expert Systems with Applications, 42.17-18 (2015): 6277~6285. https://doi.org/10.1016/j.eswa.2015.04.053
  5. Brenes, R. F., A. Johannssen, and N. Chukhrova, "An intelligent bankruptcy prediction model using a multilayer perceptron.", Intelligent Systems with Applications, (2022), 200136.
  6. Brenes, R., A. Johannssen and N. Chukhrova, "An intelligent bankruptcy prediction model using a multilayer perceptron.", Intelligent Systems with Applications, (2022), 200136.
  7. Bussmann, N., P. Giudici, D. Marinelli and J. Papenbrock, "Explainable AI in fintech risk management.", Frontiers in Artificial Intelligence 3, (2020), 26.
  8. Carmona, P., A. Dwekat and Z. Mardawi, "No more black boxes! Explaining the predictions of a machine learning XGBoost classifier algorithm in business failure.", Research in International Business and Finance, 61 (2022), 101649.
  9. Chinnici, M., Y. G. Y. Gebreyesus, D. Dalton and S. Nixon, "Machine Learning for Data Center Optimizations: Feature Selection Using SHapley Additive exPlanation (SHAP)", Future Internet 15.3 (2023): 88.
  10. Cho, J. H., E. J. Ahn and S. S. Kim, "A Study on the Prediction Model for Insolvent Companies Based on Deep Learning", Journal of Business Research, Vol.36, No1(2021), 99~113.
  11. Cho, S. and K. Shin. "Domain Knowledge Incorporated Counterfactual Example-Based Explanation for Bankruptcy Prediction Model", Journal of Intelligence and Information Systems, 28(2), 2022, 307-332.
  12. Choi, J. W., "Forecasting corporate default using artificial intelligence based on news information", Konkuk University, 2019.
  13. Chu, H. G., H. S. Shin, S. K. Cho, Y. S. Yoo and C. S. Park, "Sensitivity Analysis Using Explainable AI for Building Energy Use", Journal of the Architectural Institute of Korea, Vol.36, No11(2022), 279~287.
  14. Chun Y. E., S. B. Kim, J. Y. Lee and J. H. Woo, "Study on credit rating model using explainable AI", Journal of the Korean Data And Information Science Society, Vol.32, No2(2021), 283~295. https://doi.org/10.7465/jkdi.2021.32.2.283
  15. Gebreyesus, Y., D. Dalton, S. Nixon, D. Chiara and M. Chinnici, "Machine Learning for Data Center Optimizations: Feature Selection Using Shapley Additive exPlanation (SHAP)." Future Internet, Vol. 15, No. 3, (2023), 88.
  16. Gordini, N., "A genetic algorithm approach for SMEs bankruptcy prediction: Empirical evidence from Italy." Expert systems with applications, Vol.41, No.14 (2014), 6433~6445. https://doi.org/10.1016/j.eswa.2014.04.026
  17. Guidotti, R., A. Monreale, S. Ruggieri, F. Turini, F. Giannotti and D. Pedreschi, "A survey of methods for explaining black box models.", ACM computing surveys (CSUR), 51.5 (2018), 1~42. https://doi.org/10.1145/3236009
  18. Heo, S. W. and D. H. Baek, "A Methodology for Bankruptcy Prediction in Imbalanced Datasets using eXplainable AI", Journal of the Society of Korea Industrial and Systems Engineering, Vol.45, No.2(2022), 65~76. https://doi.org/10.11627/jksie.2022.45.2.065
  19. Hwang, S. H., "Knowledge Graph-based UI for Explainable Recommendation", Yonsei University, 2019
  20. Hwangbo, Y., and J. G. Moon. "A Comparative Study on Failure Pprediction Models for Small and Medium Manufacturing Company." Asia-Pacific Journal of Business Venturing and Entrepreneurship, Vol11, No.3, 2016, 1~15. https://doi.org/10.16972/apjbve.11.3.201606.1
  21. Iturriaga, F. L. and I. P. Sanz. "Bankruptcy visualization and prediction using neural networks: A study of US commercial banks." Expert Systems with applications, Vol. 42, No. 6, (2015), 2857~2869. https://doi.org/10.1016/j.eswa.2014.11.025
  22. Jeong, C. W., J. H. Min and M. S. Kim, "A tuning method for the architecture of neural network models incorporating GAM and GA as applied to bankruptcy prediction.", Expert Systems with Applications, Vol.39, No.3 (2012), 3650~3658. https://doi.org/10.1016/j.eswa.2011.09.056
  23. Joshi, S., R. Ramesh and S. Tahsildar, "A Bankruptcy Prediction Model Using Random Forest," 2018 Second International Conference on Intelligent Computing and Control Systems (ICICCS), Madurai, India, 2018, pp. 1~6, doi: 10.1109/ICCONS.2018.8663128.
  24. Kang, B. "Comparison of Altman's K-Score and Neural Network's Corporate Bankruptcy Prediction", Chonnam National University, 2023
  25. Kasgari A. A., M. Divsalar, M. R. Javid and S. J. Ebrahimian, "Prediction of bankruptcy Iranian corporations through artificial neural network and Probit-based analyses.", Neural Computing and Applications, 23 (2013), 927~936. https://doi.org/10.1007/s00521-012-1017-z
  26. Khademolqorani, S., A. Z. Hamadani, and F. M. Rafiei. "A hybrid analysis approach to improve financial distress forecasting: Empirical evidence from Iran.", Mathematical Problems in Engineering, (2015),1~9
  27. Kim, M. J., S. Y. Jo and K. M. Yoo, "Geometric Mean Optimization-based Boosting for Bankruptcy Prediction", Journal of The Korean Institute of Intelligent Systems, Vol.32, No.4(2022), 346~358. https://doi.org/10.5391/JKIIS.2022.32.4.346
  28. Kim, S. J. and H. C. Ahn, "Application of Random Forests to Corporate Credit Rating Prediction", The Journal of Business and Economics,32(1), 2016, 187~211.
  29. Kim, S. J., "Evaluation of Distress Prediction Model for Food Service Industry in Korea: Using the Logit Analysis", Journal of the Korea Academia-Industrial cooperation Society, Vol.20, No.11 (2019), 151~156.
  30. Kim, S. Y. and A. Upneja, "Majority voting ensemble with a Decision Tree for business failure prediction during economic downturns.", Journal of Innovation & Knowledge 6.2, (2021): 112~123. https://doi.org/10.1016/j.jik.2021.01.001
  31. Kwon, H., D. Lee and M. Shin, "Dynamic forecasts of bankruptcy with Recurrent Neural Network model", Journal of Intelligence and Information Systems,23(3),2017, 139~153.
  32. Kwon, J. H. and S. H. Park, "Machine Learning Models and Failure Prediction on Shipping Companies", Journal of shipping and logistics, Vol.38, No.2(2022), 399~418.
  33. Kwon, U. J. and S. Y. Park, "Survival prediction of hospitality businesses during the pandemic", Journal of The Korean Data Analysis Society, Vol.24, No.5(2022), 1791~1809 https://doi.org/10.37727/jkdas.2022.24.5.1791
  34. Le, N. Q. K., Q. T. Ho, V. N. Nguyen and J. S. Chang, "BERT-Promoter: An improved sequence-based predictor of DNA promoter using BERT pre-trained model and SHAP feature selection", Computational Biology and Chemistry, 99, (2022), 107732.
  35. Lee, S. G., S. K. Son., S. R. Shin and K. J. Yee., "Airfoil Inverse Design using XAI (eXplainable Artificial Intelligence)", Journal of the Korean Society for Aeronautical and Space Sciences, 2021, 50~51.
  36. Lee, S. J. and W. S. Choi, "A multi-industry bankruptcy prediction model using back-propagation neural network and multivariate discriminant analysis.", Expert Systems with Applications, Vol 40, No.8(2013), 2941~2946. https://doi.org/10.1016/j.eswa.2012.12.009
  37. Liang, D., C. Tsai and H. Wu, "The effect of feature selection on financial distress prediction.", Knowledge-Based Systems. 73 (2015), 289~297. https://doi.org/10.1016/j.knosys.2014.10.010
  38. Linardatos, P., V. Papastefanopoulos and S. Kotsiantis, "Explainable ai: A review of machine learning interpretability methods.", Entropy, 23.1 (2020), 18.
  39. Liu, Y., Y. Shen, H. Wang, Y. Zhang and X. Zhu, "m5Cpred-XS: a new method for predicting RNA m5C sites based on XGBoost and SHAP", Frontiers in Genetics, (2022), 13.
  40. Liu, Y., Z. Liu, X. Luo and H. Zhao, "Diagnosis of Parkinson's disease based on SHAP value feature selection.", Biocybernetics and Biomedical Engineering, 42(3), (2022). 856-869. https://doi.org/10.1016/j.bbe.2022.06.007
  41. Lundberg, S. M. and S. I. Lee, "A unified approach to interpreting model predictions.", Advances in neural information processing systems, 30, (2017).
  42. Lysaght, T., H. Y. Lim, V. Xafis and K.Y. Ngiam, "AI-assisted decision-making in healthcare: the application of an ethics framework for big data in health and research.", Asian Bioethics Review, 11, (2019), 299~314. https://doi.org/10.1007/s41649-019-00096-0
  43. Mattsson, B., and O. Steinert, "Corporate bankruptcy prediction using Machine Learning techniques.", Gothenburg University, 2017.
  44. Ni, Q., L. Chen, J. Zhu, J. Pang, Z. Wang and X. Yang, "Prediction and interpretation of gamma pass rate based on SHAP value feature selection", Radiation Oncology, 2023
  45. Oh, D. H., "Utilization of Artificial Intelligence Technology in the Military and Suggestion of XAI Technology Application Direction", Journal of Digital Contents Society, Vol.23, No.5(2022), 943~951. https://doi.org/10.9728/dcs.2022.23.5.943
  46. Oh, H. R., A. L. Son and Z. K. Lee, "Occupational accident prediction modeling and analysis using SHAP", The Journal of Contents Computing, Vol.22, No.7(2021), 1115~1123. https://doi.org/10.9728/dcs.2021.22.7.1115
  47. Park, S. H., T. I. Kim and C. H. Kwon, "A Study of Default Prediction for Korean Shipping Companies", Journal of Shipping and Logistics, Vol.33, No.4,2017,823~852. https://doi.org/10.37059/tjosal.2017.33.4.823
  48. Qu, Y., P. Quan, M. Lei and Y. Shi, "Review of bankruptcy prediction using machine learning and deep learning techniques.", Procedia Computer Science, 162, (2019), 895~899. https://doi.org/10.1016/j.procs.2019.12.065
  49. Rouhi, R., M., Clausel, J., Oster and F. Lauer, "An interpretable hand-crafted feature-based model for atrial fibrillation detection", Frontiers in Physiology, 12, (2021). 657304.
  50. Rustam Z. and G. S. Saragih, "Predicting Bank Financial Failures using Random Forest," 2018 International Workshop on Big Data and Information Security (IWBIS), Jakarta, Indonesia, 2018, pp. 81-86, doi: 10.1109/IWBIS.2018.8471718.
  51. Shortliffe, E. H. and B. G. Buchanan, "A model of inexact reasoning in medicine.", Readings in uncertain reasoning, 1990, 259~275.
  52. Son, H., C. Hyun, D. Phan and H. J. Hwang, "Data analytic approach for bankruptcy prediction.", Expert Systems with Applications, 138 (2019), 112816.
  53. Son, J. H., S. J. Woo, H. G. Baek, B. W. Hwang and S. G. Choi, "False positive reduction in anomaly detection using XAI", Journal of Computing Science and Engineering, 2022, 609~611.
  54. Song, H. J., D. J. Park and Z. K. Lee, "An Empirical Comparison of Bankruptcy Prediction of External Auditing and Non-External Auditing Companies Using Machine Learning Methods", Journal of The Korea Society of Information Technology Policy & Management, Vol.13, No.3(2021), 2521~2527.
  55. Tjoa, E. and C. Guan. "A survey on explainable artificial intelligence (XAI): Toward medical XAI.", IEEE transactions on Neural Networks and learning systems, 32.11, (2020), 4793~4813. https://doi.org/10.1109/TNNLS.2020.3027314
  56. Utkin, L. and A. Konstantinov, "Ensembles of random SHAPs.", Algorithms, 15.11, (2022), 431.
  57. Van Lent, M., W. Fisher and M. Mancuso, "An explainable artificial intelligence system for small-unit tactical behavior.", Proceedings of the national conference on artificial intelligence, Menlo Park, CA; Cambridge, MA; London; AAAI Press; MIT Press; 1999, 2004.
  58. Wagenmans, F., "Machine learning in bankruptcy prediction", Utrecht University, 2017.
  59. Yeh, C., D. Chi and Y. Lin, "Going-concern prediction using hybrid random forests and rough set approach.", Information Sciences, 254 (2014), 98~110. https://doi.org/10.1016/j.ins.2013.07.011
  60. Zhao, J., "Corporate financial risk prediction based on embedded system and deep learning.", Microprocessors and Microsystems, (2020), 103405.
  61. Zhou L., K. K. Lai, and J. Yen, "Empirical models based on features ranking techniques for corporate financial distress prediction.", Computers & mathematics with applications, Vol.64, No.8 (2012), 2484~2496. https://doi.org/10.1016/j.camwa.2012.06.003