DOI QR코드

DOI QR Code

Parameter estimation for the imbalanced credit scoring data using AUC maximization

AUC 최적화를 이용한 낮은 부도율 자료의 모수추정

  • Hong, C.S. (Department of Statistics, Sungkyunkwan University) ;
  • Won, C.H. (Department of Statistics, Sungkyunkwan University)
  • Received : 2015.11.02
  • Accepted : 2016.01.05
  • Published : 2016.02.29

Abstract

For binary classification models, we consider a risk score that is a function of linear scores and estimate the coefficients of the linear scores. There are two estimation methods: one is to obtain MLEs using logistic models and the other is to estimate by maximizing AUC. AUC approach estimates are better than MLEs when using logistic models under a general situation which does not support logistic assumptions. This paper considers imbalanced data that contains a smaller number of observations in the default class than those in the non-default for credit assessment models; consequently, the AUC approach is applied to imbalanced data. Various logit link functions are used as a link function to generate imbalanced data. It is found that predicted coefficients obtained by the AUC approach are equivalent to (or better) than those from logistic models for low default probability - imbalanced data.

이항 분류모형에서 선형 스코어의 함수인 리스크 스코어를 고려하고, 선형 스코어의 계수를 추정하는 문제를 고려한다. 계수를 추정하는 대표적인 방법으로 로지스틱모형을 이용하는 방법과 AUC를 최대화하여 구하는 방법이 있다. AUC 접근방법으로 구한 모수 추정량은 로지스틱모형을 이용한 선형 스코어의 모수의 최대가능도 추정량보다 자료가 로지스틱 가정이 맞지 않는 일반적인 상황에서도 좋은 추정 결과를 보인다. 본 연구에서는 신용평가모형에서 흔히 접하는 정상보다 부도 경우가 현저하게 작은 상태인 낮은 부도율의 자료를 고려하고, 낮은 부도율의 자료에 AUC 접근방법을 적용한다. 부도의 비율이 정상의 비율보다 현저하게 낮은 불균형 자료를 생성하기 위하여 수정된 로짓함수를 연결함수로 사용한다. 낮은 부도율의 상황인 불균형 자료에 AUC 접근방법을 적용한 판별결과가 로지스틱 모형 추정방법보다 동등하거나 더 나은 모수추정 결과를 보이는 것을 확인하였다.

Keywords

References

  1. Allison, P. D. (2008). Convergence failures in logistic regression, In SAS Global Forum, 360, 1-11.
  2. Bamber, D. C. (1975). The area above the ordinal dominance graph and the area below the receiver operating characteristic graph, Journal of Mathematical Psychology, 12, 387-415. https://doi.org/10.1016/0022-2496(75)90001-2
  3. Brown, I. and Mues, C. (2012). An experimental comparison of classification algorithms for imbalanced credit scoring data sets, Expert Systems with Applications, 39, 3446-3453. https://doi.org/10.1016/j.eswa.2011.09.033
  4. Burr, I. W. (1942). Cumulative frequency functions, The Annals of Mathematical Statistics, 13, 215-232. https://doi.org/10.1214/aoms/1177731607
  5. Calabrese, R. and Osmetti, S. A. (2011). Generalized extreme value regression for binary rare events data: an application to credit defaults, Bulletin of the International Statistical Institute LXII, 58th Session of the International Statistical Institute, 5631-5634.
  6. Cavanagh, C. and Sherman, R. P. (1998). Rank estimators for monotonic index models, Journal of Econometrics, 84, 351-381. https://doi.org/10.1016/S0304-4076(97)00090-0
  7. Dreiseitl, S., Ohno-Machado, L., and Binder, M. (2000). Comparing three-class diagnostic tests by three-way ROC analysis, Medical Decision Making, 20, 323-331. https://doi.org/10.1177/0272989X0002000309
  8. Egan, J. P. (1975). Signal Detection Theory and ROC Analysis, Academic Press, New York.
  9. Engelmann, B., Hayden, E., and Tasche, D. (2003). Measuring the discriminative power of rating systems, Risk, 82-86.
  10. Fawcett, T. (2003). ROC graphs: Notes and practical considerations for data mining researchers, HP Labs Technical Report HPL-2003-4, CA, USA.
  11. Han, A. K. (1987). Non-parametric analysis of a generalized regression model, the maximum rank correlation estimator, Journal of Economics, 35, 303-316. https://doi.org/10.1016/0304-4076(87)90030-3
  12. Heckerling, P. S. (2001). Parametric three-way receiver operating characteristic surface analysis using mathematica, Medical Decision Making, 21, 409-417. https://doi.org/10.1177/02729890122062703
  13. Hong, C. S. and Cho, M. H. (2015a). VUS and HUM represented with Mann-Whitney statistic, Communications for Statistical Applications and Methods, 22, 223-232. https://doi.org/10.5351/CSAM.2015.22.3.223
  14. Hong, C. S. and Cho, M. H. (2015b). Test statistics for volume under the ROC surface and hypervolume under the ROC manifold, Communications for Statistical Applications and Methods, 22, 377-387. https://doi.org/10.5351/CSAM.2015.22.4.377
  15. Hong, C. S. and Choi, J. S. (2009). Optimal threshold from ROC and CAP curves, The Korean Journal of Applied Statistics, 22, 911-921. https://doi.org/10.5351/KJAS.2009.22.5.911
  16. Hong, C. S., Joo, J. S., and Choi, J. S. (2010). Optimal thresholds from mixture distributions, The Korean Journal of Applied Statistics, 23, 13-28. https://doi.org/10.5351/KJAS.2010.23.1.013
  17. Hong, C. S. and Jung, D. G. (2014). Standard criterion of hypervolume under the ROC manifold, Journal of the Korean Data & Information Science Society, 25, 473-483. https://doi.org/10.7465/jkdi.2014.25.3.473
  18. Hong, C. S. and Jung, E. S. (2013). Optimal thresholds criteria for ROC surfaces, Journal of The Korean Data and Information Science Society, 24, 1489-1496. https://doi.org/10.7465/jkdi.2013.24.6.1489
  19. Hong, C. S., Jung, E. S., and Jung, D. G. (2013). Standard criterion of VUS for ROC surface, The Korean Journal of Applied Statistics, 26, 1-8. https://doi.org/10.5351/KJAS.2013.26.1.001
  20. Hong, C. S., Won, C. H., and Jeong, D. G. (2015). Parameter estimation of linear function using VUS and HUM maximization, Journal of the Korean Data & Information Science Society, To appear.
  21. Hong, C. S. and Wu, Zhi Qiang (2014). Alternative accuracy for multiple ROC analysis, Journal of The Korean Data & Information Science Society, 25, 1521-1530. https://doi.org/10.7465/jkdi.2014.25.6.1521
  22. Hosmer, D. W. (2000). Applied Logistic Regression, 2nd ed., Wiley, New York.
  23. Joseph, M. P. (2005). A PD validation framework for Basel II internal ratings-based systems, Quantitative Analyst Basel II Project, Commonwealth Bank of Australia.
  24. Kraus, A. (2014). Recent Methods from Statistics and Machine Learning for Credit Scoring, Dissertation an der Fakultat fur Mathematik, Informatik und Statistik, der Ludwig-Maximilians-Universitat Munchen, Munchen; http://edoc.ub.uni-muenchen.de/17143/1/Kraus Anne.pdf.
  25. Li, J. and Fine, J. P. (2008). ROC analysis with multiple classes and multiple tests: methodology and its application in microarray studies, Biostatistics, 9, 566-576. https://doi.org/10.1093/biostatistics/kxm050
  26. Mossman, D. (1999). Three-way ROCs, Medical Decision Making, 19, 78-89. https://doi.org/10.1177/0272989X9901900110
  27. Nakas, C. T., Alonzo, T. A., and Yiannoutsos, C. T. (2010). Accuracy and cut off point selection in three class classification problems using a generalization of the Youden index, Statistics in Medicine, 29, 2946-2955. https://doi.org/10.1002/sim.4044
  28. Nakas, C. T. and Yiannoutsos, C. T. (2004). Ordered multiple-class ROC analysis with continuous measurements, Statistics in Medicine, 23, 3437-3449. https://doi.org/10.1002/sim.1917
  29. Nelder, J. A. and Mead, R. (1965). A simplex method for function minimization, The Computer Journal, 7, 308-313. https://doi.org/10.1093/comjnl/7.4.308
  30. Patel, A. C. and Markey, M. K. (2005). Comparison of three-class classification performance metrics: A case study in breast cancer CAD, International Society for Optical Engineering, 5749, 581-589.
  31. Pepe, M. S. (2003). The Statistical Evaluation of Medical Tests for Classification and Prediction, Oxford UniversityPress, Oxford.
  32. Pepe, M. S., Cai, T., and Longton, G. (2005). Combining predictors for classification using the area under the receiver operating characteristic curve, Biometrics, 1, 221-229.
  33. Provost, F. and Fawcett, T. (2001). Robust classification for imprecise environments, Machine Learning, 42, 203-231. https://doi.org/10.1023/A:1007601015854
  34. Scurfield, B. K. (1996). Multiple-event forced-choice tasks in the theory of signal detectability, Journal of Mathematical Psychology, 40, 253-269. https://doi.org/10.1006/jmps.1996.0024
  35. Sherman, R. P. (1993). The limiting distribution of the maximum rank correlation estimator, Econometrics, 61, 123-137. https://doi.org/10.2307/2951780
  36. Sobehart, J. R. and Keenan, S. C. (2001). Measuring default accurately, Credit risk special report, Risk, 14, 31-33.
  37. Swets, J. (1988). Measuring the accuracy of diagnostic systems, Science, 240, 1285-1293. https://doi.org/10.1126/science.3287615
  38. Swets, J. A., Dawes, R. M., and Monahan, J. (2000). Better decisions through science, Scientific American, 283, 82-87.
  39. Tasche, D. (2009). Estimating discriminatory power and PD curves when the number of defaults is small, Lioyds Banking Group.
  40. Wandishin, M. S. and Mullen, S. J. (2009). Multiclass ROC analysis, Weather and Forecasting, 24, 530-547. https://doi.org/10.1175/2008WAF2222119.1
  41. Zou, K. H., O'Malley, A. J., and Mauri, L. (2007). Receiver-operating characteristic analysis for evaluating diagnostic tests and predictive models, Circulation, 115, 654-657. https://doi.org/10.1161/CIRCULATIONAHA.105.594929