DOI QR코드

DOI QR Code

Detecting Credit Loan Fraud Based on Individual-Level Utility

개인별 유틸리티에 기반한 신용 대출 사기 탐지

  • Received : 2012.12.16
  • Accepted : 2012.12.18
  • Published : 2012.12.31

Abstract

As credit loan products significantly increase in most financial institutions, the number of fraudulent transactions is also growing rapidly. Therefore, to manage the financial risks successfully, the financial institutions should reinforce the qualifications for a loan and augment the ability to detect a credit loan fraud proactively. In the process of building a classification model to detect credit loan frauds, utility from classification results (i.e., benefits from correct prediction and costs from incorrect prediction) is more important than the accuracy rate of classification. The objective of this paper is to propose a new approach to building a classification model for detecting credit loan fraud based on an individual-level utility. Experimental results show that the model comes up with higher utility than the fraud detection models which do not take into account the individual-level utility concept. Also, it is shown that the individual-level utility computed by the model is more accurate than the mean-level utility computed by other models, in both opportunity utility and cash flow perspectives. We provide diverse views on the experimental results from both perspectives.

금융기관들에서 개발한 신용 대출 상품이 증가함에 따라 사기 거래의 수 또한 급속히 증가하고 있다. 따라서, 재정적 위험을 성공적으로 관리하기 위해 금융기관들은 대출 승인 심사를 강화하고 신용 대출 사기를 사전에 탐지할 수 있는 능력을 증대시켜 나가야 한다. 신용 대출 사기를 탐지하기 위한 분류 모델을 구축하는 과정에서 분류 결과에 따른 유틸리티(즉, 정분류에 따른 이익과 오분류에 따른 비용)는 분류의 정확도보다 더 중요하다. 본 연구는 개인별 유틸리티에 기반하여 신용 대출 사기를 탐지하기 위한 분류 모델을 구축하는 것을 목적으로 하였다. 다양한 실험을 통해, 본 연구에서 제시한 모델이 기회 유틸리티와 현금 흐름의 두 관점 모두에서 개인별 유틸리티에 기반하지 않은 모델보다 더 높은 유틸리티를 제공하며, 평균 유틸리티에 기반한 모델보다 더 정확한 유틸리티를 제공한다는 것을 보였다. 본 연구는 기회 유틸리티와 현금 흐름의 두 관점에서 얻어진 실험 결과를 다양한 측면에서 살펴보았다.

Keywords

References

  1. Ballis, D., M. Falaschi, C. Ferri, J. Hernandez- Orallo, and M. J. Ramirez-Quintana, "Cost- Sensitive Diagnosis of Declarative Programs", Electronic Notes in Theorectical Computer Science, Vol.86, No.3(2003), 85-104. https://doi.org/10.1016/S1571-0661(04)80695-9
  2. Chawla, N. and X. Li, "Pricing Based Framework for Benefit Scoring", Proceedings of KDD Workshop on Utility-Based Data Mining, (2006), 1-5.
  3. Chung, S. H. and Y. Suh, "Estimating the Utility Value of Individual Credit Card Delinquents", Expert Systems with Applications, Vol.36, No.2(2009), 3975-3981. https://doi.org/10.1016/j.eswa.2008.02.031
  4. Ciraco, M., M. Rogalewski, and G. Weiss, "Improving Classifier Utility by Altering the Misclassification Cost Ratio", Proceedings of the 1st International Workshop on Utility- Based Data Mining, (2005), 46-52.
  5. Desai, V. S., J. N. Crook, and G. A. J. Overstreet, "A comparison of neural networks and linear scoring models in the credit union environment", European Journal of Operational Research, Vol.95, No.1(1996), 24-37. https://doi.org/10.1016/0377-2217(95)00246-4
  6. Domingos, P., "MetaCost: A General Method for Making Classifiers Cost-Sensitive", Proceedings of the 15th International Conference on Knowledge Discovery and Data Mining, (1999), 155-164.
  7. Estevez, P. A., C. M. Held, and C. A. Perez, "Subscription Fraud Prevention in Telecommunications Using Fuzzy Rules and Neural Networks", Expert Systems with Applications, Vol.31, No.2(2006), 337-344. https://doi.org/10.1016/j.eswa.2005.09.028
  8. Fawcett, T., "ROC Graphs with Instance-Varying Costs", Pattern Recognition Letters, Vol.27, No.8(2006), 882-891. https://doi.org/10.1016/j.patrec.2005.10.012
  9. Hilas, C. S., "Designing an Expert System for Fraud Detection in Private Telecommunications Networks", Expert Systems with Applications, Vol.36, No.9(2009), 11559-11569. https://doi.org/10.1016/j.eswa.2009.03.031
  10. Hilas, C. S. and P. A. Mastorocostas, "An Application of Supervised and Unsupervised Learning Approaches to Telecommunications Fraud Detection", Knowledge-Based Systems, Vol.21, No.7(2008), 721-726. https://doi.org/10.1016/j.knosys.2008.03.026
  11. Jiang, Y. and B. Cukic, "Misclassification Cost- Sensitive Fault Prediction Models", Proceedings of the 5th International Conference on Predictor Models in Software Engineering, (2009), 1-10.
  12. Kou, Y., C. T. Lu, S. Sirwongwattana, and Y. P. Huang, "Survey of Fraud Detection Techniques", Proceedings of IEEE International Conference on Networking, Sensing and Control, (2004), 749-754.
  13. Lee, H. K., S. B. Han, and W. C. Jhee, "Illegal Cash Accomodation Detection Modeling Using Ensemble Size Reduction", Journal of Intelligence and Information Systems, Vol.16, No.1(2010), 93-116.
  14. Lee, H. U. and H. Ahn, "An Intelligent Intrusion Detection Model Based on Support Vector Machines and the Classification Threshold Optimization for Considering the Asymmetric Error Cost", Journal of Intelligence and Information Systems, Vol.17, No.4(2011), 157-173.
  15. Leonard, K. J., "The Development of a Rule Based Expert System Model for Fraud Alert in Consumer Credit", European Journal of Operational Research, Vol.80, No.2(1995), 350- 356. https://doi.org/10.1016/0377-2217(93)E0249-W
  16. Ling, C. X. and V. S. Sheng, Cost-Sensitive Learning and the Class Imbalance Problem, Springer, 2008.
  17. Ling, C. X., V. S. Sheng, T. Bruckhaus, and N. H. Madhavji, "Maximum Profit Mining and Its Application in Software Development", Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, (2006), 929-934.
  18. Ling, C. X., V. S. Sheng, and Q. Yang, "Test Strategies for Cost-Sensitive Decision Trees", IEEE Transactions on Knowledge and Data Engineering, Vol.18, No.8(2006), 1055-1067. https://doi.org/10.1109/TKDE.2006.131
  19. Lu, W. Z. and D. Wang, "Ground-Level Ozone Prediction by Support Vector Machine Approach with a Cost-Sensitive Classification Scheme", Science of the Total Environment, Vol.395, No.2(2008), 109-116. https://doi.org/10.1016/j.scitotenv.2008.01.035
  20. Malhotra, R. and D. K. Malhotra, "Differentiating between good credits and bad credits using neuro-fuzzy systems", European Journal of Operational Research, Vol.136, No.1(2002), 190-211. https://doi.org/10.1016/S0377-2217(01)00052-2
  21. Margineantu, D. D., "On Class-Probability Estimates and Cost-Sensitive Evaluation of Classifiers", Proceedings of International Conference on Machine Learning, (2000), 1-3.
  22. Merler, S., C. Furlanello, B. Larcher, and A. Sboner, "Automatic Model Selection in Cost-Sensitive Boosting", Information Fusion, Vol.4, No.1 (2003), 3-10. https://doi.org/10.1016/S1566-2535(02)00100-8
  23. Pendharkar, P. C., "A Threshold Varying Bisection Method for Cost Sensitive Learning in Neural Networks", Expert Systems with Applications, Vol.34, No.2(2008), 1456-1464. https://doi.org/10.1016/j.eswa.2007.01.011
  24. Phua, C., D. Alahakoon, and V. Lee, "Minority report in fraud detection : Classification of skewed data", ACM SIGKDD Explorations Newsletter, Vol.6, No.1(2004), 50-59. https://doi.org/10.1145/1007730.1007738
  25. Sanchez, D., M. A. Vila, L. Cerda, and J. M. Serrano, "Association Rules Applied to Credit Card Fraud Detection", Expert Systems with Applications, Vol.36, No.2(2009), 3630-3640. https://doi.org/10.1016/j.eswa.2008.02.001
  26. Shin, K. S., H. J. Kim, H. S. Kim, "Development of the Knowledge-based Systems for Antimoney Laundering in the Korea Financial Intelligence Unit", Journal of Intelligence and Information Systems, Vol.14, No.2(2008), 179-192.
  27. Stolfo, S. J., D. W. Fan, W. Lee, and A. L. Prodromidis, "Credit Card Fraud Detection Using Meta-Learning : Issues and Initial Results", AAAI Workshop, AI Approaches to Fraud Detection and Risk Management, (1997), 1-12.
  28. Sun, Y., M. S. Kamel, A. K. C. Wong, and Y. Wang, "Cost-Sensitive Boosting for Classification of Imbalanced Data", Pattern Recognition, Vol.40, No.12(2007), 3358-3378. https://doi.org/10.1016/j.patcog.2007.04.009
  29. Viaene, S., G. Dedene, and R. A. Derrig, "Auto claim fraud detection using Bayesian learning neural networks", Expert Systems with Applications, Vol.29, No.3(2005), 653-666. https://doi.org/10.1016/j.eswa.2005.04.030
  30. Wheeler, R. and S. Aitken, "Multiple algorithms for fraud detection", Knowledge-Based Systems, Vol.13, No.2(2000), 93-99. https://doi.org/10.1016/S0950-7051(00)00050-2
  31. Witten, I. H. and E. Frank, Data Mining-Practical Machine Learning Tools and Techniques, 2005.
  32. Yang, W. S. and S. Y. Hwang, "A Process-Mining Framework for the Detection of Healthcare Fraud and Abuse", Expert Systems with Applications, Vol.31, No.1(2006), 56-68. https://doi.org/10.1016/j.eswa.2005.09.003
  33. Yong, M. and D. Xiaoqing, "Real-Time Multi- View Face Detection and Pose Estimation Based on Cost-Sensitive AdaBoost", Tsinghua Science and Technology, Vol.10, No.2(2005), 152-157. https://doi.org/10.1016/S1007-0214(05)70047-X
  34. Zadrozny, B., "One-Benefit Learning : Cost-Sensitive Learning with Restricted Cost Information", Proceedings of KDD Workshop on Utility-Based Data Mining, (2005), 53-58.