• Title/Summary/Keyword: credit scoring

Search Result 44, Processing Time 0.02 seconds

Mitigiating Data Imbalance via Ensembled Data Augmentation: An Explainable Credit Scoring Models (데이터 증강 기법의 앙상블을 통한 레이블 불균형 해 소: 설명 가능한 신용평가 모델을 중심으로)

  • Ji-Young Chung;So-Yeon Lee;Ye-Lin Yong;Min-Jun Kim
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2023.11a
    • /
    • pp.483-486
    • /
    • 2023
  • 최근 금융 분야는 예측 모델의 복잡성으로 인한 블랙박스 문제와 금융 규제에 대한 관심이 높아지고 있다. 이에 따라 금융 업계는 신뢰성과 투명성을 강조하며, 특히 신용평가 분야에서 설명 가능한 모델 연구가 활발히 진행되고 있다. 또한, 해당 분야에서 소수 클래스에 대해 충분히 학습하지 못하고 다수 클래스에 과적합 될 수 있는 데이터 불균형 문제 역시 강조되고 있다. 이는 제 2종 오류(Type 2 Error)를 최소화해야 하는 상황에서 더욱 부각되며, 대출 상환 능력이 낮은 고객을 최대한 식별해야 하는 개인 신용평가 문제에서 매우 중요한 화두로 떠오르고 있다. 본 논문에서는 어텐션 메커니즘을 활용하여 모델의 설명 가능성을 개선하고, 분석 결과를 해석하는 데 도움이 되고자 한다. 더 나아가, SMOTE, GAN, ADASYN 등 총 다섯 가지 데이터 증강 기법을 실험하여, 이를 앙상블 하였을 때 소수 클래스 레이블에 대한 분류 정확도를 크게 개선할 수 있음을 확인하였다.

Risk Analysis of Household Debt in Korea: Using Micro CB Data (개인CB 자료를 이용한 우리나라 가계의 부채상환위험 분석)

  • Hahm, Joon-Ho;Kim, Jung In;Lee, Young Sook
    • KDI Journal of Economic Policy
    • /
    • v.32 no.4
    • /
    • pp.1-34
    • /
    • 2010
  • We conduct a comprehensive risk analysis of household debt in Korea for the first time using the whole sample credit bureau (CB) data of 2.2 million individual debtors. After analysing debt service capacity profiles of debtor groups classified by the borrower characteristics such as income, age, occupation, credit scoring, and the type of creditor business companies, we investigate the impact of interest rate and income changes on debt service-to-income ratios (DTIs) and default rates of respective debtor groups. Empirical results indicate that debt service burdens are relatively high for low income wage earners, high income self-employed, low income capital and card loan holders, and high income mutual savings loan holders. We also find that debtors from multiple financial companies are particularly weak in their debt service capacity. The scenario analysis indicates that financial companies, with the current level of capital buffers, may be able to absorb negative consequences arising from the increase in DTIs and loan default rates if the interest rate and income changes remain modest. However, the negative consequences may fall disproportionately on non-bank financial companies such as capital, credit card, and mutual savings banks, whose debtors' DTIs are already high. We also find that the refinancing risk of household debt is relatively high in Korea as more than half of household mortgage debts are bullet loans. As the DTIs of mortgage loan holders are already high, under the current DTI regulation, mortgage loans may not be readily refinanced especially when the interest rate rises. Disruptions in mortgage loan refinancing may put downward pressure on housing prices, which may in turn magnify refinancing risk under the current loan-to-value (LTV) regulation. Overall our analysis suggests that, for more effective monitoring of household debt risk, it is necessary to combine existing surveillance schemes based on macro aggregate indicators with more comprehensive and detailed risk analyses based on micro individual data.

  • PDF

Comparison of data mining methods with daily lens data (데일리 렌즈 데이터를 사용한 데이터마이닝 기법 비교)

  • Seok, Kyungha;Lee, Taewoo
    • Journal of the Korean Data and Information Science Society
    • /
    • v.24 no.6
    • /
    • pp.1341-1348
    • /
    • 2013
  • To solve the classification problems, various data mining techniques have been applied to database marketing, credit scoring and market forecasting. In this paper, we compare various techniques such as bagging, boosting, LASSO, random forest and support vector machine with the daily lens transaction data. The classical techniques-decision tree, logistic regression-are used too. The experiment shows that the random forest has a little smaller misclassification rate and standard error than those of other methods. The performance of the SVM is good in the sense of misclassfication rate and bad in the sense of standard error. Taking the model interpretation and computing time into consideration, we conclude that the LASSO gives the best result.

Domain Knowledge Incorporated Local Rule-based Explanation for ML-based Bankruptcy Prediction Model (머신러닝 기반 부도예측모형에서 로컬영역의 도메인 지식 통합 규칙 기반 설명 방법)

  • Soo Hyun Cho;Kyung-shik Shin
    • Information Systems Review
    • /
    • v.24 no.1
    • /
    • pp.105-123
    • /
    • 2022
  • Thanks to the remarkable success of Artificial Intelligence (A.I.) techniques, a new possibility for its application on the real-world problem has begun. One of the prominent applications is the bankruptcy prediction model as it is often used as a basic knowledge base for credit scoring models in the financial industry. As a result, there has been extensive research on how to improve the prediction accuracy of the model. However, despite its impressive performance, it is difficult to implement machine learning (ML)-based models due to its intrinsic trait of obscurity, especially when the field requires or values an explanation about the result obtained by the model. The financial domain is one of the areas where explanation matters to stakeholders such as domain experts and customers. In this paper, we propose a novel approach to incorporate financial domain knowledge into local rule generation to provide explanations for the bankruptcy prediction model at instance level. The result shows the proposed method successfully selects and classifies the extracted rules based on the feasibility and information they convey to the users.