• 제목/요약/키워드: TGAN

검색결과 1건 처리시간 0.014초

금융업의 합성 데이터 유용성 분석: 온라인 P2P 대출연체 분석을 중심으로 (Utility of Synthetic Data in Finances: An Application of Online P2P Lending Loan Default Analysis)

  • 송민채
    • 한국IT서비스학회지
    • /
    • 제23권4호
    • /
    • pp.55-70
    • /
    • 2024
  • In order to promote the AI applications in the financial industry, the financial sector has recently been paying attention to synthetic data technology. Synthetic data generates using a purpose-built mathematical model or algorithm, with the aim of solving a set of data science tasks. This study evaluates the utility of synthetic data by analyzing heterogeneous tabular data that is composed of discrete, categorical and continuous variables and has the feature of unbalanced data, which is commonly found in the financial sector. As a synthetic data generation technique, the TGAN and CTGAN models are applied by considering the feature of tabular data. As a result of evaluating the utility in terms of resemblance and machine learning efficiency, those of TGAN are confirmed to be high, while the quality of CTGAN are relatively poor. This is interpreted to be particularly due to the generation of categorical variables, and it suggests that how those with categorical properties especially are considered in the synthetic data generation model is a major factor in determining the utility of generation synthetic data.