• Title/Summary/Keyword: Quality Engineering

Search Result 20,502, Processing Time 0.045 seconds

The Prediction of Export Credit Guarantee Accident using Machine Learning (기계학습을 이용한 수출신용보증 사고예측)

  • Cho, Jaeyoung;Joo, Jihwan;Han, Ingoo
    • Journal of Intelligence and Information Systems
    • /
    • v.27 no.1
    • /
    • pp.83-102
    • /
    • 2021
  • The government recently announced various policies for developing big-data and artificial intelligence fields to provide a great opportunity to the public with respect to disclosure of high-quality data within public institutions. KSURE(Korea Trade Insurance Corporation) is a major public institution for financial policy in Korea, and thus the company is strongly committed to backing export companies with various systems. Nevertheless, there are still fewer cases of realized business model based on big-data analyses. In this situation, this paper aims to develop a new business model which can be applied to an ex-ante prediction for the likelihood of the insurance accident of credit guarantee. We utilize internal data from KSURE which supports export companies in Korea and apply machine learning models. Then, we conduct performance comparison among the predictive models including Logistic Regression, Random Forest, XGBoost, LightGBM, and DNN(Deep Neural Network). For decades, many researchers have tried to find better models which can help to predict bankruptcy since the ex-ante prediction is crucial for corporate managers, investors, creditors, and other stakeholders. The development of the prediction for financial distress or bankruptcy was originated from Smith(1930), Fitzpatrick(1932), or Merwin(1942). One of the most famous models is the Altman's Z-score model(Altman, 1968) which was based on the multiple discriminant analysis. This model is widely used in both research and practice by this time. The author suggests the score model that utilizes five key financial ratios to predict the probability of bankruptcy in the next two years. Ohlson(1980) introduces logit model to complement some limitations of previous models. Furthermore, Elmer and Borowski(1988) develop and examine a rule-based, automated system which conducts the financial analysis of savings and loans. Since the 1980s, researchers in Korea have started to examine analyses on the prediction of financial distress or bankruptcy. Kim(1987) analyzes financial ratios and develops the prediction model. Also, Han et al.(1995, 1996, 1997, 2003, 2005, 2006) construct the prediction model using various techniques including artificial neural network. Yang(1996) introduces multiple discriminant analysis and logit model. Besides, Kim and Kim(2001) utilize artificial neural network techniques for ex-ante prediction of insolvent enterprises. After that, many scholars have been trying to predict financial distress or bankruptcy more precisely based on diverse models such as Random Forest or SVM. One major distinction of our research from the previous research is that we focus on examining the predicted probability of default for each sample case, not only on investigating the classification accuracy of each model for the entire sample. Most predictive models in this paper show that the level of the accuracy of classification is about 70% based on the entire sample. To be specific, LightGBM model shows the highest accuracy of 71.1% and Logit model indicates the lowest accuracy of 69%. However, we confirm that there are open to multiple interpretations. In the context of the business, we have to put more emphasis on efforts to minimize type 2 error which causes more harmful operating losses for the guaranty company. Thus, we also compare the classification accuracy by splitting predicted probability of the default into ten equal intervals. When we examine the classification accuracy for each interval, Logit model has the highest accuracy of 100% for 0~10% of the predicted probability of the default, however, Logit model has a relatively lower accuracy of 61.5% for 90~100% of the predicted probability of the default. On the other hand, Random Forest, XGBoost, LightGBM, and DNN indicate more desirable results since they indicate a higher level of accuracy for both 0~10% and 90~100% of the predicted probability of the default but have a lower level of accuracy around 50% of the predicted probability of the default. When it comes to the distribution of samples for each predicted probability of the default, both LightGBM and XGBoost models have a relatively large number of samples for both 0~10% and 90~100% of the predicted probability of the default. Although Random Forest model has an advantage with regard to the perspective of classification accuracy with small number of cases, LightGBM or XGBoost could become a more desirable model since they classify large number of cases into the two extreme intervals of the predicted probability of the default, even allowing for their relatively low classification accuracy. Considering the importance of type 2 error and total prediction accuracy, XGBoost and DNN show superior performance. Next, Random Forest and LightGBM show good results, but logistic regression shows the worst performance. However, each predictive model has a comparative advantage in terms of various evaluation standards. For instance, Random Forest model shows almost 100% accuracy for samples which are expected to have a high level of the probability of default. Collectively, we can construct more comprehensive ensemble models which contain multiple classification machine learning models and conduct majority voting for maximizing its overall performance.

Soil Physical Properties of Arable Land by Land Use Across the Country (토지이용별 전국 농경지 토양물리적 특성)

  • Cho, H.R.;Zhang, Y.S.;Han, K.H.;Cho, H.J.;Ryu, J.H.;Jung, K.Y.;Cho, K.R.;Ro, A.S.;Lim, S.J.;Choi, S.C.;Lee, J.I.;Lee, W.K.;Ahn, B.K.;Kim, B.H.;Kim, C.Y.;Park, J.H.;Hyun, S.H.
    • Korean Journal of Soil Science and Fertilizer
    • /
    • v.45 no.3
    • /
    • pp.344-352
    • /
    • 2012
  • Soil physical properties determine soil quality in aspect of root growth, infiltration, water and nutrient holding capacity. Although the monitoring of soil physical properties is important for sustainable agricultural production, there were few studies. This study was conducted to investigate the condition of soil physical properties of arable land according to land use across the country. The work was investigated on plastic film house soils, upland soils, orchard soils, and paddy soils from 2008 to 2011, including depth of topsoil, bulk density, hardness, soil texture, and organic matter. The average physical properties were following; In plastic film house soils, the depth of topsoil was 16.2 cm. For the topsoils, hardness was 9.0 mm, bulk density was 1.09 Mg $m^{-3}$, and organic matter content was 29.0 g $kg^{-1}$. For the subsoils, hardness was 19.8 mm, bulk density was 1.32 Mg $m^{-3}$, and organic matter content was 29.5 g $kg^{-1}$; In upland soils, depth of topsoil was 13.3 cm. For the topsoils, hardness was 11.3 mm, bulk density was 1.33 Mg $m^{-3}$, and organic matter content was 20.6 g $kg^{-1}$. For the subsoils, hardness was 18.8 mm, bulk density was 1.52 Mg $m^{-3}$, and organic matter content was 13.0 g $kg^{-1}$. Classified by the types of crop, soil physical properties were high value in a group of deep-rooted vegetables and a group of short-rooted vegetables soil, but low value in a group of leafy vegetables soil; In orchard soils, the depth of topsoil was 15.4 cm. For the topsoils, hardness was 16.1 mm, bulk density was 1.25 Mg $m^{-3}$, and organic matter content was 28.5 g $kg^{-1}$. For the subsoils, hardness was 19.8 mm, bulk density was 1.41 Mg $m^{-3}$, and organic matter content was 15.9 g $kg^{-1}$; In paddy soils, the depth of topsoil was 17.5 cm. For the topsoils, hardness was 15.3 mm, bulk density was 1.22 Mg $m^{-3}$, and organic matter content was 23.5 g $kg^{-1}$. For the subsoils, hardness was 20.3 mm, bulk density was 1.47 Mg $m^{-3}$, and organic matter content was 17.5 g $kg^{-1}$. The average of bulk density was plastic film house soils < paddy soils < orchard soils < upland soils in order, according to land use. The bulk density value of topsoils is mainly distributed in 1.0~1.25 Mg $m^{-3}$. The bulk density value of subsoils is mostly distributed in more than 1.50, 1.35~1.50, and 1.0~1.50 Mg $m^{-3}$ for upland and paddy soils, orchard soils, and plastic film house soils, respectively. Classified by soil textural family, there was lower bulk density in clayey soil, and higher bulk density in fine silty and sandy soil. Soil physical properties and distribution of topography were different classified by the types of land use and growing crops. Therefore, we need to consider the types of land use and crop for appropriate soil management.