• Title/Summary/Keyword: Random Over Sampling Examples

Search Result 2, Processing Time 0.015 seconds

Application of Random Over Sampling Examples(ROSE) for an Effective Bankruptcy Prediction Model (효과적인 기업부도 예측모형을 위한 ROSE 표본추출기법의 적용)

  • Ahn, Cheolhwi;Ahn, Hyunchul
    • The Journal of the Korea Contents Association
    • /
    • v.18 no.8
    • /
    • pp.525-535
    • /
    • 2018
  • If the frequency of a particular class is excessively higher than the frequency of other classes in the classification problem, data imbalance problems occur, which make machine learning distorted. Corporate bankruptcy prediction often suffers from data imbalance problems since the ratio of insolvent companies is generally very low, whereas the ratio of solvent companies is very high. To mitigate these problems, it is required to apply a proper sampling technique. Until now, oversampling techniques which adjust the class distribution of a data set by sampling minor class with replacement have popularly been used. However, they are a risk of overfitting. Under this background, this study proposes ROSE(Random Over Sampling Examples) technique which is proposed by Menardi and Torelli in 2014 for the effective corporate bankruptcy prediction. The ROSE technique creates new learning samples by synthesizing the samples for learning, so it leads to better prediction accuracy of the classifiers while avoiding the risk of overfitting. Specifically, our study proposes to combine the ROSE method with SVM(support vector machine), which is known as the best binary classifier. We applied the proposed method to a real-world bankruptcy prediction case of a Korean major bank, and compared its performance with other sampling techniques. Experimental results showed that ROSE contributed to the improvement of the prediction accuracy of SVM in bankruptcy prediction compared to other techniques, with statistical significance. These results shed a light on the fact that ROSE can be a good alternative for resolving data imbalance problems of the prediction problems in social science area other than bankruptcy prediction.

The Relationship Between Social Security Network and Security Life Satisfaction in Community Residents: Scale Development and Application of Social Security Network (사회안전망과 지역사회주민의 안전생활만족의 관계: 사회안전망 척도개발과 적용)

  • Kim, Chan-Sun
    • The Journal of the Korea Contents Association
    • /
    • v.14 no.6
    • /
    • pp.108-118
    • /
    • 2014
  • The purpose of this study is to develop a relationship of measuring method for the social security network and verify its validity and reliability and apply it to investigate the due to security life satisfaction. This study is based by setting general residents of Seoul in 2013 and using the stratified cluster random sampling method to analyze a total amount of 203 examples. The measuring methods for the social security network was developed through document research, conceptual definition and drafting the survey, experts' conference, preliminary inspection and original examination, verification of the validity and reliability of the survey. An experts' conference took pace to verify the validity of the survey, and 6 factors were extracted through exploratory factor analysis crime prevention design, street CCTV facilities, volunteer neighborhood patrol, local government security education, police public peace service, private security service. The conclusion are the following. Collected data was analyzed based on the aim of this study using SPSSWIN 18.0, and practice frequency analysis, F test, factor analysis, reliability analysis, correlation analysis, multiple regression analysis. First, the validity of the social security network measurement is very high. Thus, the factors constituting the social security network were found to be crime prevention design, street CCTV facilities, volunteer neighborhood patrol, local government security education, police public peace services, and private security services, and the crime prevention design factor was found to be most explanatory. Second, the reliability of the social security network measurement is very high. Thus, the correlation between the questions and the sector, the questions and the social security net was very high, and the internal consistency showed a Cronbach's${\alpha}$ value of over 0.865. Third, the establishment of a social security network had the biggest effect on people in their forties. Thus, when the crime prevention design, street CCTV facilities, local government security education, police public peace services are systematically established, the social anxiety of citizens was reduced.