DOI QR코드

DOI QR Code

Improving the Error Back-Propagation Algorithm for Imbalanced Data Sets

  • Oh, Sang-Hoon (Department of Information Communication Engineering Mokwon University)
  • Received : 2012.04.20
  • Accepted : 2012.06.11
  • Published : 2012.06.28

Abstract

Imbalanced data sets are difficult to be classified since most classifiers are developed based on the assumption that class distributions are well-balanced. In order to improve the error back-propagation algorithm for the classification of imbalanced data sets, a new error function is proposed. The error function controls weight-updating with regards to the classes in which the training samples are. This has the effect that samples in the minority class have a greater chance to be classified but samples in the majority class have a less chance to be classified. The proposed method is compared with the two-phase, threshold-moving, and target node methods through simulations in a mammography data set and the proposed method attains the best results.

Keywords

References

  1. H. Zhao, "Instance Weighting versus Threshold Adjusting for Cost-Sensitive Classification," Knowledge and Information Systems, vol.15, 2008, pp. 321-334. https://doi.org/10.1007/s10115-007-0079-1
  2. Y.-M. Huang, C.-M. Hung, and H. C. Jiau, "Evaluation of Neural Networks and Data Mining Methods on a Credit Assessment Task for Class Imbalance Problem," Nonlinear Analysis, vol.7, 2006, pp. 720-747. https://doi.org/10.1016/j.nonrwa.2005.04.006
  3. R. Bi, Y. Zhou, F. Lu, and W. Wang, "Predicting gene ontology functions based on support vector machines and statistical significance estimation," Neurocomputing, vol.70, 2007, pp.718-725. https://doi.org/10.1016/j.neucom.2006.10.006
  4. N. V. Chawla, K. W. Bowyer, L. O. all, and W. P. Kegelmeyer, "SMOTE: Synthetic Minority Over-sampling Technique," J. Artificial Intelligence Research, vol.16, 2002, pp. 321-357.
  5. F. Provost and T. Fawcett, "Robust Classification for Imprecise Environments," Machine Learning, vol.42, 2001, pp. 203-231. https://doi.org/10.1023/A:1007601015854
  6. D. E. Rumelhart and J. L. McClelland, Parallel Distributed Processing, Cambridge, MA, 1986.
  7. P. Kang and S. Cho, "EUS SVMs: ensemble of under-sampled SVMs for data imbalance problem, " Proc. ICONIP'06, 2006, p. 837-846.
  8. L. Bruzzone and S. B. Serpico, "Classification of Remote-Sensing Data by Neural Networks," Pattern Recognition Letters, vol.18, 1997, pp. 1323-1328. https://doi.org/10.1016/S0167-8655(97)00109-8
  9. Z.-H. Zhou and X.-Y. Liu, "Training Cost-Sensitive Neural Networks with Methods Addressing the Class Imbalance Problem," IEEE Trans. Know. and Data Eng., vol.18, no. 1, Jan. 2006, pp. 63-77. https://doi.org/10.1109/TKDE.2006.17
  10. S.-H. Oh, "Improving the Error Back-Propagation Algorithm with a Modified Error Function," IEEE Trans. Neural Networks, vol.8, 1997, pp. 799-803. https://doi.org/10.1109/72.572117
  11. S.-H. Oh, "Error Back-Propagation Algorithm for Classification of Imbalanced Data," Neurocomputing, vol.74, 2011, pp. 1058-1061. https://doi.org/10.1016/j.neucom.2010.11.024
  12. H. White, "Learning in Artificial Neural Networks: A Statistical Perspective," Neural Computation, vol.1, no.4, Winter 1989, pp. 425-464. https://doi.org/10.1162/neco.1989.1.4.425
  13. S.-H. Oh, "A Statistical Perspective of Neural Networks for Imbalanced Data Problems," Int. Journal of Contents, vol.7,2011,pp.1-5.
  14. A. van Ooyen and B. Nienhuis, "Improving the convergence of the backpropagation algorithm," Neural Networks, vol.5, 1992, pp. 465-471. https://doi.org/10.1016/0893-6080(92)90008-7