DOI QR코드

DOI QR Code

ConvXGB: A new deep learning model for classification problems based on CNN and XGBoost

  • Thongsuwan, Setthanun (Advanced Artificial Intelligence (AAI) Research Laboratory, Department of Computer Science, King Mongkut's Institute of Technology Ladkrabang) ;
  • Jaiyen, Saichon (Advanced Artificial Intelligence (AAI) Research Laboratory, Department of Computer Science, King Mongkut's Institute of Technology Ladkrabang) ;
  • Padcharoen, Anantachai (Department of Mathematics, Faculty of Science, Rambhai Barni Rajabhat University) ;
  • Agarwal, Praveen (Department of Mathematics, Anand International College of Engineering)
  • Received : 2019.10.08
  • Accepted : 2020.04.07
  • Published : 2021.02.25

Abstract

We describe a new deep learning model - Convolutional eXtreme Gradient Boosting (ConvXGB) for classification problems based on convolutional neural nets and Chen et al.'s XGBoost. As well as image data, ConvXGB also supports the general classification problems, with a data preprocessing module. ConvXGB consists of several stacked convolutional layers to learn the features of the input and is able to learn features automatically, followed by XGBoost in the last layer for predicting the class labels. The ConvXGB model is simplified by reducing the number of parameters under appropriate conditions, since it is not necessary re-adjust the weight values in a back propagation cycle. Experiments on several data sets from UCL Repository, including images and general data sets, showed that our model handled the classification problems, for all the tested data sets, slightly better than CNN and XGBoost alone and was sometimes significantly better.

Keywords

Acknowledgement

S. Thongsuwana and S. Jaiyen would like to thank the Thailand Research Fund (TRF) under grant number RTA6080013 for supporting.

References

  1. Y. Guo, X. Jia, D. Paull, Effective sequential classifier training for multitemporal remote sensing image classification, IEEE Trans. Image Process. (2018), https://doi.org/10.1109/TIP.2018.2808767.
  2. Y. Wang, S. Liu, C. Chen, B. Zeng, A hierarchical approach for rain or snow removing in a single color image, IEEE Trans. Image Process. 26 (8) (2017) 3936-3950, https://doi.org/10.1109/TIP.2017.2708502.
  3. M.M. Mironczuk, J. Protasiewicz, A recent overview of the state-of-the-art elements of text classification, Expert Syst. Appl. 106 (2018) 36-54, https://doi.org/10.1016/j.eswa.2018.03.058. http://www.sciencedirect.com/science/article/pii/S095741741830215X.
  4. D. Isa, L.H. Lee, V.P. Kallimani, R. RajKumar, Text document preprocessing with the bayes formula for classification using the support vector machine, IEEE Trans. Knowl. Data Eng. 20 (9) (2008) 1264-1272, https://doi.org/10.1109/TKDE.2008.76.
  5. H. Sadreazami, A. Mohammadi, A. Asif, K.N. Plataniotis, Distributed-graph-based statistical approach for intrusion detection in cyber-physical systems, IEEE Trans. Signal Info. Process. Over Networks 4 (1) (2018) 137-147, https://doi.org/10.1109/TSIPN.2017.2749976.
  6. M.H. Ali, B.A.D.A. Mohammed, A. Ismail, M.F. Zolkipli, A new intrusion detection system based on fast learning network and particle swarm optimization, IEEE Access 6 (2018) 20255-20261, https://doi.org/10.1109/ACCESS.2018.2820092.
  7. M. Adam, E.Y. Ng, J.H. Tan, M.L. Heng, J.W. Tong, U.R. Acharya, Computer aided diagnosis of diabetic foot using infrared thermography: a review, Comput. Biol. Med. 91 (2017) 326-336, https://doi.org/10.1016/j.compbiomed.2017.10.030. http://www.sciencedirect.com/science/article/pii/S0010482517303566.
  8. H. Muller, D. Unay, Retrieval from and understanding of large-scale multi-modal medical datasets: A review, IEEE Trans. Multimed. 19 (9) (2017) 2093-2104, https://doi.org/10.1109/TMM.2017.2729400.
  9. A. Care, F.A. Ramponi, M.C. Campi, A new classi fication algorithm with guaranteed sensitivity and specificity for medical applications, IEEE Control Syst. Lett. 2 (3) (2018) 393-398, https://doi.org/10.1109/LCSYS.2018.2840427.
  10. H. Zhu, X. Liu, R. Lu, H. Li, Efficient and privacy-preserving online medical prediagnosis framework using nonlinear SVM, IEEE J. Biomed. Health Info. 21 (3) (2017) 838-850, https://doi.org/10.1109/JBHI.2016.2548248.
  11. W. Hu, B. Wu, P. Wang, C. Yuan, Y. Li, S. Maybank, Context-dependent random walk graph kernels and tree pattern graph matching kernels with applications to action recognition, IEEE Trans. Image Process. 27 (10) (2018) 5060-5075, https://doi.org/10.1109/TIP.2018.2849885.
  12. A.A. Adewuyi, L.J. Hargrove, T.A. Kuiken, An analysis of intrinsic and extrinsic hand muscle EMG for improved pattern recognition control, IEEE Trans. Neural Syst. Rehabil. Eng. 24 (4) (2016) 485-494, https://doi.org/10.1109/TNSRE.2015.2424371.
  13. Y. Geng, Y. Ouyang, O.W. Samuel, S. Chen, X. Lu, C. Lin, G. Li, A robust sparse representation based pattern recognition approach for myoelectric control, IEEE Access 6 (2018) 38326-38335, https://doi.org/10.1109/ACCESS.2018.2851282.
  14. E. Tu, N. Kasabov, J. Yang, Mapping temporal variables into the neucube for improved pattern recognition, predictive modeling, and understanding of stream data, IEEE Trans. Neural Networks Learning Syst. 28 (6) (2017) 1305-1317, https://doi.org/10.1109/TNNLS.2016.2536742.
  15. L.D.W. Thomas, A. Leiponen, Big data commercialization, IEEE Eng. Manag. Rev. 44 (2) (2016) 74-90, https://doi.org/10.1109/EMR.2016.2568798.
  16. F. Liang, W. Yu, D. An, Q. Yang, X. Fu, W. Zhao, A survey on big data market: Pricing, trading and protection, IEEE Access 6 (2018) 15132-15154, https://doi.org/10.1109/ACCESS.2018.2806881.
  17. N. Chawla, D. Davis, Bringing big data to personalized healthcare: A patientcentered framework, J. Gen. Intern. Med. 28 (suppl 3) (2013) 1-7, https://doi.org/10.1007/s11606-013-2455-8.
  18. N. Kruger, P. Janssen, S. Kalkan, M. Lappe, A. Leonardis, J. Piater, A.J. RodriguezSanchez, L. Wiskott, Deep hierarchies in the primate visual cortex: What can we learn for computer vision? IEEE Trans. Pattern Anal. Mach. Intell. 35 (8) (2013) 1847-1871, https://doi.org/10.1109/TPAMI.2012.272.
  19. A. Brunetti, D. Buongiorno, G.F. Trotta, V. Bevilacqua, Computer vision and deep learning techniques for pedestrian detection and tracking: A survey, Neurocomputing 300 (2018) 17-33, https://doi.org/10.1016/j.neucom.2018.01.092. http://www.sciencedirect.com/science/article/pii/S092523121830290X.
  20. J. Thevenot, M.B. Lopez, A. Hadid, A survey on computer vision for assistive medical diagnosis from faces, IEEE J. Biomed. Health Info. 22 (5) (2018) 1497-1511, https://doi.org/10.1109/JBHI.2017.2754861.
  21. T. Young, D. Hazarika, S. Poria, E. Cambria, Recent Trends in Deep Learning Based Natural Language Processing, 1708, 02709. ArXiv e-printsarXiv.
  22. J. Choo, S. Liu, Visual Analytics for Explainable Deep Learning, 1804, 02527. ArXiv e-printsarXiv.
  23. J. Lemley, S. Bazrafkan, P. Corcoran, Deep learning for consumer devices and services: Pushing the limits for machine learning, artificial intelligence, and computer vision, IEEE Consumer Electron. Mag. 6 (2) (2017) 48-56, https://doi.org/10.1109/MCE.2016.2640698.
  24. Y. Bengio, A. Courville, P. Vincent, Representation learning: A review and new perspectives, IEEE Trans. Pattern Anal. Mach. Intell. 35 (8) (2013) 1798-1828, https://doi.org/10.1109/TPAMI.2013.50.
  25. H.A. Gohel, H. Upadhyay, L. Lagos, K. Cooper, A. Sanzetenea, Predictive maintenance architecture development for nuclear infrastructure using machine learning, Nucl. Eng. Technol. (2019), https://doi.org/10.1016/j.net.2019.12.029. http://www.sciencedirect.com/science/article/pii/S1738573319306783.
  26. Y.D. Koo, Y.J. An, C.-H. Kim, M.G. Na, Nuclear reactor vessel water level prediction during severe accidents using deep neural networks, Nucl. Eng. Technol. 51 (3) (2019) 723-730, https://doi.org/10.1016/j.net.2018.12.019. http://www.sciencedirect.com/science/article/pii/S1738573318307861.
  27. K. Malik, M. Zbikowski, A. Teodorczyk, Detonation cell size model based on deep neural network for hydrogen, methane and propane mixtures with air and oxygen, Nucl. Eng. Technol. 51 (2) (2019) 424-431, https://doi.org/10.1016/j.net.2018.11.004. http://www.sciencedirect.com/science/article/pii/S1738573318305953.
  28. J. Park, S.-J. Han, N. Munir, Y.-T. Yeom, S.-J. Song, H.-J. Kim, S.-G. Kwon, MRPC eddy current flaw classification in tubes using deep neural networks, Nucl. Eng. Technol. 51 (7) (2019) 1784-1790, https://doi.org/10.1016/j.net.2019.05.011. http://www.sciencedirect.com/science/article/pii/S1738573319302414.
  29. Y. Lecun, L. Bottou, Y. Bengio, P. Haffner, Gradient-based learning applied to document recognition, Proc. IEEE 86 (11) (1998) 2278-2324, https://doi.org/10.1109/5.726791.
  30. T. Chen, C. Guestrin, Xgboost: A Scalable Tree Boosting System, 2016. ArXiv eprintsarXiv:1603.02754.
  31. L. Breiman, J.H. Friedman, R.A. Olshen, C.J. Stone, Classification and Regression Trees, Wadsworth and Brooks, Monterey, CA, 1984.
  32. G.E. Hinton, Connectionist learning procedures, Artif. Intell. 40 (1) (1989) 185-234, https://doi.org/10.1016/0004-3702(89)90049-0. http://www.sciencedirect.com/science/article/pii/0004370289900490.
  33. C.-C. Chang, C.-J. Lin, LIBSVM: A library for support vector machines, ACM Trans. Intell. Syst. Technol. 2 (2011) 27, 1-27:27, software available at: http://www.csie.ntu.edu.tw/~cjlin/libsvm.
  34. P. Murugan, Feed Forward and Backward Run in Deep Convolution Neural Network, 2019. ArXiv e-printsarXiv:1711.03278.
  35. D. Stutz, Understanding convolutional neural networks, Seminar report, Fakultat fur Mathematik, Informatik und Naturwissenschaften Lehr-und Forschungsgebiet Informatik VIII Computer Vision (2014).
  36. V. Papyan, Y. Romano, J. Sulam, M. Elad, Theoretical foundations of deep learning via sparse representations: A multilayer sparse model and its connection to convolutional neural networks, IEEE Signal Process. Mag. 35 (4) (2018) 72-89, https://doi.org/10.1109/MSP.2018.2820224.
  37. A. Asuncion, D. Newman, Uci Machine Learning Repository, 2007. http://www.ics.uci.edu/$\sim$mlearn/{MLR}epository.html.
  38. D. Dheeru, E. Karra Taniskidou, Uci Machine Learning Repository, 2017. http://archive.ics.uci.edu/ml.
  39. W.H. Wolberg, O.L. Mangasarian, Multisurface method of pattern separation for medical diagnosis applied to breast cytology, Proceed. Natl. Acad. Sci. U. S. A. 87 (23) (1991) 9193-9194, https://doi.org/10.1073/pnas.87.23.9193.
  40. K. Diaz-Chito, A. Hern andez-Sabate, A.M. L opez, A reduced feature set for driver head pose estimation, Appl. Soft Comput. 45 (C) (2016) 98-107, https://doi.org/10.1016/j.asoc.2016.04.027.
  41. M. A Little, P. Mcsharry, S. Roberts, D.A.E. Costello, I. M Moroz, Exploiting nonlinear recurrence and fractal scaling properties for voice disorder detection, Biomed. Eng. Online 6 (2007) 23, https://doi.org/10.1186/1475-925X-6-23.
  42. M. Kamel, T. Ringsted, D. Ballabio, R. Todeschini, V. Consonni, Quantitative structure-activity relationship models for ready biodegradability of chemicals, J. Chem. Inf. Model. 53 (2013) 867-878, https://doi.org/10.1021/ci4000213.
  43. M. Abadi, A. Agarwal, P. Barham, E. Brevdo, Z. Chen, C. Citro, G.S. Corrado, A. Davis, J. Dean, M. Devin, S. Ghemawat, I. Goodfellow, A. Harp, G. Irving, M. Isard, Y. Jia, R. Jozefowicz, L. Kaiser, M. Kudlur, J. Levenberg, D. Mane, R. Monga, S. Moore, D. Murray, C. Olah, M. Schuster, J. Shlens, B. Steiner, I. Sutskever, K. Talwar, P. Tucker, V. Vanhoucke, V. Vasudevan, F. Viegas, O. Vinyals, P. Warden, M. Wattenberg, M. Wicke, Y. Yu, X. Zheng, TensorFlow: Large-Scale Machine Learning on Heterogeneous Systems, 2015 software available from: tensorflow.org, http://tensorflow.org/.
  44. F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, E. Duchesnay, Scikit-learn: Machine learning in Python, J. Mach. Learn. Res. 12 (2011) 2825-2830.

Cited by

  1. Classification of Retail Products: From Probabilistic Ranking to Neural Networks vol.11, pp.9, 2021, https://doi.org/10.3390/app11094117
  2. Ultrasonic Assessment of Thickness and Bonding Quality of Coating Layer Based on Short-Time Fourier Transform and Convolutional Neural Networks vol.11, pp.8, 2021, https://doi.org/10.3390/coatings11080909
  3. Development of Subsurface Geological Cross-Section from Limited Site-Specific Boreholes and Prior Geological Knowledge Using Iterative Convolution XGBoost vol.147, pp.9, 2021, https://doi.org/10.1061/(asce)gt.1943-5606.0002583
  4. Polyhedral separation via difference of convex (DC) programming vol.25, pp.19, 2021, https://doi.org/10.1007/s00500-021-05758-6
  5. Development and Validation of an Efficient MRI Radiomics Signature for Improving the Predictive Performance of 1p/19q Co-Deletion in Lower-Grade Gliomas vol.13, pp.21, 2021, https://doi.org/10.3390/cancers13215398
  6. Mathematical model on the effects of conductor thickness on the centre frequency at 28 GHz for the performance of microstrip patch antenna using air substrate for 5G application vol.60, pp.6, 2021, https://doi.org/10.1016/j.aej.2021.04.050