DOI QR코드

DOI QR Code

Enhancing Heart Disease Prediction Accuracy through Soft Voting Ensemble Techniques

  • Byung-Joo Kim (Dept. EE, Youngsan University)
  • 투고 : 2024.06.30
  • 심사 : 2024.07.13
  • 발행 : 2024.08.31

초록

We investigate the efficacy of ensemble learning methods, specifically the soft voting technique, for enhancing heart disease prediction accuracy. Our study uniquely combines Logistic Regression, SVM with RBF Kernel, and Random Forest models in a soft voting ensemble to improve predictive performance. We demonstrate that this approach outperforms individual models in diagnosing heart disease. Our research contributes to the field by applying a well-curated dataset with normalization and optimization techniques, conducting a comprehensive comparative analysis of different machine learning models, and showcasing the superior performance of the soft voting ensemble in medical diagnosis. This multifaceted approach allows us to provide a thorough evaluation of the soft voting ensemble's effectiveness in the context of heart disease prediction. We evaluate our models based on accuracy, precision, recall, F1 score, and Area Under the ROC Curve (AUC). Our results indicate that the soft voting ensemble technique achieves higher accuracy and robustness in heart disease prediction compared to individual classifiers. This study advances the application of machine learning in medical diagnostics, offering a novel approach to improve heart disease prediction. Our findings have significant implications for early detection and management of heart disease, potentially contributing to better patient outcomes and more efficient healthcare resource allocation.

키워드

참고문헌

  1. D. R. Labarthe, Epidemiology and Prevention of Cardiovascular Diseases: A Global Challenge, Jones & Bartlett Publishers, 2010.
  2. A. P. Kengne, A. E. Moran, K. Sliwa, and F. Mbanya, Cardiovascular diseases and diabetes as economic and developmental challenges in Africa, Prog. Cardiovasc. Dis., Vol. 56, No. 3, pp. 302-313, 2013.
  3. Y. Khan, C. S. Pythian, and I. Jaswinder, Machine learning techniques for heart disease datasets: A survey, in Proc. 2019 11th Int. Conf. Mach. Learn. Comput., pp. 27-35, 2019.
  4. K. K. L. Wong, G. Fortino, and D. Abbott, Deep learning-based cardiovascular image diagnosis: a promising challenge, Future Gener. Comput. Syst., Vol. 110, pp. 802-811, 2020.
  5. N. Kagiyama, R. Shrestha, S. Farjo, and P. Sengupta, Artificial intelligence: practical primer for clinical research in cardiovascular disease, J. Am. Heart Assoc., Vol. 8, No. 17, e012788, 2019.
  6. S. S. Yadav, M. Jadhav, S. Nagrale, and N. Patil, Application of machine learning for the detection of heart disease, in Proc. 2020 2nd Int. Conf. Innov. Mech. Ind. Appl. (ICIMIA), pp. 165-172, 2020.
  7. A. S. Abdullah and R. Rajalaxmi, A data mining model for predicting the coronary heart disease using random forest classifier, in Proc. Int. Conf. Recent Trends Comput. Methods, Commun. Controls, 2012.
  8. A. H. Alkeshuosh, M. Z. Moghadam, I. Al Mansoori, and M. Abdar, Using PSO algorithm for producing best rules in diagnosis of heart disease, in Proc. 2017 Int. Conf. Comput. Appl. (ICCA), 2017.
  9. N. Al-Milli, Backpropagation neural network for prediction of heart disease, J. Theor. Appl. Inf. Technol., Vol. 56, No. 1, pp. 131-135, 2013.
  10. M. A. Khan, M. Umair, M. A. S. Choudhry, and M. K. Chattha, An optimized ensemble prediction model using AutoML based on soft voting classifier for network intrusion detection, J. Netw. Comput. Appl., Vol. 212, 103560, 2023.
  11. S. Karlos, G. Kostopoulos, and S. Kotsiantis, A soft-voting ensemble based co-training scheme using static selection for binary classification problems, Algorithms, Vol. 13, No. 1, 26, 2020.
  12. S. Kumari, D. Kumar, and M. Mittal, An ensemble approach for classification and prediction of diabetes mellitus using soft voting classifier, Int. J. Cogn. Comput. Eng., Vol. 2, pp. 40-46, 2021.
  13. M. Shahhosseini, G. Hu, and H. Pham, Optimizing ensemble weights and hyperparameters of machine learning models for regression problems, Mach. Learn. Appl., Vol. 7, 100251, 2022.
  14. S. Kumari, D. Kumar, and M. Mittal, An ensemble approach for classification and prediction of diabetes mellitus using soft voting classifier, Int. J. Cogn. Comput. Eng., Vol. 2, pp. 40-46, 2021.
  15. A. Ozcift, Medical sentiment analysis based on soft voting ensemble algorithm, Yonetim Bilisim Sist. Derg., Vol. 6, No. 1, pp. 42-50, 2020.
  16. A. Onan, S. Korukoglu, and H. Bulut, A multiobjective weighted voting ensemble classifier based on differential evolution algorithm for text sentiment classification, Expert Syst. Appl., Vol. 62, pp. 1-16, 2016.
  17. D. Yang, B. Xiao, M. Cao, and H. Shen, A new hybrid credit scoring ensemble model with feature enhancement and soft voting weight optimization, Expert Syst. Appl., Vol. 238, 122101, 2024.
  18. Heart Disease Cleveland; https://www.kaggle.com/datasets/ritwikb3/heart-disease-cleveland.
  19. A. Gunawardana and G. Shani, A survey of accuracy evaluation metrics of recommendation tasks, J. Mach. Learn. Res., Vol. 10, No. 12, 2009.
  20. B. Juba and H. S. Le, Precision-recall versus accuracy and the role of large data sets, in Proc. AAAI Conf. Artif. Intell., Vol. 33, No. 01, pp. 4039-4048, July 2019.
  21. E. J. Michaud, Z. Liu, and M. Tegmark, Precision Machine Learning, Entropy, Vol. 25, No. 1, pp. 175, 2023.
  22. D. Chicco and G. Jurman, The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation, BMC Genomics, Vol. 21, pp. 1-13, 2020.
  23. J. Fan, S. Upadhye, and A. Worster, Understanding receiver operating characteristic (ROC) curves, Can. J. Emerg. Med., Vol. 8, No. 1, pp. 19-20, 2006.
  24. H. I. Hahn, Comparative Analysis of CNN Techniques designed for Rotated Object Classification, Int. J. Internet Broadcast. Commun., Vol. 24, No. 1, 2024.