DOI QR코드

DOI QR Code

Exploring Predictive Models for Student Success in National Physical Therapy Examination: Machine Learning Approach

  • Bokyung Kim (Dept. of Physical Therapy, Changshin University) ;
  • Yeonseop Lee (Dept. of Physical Therapy, Daewon University) ;
  • Jang-hoon Shin (Industry-Academy Cooperation Foundation, Sahmyook University) ;
  • Yusung Jang (Dept. of Physical Therapy, Gangdong University) ;
  • Wansuk Choi (Dept. of Physical Therapy, Kyungwoon University)
  • Received : 2024.08.23
  • Accepted : 2024.09.27
  • Published : 2024.10.31

Abstract

This study aims to assess the effectiveness of machine learning models in predicting the pass rates of physical therapy students in national exams. Traditional grade prediction methods primarily rely on past academic performance or demographic data. However, this study employed machine learning and deep learning techniques to analyze mock test scores with the goal of improving prediction accuracy. Data from 1,242 students across five Korean universities were collected and preprocessed, followed by analysis using various models. Models, including those generated and fine-tuned with the assistance of ChatGPT-4, were applied to the dataset. The results showed that H2OAutoML (GBM2) performed the best with an accuracy of 98.4%, while TabNet, LightGBM, and RandomForest also demonstrated high performance. This study demonstrates the exceptional effectiveness of H2OAutoML (GBM2) in predicting national exam pass rates and suggests that these AI-assisted models can significantly contribute to medical education and policy.

본 연구는 물리치료학과 학생들의 국가시험 합격률을 예측하는 데 있어 머신러닝 모델의 효과성을 검증하고자 한다. 기존의 성적 예측 방법은 주로 과거 학업 성적이나 인구 통계 데이터를 기반으로 하지만, 본 연구는 모의시험 점수를 머신러닝 및 딥러닝 기법으로 분석하여 보다 정확한 예측을 시도하였다. 한국의 5개 대학에서 총 1,242명의 학생 데이터를 수집하고 전처리한 후, 다양한 모델을 활용하여 분석을 진행하였다. ChatGPT4의 도움을 받아 생성 및 개선된 모델을 데이터셋에 적용한 결과, H2OAutoML (GBM2) 모델이 98.4%의 정확도로 가장 우수한 성능을 보였으며, TabNet, LightGBM, RandomForest 모델 역시 높은 성능을 나타냈다. 본 연구는 H2OAutoML(GBM2)이 국가시험 합격 여부를 예측하는 데 있어 뛰어난 효과를 발휘함을 보여주며, 이러한 AI지원 모델들이 의학 교육 및 정책에 크게 기여할 수 있음을 시사한다.

Keywords

Acknowledgement

This research was supported by a Research Grant of Kyungwoon University in 2024.

References

  1. S. Batool, J. Rashid, M.W. Nisar, J. Kim, H.Y. Kwon, and A. Hussain, "Educational data mining to predict students' academic performance: A survey study," Education and Information Technologies, Vol. 28, No. 1, pp. 905-971, Jan 2023. DOI: 10.1007/s10639-022-11152-y
  2. C. Ha, U. Ahmed, M. Khasminsky, M. Salib, and T. Andey, "Correlative and comparative study assessing use of a mock examination in a pharmaceutical calculations course," American Journal of Pharmaceutical Education, Vol. 87, No. 1, pp. 8654, Jan 2023. DOI: 10.5688/ajpe8654
  3. R.R. Utzman, D.L. Riddle, and D.V. Jewell, "Use of demographic and quantitative admissions data to predict performance on the national physical therapy examination," Physical Therapy, Vol. 87, No. 9, pp. 1181-1193, Sep 2023. DOI: 10.2522/ptj.20060222
  4. S.H. Kim, and S.H. Cho, "Exploring the predictive factors of passing the Korean physical therapist licensing examination," Journal of The Korean Society of Integrative Medicine, Vol. 10, No. 3, pp. 107-117, Sep 2022. DOI: 10.15268/KSIM.2022.10.3..107
  5. A. Parhizkar, G. Tejeddin, and T. Khatibi, "Student performance prediction using datamining classification algorithms: Evaluating generalizability of models from geographical aspect," Education and Information Technologies, pp. 1-19, Sep 2023. DOI: 10.1007/s10639-022-11560-0
  6. M.S. Kiran, E. Siramkaya, E. Esme, and M.N. Senkaya, "Prediction of the number of students taking make-up examinations using artificial neural networks," International Journal of Machine Learning and Cybernetics, Vol. 13, No. 1, pp. 71-81, Jan 2022. DOI: 10.1007/s13042-021-01348-y
  7. Y. Chen, and L.A. Zhai, "A comparative study on student performance prediction using machine learning," Education and Information Technologies, Vol. 28, No. 9, pp. 1-19, Sep 2023. DOI: 10.1007/s10639-023-11672-1
  8. M. Yagci, "Educational data mining: prediction of students' academic performance using machine learning algorithms," Smart Learning Environments, Vol. 9, No. 1, pp. 11, Jan 2022. DOI: 10.1186/s40561-022-00192-z
  9. S.H. Kim, J.Y. Park, and S.H. Cho, "Predicting the performance of students on the Korean national licensing examination for physical therapists using decision trees and support vector machines," Journal of the Korean Society of Integrative Medicine, Vol. 6, No. 2, pp. 117-125, Jun 2018. DOI: 10.15268/KSIM.2018.6.2.117
  10. H.J. Lee, M.S. Kim, and H.Y. Choi, "Exploring the relationship between academic performance and mock test scores using multiple regression analysis: A study on the Korean national physical therapist examination," Korean Journal of Educational Measurement and Evaluation, Vol. 12, No. 3, pp. 203-217, Sep 2020. DOI: 10.15268/KJEME.2020.12.3.203
  11. P. Nayak, S. Vaheed, S. Gupta, and N. Mohan, "Predicting students' academic performance by mining the educational data through machine learning-based classification model," Education and Information Technologies, pp. 1-27, Sep 2023. DOI: 10.1007/s10639-023-11706-8
  12. P. Sharma, K. Thapa, D. Thapa, P. Dhakal, M.D. Upadhaya, S. Adhikari, and S.R. Khanal, "Performance of ChatGPT on USMLE: Unlocking the Potential of Large Language Models for AI-Assisted Medical Education." PLOS Digital Health. Jul 2023. DOI: 10.48550/arXiv.2307.00112
  13. P. Probst, B. Bischl, and A.L. Boulesteix, "Tunability: importance of hyperparameters of machine learning algorithms," Journal of Machine Learning Research, Vol. 20, No. 53, pp. 1-32, Feb 2019.
  14. J. Bayliss, R.M. Thomas, and M. Eifert-Mangine, "Pilot study: what measures predict first time pass rate on the national physical therapy examination?" Internet Journal of Allied Health Sciences and Practice, Vol. 15, No. 4, pp. 1, Jan 2017. DOI: 10.46743/1540-580X/2017.1693
  15. E. LeDell, and S. Poirier. "H2o automl: Scalable automatic machine learning." Proceedings of the AutoML Workshop at ICML. Jul 2020. San Diego, CA, USA. DOI: 10.1145/3429136
  16. P. Geurts, D. Ernst, and L. Wehenkel, "Extremely randomized trees," Machine Learning, Vol. 63, No. 1, pp. 3-42, Jun 2006. DOI: 10.1007/s10994-006-6226-1
  17. T. Chen, and C. Guestrin, "XGBoost: A scalable tree boosting system," Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 785-794, August 2016, San Francisco, USA. DOI: 10.1145/2939672.2939785
  18. S.O. Arik, and T. Pfister, "Tabnet: Attentive interpretable tabular learning," Proceedings of the 35th AAAI Conference on Artificial Intelligence, pp. 6679-6687, February 2021, virtually. DOI: 10.1609/aaai.v35i8.16826
  19. K. He, X. Zhang, S. Ren, and J. Sun, "Deep residual learning for image recognition," Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770-778, June 2016, Las Vegas, USA. DOI: 10.48550/arXiv.1512.03385
  20. G. Ke, Q. Meng, T. Finley, T. Wang, W. Chen, W. Ma, Q. Ye, and T.Y. Liu, "LightGBM: A highly efficient gradient boosting decision tree," Proceedings of the 31st Conference on Neural Information Processing Systems," pp. 3149-3157, Long Beach, USA, Dec, 2017.
  21. L. Prokhorenkova, G. Gusev, A. Vorobev, A.V. Dorogush, and A. Gulin, "CatBoost: unbiased boosting with categorical features," Proceedings of the 32nd International Conference on Neural Information Processing Systems, pp. 6638-6648, December 2018, Montreal, Canada. DOI: 10.48550/arXiv.1706.09516