• Title/Summary/Keyword: artificial intelligence-based model

Search Result 1,231, Processing Time 0.024 seconds

Analyzing Mathematical Performances of ChatGPT: Focusing on the Solution of National Assessment of Educational Achievement and the College Scholastic Ability Test (ChatGPT의 수학적 성능 분석: 국가수준 학업성취도 평가 및 대학수학능력시험 수학 문제 풀이를 중심으로)

  • Kwon, Oh Nam;Oh, Se Jun;Yoon, Jungeun;Lee, Kyungwon;Shin, Byoung Chul;Jung, Won
    • Communications of Mathematical Education
    • /
    • v.37 no.2
    • /
    • pp.233-256
    • /
    • 2023
  • This study conducted foundational research to derive ways to use ChatGPT in mathematics education by analyzing ChatGPT's responses to questions from the National Assessment of Educational Achievement (NAEA) and the College Scholastic Ability Test (CSAT). ChatGPT, a generative artificial intelligence model, has gained attention in various fields, and there is a growing demand for its use in education as the number of users rapidly increases. To the best of our knowledge, there are very few reported cases of educational studies utilizing ChatGPT. In this study, we analyzed ChatGPT 3.5 responses to questions from the three-year National Assessment of Educational Achievement and the College Scholastic Ability Test, categorizing them based on the percentage of correct answers, the accuracy of the solution process, and types of errors. The correct answer rates for ChatGPT in the National Assessment of Educational Achievement and the College Scholastic Ability Test questions were 37.1% and 15.97%, respectively. The accuracy of ChatGPT's solution process was calculated as 3.44 for the National Assessment of Educational Achievement and 2.49 for the College Scholastic Ability Test. Errors in solving math problems with ChatGPT were classified into procedural and functional errors. Procedural errors referred to mistakes in connecting expressions to the next step or in calculations, while functional errors were related to how ChatGPT recognized, judged, and outputted text. This analysis suggests that relying solely on the percentage of correct answers should not be the criterion for assessing ChatGPT's mathematical performance, but rather a combination of the accuracy of the solution process and types of errors should be considered.