• Title/Summary/Keyword: scoring descriptive assessment

Search Result 8, Processing Time 0.026 seconds

Automatic scoring of mathematics descriptive assessment using random forest algorithm (랜덤 포레스트 알고리즘을 활용한 수학 서술형 자동 채점)

  • Inyong Choi;Hwa Kyung Kim;In Woo Chung;Min Ho Song
    • The Mathematical Education
    • /
    • v.63 no.2
    • /
    • pp.165-186
    • /
    • 2024
  • Despite the growing attention on artificial intelligence-based automated scoring technology as a support method for the introduction of descriptive items in school environments and large-scale assessments, there is a noticeable lack of foundational research in mathematics compared to other subjects. This study developed an automated scoring model for two descriptive items in first-year middle school mathematics using the Random Forest algorithm, evaluated its performance, and explored ways to enhance this performance. The accuracy of the final models for the two items was found to be between 0.95 to 1.00 and 0.73 to 0.89, respectively, which is relatively high compared to automated scoring models in other subjects. We discovered that the strategic selection of the number of evaluation categories, taking into account the amount of data, is crucial for the effective development and performance of automated scoring models. Additionally, text preprocessing by mathematics education experts proved effective in improving both the performance and interpretability of the automated scoring model. Selecting a vectorization method that matches the characteristics of the items and data was identified as one way to enhance model performance. Furthermore, we confirmed that oversampling is a useful method to supplement performance in situations where practical limitations hinder balanced data collection. To enhance educational utility, further research is needed on how to utilize feature importance derived from the Random Forest-based automated scoring model to generate useful information for teaching and learning, such as feedback. This study is significant as foundational research in the field of mathematics descriptive automatic scoring, and there is a need for various subsequent studies through close collaboration between AI experts and math education experts.

Exploring automatic scoring of mathematical descriptive assessment using prompt engineering with the GPT-4 model: Focused on permutations and combinations (프롬프트 엔지니어링을 통한 GPT-4 모델의 수학 서술형 평가 자동 채점 탐색: 순열과 조합을 중심으로)

  • Byoungchul Shin;Junsu Lee;Yunjoo Yoo
    • The Mathematical Education
    • /
    • v.63 no.2
    • /
    • pp.187-207
    • /
    • 2024
  • In this study, we explored the feasibility of automatically scoring descriptive assessment items using GPT-4 based ChatGPT by comparing and analyzing the scoring results between teachers and GPT-4 based ChatGPT. For this purpose, three descriptive items from the permutation and combination unit for first-year high school students were selected from the KICE (Korea Institute for Curriculum and Evaluation) website. Items 1 and 2 had only one problem-solving strategy, while Item 3 had more than two strategies. Two teachers, each with over eight years of educational experience, graded answers from 204 students and compared these with the results from GPT-4 based ChatGPT. Various techniques such as Few-Shot-CoT, SC, structured, and Iteratively prompts were utilized to construct prompts for scoring, which were then inputted into GPT-4 based ChatGPT for scoring. The scoring results for Items 1 and 2 showed a strong correlation between the teachers' and GPT-4's scoring. For Item 3, which involved multiple problem-solving strategies, the student answers were first classified according to their strategies using prompts inputted into GPT-4 based ChatGPT. Following this classification, scoring prompts tailored to each type were applied and inputted into GPT-4 based ChatGPT for scoring, and these results also showed a strong correlation with the teachers' scoring. Through this, the potential for GPT-4 models utilizing prompt engineering to assist in teachers' scoring was confirmed, and the limitations of this study and directions for future research were presented.

A Study on Descriptive Assessment of Mathematics in Russia's Unified State Examination (러시아의 국가통합시험에서 수학교과의 서술형 평가 연구)

  • Han, Inki;Shin, Vladimir
    • Journal of Science Education
    • /
    • v.46 no.1
    • /
    • pp.121-149
    • /
    • 2022
  • Descriptive assessment is a meaningful assessment method in relation to problem solving ability, reasoning ability, and communication ability as emphasized in mathematics curriculum. In Korea, as performance assessment has been emphasized since the 7th mathematics curriculum, descriptive assessment is being conducted as a method of performance assessment in schools. However, descriptive assessment has not been introduced in the university scholastic ability test for various reasons. Considering that descriptive assessment is emphasized in the mathematics classroom and has sufficient educational value, a serious discussion on the implementation of descriptive assessment in the university scholastic ability test will be necessary. In this study, we analyzed the descriptive assessment of Russia's unified state examination (USE) in the mathematics, which corresponds to Korea's university scholastic ability test. Through a literature review, we investigated how mathematics examination problems were structured in the USE and which mathematical abilities were required for the examination. In particular, the outer structure of the problems was analyzed focusing on the mathematics problems of the USE 2021, and the scoring method of the descriptive problems was also analyzed. The results of this study are expected to provide a variety of information on the possibility of introducing descriptive assessment in the Korean university scholastic ability tests.

A Study on Development of Balanced Performance Assessment Tasks for Primary School Mathematics -Focused on 1, 2 Stage in the Primary School- (균형 있는 초등수학과 수행평가 과제 개발에 대한 연구 - 1, 2단계를 중심으로 -)

  • 정영옥
    • School Mathematics
    • /
    • v.3 no.2
    • /
    • pp.325-354
    • /
    • 2001
  • The study aims to develop balanced performance assessment tasks for primary school mathematics which can be implemented in the primary school easily. In order to these purposes, I suggest the types of performance assessment tasks and the framework of assessment standards for the balanced performance assessment with describing the procedures of developing tasks and rubrics. The types of task are journal writing, problem posing, constructed task, and descriptive task. In the framework of assessment standards, I suggest holistic scoring which are classified as four levels according to the degree of excellence which students perform totally concerning about the criterion of implication, reasoning, accuracy, and communication. Also I analyse the responses of children to the task “make a beautiful pattern” and suggest its assessment rubric and anchor papers for each level for illustrating the process of developing a rubric in holistic scoring. In order to reflect the viewpoints of children and their Parents concerning about the tasks, the responses in self assessment and parent assessment are analysed. Finally, methods of implementing the assessment tasks and considerations are discussed.

  • PDF

The defects of questions of descriptive assessment in elementary school mathematics and the suggestions for its improvement -focusing on the questions produced by Gyeonggi Provincial Office of Education (초등 수학과 서술형 평가문항의 문제점과 개선방안 -경기도 교육청 창의.서술형 평가 문항을 중심으로-)

  • Chang, Suchin;Kim, Soomi
    • Journal of Elementary Mathematics Education in Korea
    • /
    • v.18 no.2
    • /
    • pp.297-318
    • /
    • 2014
  • This study is designed for helping elementary school teachers have an insight into making or choosing questions of descriptive assessment in mathematics. For this, it is analyzed 30 descriptive mathematical questions produced by Gyeonggi Provincial Office of Education in 2011 and 2012 and 3rd to 6th grade students' papers marked by their teachers in charge from 2 elementary schools located in Gyeonggi Province. The main focus of analysis is the errors of students' answers and teachers' marking not from their own mistakes but from the defects of questions themselves. As a result of analysis, 7 cases of problematic situations are induced and they are reorganized into 3 categories as follow: i) case of not performing unique purpose of descriptive assessment, ii) case of inducing the problem of fairness of grading, iii) case of leading students erroneous direction.

  • PDF

Critical Thinking Disposition, Problem Solving Process, and Simulation- Based Assessment of Clinical Competence of Nursing Students in Pediatric Nursing (간호대학생의 비판적 사고성향, 문제해결과정 정도 및 아동간호 시뮬레이션 기반 임상수행능력)

  • Kim, Sunghee;Nam, Hyuna;Kim, Miok
    • Child Health Nursing Research
    • /
    • v.20 no.4
    • /
    • pp.294-303
    • /
    • 2014
  • Purpose: The purpose of this study was to identify the correlation of critical thinking disposition and problem solving process, and the simulation- based assessment of clinical competence based on a survey of college nursing students. Methods: In this descriptive correlation study, data for 214 nursing students were analyzed using t-test and Pearson correlation coefficients. Results: Critical thinking disposition, problem solving process, and simulation-based assessment of clinical competence averaged $3.76{\pm}0.46$ (out of 5), $3.67{\pm}0.47$ (5), and $1.51{\pm}0.17$ (2), respectively. A significant difference in scores for simulation-based assessment of clinical competence was found between the high-scoring group and low-scoring group in critical thinking disposition. A significant positive correlation was found between critical thinking disposition and nursing assessment, a sub-domain of clinical competence. Conclusion: The results suggest that success in simulation-based learning requires critical thinking disposition in the nursing students, and their critical thinking disposition plays a positive role in nursing assessment, which evaluates the patient's status in a complex situation. Simulation-based learning programs help assess the students' levels in their clinical judgement and performance, and identify their strengths and weaknesses so that the instructor can evaluate and improve the current teaching method.

The Relationship among Knowledge of the SBAR, Attitudes towards SBAR and Critical Thinking Disposition for Nursing Students (SBAR 사용능력, SBAR 이용인식 및 비판적 사고성향 간의 관계)

  • Lee, Oi Sun;Noh, Yoon Goo
    • Journal of Digital Convergence
    • /
    • v.17 no.9
    • /
    • pp.213-220
    • /
    • 2019
  • The purpose of this study was to identify to identify the relationship between knowledge of the SBAR(Situation-Background-Assessment-Recommendation) Attitudes towards SBAR and critical thinking disposition of nursing students. Subjects were 101 associate nursing students associate nursing students(3rd). The data were collected using self -report questionnaire from August 31 to October 26, 2018. Data were analyzed by descriptive statistics, t-test, ANOVA, Pearson's correlation using SPSS Win 23. The score for knowledge of the SBAR was 3.26, Attitudes towards SBAR scoring 3.31, Critical thinking disposition scoring 3.50. Knowledge of the SBAR(r=.46, p<.001) and Attitudes towards SBAR(r=.23, p=.023) were significantly positive correlation with critical thinking disposition in nursing students. Therefore, to increase the critical thinking disposition of nursing students, It is necessary to develop the program for increase knowledge of the SBAR and attitudes towards SBAR of nursing students.

Evaluating the Primary Care Quality of a Public Health Center in a Rural Area (농촌 지역 보건소 일차의료의 질 평가)

  • Byeon, Young-Kwan;Choi, Yong-Jun
    • Journal of agricultural medicine and community health
    • /
    • v.42 no.1
    • /
    • pp.24-35
    • /
    • 2017
  • Objectives: This study aimed to evaluate the primary care quality of a public health center in a rural area using the Korean Primary Care Assessment Tool (KPCAT). It also examined some methodological issues in applying the KPCAT and interpreting its results. Methods: Seventy-nine patients who had visited their doctor more than four times responded to the KPCAT questionnaire. Descriptive statistics and a radar chart were used in analyzing data. Sign test was used to test the KPCAT score difference by don't know option scoring methods. Results: Median and interquartile range of the public health center's KPCAT scores were forty-five and sixteen points, respectively. Only the median of the first contact domain reached the expected value of seventy-five points. The proportions of those who scored under the expected value were under fifty percent in two of four comprehensiveness items, all of three coordinating function items, two of five personalized items and all of four family/community orientation items. There were some methodological issues including, how to score don't know option and make sure response scale consistency. Conclusions: There was much room to improve the primary care quality of the rural public health center. Especially, improvement is needed in the domain of coordinating function and family/community orientation. We also hope that methodological improvement of the KPCAT contributes to more valid and reliable primary care assessment.