DOI QR코드

DOI QR Code

내용 및 인지 영역을 함께 고려한 평가 데이터 분석을 위한 Q행렬 기반 다차원 문항반응모형의 활용 방안 연구: TIMSS 2019 수학 평가 분석

A study on the application of M2PL-Q model for analyzing assessment data considering both content and cognitive domains: An analysis of TIMSS 2019 mathematics data

  • 김래영 (서울대학교) ;
  • 황수빈 (서울대학교) ;
  • 이슬기 (서울대학교) ;
  • 유연주 (서울대학교)
  • Kim, Rae Yeong (Seoul National University) ;
  • Hwang, Su Bhin (Graduate School of Seoul National University) ;
  • Lee, Seul Gi (Graduate School of Seoul National University) ;
  • Yoo, Yun Joo (Seoul National University)
  • 투고 : 2024.08.26
  • 심사 : 2024.09.26
  • 발행 : 2024.09.30

초록

본 연구의 목적은 수학 평가 데이터를 내용 및 인지 영역을 함께 고려하여 분석하기 위해, da Silva(2019)가 제시한 Q행렬 기반의 다차원 문항반응모형(M2PL-Q)의 활용 방안을 제시하고, 이를 TIMSS 2019 8학년 수학 평가 데이터에 적용하여 분석한 결과를 제시하는 것이다. 연구 결과 M2PL-Q 모형을 통해 학생의 능력 수준을 내용 영역과 인지 영역에 걸쳐서 추정할 수 있음을 확인하였으며, 각 영역에 대한 능력 수준이 서로 연관되어 있는 양상을 확인하였다. 또한, 문항의 특성을 영역별로 구분하여 추출할 수 있으며, 문항에 따라 내용 영역과 인지 영역이 문제 해결에 미치는 영향이 서로 다를 수 있음을 확인하였다. 본 연구는 기존의 방법에서는 별도로 분석되던 내용 영역과 인지 영역을 하나의 분석 모형에 포함하여 평가 데이터를 종합적으로 분석하는 방안을 제시했다는 점에서 의의를 지닌다. 각 영역에 대해 추정된 능력 수준을 개별 학생에 대한 진단에 활용하면, 학생이 세부적인 내용 및 인지 영역에서 보이는 강점과 약점을 파악하여 학습을 지원할 수 있을 것으로 기대한다. 또한, 각 평가 문항의 세부적인 특성을 고려하여 평가의 상황과 목적에 따라 적절하게 활용함으로써, 평가의 타당성과 효율성을 높이고 학생의 능력 수준을 보다 효과적으로 진단할 수 있을 것으로 기대한다.

This study aims to propose a method for analyzing mathematics assessment data that integrates both content and cognitive domains, utilizing the multidimensional two-parameter logistic model with a Q-matrix (M2PL-Q; da Silva, 2019). The method was applied to the TIMSS 2019 8th-grade mathematics assessment data. The results demonstrate that the M2PL-Q model effectively estimates students' ability levels across both domains, highlighting the interrelationships between abilities in each domain. Additionally, the M2PL-Q model was found to be effective in estimating item characteristics by differentiating between content and cognitive domain, revealing that their influence on problem-solving can vary across items. This study is significant in that it offers a comprehensive analytical approach that incorporates both content and cognitive domains, which were traditionally analyzed separately. By using the estimated ability levels for individual student diagnostics, students' strengths and weaknesses in specific content and cognitive areas can be identified, supporting more targeted learning interventions. Furthermore, by considering the detailed characteristics of each assessment item and applying them appropriately based on the context and purpose of the assessment, the validity and efficiency of assessments can be enhanced, leading to more accurate diagnoses of students' ability levels.

키워드

참고문헌

  1. Ministry of Education (2022). Mathematics curriculum. Notification of Ministry of Education No. 2022-33 [Vol 8].
  2. Kwon, J. R. (2024). Analysis of the trend of mathematical achievement of students according to school grade change in TIMSS. Communications of Mathematical Education, 38(2), 121-144. https://doi.org/10.7468/jksmee.2024.38.2.121
  3. Park, J. H., & Kim, S. (2015). The analysis of characteristic achievement of TIMSS 2011 G8 high-performing countries according to the mathematics cognitive attributes. Journal of Educational Research in Mathematics, 25(3), 303-321.
  4. Sang, K. A., Kim K. H., Park S. W., Jeon S. K., Park, M. M., & Lee, J. W. (2020). An international comparative study on the trend of mathematical and scientific achievement: TIMSS 2019 (RRE 2020-10). Korea Institute of Curriculum and Evaluation.
  5. Song, M. Y., & Kim, S. H. (2007). Investigating the hierarchical nature of content and cognitive domains in the mathematics curriculum for Korean middle school students via assessment items. School Mathematics, 9(2), 223-240.
  6. Lee, K. H., Yoo, Y. J., & Tak, B. (2021). Towards data-driven statistics education: An exploration of restructuring the mathematics curriculum. School Mathematics, 23(3), 361-386. http://doi.org/10.29275/sm.2021.09.23.3.361
  7. Rim, H., Kim, S. K., & Park, J. H. (2018). Development of assessment framework and items of NAEA considering the math competencies of the 2015 revised mathematics curriculum. School Mathematics, 20(1), 65-82. http://doi.org/10.29275/sm.2018.03.20.1.65
  8. Tak, B. (2018). An analysis on classifying and representing data as statistical literacy: Focusing on elementary mathematics curriculum for 1st and 2nd grades. Journal of Elementary Mathematics Education in Korea, 22(3), 221-240.
  9. Han, C., & Park, M. (2015). A comparison study on mathematics assessment frameworks: Focusing on NAEP 2015, TIMSS 2015 and PISA 2015. The Mathematics Education, 54(3), 261-282. https://doi.org/10.7468/mathedu.2015.54.3.261
  10. Ackerman, T. A. (1994). Using multidimensional item response theory to understand what items and tests are measuring. Applied Measurement in Education, 7(4), 255-278. https://doi.org/10.1207/s15324818ame0704_1
  11. Anderson, L. W., Krathwohl, D. R., Airasian, P. W., Cruikshank, K. A., and Wittrock, M. C. (2001). A taxonomy for learning, teaching, and assessing: A revision of Bloom's taxonomy of educational objectives. Longman.
  12. Arikan, S. (2015). Construct validity of TIMSS 2011 mathematics cognitive domains for Turkish students. International Online Journal of Educational Sciences, 7(1), 29-44.
  13. Balfaqeeh, A., Mansour, N., & Forawi, S. (2022). Factors influencing students' achievements in the content and cognitive domains in TIMSS 4th grade science and mathematics in the United Arab Emirates. Education Sciences, 12(9), 618. http://dx.doi.org/10.3390/educsci12090618
  14. Buck, G. (1994). The appropriacy of psychometric measurement models for testing second language listening comprehension. Language Testing, 11(2), 145-170. https://doi.org/10.1177/026553229401100204
  15. da Silva, M. A., Liu, R., Huggins-Manley, A. C., & Bazan, J. L. (2019). Incorporating the q-matrix into multidimensional item response theory models. Educational and Psychological Measurement, 79(4), 665-687. https://doi.org/10.1177%2F0013164418814898 https://doi.org/10.1177%2F0013164418814898
  16. Dancey, C. P., & Reidy, J. (2017). Statistics without maths for psychology. Pearson.
  17. Delil, A. (2019). How fifth graders are assessed through central exams in Turkey: A comparison with TIMSS 2019 Assessment Framework. International Online Journal of Educational Sciences, 3(11), 222-234.
  18. Embretson, S. E., & Reise, S. (2000). Item response theory as model-based measurement. In Embretson, S. E., & Reise, S. (Eds.), Item response theory for psychologists (pp. 158-186). Lawrence Erlbaum Associates.
  19. Fishbein, B., Foy, P., & Tyack, L. (2020). Reviewing the TIMSS 2019 achievement item statistics. In Martin, M. O., von Davier, M., & Mullis, I. V. S. (Eds.), Methods and procedures: TIMSS 2019 Technical report (pp. 10.1-10.70). TIMSS & PIRLS International Study Center. https://timssandpirls.bc.edu/timss2019/methods/chapter-10.html
  20. Foy, P., Fishbein, B., von Davier, M., & Yin, L. (2020). Implementing the TIMSS 2019 scaling methodology. In Martin, M. O., von Davier, M., & Mullis, I. V. S. (Eds.), Methods and procedures: TIMSS 2019 technical report (pp. 12.1-12.146). TIMSS & PIRLS International Study Center, Boston College. https://timssandpirls.bc.edu/timss2019/methods/chapter-12.html
  21. George, A. C., & Robitzsch, A. (2018). Focusing on interactions between content and cognition: a new perspective on gender differences in mathematical sub-competencies. Applied Measurement in Education, 31(1), 79-97. https://doi.org/10.1080/08957347.2017.1391260
  22. Gierl, M. J., Bisanz, J., Bisanz, G. L., & Boughton, K. A. (2003). Identifying content and cognitive skills that produce gender differences in mathematics: A demonstration of the multidimensionality-based DIF analysis paradigm. Journal of Educational Measurement, 40(4), 281-306.
  23. Harks, B., Klieme, E., Hartig, J., & Leiss, D. (2014). Separating cognitive and content domains in mathematical competence. Educational Assessment, 19(4), 243-266. https://doi.org/10.1080/10627197.2014.964114
  24. Jang, Y. J. (2022). Reliability and validity evidence of diagnostic methods: Comparison of diagnostic classification models and item response theory-based methods [Unpublished doctoral dissertation, University of Minnesota].
  25. Moore, D. (1992). Teaching statistics as a respectable subject. In F. Gordon & S. Gordon (Eds.), Statistics for the twenty-first century (p. 14-25). The Mathematical Association of America.
  26. Mullis, I. V. S., & Martin, M. O. (2017). TIMSS 2019 assessment frameworks. Retrieved from Boston College, TIMSS & PIRLS International Study.
  27. Mullis, I. V. S., Martin, M. O., Foy, P., Kelly, D. L., & Fishbein, B. (2020). TIMSS 2019 international results in mathematics and science. Retrieved from Boston College, TIMSS & PIRLS International Study.
  28. Natesan, P., Nandakumar, R., Minka, T., & Rubright, J. D. (2016). Bayesian prior choice in IRT estimation using MCMC and variational Bayes. Frontiers in psychology, 7, 1422. https://doi.org/10.3389/fpsyg.2016.01422
  29. Niss, M. (2003). Mathematical competencies and the learning of mathematics: The Danish KOM project. In A. Gagatsis & S. Papastavridis (Eds.), Mediterranean Conference on Mathematical Education (pp. 115-124). Athens, Greece: Hellenic Mathematical Society and Cyprus Mathematical Society.
  30. Novikasari, I. (2016). The improvement of mathematics content knowledge on elementary school teacher candidates in problem based learning-models. International Journal of Education and Research, 4(17), 153-162.
  31. OECD. (2023). PISA 2022 assessment and analytical framework. Retrieved from https://doi.org/10.1787/19963777
  32. Plummer, M. (2017). JAGS Version 4.3.0 user manual. Retrieved from https://sourceforge.net/projects/mcmc-jags/files/Manuals/4.x/
  33. Shu, T., Luo, G., Luo, Z., Yu, X., Guo, X., & Li, Y. (2023). An explicit form with continuous attribute profile of the partial mastery DINA model. Journal of Educational and Behavioral Statistics, 48(5), 573-602.
  34. Su, Y. S., & Yajima, M. (2020). R2jags: Using R to run 'JAGS'. R package version 0.6-1. Retrieved from https://CRAN.R-project.org/package=R2jags
  35. Tatsuoka, K. K. (1983). Rule-space: An approach for dealing with misconceptions based on item response theory. Journal of Educational Measurement, 20, 345-354.
  36. von Davier, M. (2020). TIMSS 2019 scaling methodology: Item response theory, population models, and linking across modes. In M. O. Martin, M. von Davier, & I. V. S. Mullis (Eds.), Methods and procedures: TIMSS 2019 technical report (pp. 11.1-11.25). TIMSS & PIRLS International Study Center, Boston College. https://timssandpirls.bc.edu/timss2019/methods/chapter-11.html
  37. Wu, M., & Adams, R. (2006). Modelling mathematics problem solving item responses using a multidimensional IRT model. Mathematics Education Research Journal, 18(2), 93-113.
  38. Young, J. W., Cho, Y., Ling, G., Cline, F., Steinberg, J., & Stone, E. (2008). Validity and fairness of state standards-based assessments for English language learners. Educational Assessment, 13, 170-192.
  39. Zhang, J. (2004). Comparison of unidimensional and multidimensional approaches to IRT parameter estimation (ETS Research Report 04-44). Educational Testing Service.