DOI QR코드

DOI QR Code

Investigation of Various Reliability Indices of Pre-service Mathematics Teachers' Teaching Aptitude and Personality Test based on Setting Cut Scores

예비수학교사의 교직 적성·인성 검사에서 분할점수 변화에 따른 다양한 신뢰도 탐색

  • Received : 2018.01.15
  • Accepted : 2018.02.22
  • Published : 2018.02.28

Abstract

The purpose of this study is first to examine the relative influence of each error source and to investigate the optimal measurement conditions to ensure satisfactory multiple reliability coefficients based on the teaching aptitude and personality test for pre-service teachers. Participants were 33 students enrolled in mathematics education in a graduate school of education located in the Seoul metropolitan area from 2013 to 2017. The main results were as follows. First, the estimated variance due to residual was highest, followed by nesting of items within domains, graduate students, interactions of graduate students with domains, and domains. Second, total 96 items, with 12 domains containing 8 items in each domain, with cut score of 598, and original 210 items, with 14 domains containing 15 items in each domain, with cut scores of 615 or 716 were optimal measurement conditions to reach acceptable reliability levels based on the joint consideration of dependability coefficients, cut score dependability coefficients, adjusted dependability coefficients, and standard errors of measurement. Third, larger deviations between the arithmetic mean and the cut score indicated higher reliability coefficients of the test results. Finally, this study suggests ways for practitioners to consider how to apply generalizability theory for criterion-referenced tests and how to develop future research based on limitations.

Keywords

References

  1. 강애남, 이규민 (2006). 학생들의 동료평가를 활용한 수행평가 결과의 일반화가능도 분석. 교육평가연구 19(3), 107-121.(Kang, A. N. & Lee, G. M. (2006). A generalizabiliy theory approach to investigating the generalizability of performance assessment using student peer reviews. Journal of Educational Evaluation 19(3), 107-121.)
  2. 교육과학기술부 (2012). 교원자격검정령. 서울: 교육과학기술부.(Ministry of Education, Science, and Technology. (2012). Official approval provision for teacher qualification. Seoul: Ministry of Education, Science, and Technology.)
  3. 교육부 (2013). 교원자격검정령. 서울: 교육부.(Ministry of Education. (2013). Official approval provisions forteacher qualification. Seoul: Ministry of Education.)
  4. 김경령, 서은희 (2014). 예비교사의 교직인성 자기점검도구 개발 연구. 한국교원교육연구 31(1), 117-139.(Kim, K. R. & Seo, E. H. (2014). The development and validation of the characteristics self-monitoring instrument for pre-service teachers. The Journal of Korean Teacher Education 31(1), 117-139.)
  5. 김경선, 이규민, 강승혜 (2010). 일반화가능도 이론을 적용한 한국어 말하기 성취도 평가의 신뢰도와 오차요인 분석. 한국어교육 21(4), 51-75.(Kim, K. S., Lee, G. M., & Kang, S. H. (2010). Analysis of error sources and estimation of reliability in a Korean Speaking Achievement Test by applying generalizability theory. Journal of Korean Language Education 21(4), 51-75.)
  6. 김민웅, 김태훈 (2017). 공업계열 특성화고 및 마이스터고 학생의 인성 수준 조사 분석. 직업교육연구 36(1), 23-46.(Kim, M. W. & Kim, T. H. (2017). Analysis of personality level of students of industrial-field specialized highschools and Meister high schools. The Journal of Vocational Education Research 36(1), 23-46.)
  7. 김보라, 이규민 (2012). 일반화가능도 이론을 적용한 초등학교 쓰기 수행평가의 총체적 채점과 분석적 채점방식 비교. 교육학연구 50(4), 49-76.(Kim, B. R. & Lee, G. M. (2012). A comparison of holistic and analytic scoring methods for elementary school writing assessment by applying generalizability theory. Korean Journal of Educational Research 50(4), 49-76.)
  8. 김성숙 (1993). 관찰을 통한 교수 평가 체계에 대한 측정의 일반화 가능도 연구. 교육학연구 31(1), 23-40.(Kim, S. S. (1993). The generalizability of student ratings of instructors across sections. The Journal of EducationalResearch 31(1), 23-40.)
  9. 김성숙 (1998). 준거참조검사의 분할점수에 따른 오차손실 변동과 신뢰성 지수 추정. 교육평가연구 11(1), 153-177.(Kim, S. S. (1998). An estimation of error-loss variation and dependability coefficient associated with cut-off score. Journal of Educational Evaluation, 11(1), 153-177.)
  10. 김성숙, 김양분 (2001). 일반화가능도 이론. 서울: 교육과학사.(Kim, S. S. & Kim, Y. B. (2001). Generalizability Theory. Seoul: Kyoyookgwahaksa.)
  11. 김성찬, 김성연, 한기순 (2012). 관찰, 추전에 의한 수학 영재 선발 시 사용되는 자기소개서와 교사추천서 평가에 대한 일반화가능도 이론의 활용. 수학교육 논문집 26(3), 251-271.(Kim, S. C., Kim, S. Y., & Han, K. S. (2012). An application of generalizability theory to self-introduction letter and teacher's recommendation letter used in identification of mathematical gifted students by observations and nominations. Communications of Mathematical Education 26(3), 251-271.)
  12. 김성호 (2017. 10. 23). 예비교사 인.적성검사, 부적격자 0.6% 그쳐. http://www.shinmoongo.net 에서 2017 10. 23 인출.(Kim, S. H. (2017). Unqualified candidates were only 0.6%based on aptitude and personality test for pre-service teachers. Retrieved from http://www.shinmoongo.net.)
  13. 김성희, 김성연 (2017). 일반화가능도 이론을 적용한 최순자의 유아 사회도덕성 검사의 효율적인 측정 조건탐색, 미래유아교육학회지 24(1), 325-342.(Kim, S H. & Kim, S. Y. (2017). An investigation of efficient measurement conditions of the sociomoral test for young children by Soon-Ja Choi using generalizability theory, Journal of Future Early Childhood Education 24(1), 325-342.)
  14. 김정환 (2004). 초등학교 기간제 교사의 교육능력 관련요소의 인과관계 분석, 교육평가연구 17(1), 121-139.(Kim, J. H. (1998). An investigation on casual relationships among educational competence-related factors of period-limit teachers in primary schools, Journal of Educational Evaluation 17(1), 121-139.)
  15. 김정환, 남현우, 염시창, 임진영 (2012). 교직 적성${\cdot}$인성검사 도구 개발 연구. 서울: 교육과학기술부.(Kim, J. H., Nam, H. W., Yeom, S. C., & Im, J. Y. (2012). The development of teaching aptitude and personality test. Seoul: Ministry of Education, Science, and Technology.)
  16. 김현진 (2013). 교사${\cdot}$학교장 신념과 중학생의 자율성 및 자기효능감, 학업성취도의 관계 분석, 교육학연구 51(2), 117-143.(Kim, H. J. (2013). An analysis of relations among teacher and principal's beliefs and $8^{th}$ grade students' autonomy, self-efficacy and achievement, The Journal ofEducational Research 51(2), 117-143.)
  17. 류춘렬, 이용근 (2010). 일반화가능도 이론을 이용한 집단논리적사고력검사(GALT)의 신뢰도 분석, 한국지구과학회지 31(1), 95-105.(Ryu, C, R. & Lee, Y. G (2010). An analysis of the reliability of Group Assessment of Logical Thinking (GALT) using Generalizability Theory, The Journal of the Korean Earth Science Society 31(1), 95-105.) https://doi.org/10.5467/JKESS.2010.31.1.095
  18. 민경석, 박인용, 양길석 (2014). 한국어능력시험II 의 규준 참조적 준거설정 방법 비교, 교육방법연구 26(4), 607-628.(Min, K. S., Park, I. Y., & Yang, K. S. (2014). Evaluation of Norm-referenced Standard Setting Methods for TOPIK II, The Korean Journal of Educational Methodology Studies 26(4), 607-628.)
  19. 박정 (2007). 우리나라 중학생의 수학에 대한 정의적 특성 변화와 수학 성취에 미치는 영향력 분석. 수학교육, 46(1), 19-31.(Park, J. (2007). The trend in the Korean middle school students' affective variables toward mathematics and its effect on their mathematics achievements, The Mathematical Education 46(1), 19-31.)
  20. 서경혜, 최진영, 노선숙, 김수진, 이지영, 현성혜 (2013). 예비교사 교직 인성 평가도구 개발 및 타당화, 교육과학연구 26(4), 607-628.
  21. Seo, K. H., Choi, J. Y., No, S. S., Kim, S. J., Lee, J. Y., & Hyun, S. H (2013). The development and validation of teacher disposition assessment instruments, Journal of Educational Studies 44(1), 147-176
  22. 신준국, 부덕훈, 서보억 (2015). 수학수업에서 인성 함양을 위한 중학교 교수.학습 자료 개발 연구, 수학교육논문집 29(2), 241-265.(Shin, J. K., Boo, D. H., & Suh, B. E. (2015). A study on the development of teaching and learning materials for character education in middle school, Communications ofMathematical Education 29(2), 241-265.)
  23. 이규민, 황경현 (2007). 초등학교 과학과 수행평가의 총체적 채점과 분석적 채점 방식에 대한 일반화가능도 분석, 아동교육 16(4), 169-184.(Lee, G, M., & Hwang, K. H. (2007). A generalizability theory approach toward investigating the generalizability of scores from holistic and analytic scoring methods in performance assessments of an elementary school science class, The Korean Journal of Child Education 16(4), 169-184.)
  24. 이진아, 한기순 (2016). 일반화가능도 이론을 활용한 TTCT (도형 A형- 활동 2) 독창성 평가 방안 탐색, 창의력교육연구 16(3), 65-77.(Lee, J. A., & Han, K. S. (2016). Optimizing TTCT figure A (Section 2) originality scoring system using the generalizability theory, The Journal of Creativity Education 16(3), 65-77.)
  25. 이태구, 양희원 (2016). 강제결합-스포츠모의중계수업에서 일반화가능도 이론을 적용한 동료평가의 신뢰도와 오차요인 분석, 체육과학연구 27(2), 345-361.(Lee, T. K. & Yang, H. W. (2016). Analysis of error sources and estimation of reliability in peer review of forced connection method-sportscasting by applying generalizability theory, Korean Journal of Sport Science 27(2), 345-361.)
  26. 이한준, 강민수 (2017). 일반화가능도 이론을 이용한 하지의 등속성 검사의 신뢰도 연구, 운동학 학술지 19(4), 29-35.(Lee, H. J. & Kang, M. S. (2017). Reliability of isokinetic knee strength measurements using generalizability theory, The Official Journal of the Korean Association of Certified Exercise Professionals 19(4), 29-35.)
  27. 이현숙, 송미영 (2015). PISA 2012 수학 성취도를 설명하는 학생의 정의적 특성 및 교사 특성 분석을 위한 다층 구조방정식 모형의 적용, 교과교육학연구 19(1), 137-158.(Yi, H. S. & Song, M Y. (2015). A multi-level SEMapproach for the analysis of relationships between math-related educational context variables and mathliteracy of PISA 2012, Journal of Research in Curriculum Instruction 19(1), 137-158.)
  28. 조운주 (2014). 예비유아교사를 위한 교직적성.인성 검사도구의 타당성 및 개선방안, 육아지원연구 9(2), 101-123.(Cho, W. J. (2014). Validation and modification of teaching aptitude test for pre-service early childhood teachers, Early Childhood Education and Care 9(2), 101-123.)
  29. 조주연, 백순근, 임진영, 여태철, 최지은 (2004). 초등 교직적성검사 모형개발 연구, 교육심리연구 18(3), 231-247.(Cho, J. Y., Baek, S. G., Im, J. Y., Yeo, T. C., & Choi, J. E. (2004). The development of the test measuring aptitude for primary school teacher. Journal of Educational Psychology, 18(3), 231-247.)
  30. 조주연, 백순근, 임진영, 여태철, 최지은 (2007). 초등 교직적성검사(TAPST) 타당화 연구, 초등교육연구 20(2), 161-183.(Cho, J. Y., Baek, S. G., Im, J. Y., Yeo, T. C., & Choi, J. E. (2004). A validation study of the Test for Aptitude of Primary School Teacher(TAPST), The Journal of Elementary Education 20(2), 161-183.)
  31. 주지은, 노언경, 이규민 (2007). 공간능력 검사의 성차 및 과제유형 효과와 효율적 측정 구조 탐색, 교육심리연구 21(2), 311-330.(Joo, J. E., No, U. K., & Lee, G. M. (2007). Gender and task type effects on the spatial ability test and the investigation of efficient measurement procedures, Journal of Educational Psychology 21(2), 311-330.)
  32. 한혜숙, 최계현 (2011). 중등 수학 교사들의 정의적 특성에 대한 인식과 수업 실태 분석, 한국학교수학회논문집 14(4), 491-518.(Han, H. S. & Choi, K. H. (2011). Secondary mathematics teachers' recognition of the affective domain and analysis of condition in mathematics teaching, Journal of the Korean School Mathematics Society 14(4), 491-518.)
  33. 한국교육과정평가원 (2016). 역량중심 교육환경에 따른 교사 자격검정 개선 방향. 서울: 한국교육과정평가원.(Korea Institute for Curriculum and Evaluation. (2016). A study on directions to improve official approvalprovisions for teacher qualification in a competency-based educational environment. Seoul: Korea Institute for Curriculum and Evaluation.)
  34. 홍기칠 (2006). 교직적성.인성 검사도구의 적용연구. 초등교육연구논총 22(1), 113-135.(Hong, K. C. (2006). A Application Study on Test of Aptitude for Primary School Teacher: TAPST. Journal ofElementary Education, 22(1), 113-135.)
  35. 황혜정 (2011). 수학 수업의 교사 지식에 관한 평가 요소 탐색-교수.학습 방법 및 평가를 중심으로, 한국학교수학회논문집 14(3), 241-263.(Hwang, H. J. (2011). The study on the investigation of themathematics teaching evaluation standards focused on teaching and learning methods and assessment, Journal of the Korean School Mathematics Society 14(3), 241-263.)
  36. Arce, A. J., & Wang, Z. (2012). Applying Rasch model and generalizability theory to study Modified-Angoff cut scores, International Journal of Testing 12(1), 44-60. https://doi.org/10.1080/15305058.2011.614366
  37. Arterberry, B. J., Martens, M. P., Cadigan, J. M., & Smith, A. E. (2012). Assessing the dependability of drinking motives via generalizability theory, Measurement and Evaluation in Counseling and Development 45(4), 292-302. https://doi.org/10.1177/0748175612449744
  38. Brennan, R. L. (1984). Estimating the dependability of the scores. In R. A. Berk(Ed.), A guide to criterion-referenced test construction. Baltimore: Johns Hopkins University Press.
  39. Brennan, R. L. (2001). Generalizability Theory. NewYork, NY: Springer-Verlag.
  40. Brennan, R. L., Gao, X., & Colton, D. A. (1995). Generalizability analyses of Work Keys listening and writing tests, Educational and Psychological Measurement 55(2), 157-176. https://doi.org/10.1177/0013164495055002001
  41. Brennan, R. L., & Kane, M. T. (1977). An index of dependability for mastery tests, Journal of Educational Measurement 14(3), 277-289. https://doi.org/10.1111/j.1745-3984.1977.tb00045.x
  42. Cizek, G. J. (1996). Standard‐Setting Guidelines, Educational Measurement: issues and practice 15(1), 13-21. https://doi.org/10.1111/j.1745-3992.1996.tb00802.x
  43. Cohen, J. (1960). A coefficient of agreement for nominal scales, Educational and psychological measurement 20(1), 37-46. https://doi.org/10.1177/001316446002000104
  44. Englehart, D. S., Batchelder, H. L., Jennings, K. L., Wilkerson, J. R., Lang, W. S., & Quinn, D. (2012). Teacher dispositions: Moving from assessment to improvement, The International Journal of Educational and Psychological Assessment 9(2), 26-44.
  45. Fyans, L. J. (1983). Generalizability theory: Inferences and practical applications. Jossey-BassInc Pub.
  46. Govaerts, M. J., Van der Vleuten, C. P., & Schuwirth, L. W. (2002). Optimising the reproducibility of a performance-based assessment test in midwifery education, Advances in Health Sciences Education 7(2), 133-145. https://doi.org/10.1023/A:1015720302925
  47. Gugiu, M. R., Gugiu, P. C., & Baldus, R. (2012). Utilizing generalizability theory to investigate the reliability of grades assigned to undergraduate research papers, Journal of Multi Disciplinary Evaluation 8(19), 26-40.
  48. Hanson, B. A. & Brennan, R. L. (1990). An investigation of classification consistency indexes estimated under alternative strong true score models, Journal of Educational Measurement 27(4), 345-359. https://doi.org/10.1111/j.1745-3984.1990.tb00753.x
  49. Haq, R., Anwar, M. N., & Naz, A. (2012). An examination of teaching aptitude of teachers working at primary level: Demographic differences, International Interdisciplinary Journal of Education 1(2), 29-33.
  50. Hopkins, K. D. (1997). Educational and psychological measurement and evaluation (8th Ed.). Upper Saddle River, NJ: Pearson.
  51. Houchard, M. A. (2005). Principal Leadership, Teacher Morale, and Student Achievement in Seven Schools in Mitchell County, North Carolina. Electronic Theses and Dissertations.
  52. Huynh, H. (1976). On consistency of decisions in criterion-referenced testing, Journal of Educational Measuremen 13(4), 265-275. https://doi.org/10.1111/j.1745-3984.1976.tb00017.x
  53. Kane, M. T. (2001). So much remains the same: Conception and status of validation in setting standards. Setting performance standards: Concepts, methods, and perspectives. Mahwah, NJ: Lawrence Erlbaum Associates.
  54. Kant, R. (2011). A study of teaching aptitude and responsibility feeling of secondary school teachers in relation to their sex and locale, Academic Research International 1(2), 254-259.
  55. Keeves, J. P. (Ed.). (1988). Educational research, methodology and measurement: An international handbook. Oxford: Pergamon.
  56. Klieme, E., Pauli, C., & Reusser, K. (2009). The Pythagoras study: Investigating effects of teaching and learning in Swiss and German mathematics classrooms, The power of video studies in investigating teaching and learning in the classroom 137-160.
  57. Kunter, M., Tsai, Y. M., Klusmann, U., Brunner, M., Krauss, S., & Baumert, J. (2008). Students' and mathematics teachers' perceptions of teacher enthusiasm and instruction, Learning and Instruction 18(5), 468-482. https://doi.org/10.1016/j.learninstruc.2008.06.008
  58. Lee, G. M. (2002). The influence of several factors on reliability for complex reading comprehension tests, Journal of Educational Measurement 39(2), 149-164. https://doi.org/10.1111/j.1745-3984.2002.tb01140.x
  59. Lee, Y. W., & Kantor, R. (2007). Evaluating prototype tasks and alternative rating schemes for a new ESL writing test through G-theory, International Journal of Testing 7(4), 353-385.
  60. Lin, C. K., & Zhang, J. (2014). Investigating correspondence between language proficiency standards and academic content standards: A generalizability theory study, Language Testing 31(4), 413-431. https://doi.org/10.1177/0265532213520304
  61. Livingston, S. A. (1972). Criterion‐referenced applications of classical test theory, Journal of Educational Measurement 9(1), 13-26. https://doi.org/10.1111/j.1745-3984.1972.tb00756.x
  62. Livingston, S. A., & Lewis, C. (1995). Estimating the consistency and accuracy of classifications based on test scores, Journal of educational measurement 32(2), 179-197. https://doi.org/10.1111/j.1745-3984.1995.tb00462.x
  63. Marzano, R. J. (2002). A comparison of selected methods of scoring classroom assessments, Applied Measurement in Education 15(3), 249-268. https://doi.org/10.1207/S15324818AME1503_2
  64. Narvaez, D., & Nucci, L. P. (2008). Handbook of moral and character education. New York, NY: Routledge.
  65. Newton, X. A. (2010). Developing indicators of classroom practice to evaluate the impact of district mathematics reform initiative: A generalizability analysis, Studies in Educational Evaluation 36(1), 1-13. https://doi.org/10.1016/j.stueduc.2010.10.002
  66. Norcini, J. J. (1999). Measurement issues in the use of simulation for testing professionals: Test development, test scoring, standard setting. Innovative simulations for assessing professional competence. Chicago, IL: University of Illinois.
  67. Nunnally, J. C. & Bernstein, I. H. (1994). Psychometric theory (3rd ed.). New York, NY: McGraw-Hill.
  68. Phillips, S. E., & Camara, W. J. (2006). Legal and ethical issues. In R. L. Brennan(Ed), Educational measurement (4th ed). Westport, CT: Praeger.
  69. Popham, W. J, & Husek, T. R. (1969). Implications of criterion‐referenced measurement, Journal of Educational Measurement 6(1), 1-9. https://doi.org/10.1111/j.1745-3984.1969.tb00654.x
  70. Salvia, J., Ysseldyke, J., & Witmer, S. (2012). Assessment: In special and inclusive education(11th ed.). Boston, MA: Houghton Mifflin.
  71. Schoonheim-Klein, M., Muijtens, A., Habets, L., Manogue, M., van der Vleuten, C., Hoogstraten, J., & van der Velden, U. (2008). On the reliability of a dental OSCE, using SEM: effect of different days, European Journal of Dental Education 12(3), 131-137. https://doi.org/10.1111/j.1600-0579.2008.00507.x
  72. Shavelson, R. J. & Webb, N. M. (1991). Generalizability theory: A Primer. Thousand Oaks, CA: Sage.
  73. Sireci, S. G. (2005). The most frequently unasked questions about testing. In R. P. Phelps(Ed), Defending standardized testing. Mahwah, NJ: Lawrence Erlbaum Associates.
  74. Solano-Flores, G. & Li, M. (2006). The use of generalizability (G) theory in the testing of linguistic minorities, Educational Measurement: Issues and Practice 25(1), 13-22. https://doi.org/10.1111/j.1745-3992.2006.00048.x
  75. Subkoviak, M. J. (1976). Estimating reliability from a single administration of a criterion‐referenced test, Journal of Educational Measurement 13(4), 265-276. https://doi.org/10.1111/j.1745-3984.1976.tb00017.x
  76. Sung, Y. T., Chang, K. E., Chang, T. H., & Yu, W.C. (2010). How many heads are better than one? The reliability and validity of teenagers' self-and peer assessments, Journal of Adolescence 33(1), 135-145 https://doi.org/10.1016/j.adolescence.2009.04.004
  77. Swaminathan, H., Hambleton, R. K., & Algina, J. (1974). Reliability of criterion‐referenced tests: A decision‐theoretic formulation, Journal of Educational Measurement 11(4), 263-267. https://doi.org/10.1111/j.1745-3984.1974.tb00998.x
  78. Tasleema, J., & Hamid, M. M. (2012). Teaching aptitude of elementary and secondary level teacher educators, Journal of Education and Practice 3(2), 67-71.
  79. Taylor, M. A., & Pastor, D. A. (2013). An application of generalizability theory to evaluate the technical quality of an alternate assessment, Applied Measurement in Education 26(4), 279-297. https://doi.org/10.1080/08957347.2013.824450
  80. Thompson, A. G. (1992). Teacher' belief and conceptions: A synthesis of the research. New York: Macmillan Publishing Company.
  81. Tindal, G., Yovanoff, P., & Geller, J. P. (2010). Generalizability theory applied to reading assessments for students with significant cognitive disabilities, The Journal of Special Education 44(1), 3-17. https://doi.org/10.1177/0022466908323008
  82. Volpe, R. J., McConaughy, S. H., & Hintze, J. M. (2009). Generalizability of classroom behavior problem and on-task scores from the Direct Observation Form, School Psychology Review 38(3), 382.
  83. Volpe, R. J. & Briesch, A. M. (2012). Generalizabilityand dependability of single-item and multiple-item direct behavior rating scales for engagement and disruptive behavior, School Psychology Review 41(3), 246.
  84. Webb, N. M., Schlackman, J., & Sugrue, B. (2000). The dependability and interchangeability of assessment methods in science, Applied Measurement in Education 13(3), 277-301. https://doi.org/10.1207/S15324818AME1303_4
  85. Yang, Y., Oosterhof, A., & Xia, Y. (2015). Reliability of Scores on the Summative Performance Assessments, The Journal of Educational Research 108(6), 465-479. https://doi.org/10.1080/00220671.2014.917255
  86. Yin, P. & Sconing, J. (2008). Estimating standard errors of cut scores for item rating and mapmark procedures: A generalizability theory approach, Educational and Psychological Measurement 68(1), 25-41. https://doi.org/10.1177/0013164407301546
  87. Xi, X. (2007). Evaluating analytic scoring for the TOEFLR Academic Speaking Test (TAST) for operational use, Language Testing 24(2), 251-286. https://doi.org/10.1177/0265532207076365