DOI QR코드

DOI QR Code

Verification of educational goal of reading area in Korean SAT through natural language processing techniques

대학수학능력시험 독서 영역의 교육 목표를 위한 자연어처리 기법을 통한 검증

  • Lee, Soomin (Department of Computer Science and Engineering, Korea University) ;
  • Kim, Gyeongmin (Department of Computer Science and Engineering, Korea University) ;
  • Lim, Heuiseok (Department of Computer Science and Engineering, Korea University)
  • Received : 2021.10.07
  • Accepted : 2022.01.20
  • Published : 2022.01.28

Abstract

The major educational goal of reading part, which occupies important portion in Korean language in Korean SAT, is to evaluated whether a given text can be fully understood. Therefore given questions in the exam must be able to solely solvable by given text. In this paper we developed a datatset based on Korean SAT's reading part in order to evaluate whether a deep learning language model can classify if the given question is true or false, which is a binary classification task in NLP. In result, by applying language model solely according to the passages in the dataset, we were able to acquire better performance than 59.2% in F1 score for human performance in most of language models, that KoELECTRA scored 62.49% in our experiment. Also we proved that structural limit of language models can be eased by adjusting data preprocess.

대학수학능력시험 국어 과목에서 중요한 비중을 차지하는 독서 영역의 주된 교육 목표는 주어진 지문을 온전히 이해할 수 있는가를 평가하는 데에 있다. 따라서 해당 지문에 포함된 질의를 주어진 지문만으로 풀이할 수 있는지는 해당 영역의 교육 목표와 관련이 깊다. 본 연구에서는 처음으로, 교육학 분야와 딥러닝을 접목하여 이러한 교육 목표가 실제로도 타당하게 실현 가능한지를 입증하고자 한다. 대학수학능력시험의 독서 영역의 개별지문과 그에 수반된 다수의 문장 쌍(sentence pair)을 정제하여 추출하고, 해당 문장 쌍을 주어진 지문에 비추어 적절하거나(T), 적절하지 않은지(F)를 판단하는 이진 분류 태스크(binary classification task)에 적용하여 평가하고자 한다. 그 결과, F1 스코어 기준 59.2%의 human performance를 뛰어넘는 성능을 62.49%의 KoELECTRA를 비롯한 대부분의 언어 모델에서 확인할 수 있었으며, 또한 데이터 전처리 과정에 변화를 줌으로써 언어 모델의 구조적 한계를 극복할 수 있었다.

Keywords

Acknowledgement

This research was supported by the MSIT(Ministry of Science and ICT), Korea, under the ITRC(Information Technology Research Center) support program(IITP-2018-0-01405) supervised by the IITP(Institute for Information & Communications Technology Planning & Evaluation) and supported by Basic Science Research Program through the National Research Foundation of Korea(NRF) funded by the Ministry of Education(NRF-2021R1A6A1A03045425).

References

  1. M. H. Roh. (2011). Reading: the concept and important issues of education. KEDI Research Paper, 43(3), 1-43.
  2. Ministry of Education. (2015. January). Korean Language Curriculum. 2015 Revised Curriculum, 5, 1-178.
  3. S. H. Kim. (2014). Analysis of Students' Recognition of National Scholastic Aptitude Test for University Admission -With Focus on the 'Korean Language Section'-, Journal of CheongRam Korean Language Education, 49, 135-164.
  4. S. Y. Ryu. (2019). Critical Examination About CSAT Korean Language and Its Developmental Directions-Toward the Recovery of the Nature of the CSAT Evaluation-. New Language Education, 121, 353-380.
  5. R. Sennrich, B. Haddow & A. Birch. (2016). Improving Neural Machine Translation Models with Monolingual Data. Association for Computational Linguistics, 1, 86-96.
  6. A. Vaswani et al. (2017). Attention is all you need. Advances in Neural Information Processing Systems, 30, 6000-6010.
  7. D. Powers, D. Escoffery, & M. Duchnowski. (2015). Validating Automated Essay Scoring: A (Modest) Refinement of the "Gold Standard". Applied Measurement in Education, 28(2), 130-142. https://doi.org/10.1080/08957347.2014.1002920
  8. SKT Brain. (2019). KoBERT. Github Repository. https://github.com/SKTBRain/KoBERT
  9. K. Clark, M. Luong, Q. Le, & C. Manning. (2020). Electra: Pre-training text encoders as discriminators rather than generators. arXiv preprint arXiv:2003.10555.
  10. J. W. Park. (2020). KoELECTRA: Pretrained ELECTRA Model for Korean. Github Repository. https://github.com/monologg/KoELECTRA
  11. J. B. Lee. (2021). KcELECTRA : Korean comments ELECTRA. Github Repository. https://github.com/Beomi/KcELECTRA
  12. J. Wei & K. Zou. (2019). EDA: Easy Data Augmentation Techniques for Boosting Performance. Association for Computational Linguistics, 1, 6382-6388. DOI : 10.18653/v1/D19-1670
  13. P. Rajpurkar, J. Zhang, K. Lopyrev & P. Liang. (2016). SQuAD: 100, 000+ Questions for Machine Comprehension of Text, EMNLP, 1, 2383-2392. DOI : 10.18653/v1/d16-1264
  14. Y. Y. Yang, S. W. Kang, J. Y. Seo. (2019). Improved Machine Reading Comprehension Using Data Validation for Weakly Labeled Data. IEEE, 8, 5667-5677. DOI : 10.1109/ACCESS.2019.2963569
  15. Y. R. Lee et al. (2009. July.). Analysis of SAT and ACT, Seoul : KICE.
  16. C. Fellbaum. (2005). WordNet and wordnets, Oxford : Elsevier.
  17. J. H. Moon, H. C. Cho & E. J. Park. (2020). Revisiting Round-Trip Translation for Quality Estimation. http://arxiv.org/abs/2004.13937