DOI QR코드

DOI QR Code

A BERT-Based Automatic Scoring Model of Korean Language Learners' Essay

  • Lee, Jung Hee (Dept. of Korean Language Education as a Second Language, Kyung Hee University) ;
  • Park, Ji Su (Dept. of Computer Science and Engineering, Jeonju University) ;
  • Shon, Jin Gon (Dept. of Computer Science, Korea National Open University)
  • Received : 2021.09.13
  • Accepted : 2021.11.29
  • Published : 2022.04.30

Abstract

This research applies a pre-trained bidirectional encoder representations from transformers (BERT) handwriting recognition model to predict foreign Korean-language learners' writing scores. A corpus of 586 answers to midterm and final exams written by foreign learners at the Intermediate 1 level was acquired and used for pre-training, resulting in consistent performance, even with small datasets. The test data were pre-processed and fine-tuned, and the results were calculated in the form of a score prediction. The difference between the prediction and actual score was then calculated. An accuracy of 95.8% was demonstrated, indicating that the prediction results were strong overall; hence, the tool is suitable for the automatic scoring of Korean written test answers, including grammatical errors, written by foreigners. These results are particularly meaningful in that the data included written language text produced by foreign learners, not native speakers.

Keywords

References

  1. S. H. Ahn and C. S. Kim, "A study on the features of writing rater in TOPIK writing assessment," Journal of Korean Language Education, vol. 28, no. 1, pp. 173-196, 2017. https://doi.org/10.18209/iakle.2017.28.1.173
  2. S. Hwang and K. Kim, "BERT-based classification model for Korean documents," Journal of Society for eBusiness Studies, vol. 25, no. 1, pp. 203-214, 2020.
  3. J. O. Min, J. W. Park, Y. J. Jo, and B. G. Lee, "Korean machine reading comprehension for patent consultation using BERT," KIPS Transactions on Software and Data Engineering, vol. 9, no. 4, pp.145-152, 2020. https://doi.org/10.3745/KTSDE.2020.9.4.145
  4. C. H. Lee, Y. J. Lee, and D. H. Lee, "A study of fine tuning pre-trained Korean BERT for question answering performance development," Journal of Information Technology Services, vol. 19, no. 5, pp. 83-91, 2020. https://doi.org/10.9716/KITS.2020.19.5.083
  5. K. Jiang and X. Lu, "Natural language processing and its applications in machine translation: a diachronic review," in Proceedings of 2020 IEEE 3rd International Conference of Safe Production and Informatization (IICSPI), Chongqing City, China, 2020, pp. 210-214.
  6. J. Devlin, M. W. Chang, K. Lee, and K. Toutanova, "BERT: pre-training of deep bidirectional transformers for language understanding," in Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT), Minneapolis, MN, 2019, pp. 4171-4186.
  7. D. Alikaniotis, H. Yannakoudakis, and M. Rei, "Automatic text scoring using neural networks," in Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (ACL), Berlin, Germany, 2016, pp. 715-725.
  8. J. E. Kim, K. Park, J. M. Chae, H. J. Jang, B. W. Kim, and S. Y. Jung, "Automatic scoring system for short descriptive answer written in Korean using lexico-semantic pattern," Soft Computing, vol. 22, no. 13, pp. 4241-4249, 2018. https://doi.org/10.1007/s00500-017-2772-7
  9. National Institute of the Korean Language, Application Research of Korean Language Curriculum. Seoul, Korea: National Institute of the Korean Language, 2017.
  10. J. H. Lee, "A study on error determination standard and classification in Korean education," Journal of Korean Language Education, vol. 13, no. 1, pp. 175-197, 2002.
  11. Y. Oh, "BERT with SentencePiece for Korean Text," 2020 [Online]. Available: https://github.com/yeontaek/BERT-Korean-Model.
  12. H. Lee, J. Yoon, B. Hwang, S. Joe, S. Min, and Y. Gwon, "KoreALBERT: pretraining a Lite BERT model for Korean language understanding," in Proceedings of 2020 25th International Conference on Pattern Recognition (ICPR), Milan, Italy, 2021, pp. 5551-5557.
  13. Test of Proficiency of Korea (TOPIC) [Online]. Available: https://www.topik.go.kr/HMENU0/HMENU00018.do.