Search | Korea Science

Difficulty-adjustable Phrase-level Cloze Question Generation System (난이도 조절 가능한 어구 단위 빈칸 추론 문항 생성 시스템)

Seokhoon Kang;Gary Geunbae Lee
- Annual Conference on Human and Language Technology
- /
- 2023.10a
- /
- pp.113-118
- /
- 2023
딥러닝을 이용한 언어 모델은 다양한 분야에서 사용되고 있는데, 그 중 교육 분야에선 꾸준히 시험 문항을 자동으로 생성하려는 요구가 존재해 왔다. 그러나 빈칸 추론 문항, 그 중에서도 어구 단위 빈칸 추론 문항은 학습 및 평가 목적으로 널리 쓰이고 있지만, 이를 자동 생성하려는 연구는 상대적으로 드물다. 이에 본 연구에선 masked language modeling (MLM)을 이용한 난이도 조절이 가능한 어구 단위 빈칸 추론 문항 생성 시스템을 제안한다. 본 시스템은 정답 생성 모델의 attention 정보에 따라 지문 내 중요한 어구를 삭제해 오답을 생성하고, 동시에 어구의 삭제 비율을 조절함으로써 더 쉽거나 더 어려운 오답을 만들어낼 수 있다. 평가 결과, 제안한 시스템은 기존 접근법보다 정답과의 유사도가 최고 28.3% 낮았고, 또한 난이도 설정에 따라 쉬운 오답이 어려운 오답에 비해 유사도가 15.1% 낮아, 더 정답과 먼 뜻의 오답을 생성해내었다.
PDF

Design of Mutant-based Practical Test Problem Generator for Programming Education (프로그래밍 학습을 위한 뮤턴트 기반의 실습 문항 생성기의 구조 설계)

Kwak, Yong-Sub;Lee, Sunghee;Lee, Woo Jin
- Proceedings of the Korea Information Processing Society Conference
- /
- 2017.04a
- /
- pp.649-652
- /
- 2017
프로그래밍 교육에서 실습교육은 소스 코드를 직접 작성해보는 과정을 통해 이론적인 지식을 보완할 수 있는 매우 중요한 과정이다. 따라서 대부분의 프로그래밍 교과과정은 실습교육을 포함하고 있다. 그러나 실습교육을 통해 학습 성취도를 평가하는 일은 시간과 비용이 많이 소모되는 작업이다. 그래서 많은 교육기관에서는 평가를 효율적으로 하기 위해 자동 평가 시스템을 운용하고 있다. 자동 평가 시스템은 학생들의 실습 결과를 정확하고 신속하게 평가하는데 효과적이다. 그러나 실습교육에 필요한 실습문항은 대부분의 경우 교사가 수작업으로 생성하며 이 과정에서 많은 인적 시간적 비용이 발생하게 된다. 이러한 문제를 해결하기 위해서 문항 생성을 자동화하려는 연구가 진행되고 있으나 아직까지 초기 단계이며 새로운 문항을 생성하지 못하는 등의 제약 사항이 많아 적용하기에 무리가 있다. 따라서 본 논문에서는 하나의 문항으로부터 다양한 문항들을 변형하여 생성할 수 있는 방법을 제안하고 이를 지원하는 프로그래밍 실습용 문항 생성기의 구조를 설계한다.
https://doi.org/10.3745/PKIPS.y2017m04a.649 인용 PDF

Algorithm Generating Item Response Data Based on Multidimensional Item Response Theory (다차원 문항반응이론에 기반한 문항 응답 데이터 생성 알고리즘)

Kim, ByoungWook;Lee, WonGyu
- Proceedings of the Korea Information Processing Society Conference
- /
- 2014.04a
- /
- pp.526-528
- /
- 2014
본 논문은 다차원 문항반응이론 모델에 기반하여 시뮬레이션을 위한 피험자들의 문항 응답 데이터를 생성하는 알고리즘을 개발하는 것이 목적으로 하였다. 본 알고리즘은 시험지를 구성하고 있는 문항들의 모수를 읽고, 각각의 차원에 대해 피험자들의 능력 수준을 나타내는 정규 분포 확률 변수를 생성한다. 본 알고리즘은 다차원 문항반응이론 모델에 기반하여 피험자들이 각 문항에 대해 정답으로 응답할 확률을 계산한다. 피험자들의 문항 응답을 결정하는 균일 분포 난수와 비교한다. 만약 확률이 난수보다 크면 피험자는 올바른 답을 한 것으로 보고 그렇지 않을 경우 틀리게 답할 것으로 한다. 본 프로그램은 피험자 수, 문항 수를 조절할 수 있다. 본 알고리즘을 통해 교육 측정 분야에서 다차원 문항반응 이론을 이용하여 학습자들의 문항 응답 데이터를 이용한 시뮬레이션 연구에 기여할 수 있을 것으로 기대한다.
https://doi.org/10.3745/PKIPS.y2014m04a.526 인용 PDF

The utility of digital evaluation based on automatic item generation in mathematics: Focusing on the CAFA system (수학교과에서 자동문항생성 기반의 디지털 평가 활용 방안: CAFA 시스템을 중심으로)

Kim, Sungyeun
- The Mathematical Education
- /
- v.61 no.4
- /
- pp.581-595
- /
- 2022
The purpose of this study is to specify the procedure for making item models based on ontology models using automatic item generation in the mathematics subject through the CAFA system, and to explore the generated item instances. As an illustration for this, an item model was designed as a part of formative assessment based on the content characteristics, including concepts and calculations, and process characteristics, including application, using the representative values and the measures of dispersion in Mathematics of the 9th grade based on the evaluation criteria achievement standards. The item types generated in one item model were a best answer type, a correct answer type, a combined-response type, an incomplete statement type, a negative type, a true-false type, and a matching type. It was found that HTML, Google Charts, TTS, figures, videos and so on can be used as media. The implications of the use of digital evaluation based on automatic item generation were suggested in the aspects of students, pre-service teachers, general teachers, and special education, and the limitations of this study and future research directions were presented.
https://doi.org/10.7468/mathedu.2022.61.4.581 인용 PDF KSCI

Automatic Generating Technique of Questions about Filling in the Blanks for Programming Education (프로그래밍 교육을 위한 빈 칸 채우기 문항 자동생성 기법)

Lee, Sunghee;Kim, Deok Yeop;Lee, Woo Jin
- Proceedings of the Korea Information Processing Society Conference
- /
- 2018.05a
- /
- pp.187-190
- /
- 2018
최근 프로그래밍 교육에서 학생들의 학습 성취도를 빠르고 정확하게 평가하기 위하여 자동 채점 시스템을 사용한다. 강의를 통해 습득한 이론적인 지식을 이해하기 위해 직접 코드를 작성하는 실습이 진행되는 프로그래밍 교육에서 효과적이기 때문이다. 현재 실습에 필요한 실습문항의 대부분은 강사가 직접 생성해야 한다. 특히 강의내용을 바탕으로 예제 소스코드를 이해하여 빈 칸에 알맞은 코드를 작성하는 실습은 강사가 직접 빈 칸에 해당되는 부분을 예제 코드에서 지정해줘야 하는 추가적인 작업이 필요하다. 이러한 빈 칸 채우기 문항은 일반적으로 빈 칸이 고정된 행태이기 예문에 학생들이 답안을 공유하기 쉽다. 이를 막기 위해서 강사는 유사한 내용의 빈 칸 채우기 문항을 추가적으로 생성해야 한다. 하지만 대부분의 자동 채점 시스템은 이를 지원하지 않거나 강사에게 빈 칸을 직접 지정하도록 하는 경우가 대부분이다. 따라서 본 논문에서는 이러한 문제를 해결하는 빈 칸 채우기 문항 자동생성기법을 제안하고 적용 사례를 보인다.
https://doi.org/10.3745/PKIPS.y2018m05a.187 인용 PDF

Design of Iterative Learning Contents and Items Generation System based on SCORM (SCORM 기반 반복 학습 콘텐츠 및 문항 생성 시스템 설계)

Baek, Yeong-Tae;Lee, Se-Hoon;Jeong, Jae-Cheul
- Journal of the Korea Society of Computer and Information
- /
- v.14 no.2
- /
- pp.201-209
- /
- 2009
According to previous researches about online evaluation in many e-Learning contents, it took too much time and effort to generate test questions for formative or achievement tests using a database as an item pool. Furthermore, it is hard to measure accomplishment of learners for each unit through overall tests provided by existing e-learning contents. In this paper, to efficiently cope with problems described above, the item pool based on Item Form was transformed into Interaction Date Model in Run-Time Environment of SCORM2004. And the contents for the math concepts and principles that students would learn from regular classroom were developed in accordance with SCORM. In addition, Confidence Factor Function was used to take an objective view in measuring the accomplishment of learners through the items automatically generated by LMS(Learning Management System).
https://doi.org/10.9708/jksci.2009.14.2.201 인용 PDF

A Measure for Improvement in Quality of Association Rules in the Item Response Dataset (문항 응답 데이터에서 문항간 연관규칙의 질적 향상을 위한 도구 개발)

Kwak, Eun-Young;Kim, Hyeoncheol
- The Journal of Korean Association of Computer Education
- /
- v.10 no.3
- /
- pp.1-8
- /
- 2007
In this paper, we introduce a new measure called surprisal that estimates the informativeness of transactional instances and attributes in the item response dataset and improve the quality of association rules. In order to this, we set artificial dataset and eliminate noisy and uninformative data using the surprisal first, and then generate association rules between items. And we compare the association rules from the dataset after surprisal-based pruning with support-based pruning and original dataset unpruned. Experimental result that the surprisal-based pruning improves quality of association rules in question item response datasets significantly.
PDF

A Question Example Generation System for Multiple Choice Tests by utilizing Concept Similarity in Korean WordNet (한국어 워드넷에서의 개념 유사도를 활용한 선택형 문항 생성 시스템)

Kim, Young-Bum;Kim, Yu-Seop
- The KIPS Transactions:PartA
- /
- v.15A no.2
- /
- pp.125-134
- /
- 2008
We implemented a system being able to suggest example sentences for multiple choice tests, considering the level of students. To build the system, we designed an automatic method for sentence generation, which made it possible to control the difficulty degree of questions. For the proper evaluation in the multiple choice tests, proper size of question pools is required. To satisfy this requirement, a system which can generate various and numerous questions and their example sentences in a fast way should be used. In this paper, we designed an automatic generation method using a linguistic resource called WordNet. For the automatic generation, firstly, we extracted keywords from the existing sentences with the morphological analysis and candidate terms with similar meaning to the keywords in Korean WordNet space are suggested. When suggesting candidate terms, we transformed the existing Korean WordNet scheme into a new scheme to construct the concept similarity matrix. The similarity degree between concepts can be ranged from 0, representing synonyms relationships, to 9, representing non-connected relationships. By using the degree, we can control the difficulty degree of newly generated questions. We used two methods for evaluating semantic similarity between two concepts. The first one is considering only the distance between two concepts and the second one additionally considers positions of two concepts in the Korean Wordnet space. With these methods, we can build a system which can help the instructors generate new questions and their example sentences with various contents and difficulty degree from existing sentences more easily.
https://doi.org/10.3745/KIPSTA.2008.15-A.2.125 인용 PDF KSCI

A Sentence Generation System for Multiple Choice Test with Automatic Control of Difficulty Degree (난이도 자동제어가 구현된 객관식 문항 생성 시스템)

Kim, Young-Bum;Kim, Yu-Seop
- Proceedings of the Korea Information Processing Society Conference
- /
- 2007.05a
- /
- pp.1404-1407
- /
- 2007
본 논문에서는 객관식 문항을 난이도에 따라 자동으로 생성하는 방법을 고안하여, 학습자 수준에 적합하도록 다양하고 동적인 형태로 문항 제시를 할 수 있는 시스템을 제안하였다. 이를 위해서는 주어진 문장에서 형태소 분석을 통해 키워드를 추출하고, 각 키워드에 대하여 워드넷의 계층적 특성에 따라 의미가 유사한 후보 단어를 제시한다. 의미 유사 후보 단어를 제시할 때, 워드넷에서의 어휘간 유사도 측정 방법을 사용함으로써 생성된 문항의 난이도를 사용자가 원하는 수준으로 조정할 수 있도록 하였다. 단어의 의미 유사도는 동의어를 의미하는 수준 0에서 거의 유사도를 찾을 수 없는 수준 9 까지 다양하게 제시할 수 있으며, 이를 조절함으로써 문항의 전체 난이도를 조절할 수 있다. 후보 어휘들의 의미 유사도 측정을 위해서, 본 논문에서는 두 가지 방법을 사용하여 구현하였다. 첫째는 단순히 두 어휘의 워드넷 상에서의 거리만을 고려한 것이고 둘째는 두 어휘가 워드넷에서 차지하는 비중까지 추가적으로 고려한 것이다. 이러한 방법을 통하여 실제 출제자가 기존에 출제된 문제를 토대로 보다 다양한 내용과 난이도를 가진 문제 또는 문항을 보다 쉽게 출제하게 함으로써 출제에 소요되는 비용을 줄일 수 있었다.
https://doi.org/10.3745/PKIPS.y2007m05a.1404 인용 PDF

Exploring the Reliability of an Assessment based on Automatic Item Generation Using the Multivariate Generalizability Theory (다변량일반화가능도 이론을 적용한 자동문항생성 기반 평가에서의 신뢰도 탐색)

Jinmin Chung;Sungyeun Kim
- Journal of Science Education
- /
- v.47 no.2
- /
- pp.211-224
- /
- 2023
The purpose of this study is to suggest how to investigate the reliability of the assessment, which consists of items generated by automatic item generation using empirical example data. To achieve this, we analyzed the illustrative assessment data by applying the multivariate generalizability theory, which can reflect the design of responding to different items for each student and multiple error sources in the assessment score. The result of the G-study showed that, in most designs, the student effect corresponding to the true score of the classical test theory was relatively large after residual effects. In addition, in the design where the content domain was fixed, the ranking of students did not change depending on the item types or items. Similarly, in the design where the item format was fixed, the difficulty showed little variation depending on the content domains. The result of the D-study indicated that the original assessment data achieved a sufficient level of reliability. It was also found that higher reliability than the original assessment data could be obtained by reducing the number of items in the content domains of operation, geometry, and probability and statistics, or by assigning higher weights to the domains of letters and formulas, and function. The efficient measurement conditions presented in this study are limited to the illustrative assessment data. However, the method applied in this study can be utilized to determine the reliability and to find efficient measurement conditions for the various assessment situations using automatic item generation based on measurement traits.
https://doi.org/10.21796/jse.2023.47.2.211 인용 PDF

Search Result 77, Processing Time 0.021 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)