• Title/Summary/Keyword: 문장 수준

Search Result 223, Processing Time 0.03 seconds

Graph Reasoning and Context Fusion for Multi-Task, Multi-Hop Question Answering (다중 작업, 다중 홉 질문 응답을 위한 그래프 추론 및 맥락 융합)

  • Lee, Sangui;Kim, Incheol
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.10 no.8
    • /
    • pp.319-330
    • /
    • 2021
  • Recently, in the field of open domain natural language question answering, multi-task, multi-hop question answering has been studied extensively. In this paper, we propose a novel deep neural network model using hierarchical graphs to answer effectively such multi-task, multi-hop questions. The proposed model extracts different levels of contextual information from multiple paragraphs using hierarchical graphs and graph neural networks, and then utilize them to predict answer type, supporting sentences and answer spans simultaneously. Conducting experiments with the HotpotQA benchmark dataset, we show high performance and positive effects of the proposed model.

Probing Semantic Relations between Words in Pre-trained Language Model (사전학습 언어모델의 단어간 의미관계 이해도 평가)

  • Oh, Dongsuk;Kwon, Sunjae;Lee, Chanhee;Lim, Heuiseok
    • Annual Conference on Human and Language Technology
    • /
    • 2020.10a
    • /
    • pp.237-240
    • /
    • 2020
  • 사전학습 언어모델은 다양한 자연어처리 작업에서 높은 성능을 보였다. 하지만, 사전학습 언어모델은 문장 내 문맥 정보만을 학습하기 때문에 단어간 의미관계 정보를 추론하는데는 한계가 있다. 최근에는, 사전학습 언어모델이 어느수준으로 단어간 의미관계를 이해하고 있는지 다양한 Probing Test를 진행하고 있다. 이러한 Test는 언어모델의 강점과 약점을 분석하는데 효율적이며, 한층 더 인간의 언어를 정확하게 이해하기 위한 모델을 구축하는데 새로운 방향을 제시한다. 본 논문에서는 대표적인 사전 학습기반 언어모델인 BERT(Bidirectional Encoder Representations from Transformers)의 단어간 의미관계 이해도를 평가하는 3가지 작업을 진행한다. 첫 번째로 단어 간의 상위어, 하위어 관계를 나타내는 IsA 관계를 분석한다. 두번째는 '자동차'와 '변속'과 같은 관계를 나타내는 PartOf 관계를 분석한다. 마지막으로 '새'와 '날개'와 같은 관계를 나타내는 HasA 관계를 분석한다. 결과적으로, BERTbase 모델에 대해서는 추론 결과 대부분에서 낮은 성능을 보이지만, BERTlarge 모델에서는 BERTbase보다 높은 성능을 보였다.

  • PDF

The Effects of Gender Cue and Antecedent Case on the Immediacy of Pronominal Resolution (대명사의 성별단서와 선행어 격이 참조해결의 즉각성에 미치는 효과)

  • JaehoLee
    • Korean Journal of Cognitive Science
    • /
    • v.4 no.1
    • /
    • pp.51-86
    • /
    • 1993
  • The purpose of this study was to investigate on-line comprehension processing in pronoun resolution. The two important constraints investigated in this study were the gender cue of pronoun and the antecedent case. Using antecedent probe recognition task. Experiment 1 investigated the effects of gender cues and antecedent cases on probe recognition time. There were on signigicant effects of employed variable. This result suggest the possibilty of immediate antecedent assignment depending on the degree of syntactic constraints satisfaction. In Experiment 2. using antecedent probe recognition task. the primed activation level differences between antecedents and non-antecedents over time-course intervals from 0 to 250msec were measured. The effect of gender cues was obtained over 0-250msec time-course condition. This indicates that the gender cues can determine the assignment of proper antecedent for a pronoun. In Experiment 3, subect-case pronouns were used only:Unambiguous gender cues were given and the time-course intervals of 250 and 750msec were employed. A signigicant interaction effect of antecedent cases with probe conditions was obtained. All the results of this research suggest that gender cues are powerful constraints for pronoun resolution.

Restoring Omitted Sentence Constituents in Encyclopedia Documents Using Structural SVM (Structural SVM을 이용한 백과사전 문서 내 생략 문장성분 복원)

  • Hwang, Min-Kook;Kim, Youngtae;Ra, Dongyul;Lim, Soojong;Kim, Hyunki
    • Journal of Intelligence and Information Systems
    • /
    • v.21 no.2
    • /
    • pp.131-150
    • /
    • 2015
  • Omission of noun phrases for obligatory cases is a common phenomenon in sentences of Korean and Japanese, which is not observed in English. When an argument of a predicate can be filled with a noun phrase co-referential with the title, the argument is more easily omitted in Encyclopedia texts. The omitted noun phrase is called a zero anaphor or zero pronoun. Encyclopedias like Wikipedia are major source for information extraction by intelligent application systems such as information retrieval and question answering systems. However, omission of noun phrases makes the quality of information extraction poor. This paper deals with the problem of developing a system that can restore omitted noun phrases in encyclopedia documents. The problem that our system deals with is almost similar to zero anaphora resolution which is one of the important problems in natural language processing. A noun phrase existing in the text that can be used for restoration is called an antecedent. An antecedent must be co-referential with the zero anaphor. While the candidates for the antecedent are only noun phrases in the same text in case of zero anaphora resolution, the title is also a candidate in our problem. In our system, the first stage is in charge of detecting the zero anaphor. In the second stage, antecedent search is carried out by considering the candidates. If antecedent search fails, an attempt made, in the third stage, to use the title as the antecedent. The main characteristic of our system is to make use of a structural SVM for finding the antecedent. The noun phrases in the text that appear before the position of zero anaphor comprise the search space. The main technique used in the methods proposed in previous research works is to perform binary classification for all the noun phrases in the search space. The noun phrase classified to be an antecedent with highest confidence is selected as the antecedent. However, we propose in this paper that antecedent search is viewed as the problem of assigning the antecedent indicator labels to a sequence of noun phrases. In other words, sequence labeling is employed in antecedent search in the text. We are the first to suggest this idea. To perform sequence labeling, we suggest to use a structural SVM which receives a sequence of noun phrases as input and returns the sequence of labels as output. An output label takes one of two values: one indicating that the corresponding noun phrase is the antecedent and the other indicating that it is not. The structural SVM we used is based on the modified Pegasos algorithm which exploits a subgradient descent methodology used for optimization problems. To train and test our system we selected a set of Wikipedia texts and constructed the annotated corpus in which gold-standard answers are provided such as zero anaphors and their possible antecedents. Training examples are prepared using the annotated corpus and used to train the SVMs and test the system. For zero anaphor detection, sentences are parsed by a syntactic analyzer and subject or object cases omitted are identified. Thus performance of our system is dependent on that of the syntactic analyzer, which is a limitation of our system. When an antecedent is not found in the text, our system tries to use the title to restore the zero anaphor. This is based on binary classification using the regular SVM. The experiment showed that our system's performance is F1 = 68.58%. This means that state-of-the-art system can be developed with our technique. It is expected that future work that enables the system to utilize semantic information can lead to a significant performance improvement.

Analysis of problems of current science textbooks perceived by teachers and students in view of learner-centered classroom (학습자 중심 수업 운영의 관점에서 초중등 교사와 학생이 본 현행 과학 교과서의 문제점 분석)

  • Yun, Eunjeong;Kwon, Sung Gi;Park, Yunebae
    • Journal of Science Education
    • /
    • v.39 no.3
    • /
    • pp.404-417
    • /
    • 2015
  • It is important for student to participate in classroom actively in order to raise effeciveness of education. In this study, we have considered the science textbooks as major factor which influence to participation in the science class, and aimed to find the problems of current sicence textbooks as tool to promote students' participation, and the improvement method. The questionnaire which include the questions to ask requirements for and problems of science textbooks for learner-centered instruction was developed, and then 99 science teachers and 821 students answered the questionnaire. As a result, students responded that current science textbooks lacked explanation, had many of difficult words and complex sentences, and were uninteresting. Teachers responded that current science textbooks had large in quantity, were written knowledge centered, and lacked of link with real life, and of story. To conclude, science textbooks revitalizing the students' participation had to strengthen the link with real life, increase students' activities, use words and sentences appropriate level for students, strengthen storyline, and provide sufficient chances to check the students' understanding by themselves.

  • PDF

Investigating Science-Talented Students' Understandings and Meaning Generation about the Earth Systems Based on Their Geological Field Trip Reports (야외지질답사 보고서에 나타난 과학영재학생들의 지구계 이해와 지구계 의미 생성 탐색)

  • Yu, Eun-Jeong;Lee, Sun-Kyung;Kim, Chan-Jong
    • Journal of the Korean earth science society
    • /
    • v.28 no.6
    • /
    • pp.673-685
    • /
    • 2007
  • The purpose of this study was to investigate Earth Systems Understandings (Mayer, 1991) and Earth Systems meaning generation reported by science-talented students who participated in a geological field trip. The eight (4 female and 4 male students) field trip reports were randomly selected among all the reports written by twenty eighth-grade students who joined Shiwha-Lake field trip in Korea. The three-step program, including preparation, field trip, and summary, was provided to the students in order to facilitate meaningful learning through outdoor teaming activities. Seven Earth Systems Understandings and thematic types (Keys, 1999) were used to analyze the reports. The results of this study indicated thai aesthetic views and stewardship toward the Earth, which were the most distinguishing characteristics in Earth Systems Education, were reflected on most of the reports. The results also showed that the students tried to represent their understandings in such a type as meaning extension, meaning enhancement, or meaning elaboration. Overall, many students used 'knowledge-telling' process with a long list of observations and facts, whereas a few students used higher-order 'knowledge-transforming' process by coordinating their findings with interpretations and reasoning in their writings.

Analysis of Interpretation Processes Through Readers' Thinking Aloud in Science-Related Line Graphs (과학관련 선 그래프를 해석하는 고등학생들의 발성사고 과정 분석)

  • Kim, Tae-Sun;Kim, Beom-Ki
    • Journal of The Korean Association For Science Education
    • /
    • v.25 no.2
    • /
    • pp.122-132
    • /
    • 2005
  • Graphing abilities are critical to understand and convey information in science. And then, to what extent are secondary students in science courses able to understand line graphs? To find clues about the students' interpretation processes of the information in science-related line graphs, this study has the following research question: Is there a difference between the levels of complexity of good and poor readers as they use the thinking aloud method for studying cognitive processes? The present study was designed to provide evidence for the hypothesis that good line graph readers use a specific graph interpretation process when reading and interpreting line graphs. With the aid of the thinking aloud method we gained deeper insight into the interpretation processes of good and poor graph readers while verifying verbal statements with respect to line graphs. The high performing students tend to read much more information and more trend-related information than the low performing students. We support the assumption of differential line graph schema existing in the high performing students in conjunction with general graph schema. Also, high performing students tend to think aloud much more metacognitively than low performing students. High performing students think aloud a larger quantity of information from line graphs than low performing students, and more trend-related sentences than value-related sentences from line graphs. The differences of interpretation processes revealed between good and poor graph readers while reading and interpreting line graphs have implications for instructional practice as well as for test development and validation. Teaching students to read and interpret graphs flexibly and skillfully is a particular challenge to anyone seriously concerned with good education for students who live in an technological society.

Development and validation of Speech Range Profile task (발화범위 프로파일 과제 개발 및 타당성 검증)

  • Kim, Jaeock;Lee, Seung Jin
    • Phonetics and Speech Sciences
    • /
    • v.11 no.3
    • /
    • pp.77-87
    • /
    • 2019
  • The study aimed to develop Speech Range Profile (SRP) and to examine and validate its clinical application. Forty-five participants without voice disorders aged 18-29 years were compared using SRP and Voice Range Profile (VRP). The authors developed the "Fire!" paragraph as a SRP task compromising 14 sentences including all Korean spoken phonemes and sentence types. To compare SRP and VRP results, the participants read the paragraph (reading) and counted from 21 to 30 (counting) as a part of SRP tasks, and produced a vowel /a/ from low to high frequencies (gliding) and a shortened form of the VRP as a part of VRP tasks. $F0_{max}$, $F0_{min}$, $F0_{range}$, $I_{max}$, $I_{min}$, and $I_{range}$ for each task were measured and compared, showing that $F0_{max}$, $F0_{min}$, $F0_{range}$, $I_{max}$, and $I_{range}$ were not different between reading and gliding. $I_{min}$, had the lowest value in counting. It is concluded that the newly developed SRP task, reading the "Fire" paragraph, can yield a maximum phonation range similar to that found by VRP. Therefore, it is expected that voice evaluation can be effectively performed in a relatively short time by applying SRP with the "Fire" paragraph, a functional utterance task, in place of VRP, which may be difficult to measure long term or in cases of severe voice disorders.

The influences of speech rate, utterance length and sentence complexity of disfluency in preschool children who stutter and children who do not stutter (문장 따라말하기에서 말속도, 발화길이 및 통사적 복잡성에 따른 말더듬 아동과 일반아동의 비유창성 비교)

  • Kim, Yesul;Sim, Hyunsub
    • Phonetics and Speech Sciences
    • /
    • v.13 no.1
    • /
    • pp.53-64
    • /
    • 2021
  • According to Demand and Capacity Model (DCM), external and internal environments influence the disfluency of children who stutter (CWS). This study investigated the effects of simultaneous changes in motoric and linguistic demands on CWS and children who do not stutter (CWNS). Participants were 4-6 years old CWS and CWNS. A sentence imitation task with changes in speech rate, utterance length, and sentence complexity was used to examine their effects on children's disfluency. When the utterance length changed, CWS showed more disfluency regardless of utterance length and as the speech rate changed, CWS showed more disfluency at fast speech rate than CWNS. When the utterance length and speech rate changed, at fast speech rate, CWS showed more disfluency in both utterances than CWNS. When sentence complexity changed, CWS showed more disfluency than CWNS in complex sentences. Changes in linguistic elements such as speech rate, utterance length, and sentence complexity affect disfluency in CWS, especially when they were exposed to faster, longer, and more complex sentences. This indicates that CWS are vulnerable to fast and complex speech motor control and language processing ability than CWNS. Thus, this study suggests that parents and therapists consider both the speech rate and the utterance length when talking with CWS.

Study on the development of automatic translation service system for Korean astronomical classics by artificial intelligence - Focused on system analysis and design step (천문 고문헌 특화 인공지능 자동번역 서비스 시스템 개발 연구 - 시스템 요구사항 분석 및 설계 위주)

  • Seo, Yoon Kyung;Kim, Sang Hyuk;Ahn, Young Sook;Choi, Go-Eun;Choi, Young Sil;Baik, Hangi;Sun, Bo Min;Kim, Hyun Jin;Lee, Sahng Woon
    • The Bulletin of The Korean Astronomical Society
    • /
    • v.44 no.2
    • /
    • pp.62.2-62.2
    • /
    • 2019
  • 한국의 고천문 자료는 삼국시대 이후 근대 조선까지 다수가 존재하여 세계적으로 드문 기록 문화를 보유하고 있으나, 한문 번역이 많이 이루어지지 않아 학술적 활용이 활발하지 못한 상태이다. 고문헌의 한문 문장 번역은 전문인력의 수작업에 의존하는 만큼 소요 시간이 길기에 투자대비 효율성이 떨어지는 편이다. 이에 최근 여러 분야에서 응용되는 인공지능의 적용을 대안으로 삼을 수 있으며, 초벌 번역 수준일지라도 자동번역기의 개발은 유용한 학술도구가 될 수 있다. 한국천문연구원은 한국정보화진흥원이 주관하는 2019년도 Information and Communication Technology 기반 공공서비스 촉진사업에 한국고전번역원과 공동 참여하여 인공신경망 기계학습이 적용된 고문헌 자동번역모델을 개발하고자 한다. 이 연구는 고천문 도메인에 특화된 인공지능 기계학습 기법으로 자동번역모델을 개발하여 이를 서비스하는 것을 목적으로 한다. 연구 방법은 크게 4가지 개발을 진행하는 것으로 나누어 볼 수 있다. 첫째, 인공지능의 학습 데이터에 해당되는 '코퍼스'를 구축하는 것이다. 이는 고문헌의 한자 원문과 한글 번역문이 쌍을 이루도록 만들어 줌으로써 학습에 최적화한 데이터를 최소 6만 개 이상 추출하는 것이다. 둘째, 추출된 학습 데이터 코퍼스를 다양한 인공지능 기계학습 기법에 적용하여 천문 분야 특수고전 도메인에 특화된 자동번역 모델을 생성하는 것이다. 셋째, 클라우드 기반에서 참여 기관별로 소장한 고문헌을 자동 번역 모델에 기반하여 도메인 특화된 모델로 도출 및 활용할 수 있는 대기관 서비스 플랫폼 구축이다. 넷째, 개발된 자동 번역기의 대국민 개방을 위해 웹과 모바일 메신저를 통해 자동 번역 서비스를 클라우드 기반으로 구축하는 것이다. 이 연구는 시스템 요구사항 분석과 정의를 바탕으로 설계가 진행 또는 일부 완료되어 구현 중에 있다. 추후 이 연구의 성능 평가는 자동번역모델 평가와 응용시스템 시험으로 나누어 진행된다. 자동번역모델은 평가용 테스트셋에 의한 자동 평가와 전문가에 의한 휴먼 평가에 따라 모델의 품질을 수치로 측정할 수 있다. 또한 응용시스템 시험은 소프트웨어 방법론의 개발 단계별 테스트를 적용한다. 이 연구를 통해 고천문 분야가 인공지능 자동번역 확산 플랫폼 시범의 첫 케이스라는 점에서 의의가 있다. 즉, 클라우드 기반으로 시스템을 구축함으로써 상대적으로 적은 초기 비용을 투자하여 활용성이 높은 한문 문장 자동 번역기라는 연구 인프라를 확보하는 첫 적용 학문 분야이다. 향후 이를 활용한 고천문 분야 학술 활동이 더욱 활발해질 것을 기대해 볼 수 있다.

  • PDF