• Title/Summary/Keyword: Language Comprehension

Search Result 243, Processing Time 0.026 seconds

KorPatELECTRA : A Pre-trained Language Model for Korean Patent Literature to improve performance in the field of natural language processing(Korean Patent ELECTRA)

  • Jang, Ji-Mo;Min, Jae-Ok;Noh, Han-Sung
    • Journal of the Korea Society of Computer and Information
    • /
    • v.27 no.2
    • /
    • pp.15-23
    • /
    • 2022
  • In the field of patents, as NLP(Natural Language Processing) is a challenging task due to the linguistic specificity of patent literature, there is an urgent need to research a language model optimized for Korean patent literature. Recently, in the field of NLP, there have been continuous attempts to establish a pre-trained language model for specific domains to improve performance in various tasks of related fields. Among them, ELECTRA is a pre-trained language model by Google using a new method called RTD(Replaced Token Detection), after BERT, for increasing training efficiency. The purpose of this paper is to propose KorPatELECTRA pre-trained on a large amount of Korean patent literature data. In addition, optimal pre-training was conducted by preprocessing the training corpus according to the characteristics of the patent literature and applying patent vocabulary and tokenizer. In order to confirm the performance, KorPatELECTRA was tested for NER(Named Entity Recognition), MRC(Machine Reading Comprehension), and patent classification tasks using actual patent data, and the most excellent performance was verified in all the three tasks compared to comparative general-purpose language models.

A Study on the Comprehension of Texts with Korean Hangul, Chinese Hanja and Hangul.Hanja among Korean-Chinese children and adolescents (이중언어능력의 조선족 아동과 청소년의 한글, 한자, 한글.한자혼합문 형태의 덩이글 이해에 관한 연구)

  • Yoon, Hye-Kyung;ParkChoi, Hye-Won;Kwon, Oh-Seek
    • Korean Journal of Child Studies
    • /
    • v.30 no.2
    • /
    • pp.15-28
    • /
    • 2009
  • This study focused on the comprehension of texts written either in Korean script (Hangul) or Chinese script (Hanja). For this purpose, we measured the reading time and the correct response in text comprehension tasks with 104 Korean-Chinese children who were either 10 or 19 years old. There was a main effect of script : The reading time of Hanja texts was shorter than that of Hangul or Hangul Hanja mixed texts. But the older subjects who spent the same reading time in both Hangul and Hanja texts showed the longer reading time in Hangul Hanja mixed texts revealing the interaction between age and script. The correct response rate on the comprehension task was the highest in Hangul text. The results were discussed in relation to the independent dual language processing systems in Korean-Chinese.

  • PDF

The Effects of Increased Processing Demands on the Sentence Comprehension of Korean-speaking Adults with Aphasia (지연된 자극 제시가 실어증 환자의 문장 이해에 미치는 영향: 반응정확도와 반응시간을 중심으로)

  • Choi, So-Young
    • Phonetics and Speech Sciences
    • /
    • v.4 no.2
    • /
    • pp.127-134
    • /
    • 2012
  • The purpose of this study is to present evidence for a particular processing approach based on the language-specific characteristics of Korean. To compare individuals' sentence-comprehension abilities, this study measured the accuracy and reaction times (RT) of 12 aphasic patients (AP) and 12 normal controls (NC) during a sentence-picture matching task. Four versions of a sentence were constructed with the two types of voice (active/passive) and two types of word order (agent-first/patient-first). To examine the effects of increased processing demand, picture stimuli were manipulated in such a way that they appeared immediately after the sentence was presented. As expected, the AP group showed higher error rates and longer RT for all conditions than the NC group. Furthermore, Korean speakers with aphasia performed above a chance level in sentence comprehension, even with passive sentences. Aphasics understood sentences more quickly and accurately when they were given in the active voice and with agent-first order. The patterns of the NC group were similar. These results confirm that Korean adults with aphasia do not completely lose their knowledge of sentence comprehension. When the processing demand was increased by delaying the picture stimulus onset, the effect of increased processing demands on RT was more pronounced in the AP than in the NC group. These findings fit well with the idea that the computational system for interpreting sentences is intact in aphasics, but its ability is compromised when processing demands increase.

Effective Method to Improve the Skills of Listening Comprehension: For Candidate(s) Who Prepare the DELF A2 (듣기 능력 향상을 위한 효율적 학습 방안: DELF A2 학습자를 대상으로)

  • JUNG, Il Young
    • Cross-Cultural Studies
    • /
    • v.30
    • /
    • pp.125-165
    • /
    • 2013
  • The purpose of this study is to find methods that allow learners to improve their listening comprehension skills. To do this, we have divided this article into three parts. In the first part, we analyzed studies focus on the skill of listening. In the second part, we are dedicated to the process steps in the study of listening comprehension. In the last part, we tried to demonstrate this by examples according to the difference in levels of learners. More specifically, we applied the questionnaires according to a typological difference. Most teachers recognize the importance of listening to complete the language proficiency of learners. Nevertheless, there are many difficulties in implementing effective methods for improving listening comprehension skills. In addition, there is a strong tendency not to consider listening as an autonomous field, but as a part of oral proficiency. In addition, we can not ignore the importance of the method of application, because it can motivate learners to both concentrate on their studies and to voluntarily participate in the course. In this sense, the Professors and teaching staff can use the examples of DELF to establish concrete goals for the course of listening. It is difficult to confirm that this study is the most effective way with regard to the methodology, but we hope it may be useful to improve the skills of listening comprehension French learners.

A Study of Morphological Errors in Aphasic Language

  • Kim, Heui-Beom
    • Speech Sciences
    • /
    • v.1
    • /
    • pp.227-236
    • /
    • 1997
  • How do aphasics deal with the inflectional marking occurring in agglutinative languages like Korean? Korean speech repetition, comprehension and production were studied in 3 Broca's aphasic speakers of Korean. As experimental materials, 100 easy sentences were chosen in 1st grade Korean elementary school textbooks about reading writing and listening, and two pictures were made from each sentence. This study examines the use of three kinds of inflectional markings--past tense, nominative case, and accusative case. The analysis focuses on whether each inflectional marking was performed well or not in tasks such as repetition, comprehension and production. In addition, morphological errors concerned with each inflectional marking were analyzed in view of markedness. In general, the aphasic subjects showed a clear preservation of the morphological aspects of their native language. So the view of Broca's aphasics as agrammatical could not be strongly supported. It can be suggested that nominative case and accusative case are marked elements in Korean.

  • PDF

Production and Perception from Perspective of Focus

  • Noh, Bo-Kyung
    • Language and Information
    • /
    • v.6 no.1
    • /
    • pp.105-121
    • /
    • 2002
  • This paper investigates the effect of semantic argument structure on the comprehension and production of sentences by observing the prosodic realizations of English secondary predications. Specifically, the goal of this study is to show how the theory of predication, argument structure, and focus semantically interact to account for similarities and differences between English resultative and depictive predications. To address this issue, production and comprehension tests were performed. In the fried focus domain (verb phrase), subjects were asked to utter and to comprehend ambiguous sentences in the context monologues. The experimental results were generally consistent with general linguistic analyses: In the resultative constructions, secondary subject NPs tend to be accented, as in other argument-head constructions, while in the depictive constructions, secondary predicates tend to have accents, as in other adjunct-head constructions.

  • PDF

Adding New Information in DCS (DCS의 정보확장)

  • Lee, Chang-In
    • Annual Conference on Human and Language Technology
    • /
    • 1995.10a
    • /
    • pp.253-257
    • /
    • 1995
  • 본 논문은 DCS(Dynamic Comprehension System) 정보확장 과정을 어휘 정보의 첨가를 통해 묘사하고자한다. 즉, 현존의 사전정보의 보완작용없이, 시스템을 확장하는 과정이 이 논문에서 보여진다. 새로운 언어정보에 유연하게 대처하기 위해 화자와 청자간의 새로운 지식의 학습과정이 나무구조 형식의 보조메뉴를 통해 상호 교환방식으로 나타내진다. 본 논문은 새로운 지식의 인지과정 중 현존의 정보망(network)에 각 단위망(nection)이 첨가될 때의 과정을 구현시키고자 시도된 것이다.

  • PDF

Relationships Among Language Ability, Foreign Language Learning Experience, and Metalinguistic Ability in Korean Preschool Children (유아의 모국어 능력, 외국어 경험 정도와 상위언어 능력간의 관계)

  • Han, You Me;Cho, Bok Hee
    • Korean Journal of Child Studies
    • /
    • v.20 no.3
    • /
    • pp.199-216
    • /
    • 1999
  • The 121 five-year-old Korean subjects of this study were divided in 3 groups based on their experience in learning a foreign language (English). A battery of tests was administered to measure spoken and written language ability and the 3 metalinguistic domains of phonological, semantic, and syntactic awareness. Spoken language ability was positively correlated with semantic and syntactic awareness. The relative importance of each metalinguistic domain varied with level of written language development. Phonological awareness was the only predictor of decoding. Syntactic awareness and phonological awareness were significant variables in sentence comprehension. Metalinguistic ability was a better predictor of written language development than spoken language ability. Foreign language learning experience had an effect on syntactic awareness: low experience was superior to no experience, but high experience was not superior to low experience.

  • PDF

ORMN: A Deep Neural Network Model for Referring Expression Comprehension (ORMN: 참조 표현 이해를 위한 심층 신경망 모델)

  • Shin, Donghyeop;Kim, Incheol
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.7 no.2
    • /
    • pp.69-76
    • /
    • 2018
  • Referring expressions are natural language constructions used to identify particular objects within a scene. In this paper, we propose a new deep neural network model for referring expression comprehension. The proposed model finds out the region of the referred object in the given image by making use of the rich information about the referred object itself, the context object, and the relationship with the context object mentioned in the referring expression. In the proposed model, the object matching score and the relationship matching score are combined to compute the fitness score of each candidate region according to the structure of the referring expression sentence. Therefore, the proposed model consists of four different sub-networks: Language Representation Network(LRN), Object Matching Network (OMN), Relationship Matching Network(RMN), and Weighted Composition Network(WCN). We demonstrate that our model achieves state-of-the-art results for comprehension on three referring expression datasets.

HTML Tag Depth Embedding: An Input Embedding Method of the BERT Model for Improving Web Document Reading Comprehension Performance (HTML 태그 깊이 임베딩: 웹 문서 기계 독해 성능 개선을 위한 BERT 모델의 입력 임베딩 기법)

  • Mok, Jin-Wang;Jang, Hyun Jae;Lee, Hyun-Seob
    • Journal of Internet of Things and Convergence
    • /
    • v.8 no.5
    • /
    • pp.17-25
    • /
    • 2022
  • Recently the massive amount of data has been generated because of the number of edge devices increases. And especially, the number of raw unstructured HTML documents has been increased. Therefore, MRC(Machine Reading Comprehension) in which a natural language processing model finds the important information within an HTML document is becoming more important. In this paper, we propose HTDE(HTML Tag Depth Embedding Method), which allows the BERT to train the depth of the HTML document structure. HTDE makes a tag stack from the HTML document for each input token in the BERT and then extracts the depth information. After that, we add a HTML embedding layer that takes the depth of the token as input to the step of input embedding of BERT. Since tokenization using HTDE identifies the HTML document structures through the relationship of surrounding tokens, HTDE improves the accuracy of BERT for HTML documents. Finally, we demonstrated that the proposed idea showing the higher accuracy compared than the accuracy using the conventional embedding of BERT.