• Title/Summary/Keyword: text linguistics

Search Result 69, Processing Time 0.023 seconds

GNI Corpus Version 1.0: Annotated Full-Text Corpus of Genomics & Informatics to Support Biomedical Information Extraction

  • Oh, So-Yeon;Kim, Ji-Hyeon;Kim, Seo-Jin;Nam, Hee-Jo;Park, Hyun-Seok
    • Genomics & Informatics
    • /
    • v.16 no.3
    • /
    • pp.75-77
    • /
    • 2018
  • Genomics & Informatics (NLM title abbreviation: Genomics Inform) is the official journal of the Korea Genome Organization. Text corpus for this journal annotated with various levels of linguistic information would be a valuable resource as the process of information extraction requires syntactic, semantic, and higher levels of natural language processing. In this study, we publish our new corpus called GNI Corpus version 1.0, extracted and annotated from full texts of Genomics & Informatics, with NLTK (Natural Language ToolKit)-based text mining script. The preliminary version of the corpus could be used as a training and testing set of a system that serves a variety of functions for future biomedical text mining.

Citation-based Article Summarization using a Combination of Lexical Text Similarities: Evaluation with Computational Linguistics Literature Summarization Datasets

  • Kang, In-Su
    • Journal of the Korea Society of Computer and Information
    • /
    • v.24 no.7
    • /
    • pp.31-37
    • /
    • 2019
  • Citation-based article summarization is to create a shortened text for an academic article, reflecting the content of citing sentences which contain other's thoughts about the target article to be summarized. To deal with the problem, this study introduces an extractive summarization method based on calculating a linear combination of various sentence salience scores, which represent the degrees to which a candidate sentence reflects the content of author's abstract text, reader's citing text, and the target article to be summarized. In the current study, salience scores are obtained by computing surface-level textual similarities. Experiments using CL-SciSumm datasets show that the proposed method parallels or outperforms the previous approaches in ROUGE evaluations against SciSumm-2017 human summaries and SciSumm-2016/2017 community summaries.

결속구조 비교와 번역 - 중한텍스트 대조분석을 중심으로

  • Park, Eun-Suk
    • 중국학논총
    • /
    • no.71
    • /
    • pp.107-129
    • /
    • 2021
  • 近几十年来, 翻译学与语言学, 社会学, 文化学, 哲学等学科相结合, 取得了很大的发展。特别是语言学和翻译学一直有着密切的관关系。自上世纪六十年代起, 语言学家们开始逐步突破以句子为最高语言单位的研究范围, 将视角扩大到语篇, "篇章语言学"自此兴起。"衔接理论"作为语言学或翻译学的一个重要课题, 早已在国内外语言学界得到广泛而深入的研究。但是与语言对比研究中的众多课题一样, 两个语言在篇章衔接手段上的对比还鲜有人问津。因此本论文从篇章语言学的角度出发, 将Halliday和Hason提出的衔接(cohesion)理论运用于中韩翻译中, 进行了对比分析和研究。还讨论中韩语篇对比分析对中韩翻译实践和研究带来的影响。第一章是绪论, 介绍了篇章语言学的兴起和国内外代表学者。第二章, 把衔接机制分为衔接的定义和衔接的分类两小节, 了解中韩语篇的衔接机制。第三章, 把衔接理论运用于新闻中韩语篇中, 对两个语篇的衔接机制进行对比分析, 实质上浅谈衔接理在中韩语篇翻译中的应用与实践。

Interrelationship between Prior Knowledge and Language Proficiency in L2 Listening Comprehension

  • Chung, Hyun-Sook
    • Korean Journal of English Language and Linguistics
    • /
    • v.1 no.1
    • /
    • pp.187-209
    • /
    • 2001
  • This study attempts to supplement what is known about the influence of prior knowledge on second language listening comprehension. To do so, the study examines the effect of prior knowledge and language proficiency on the ability of L2 listeners to understand texts. The purpose of an experiment was to determine the effect of topic familiarity on the L2 listening comprehension ability of subjects who varied in L2 listening proficiency level. The subjects (N=117) were selected from a population of college students enrolled in the Departments of English and Business in Korea. English listening proficiency levels were designated on the basis of TOEFL listening scores. Subjects listened twice each to texts (more familiar and less familiar). After listening to each text, a ten-item objective test was administered to test the subjects' comprehension of the information presented in the text. Objective tests were analyzed. using repeated measures analysis. A post hoc test was conducted to identify the means that were significantly different. This study yielded the following results: (1) subjects with high prior knowledge comprehended texts significantly better than did subjects with low prior knowledge; (2) the level of L2 listening proficiency had a significant effect on the L2 listening comprehension of texts, but there was no interaction between prior knowledge and the level of L2 listening proficiency.

  • PDF

Perspective Coherence in Simultaneous Interpreting - with Reference to German-Korean Interpreting - (동시통역과 시각적 응집성 - 독한 통역을 중심으로 -)

  • Ahn In-Kyoung
    • Koreanishche Zeitschrift fur Deutsche Sprachwissenschaft
    • /
    • v.9
    • /
    • pp.169-193
    • /
    • 2004
  • In simultaneous interpreting, if the syntactic structure of the source language and the target language are very different, interpreters have to wait before being able to reformulate the source text segments into a meaningful utterance in target language. It is inevitable to adapt the target language structure to that of the source language so as not to unduly increase the memory load and to minimize the pause. While such adaptation enables simultaneous interpretating, it results in damaging the perspective coherence of the text. Discovering when such perspective coherence is impaired, and how the problem can be relieved, will enable interpreters to enhance their performance. This paper analyses the reasons for perspective coherence damage by looking at some examples of German-Korean simultaneous interpreting.

  • PDF

The Informative Support and Emotional Support Classification Model for Medical Web Forums using Text Analysis (의료 웹포럼에서의 텍스트 분석을 통한 정보적 지지 및 감성적 지지 유형의 글 분류 모델)

  • Woo, Jiyoung;Lee, Min-Jung;Ku, Yungchang
    • Journal of Information Technology Services
    • /
    • v.11 no.sup
    • /
    • pp.139-152
    • /
    • 2012
  • In the medical web forum, people share medical experience and information as patients and patents' families. Some people search medical information written in non-expert language and some people offer words of comport to who are suffering from diseases. Medical web forums play a role of the informative support and the emotional support. We propose the automatic classification model of articles in the medical web forum into the information support and emotional support. We extract text features of articles in web forum using text mining techniques from the perspective of linguistics and then perform supervised learning to classify texts into the information support and the emotional support types. We adopt the Support Vector Machine (SVM), Naive-Bayesian, decision tree for automatic classification. We apply the proposed model to the HealthBoards forum, which is also one of the largest and most dynamic medical web forum.

Embedding with different levels for idiom disambiguation (관용표현 중의성 해소를 위한 다층위 임베딩 연구)

  • Park, Seo-Yoon;Kang, Ye-Jee;Kang, Hye-Rin;Jang, Yeon-Ji;Kim, Han-Saem
    • Annual Conference on Human and Language Technology
    • /
    • 2021.10a
    • /
    • pp.167-172
    • /
    • 2021
  • 관용표현 중에는 중의성을 가진 표현이 많다. 즉 하나의 표현이 맥락에 따라 일반적 의미와 관용적 의미 두 가지 이상으로 해석될 가능성이 있어 이런 유형의 관용표현을 중의성 해소 없이 자연어 처리 태스크에 적용할 경우 문제가 발생하게 된다. 본 연구에서는 관용표현의 특성인 중의성과 더불어 '관용표현은 이미 사용자의 머릿속에 하나의 토큰으로 저장되어 있다'라는 'Idiom Principle'을 바탕으로 관용표현에 대해 각각 표면형, 단순 단일 토큰형, stemming 단일 토큰형 층위의 임베딩을 만들어 관용표현 분류 연구를 진행하였으며, 실험 결과 표면형 및 stemming을 적용하지 않은 단순 단일 토큰으로 학습하는 것보다, stemming을 적용한 후 단일 토큰으로 학습하는 것이 관용표현의 중의성 해소에 유의미한 효과가 있음을 확인하였다.

  • PDF

A Cognitive Pragmatic Approach to Contextual Effects In Modern Korean Poetry (한국 현대시 텍스트의 맥락 효과에 관한 인지 화용론적 연구)

  • HyonhoLee
    • Korean Journal of Cognitive Science
    • /
    • v.4 no.2
    • /
    • pp.5-28
    • /
    • 1994
  • In this thesis we attempt to analyze modern Korean poetic texts in the franmeworks of text limgisitics and cognitive pragmatics. Both frameworks describe and explai human verbal communicantion in terms of congnitive information-processing procedures.By utilizing analytical devices provided by seven standards of textuality we can analyze any type of text,especially in terms of the cognitive operations underlying the production and reception processes.It is clamed in cognitive pragmatic framework that human ostensive inforential communication is regulated by the Principle of Relevance.We claim that the relevance-based framework of pragmatics provides evidence and rationale for those cognitive operations identified in the text linguistic framework. poetic texts involve every kind of cognitive strategies and processing procedures underlying human verbal communication.So,if modern Korean poetic texts are satisfactorily analyzed by text linguistics and cognitive pragmatics,it means that both frameworks are very useful tools for analyzing texts and that all the other text types which are less complicated than poetic text will also be analyzed by these frameworks. Researchers of poetry,and poets,are sensitive to poetic effects.They feel more of poeticity while reading poetic texts than ordinary readers do.However,these researchers or poets sometimes give different interpretation of a single poetic text.The interpretation of poetry cannot be anything,because poets write poems with particular intertions and do not just throw them out so as to be interpreted at ramdom.This thesis suggersts that the poeticity felt by the reader can be described and accounted for in a scientific way.In other words,text linguistics and cognitive pragmatics enable the researchers of poetry to become objective in interpreting poetic texts. It will be clearly shown that we have to see poetic texts from a cognitive perspective,since they are by-products of cognitive processing performed by discourse participants.