• 제목/요약/키워드: sentence representation

검색결과 58건 처리시간 0.022초

감정 표현구 단위 분류기와 문장 단위 분류기의 결합을 통한 주관적 문장 분류의 성능 향상 (Combining Sentimental Expression-level and Sentence-level Classifiers to Improve Subjective Sentence Classification)

  • 강인호
    • 정보처리학회논문지B
    • /
    • 제14B권7호
    • /
    • pp.559-566
    • /
    • 2007
  • 주관적 문장이란 주관적인 내용을 포함한 문장으로써 저자의 제품이나 사건에 대한 생각을 알 수 있다. 주관적 내용임을 나타내는 주관적인 표현은 문장 전반적으로 골고루 나타날 수도 있지만 일부 한정된 영역에서만 발견될 수도 있다. 따라서 보다 정확한 분류를 위해서는, 문장 전체를 고려하는 정보 외에 사실이나 감정을 표현하는 주관적 혹은 객관적 표현구 정보의 활용이 필요하다. 본 연구에서는 문장 전체를 이용한 분류 결과와 감정 표현구를 이용한 분류 결과를 결합하여 주/객관적 문장 분류기의 성능을 향상시키는 방법을 제안한다. 한 문장은 여러 개의 표현구를 가질 수 있어 복수개의 표현구 단위 결과를 얻게 되며 기계 학습을 응용하여 문장 단위 결과와 결합한다. 실험을 통한 결과, 표현구 단위 결과물 중 최대값을 가지는 두 가지 결과와 문장 전체를 이용한 결과를 합침으로써 2.5% 성능 향상된 79.7%의 정확률을 얻을 수 있었다.

통사적 제약과 화용적 제약이 문장의 표상과 기억접근에 미치는 효과 (The effect of syntatic and pragmatic Constraints on Sentential Representaition and Memory Accessibility)

  • 김성일;이재호
    • 인지과학
    • /
    • 제6권2호
    • /
    • pp.97-116
    • /
    • 1995
  • 본 연구는 문장 표상형성 과정에서 통사적 제약과 화용적 제약이 시간경과에 따라 각 구성성분의 표상 및 기억접근에 어떠한 영향을 미치는지를 살펴보고자 실시되었다.통사적 제약과 화용적 제약을 분리시키기 위해 구성성분의 통사적 역할(주어,목적어)과 언급순서(첫째,둘째)를 조작하였고, 문장 구성성분의 표상강도를 기억접근의 용이서을 통해 살펴보기 위해 각 문장을 마디별로 제시한 후 목표단어의 재인 반응시간을 측정하였다. 탐사재인의 지연시간이 255ms인 실험 1에서는 주어가 목적어보다 그리고 먼저 언급된 정보가 나중에 언급된 정보보다 각각 28ms씩 기억접근 시간이 빠르 것으로 나타났으나,지연시간이 1540ms으로 기렁진 실험2에서는 주어와 목적어간의 기억접근 시간의 차이는 없었고 먼저 언급된 정보가 나중에 언급된 정보에 비해 기억접근 시간이 48ms가 빠른 것으로 나타났다.따라서 통사적 제약과 화용적 제약 모두 문장 표상형성 과정의 초기에는 독립적인 효과를 미치나 일정시간이 경과하면서 통사적 제약의 효과는 사라지며 화용적 제약의 효과만 남는다고 할수 있다.본 연구의 이러한 결과는 문장의 기억표상이 중다제약의 수렴적 만족에 으해서 점진적으로 심성모형을 형성하는 과정이라는 이론적 입장을 지지한다.

  • PDF

초등수학 교과서 문장제의 언어적 분석 (A Linguistic Study on the Sentence Problems in 2015 revised Elementary Mathematics Textbooks)

  • 김영아;김성준
    • East Asian mathematical journal
    • /
    • 제35권2호
    • /
    • pp.115-139
    • /
    • 2019
  • In problem solving education, sentence problems are a tool for comprehensive evaluation of mathematical ability. The sentence problems refer to the problem expressed in sentence form rather than simply a numerical representation of mathematical problems. In order to solve sentence problems with a mixture of mathematical terms and general language, problem-solving ability including the ability to understand the meaning of sentences as well as the mathematical computation ability is required. Therefore, it is important to analyze syntactic elements from the linguistic aspects in sentence problems. The purpose of this study is to investigate the complexity of sentence problems in the length of sentences and the grammatical complexity of the sentences in the depth of the sentences by analyzing the 51 sentence problems presented in the $4^{th}$ grade mathematics textbook(2015 revised curriculum). As a result, it was confirmed that it is necessary to examine the length and depth of the sentence more carefully in the teaching and learning of sentence problems. Especially in elementary mathematics, the sentence problems requires a linguistic understanding of the sentence, and therefore it is necessary to consider syntactic elements in the process of developing and teaching sentence problems in mathematics textbook.

Effectiveness of Fuzzy Graph Based Document Model

  • Aswathy M R;P.C. Reghu Raj;Ajeesh Ramanujan
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제18권8호
    • /
    • pp.2178-2198
    • /
    • 2024
  • Graph-based document models have good capabilities to reveal inter-dependencies among unstructured text data. Natural language processing (NLP) systems that use such models as an intermediate representation have shown good performance. This paper proposes a novel fuzzy graph-based document model and to demonstrate its effectiveness by applying fuzzy logic tools for text summarization. The proposed system accepts a text document as input and identifies some of its sentence level features, namely sentence position, sentence length, numerical data, thematic word, proper noun, title feature, upper case feature, and sentence similarity. The fuzzy membership value of each feature is computed from the sentences. We also propose a novel algorithm to construct the fuzzy graph as an intermediate representation of the input document. The Recall-Oriented Understudy for Gisting Evaluation (ROUGE) metric is used to evaluate the model. The evaluation based on different quality metrics was also performed to verify the effectiveness of the model. The ANOVA test confirms the hypothesis that the proposed model improves the summarizer performance by 10% when compared with the state-of-the-art summarizers employing alternate intermediate representations for the input text.

Prosodic Annotation in a Thai Text-to-speech System

  • Potisuk, Siripong
    • 한국언어정보학회:학술대회논문집
    • /
    • 한국언어정보학회 2007년도 정기학술대회
    • /
    • pp.405-414
    • /
    • 2007
  • This paper describes a preliminary work on prosody modeling aspect of a text-to-speech system for Thai. Specifically, the model is designed to predict symbolic markers from text (i.e., prosodic phrase boundaries, accent, and intonation boundaries), and then using these markers to generate pitch, intensity, and durational patterns for the synthesis module of the system. In this paper, a novel method for annotating the prosodic structure of Thai sentences based on dependency representation of syntax is presented. The goal of the annotation process is to predict from text the rhythm of the input sentence when spoken according to its intended meaning. The encoding of the prosodic structure is established by minimizing speech disrhythmy while maintaining the congruency with syntax. That is, each word in the sentence is assigned a prosodic feature called strength dynamic which is based on the dependency representation of syntax. The strength dynamics assigned are then used to obtain rhythmic groupings in terms of a phonological unit called foot. Finally, the foot structure is used to predict the durational pattern of the input sentence. The aforementioned process has been tested on a set of ambiguous sentences, which represents various structural ambiguities involving five types of compounds in Thai.

  • PDF

Prosody in Spoken Language Processing

  • Schafer Amy J.;Jun Sun-Ah
    • 한국음향학회:학술대회논문집
    • /
    • 한국음향학회 2000년도 하계학술발표대회 논문집 제19권 1호
    • /
    • pp.7-10
    • /
    • 2000
  • Studies of prosody and sentence processing have demonstrated that prosodic phrasing can exhibit strong effects on processing decisions in English. In this paper, we tested Korean sentence fragments containing syntactically ambiguous Adj-N1-N2 strings in a cross-modal naming task. Four accentual phrasing patterns were tested: (a) the default phrasing pattern, in which each word forms an accentual phrase; (b) a phrasing biased toward N1 modification; (c) a phrasing biased toward complex-NP modification; and (d) a phrasing used with adjective focus. Patterns (b) and (c) are disambiguating phrasings; the other two are commonly found with both interpretations and are thus ambiguous. The results showed that the naming time of items produced in the prosody contradicting the semantic grouping is significantly longer than that produced in either default or supporting prosody, We claim that, as in English, prosodic information in Korean is parsed into a well-formed prosodic representation during the early stages of processing. The partially constructed prosodic representation produces incremental effects on syntactic and semantic processing decisions and is retained in memory to influence reanalysis decisions.

  • PDF

조응관계 실타래 풀기 (Untangling Anaphoric Threads)

  • 정소우
    • 한국언어정보학회지:언어와정보
    • /
    • 제8권2호
    • /
    • pp.1-25
    • /
    • 2004
  • This paper examines two different approaches to resolving a theoretical problem which the bottom-up approach version of Discourse Representation Theory of Kamp et al. (2003) faces in dealing with anaphoric relations between pronouns and their potential antecedents in conditional sentences where consequent clauses precede their corresponding conditional clauses. In one of the approaches, every element is processed in the order of occurrence and conditional operators in a non-sentence-initial position cause the ongoing DR to split in two with the same index. The definition of accessibility is accordingly modified so that the right DR can be accessible from the left DR. In the other approach, a different type of discourse representation structure, K ${\Leftarrow}$ K, is introduced, which allows us to resolve the target problem without modifying accessibility proposed in Kamp et al. (2003). Compatibility of these two approaches with the bottom-up version of DRT is evaluated by examining their applicability to the analysis of quantified sentences where pronominal expressions precede generalized quantifiers.

  • PDF

영어교과서에 나타난 영어억양교육의 문제점 (On the Problems of English Intonation Representation in English Textbook)

  • 오세풍;장영수;이용재
    • 음성과학
    • /
    • 제8권4호
    • /
    • pp.243-257
    • /
    • 2001
  • In English textbooks, there are three kinds of English intonation representations: Trager & Smith's, Weak-strong, Audio-lingual way. Each representation has its merits and demerits. Therefore, just one of them is insufficient to represent English intonation properly. Trager & Smith's representation is relevant to show holistic intonation itself. In contrast to this merit, it is not appropriate to represent downstep, declination, etc. With Weak-strong, it is good to show weak and strong point in the sentence. It is not, however, consistent with intonation. Instead of these representations, some textbooks accept Audio-lingual method. Audio-lingual method gives students more chances to hear native speakers' intonations. But it doesn't give ways to understand English intonation itself. In English textbooks, they don't have any hierarchies dependent upon students' proficiency. In spite of various intonations, they just accept a few limited intonation models. Thus, it is necessary to review all kinds of intonation representations and to develop more advanced and relevant English intonation representation.

  • PDF

KI-HABS: Key Information Guided Hierarchical Abstractive Summarization

  • Zhang, Mengli;Zhou, Gang;Yu, Wanting;Liu, Wenfen
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제15권12호
    • /
    • pp.4275-4291
    • /
    • 2021
  • With the unprecedented growth of textual information on the Internet, an efficient automatic summarization system has become an urgent need. Recently, the neural network models based on the encoder-decoder with an attention mechanism have demonstrated powerful capabilities in the sentence summarization task. However, for paragraphs or longer document summarization, these models fail to mine the core information in the input text, which leads to information loss and repetitions. In this paper, we propose an abstractive document summarization method by applying guidance signals of key sentences to the encoder based on the hierarchical encoder-decoder architecture, denoted as KI-HABS. Specifically, we first train an extractor to extract key sentences in the input document by the hierarchical bidirectional GRU. Then, we encode the key sentences to the key information representation in the sentence level. Finally, we adopt key information representation guided selective encoding strategies to filter source information, which establishes a connection between the key sentences and the document. We use the CNN/Daily Mail and Gigaword datasets to evaluate our model. The experimental results demonstrate that our method generates more informative and concise summaries, achieving better performance than the competitive models.

비지도 대조 학습에서 한국어 문장 표현을 위한 특수 토큰 컷오프 방법의 유효성 분석 (On the Effectiveness of the Special Token Cutoff Method for Korean Sentence Representation in Unsupervised Contrastive Learning)

  • 한명수;정유현;채동규
    • 한국정보과학회 언어공학연구회:학술대회논문집(한글 및 한국어 정보처리)
    • /
    • 한국정보과학회언어공학연구회 2023년도 제35회 한글 및 한국어 정보처리 학술대회
    • /
    • pp.491-496
    • /
    • 2023
  • 사전학습 언어모델을 개선하여 고품질의 문장 표현(sentence representation)을 도출하기 위한 다양한 대조 학습 방법에 대한 연구가 진행되고 있다. 그러나, 대부분의 대조학습 방법들은 문장 쌍의 관계만을 고려하며, 문장 간의 유사 정도를 파악하는데는 한계가 있어서 근본적인 대조 학습 목표를 저해하였다. 이에 최근 삼중항 손실 (triplet loss) 함수를 도입하여 문장의 상대적 유사성을 파악하여 대조학습의 성능을 개선한 연구들이 제안되었다. 그러나 많은 연구들이 영어를 기반으로한 사전학습 언어모델을 대상으로 하였으며, 한국어 기반의 비지도 대조학습에 대한 삼중항 손실 함수의 실효성 검증 및 분석은 여전히 부족한 실정이다. 본 논문에서는 이러한 방법론이 한국어 비지도 대조학습에서도 유효한지 면밀히 검증하였으며, 다양한 평가 지표를 통해 해당 방법론의 타당성을 확인하였다. 본 논문의 결과가 향후 한국어 문장 표현 연구 발전에 기여하기를 기대한다.

  • PDF