• Title/Summary/Keyword: word context

Search Result 358, Processing Time 0.021 seconds

Word Sense Disambiguation of Korean Verbs Using Weight Information from Context (가중치 정보를 이용한 한국어 동사의 의미 중의성 해소)

  • Lim, Soo-Jong;Park, Young-Ja;Song, Man-Suk
    • Annual Conference on Human and Language Technology
    • /
    • 1998.10c
    • /
    • pp.425-429
    • /
    • 1998
  • 본 논문은 문맥에서 추출한 가중치 정보를 이용한 한국어 동사의 의미 중의성 해소 모델을 제안한다. 중의성이 있는 단어가 쓰인 문장에서 그 단어의 의미 결정에 영향을 주는 단어들로 의미 결정자 벡터를 구성하고, 사전에서 그 단어의 의미 항목에 쓰인 단어들로 의미 항목 벡터를 구성한다. 목적 단어의 의미는 두 벡터간의 유사도 계산에 의해 결정된다. 벡터간의 유사도 계산은 사전에서 추출된 공기 관계와 목적 단어가 속한 문장에서 추출한 거리와 품사정보에 기반한 가중치 정보를 이용하여 이루어진다. 4개의 한국어 동사에 대해 내부실험과 외부실험을 하였다. 내부 실험은 84%의 정확률과 baseline을 기준으로 50%의 성능향상, 외부 실험은 75%의 정확률과 baseline을 기준으로 40 %의 성능향상을 보인다.

  • PDF

Literature of the Bittersweet: Kim Sung-ok and 1960s Korea

  • Kim, Daniel-H.
    • Lingua Humanitatis
    • /
    • v.2 no.1
    • /
    • pp.175-212
    • /
    • 2002
  • This paper centers on the erstwhile novelist Kim Seung-ok with a focus not only on his watershed works of the 1960s, but also on his lesser-known works of the 1970s as well as on the circumstances and possible reasons for his decision to quit writing in the 1980s. His works from the 60s address certain basic human contradictions and are in this respect timeless, and these same works are also firmly grounded in their larger socio-cultural contexts of 1960s Korea. This article attempts to place the word firmly here sous rature. In new critical terms, the Kim's settings can not be understood as anything but Korea, in the then and now. This characteristic is shared, however, with highly ideological literature that at times seems to want to beat the reader over the head with the problems and author-sponsored solutions of then and now. In order to understand Kim's paradoxical position in Korean literary history, one must view his works from within the context of the debate between pure and engagement literature.

  • PDF

Improving Recall for Context-Sensitive Spelling Correction Rules by Combining Rule-Generalization and Statistical Method (규칙의 일반화와 통계 방식을 결합한 한국어 문맥의존 철자오류 교정규칙의 재현율 향상)

  • Choi, Hyun-Soo;Kwon, Hyuk-Chul;Yoon, Aesun
    • Annual Conference on Human and Language Technology
    • /
    • 2014.10a
    • /
    • pp.18-23
    • /
    • 2014
  • 한국어 맞춤법 검사기는 전자화된 한국어 텍스트에 나타난 오류어를 검색하여 이를 교정할 대치어를 제시하는 시스템이다. 이때 오류어의 유형은 크게 단순 철자오류와 문맥의존 철자오류로 구분할 수 있다. 이중 문맥의존 철자오류는 어절(word)단위로 봤을 때는 정확하지만, 문맥을 고려하였을 때 오류가 되는 유형으로 교정 난도가 매우 높다. 문맥의존 철자오류의 교정 방법은 크게 규칙을 이용한 방법과 통계 정보에 기반을 둔 방법으로 나뉜다. 이때 규칙을 이용한 방법은 그 특성상 정확도가 매우 높지만, 반대로 재현율이 매우 낮다. 본 논문에서는 본 연구진이 기존에 연구하였던 규칙을 일반화하는 방식에 추가로 조건부 확률을 이용한 통계 방식을 결합하여 정확도를 유지하면서 재현율을 향상시키는 방법을 제안한다.

  • PDF

Unseen Model Prediction using an Optimal Decision Tree (Optimal Decision Tree를 이용한 Unseen Model 추정방법)

  • Kim Sungtak;Kim Hoi-Rin
    • MALSORI
    • /
    • no.45
    • /
    • pp.117-126
    • /
    • 2003
  • Decision tree-based state tying has been proposed in recent years as the most popular approach for clustering the states of context-dependent hidden Markov model-based speech recognition. The aims of state tying is to reduce the number of free parameters and predict state probability distributions of unseen models. But, when doing state tying, the size of a decision tree is very important for word independent recognition. In this paper, we try to construct optimized decision tree based on the average of feature vectors in state pool and the number of seen modes. We observed that the proposed optimal decision tree is effective in predicting the state probability distribution of unseen models.

  • PDF

ON A SECURE BINARY SEQUENCE GENERATED BY A QUADRATIC POLYNOMIAL ON $\mathbb{Z}_{2^n}$

  • Rhee, Min-Surp
    • Journal of applied mathematics & informatics
    • /
    • v.29 no.1_2
    • /
    • pp.247-255
    • /
    • 2011
  • Invertible functions with a single cycle property have many cryptographic applications. The main context in which we study them in this paper is pseudo random generation and stream ciphers. In some cryptographic applications we need a generator which generates binary sequences of period long enough. A common way to increase the size of the state and extend the period of a generator is to run in parallel and combine the outputs of several generators with different period. In this paper we will characterize a secure quadratic polynomial on $\mathbb{Z}_{2^n}$, which generates a binary sequence of period long enough and without consecutive elements.

Design of Big Data Preference Analysis System (빅데이터 선호도 분석 시스템 설계)

  • Son, Sung Il;Park, Chan Khon
    • Journal of Korea Multimedia Society
    • /
    • v.17 no.11
    • /
    • pp.1286-1295
    • /
    • 2014
  • This paper suggests the way that it could improve the reliability about preference of user's feedback by adding weighting factor on sentiment analysis, and efficiently make a sentiment analysis of users' emotional perspective on the big data massively generated on twitter. To solve errors on earlier studies, this paper has improved recall and precision of sensibility determination by using sensibility dictionary subdivided sentiment polarity based on the level of sensibility and given impotance to sensibility determination by populating slang, new words, emoticons and idiomatic expressions not in the system dictionary. It has considered the context through conjunctive adverbs fixed in korean characteristics which are free to the word order. It also recognize sensibility words such as TF(Term Frequency), RT(Retweet), Follower which are weighting factors of preference and has increased reliability of preference analysis considering weight on 'a very emotional tweet', 'a recognised tweet from users' and 'a tweeter influencer'

Acoustic Features of Phonatory Offset-Onset in the Connected Speech between a Female Stutterer and Non-Stutterers (연속구어 내 발성 종결-개시의 음향학적 특징 - 말더듬 화자와 비말더듬 화자 비교 -)

  • Han, Ji-Yeon;Lee, Ok-Bun
    • Speech Sciences
    • /
    • v.13 no.2
    • /
    • pp.19-33
    • /
    • 2006
  • The purpose of this paper was to examine acoustical characteristics of phonatory offset-onset mechanism in the connected speech of female adults with stuttering and normal nonfluency. The phonatory offset-onset mechanism refers to the laryngeal articulatory gestures. Those gestures are required to mark word boundaries in phonetic contexts of the connected speech. This mechanism included 7 patterns based on the speech spectrogram. This study showed the acoustic features in the connected speech in the production of female adults with stuttering (n=1) and normal nonfluency (n=3). Speech tokens in V_V, V_H, and V_S contexts were selected for the analysis. Speech samples were recorded by Sound Forge, and the spectrographic analysis was conducted using Praat. Results revealed a stuttering (with a type of block) female exhibited more laryngealization gestures in the V_V context. Laryngealization gesture was more characterized by a complete glottal stop or glottal fry both in V_H and in V_S contexts. The results were discussed from theoretical and clinical perspectives.

  • PDF

Gradient Reduction of $C_1$ in /pk/ Sequences

  • Son, Min-Jung
    • Speech Sciences
    • /
    • v.15 no.4
    • /
    • pp.43-60
    • /
    • 2008
  • Instrumental studies (e.g., aerodynamic, EPG, and EMMA) have shown that the first of two stops in sequence can be articulatorily reduced in time and space sometimes; either gradient or categorical. The current EMMA study aims to examine possible factors_linguistic (e.g., speech rate, word boundary, and prosodic boundary) and paralinguistic (e.g., natural context and repetition)_to induce gradient reduction of $C_1$ in /pk/ cluster sequences. EMMA data are collected from five Seoul-Korean speakers. The results show that gradient reduction of lip aperture seldom occurs, being quite restricted both in speaker frequency and in token frequency. The results also suggest that the place assimilation is not a lexical process, implying that speakers have not fully developed this process to be phonologized in the abstract level.

  • PDF

Bracketing Input for Accurate Parsing

  • No, Yong-Kyoon
    • Proceedings of the Korean Society for Language and Information Conference
    • /
    • 2007.11a
    • /
    • pp.358-364
    • /
    • 2007
  • Syntax parsers can benefit from speakers' intuition about constituent structures indicated in the input string in the form of parentheses. Focusing on languages like Korean, whose orthographic convention requires more than one word to be written without spaces, we describe an algorithm for passing the bracketing information across the tagger to the probabilistic CFG parser, together with one for heightening (or penalizing, as the case may be) probabilities of putative constituents as they are suggested by the parser. It is shown that two or three constituents marked in the input suffice to guide the parser to the correct parse as the most likely one, even with sentences that are considered long.

  • PDF

The Philosophy and Linguistics of Dao : the Ancient Chinese Philosophy and Language (도의 철학과 도의 언어학 -고대 중국의 철학과 언어-)

  • 정재현
    • Lingua Humanitatis
    • /
    • v.5
    • /
    • pp.109-126
    • /
    • 2003
  • The aim of this paper is to elucidate ancient Chinese philosophy and linguistics through the concept of the Dao. Ancient Chinese thought had developed together with ancient Chinese theories of language and the linguistic features of Classical Chinese. The concept of the Dao served as an intermediary among them. The Dao which ancient Chinese philosophers sought for has several characteristics: ethical normativity, wholeness, dynamicity, non-reducibility. Linguistic studies also revealed them. The following linguistic features of Classical Chinese are the cause and/or the effect of such Dao-based philosophy and linguistics: No explicit subject-predicate sentential structure, no parts of speech, heavy reliance on the word order and context for meaning determination, no explicit distinction between compound words and a sentence, the pictographic or the ideographic features of Chinese graphs, and non-existence of a copula.

  • PDF