• Title/Summary/Keyword: Lexicon

Search Result 273, Processing Time 0.025 seconds

Ontology Construction and Its Application to Disambiguate Word Senses (온톨로지 구축 및 단어 의미 중의성 해소에의 활용)

  • Kang, Sin-Jae
    • The KIPS Transactions:PartB
    • /
    • v.11B no.4
    • /
    • pp.491-500
    • /
    • 2004
  • This paper presents an ontology construction method using various computational language resources, and an ontology-based word sense disambiguation method. In order to acquire a reasonably practical ontology the Kadokawa thesaurus is extended by inserting additional semantic relations into its hierarchy, which are classified as case relations and other semantic relations. To apply the ontology to disambiguate word senses, we apply the previously-secured dictionary information to select the correct senses of some ambiguous words with high precision, and then use the ontology to disambiguate the remaining ambiguous words. The mutual information between concepts in the ontology was calculated before using the ontology as knowledge for disambiguating word senses. If mutual information is regarded as a weight between ontology concepts, the ontology can be treated as a graph with weighted edges, and then we locate the weighted path from one concept to the other concept. In our practical machine translation system, our word sense disambiguation method achieved a 9% improvement over methods which do not use ontology for Korean translation.

Performance and Limitations of a Korean Sentiment Lexicon Built on the English SentiWordNet (영어 SentiWordNet을 이용하여 구축한 한국어 감성어휘사전의 성능 평가와 한계 연구)

  • Shin, Donghyok;Kim, Sairom;Cho, Donghee;Nguyen, Minh Dieu;Park, Soongang;Eo, Keonjoo;Nam, Jeesun
    • 한국어정보학회:학술대회논문집
    • /
    • 2016.10a
    • /
    • pp.189-194
    • /
    • 2016
  • 본 연구는 다국어 감성사전 및 감성주석 코퍼스 구축 프로젝트인 MUSE 프로젝트의 일환으로 한국어 감성사전을 구축하기 위해 대표적인 영어 감성사전인 SentiWordNet을 이용하여 한국어 감성사전을 구축하는 방법의 의의와 한계점을 검토하는 것을 목적으로 한다. 우선 영어 SentiWordNet의 117,659개의 어휘중에서 긍정/부정 0.5 스코어 이상의 어휘를 추출하여 구글 번역기를 이용해 자동 번역하는 작업을 실시하였다. 그 중에서 번역이 되지 않거나, 중복되는 경우를 제거하고, 언어학 전문가들의 수작업으로 분류해낸 결과 3,665개의 감성어휘를 획득할 수 있었다. 그러나 이마저도 병명이나 순수 감성어휘로 보기 어려운 사례들이 상당수 포함되어 있어 실제 이를 코퍼스에 적용하여 감성어휘를 자동 판별했을 때에 맛집 코퍼스에서의 재현율(recall)이 긍정과 부정에서 각각 47.4%, 37.7%, IT 코퍼스에서 각각 55.2%, 32.4%에 불과하였다. 이와 더불어 F-measure의 경우, 맛집 코퍼스에서는 긍정과 부정의 값이 각각 62.3%, 38.5%였고, IT 코퍼스에서는 각각 65.5%, 44.6%의 낮은 수치를 보여주고 있어, SentiWordNet 기반의 감성사전은 감성사전으로서의 역할을 수행하기에 충분하지 않은 것으로 나타났다. 이를 통해 한국어 감성사전을 구축할 때에는 한국어의 언어적 속성을 고려한 체계적인 접근이 필요함을 역설하고, 현재 한국어 전자사전 DECO에 기반을 두어 보완 확장중인 SELEX 감성사전에 대해 소개한다.

  • PDF

A Method based on Ontology for detecting errors in the Software Design (온톨로지 기반의 소프트웨어 설계에러검출방법)

  • Seo, Jin-Won;Kim, Young-Tae;Kong, Heon-Tag;Lim, Jae-Hyun;Kim, Chi-Su
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.10 no.10
    • /
    • pp.2676-2683
    • /
    • 2009
  • The objective of this thesis is to improve the quality of a software product based on the enhancement of a software design quality using a better error detecting method. Also, this thesis is based on a software design method called as MOA(Methodology for Object to Agents) which uses an ontology based ODES(A Method based on Ontology for Detecting Errors in the Software Design) model as a common information model. At this thesis, a new format of error detecting method was defined. The method is implemented during a transformation process from UML model to ODES model using a ODES model, a Inter-View Inconsistency Detection technique and a combination of ontologic property of consistency framework and related rules. Transformation process to ODES model includes lexicon analysis and meaning analysis of a software design using of multiple mapping table at algorithm for the generation of ODES model instance.

An Adaptive Pruning Threshold Algorithm for the Korean Address Speech Recognition (한국어 주소 음성인식의 고속화를 위한 적응 프루닝 문턱치 알고리즘)

  • 황철준;오세진;김범국;정호열;정현열
    • The Journal of the Acoustical Society of Korea
    • /
    • v.20 no.7
    • /
    • pp.55-62
    • /
    • 2001
  • In this paper, we propose a new adaptative pruning algorithm which effectively reduces the search space during the recognition process. As maximum probabilities between neighbor frames are highly interrelated, an efficient pruning threshold value can be obtained from the maximum probabilities of previous frames. The main idea is to update threshold at the present frame by a combination of previous maximum probability and hypotheses probabilities. As present threshold is obtained in on-going recognition process, the algorithm does not need any pre-experiments to find threshold values even when recognition tasks are changed. In addition, the adaptively selected threshold allows an improvement of recognition speed under different environments. The proposed algorithm has been applied to a Korean Address recognition system. Experimental results show that the proposed algorithm reduces the search space of average 14.4% and 9.14% respectively while preserving the recognition accuracy, compared to the previous method of using fixed pruning threshold values and variable pruning threshold values.

  • PDF

Cliche Analysis for English-Korean Interpretation and Translation Training : Mainly on Shakespeare's Works Texts (영·한 통번역 교육을 위한 클리셰(cliche) 분석 : 셰익스피어 극 텍스트를 중심으로)

  • You, Seon-Young
    • The Journal of the Korea Contents Association
    • /
    • v.15 no.11
    • /
    • pp.626-634
    • /
    • 2015
  • The purpose of this study was to analyze the cliche for English-Korean interpretation and translation training with special reference to the cliche based on Shakespeare's works texts. The term of 'idioms' are generally used as figurative expressions instead of the term of 'cliche'. Thus, cliches must be reinterpreted in the lexicon that are used in useful expressions. Cliches are often idioms. Idioms are figurative phrases with an implied meaning; the phrase is not to be taken literally. This causes difficulty when translating to another language because the meaning may not be understood by people within that culture. Cliches are figurative or literal expressions and are overused expressions. Consequently, the cliches are distinguished from the idioms by the transparent meanings. This study was examined based on the cliches shown in Shakespeare's works texts. After all, anyone who wants to become an efficient English learners, interpretor and translator should be familiar with cliches. They had better use the cliche in English learning site. I hope this study will be helpful even a bit to his attempt.

A Test of Hierarchical Model of Bilinguals Using Implicit and Explicit Memory Tasks (이중언어자의 위계모형 검증 : 암묵기억과제와 외현기억과제의 효과)

  • 김미라;정찬섭
    • Korean Journal of Cognitive Science
    • /
    • v.9 no.1
    • /
    • pp.47-60
    • /
    • 1998
  • The study was designed to investigate implicit and explicit memory effec representations of bilinguals. Hierarchical model of bilingual information processing word naming and translation tasks in the context of semantically categorized or rar Experiments 1 and 2, bilinguals first viewed stimulus words and performed naming or tr then implicit and explicit memory tasks. In experiment I, word recognition times(exp were significantly faster for semantic category condition than random category condi naming task and lexical decision taskOmplicit memory task)showed no difference in e experiment 2, naming task and exlicit memory task showed categorization effect but fOWE a and implcit memory task showed no categorization effect. These findings support the which posits that memory representations of bilinguals are composed of two independer a and one common conceptual store.

  • PDF

From Representational Geography to Non-Representational Geography: Paradigm Shifts of Landscape Studies in Anglophone Cultural and Historical Geography (경관지리학에서 경치지리학(景致地理學)으로: 영미권 문화역사지리학 경관연구 패러다임의 전환)

  • Song, Wonseob
    • Journal of the Korean Geographical Society
    • /
    • v.50 no.3
    • /
    • pp.305-323
    • /
    • 2015
  • The main purpose of this paper is to explore the paradigm shifts of landscape studies in Anglophone cultural and historical geography. By analyzing the work of the Berkley School in the 1950s and 1960s, the advance of humanistic geography in the 1970s, the revival of cultural geography in the 1980s ("new cultural geography"), and the recent development of non-representational geography, this paper demonstrates that the paradigms of landscape studies in Anglophone cultural and historical geography have been changed. By giving buoyancy to the concept of 'Affect'-a kind of 'spatio-bodily-magnetic relation'-as an essence of non-representational geography, I provide an easy way for understanding the implications of non-representational geography. In addition to this, re-conceptualising Non-Representational Theory (NRT) based non-representational geography as 'Kyung-Chi Jirihak' in Korean lexicon context, it is suggested that what the directions of landscape studies of cultural and historical geography of Korea should be and how it can be set up in the paradigm shifts.

  • PDF

The Effects of Linguistic Contrast and Conceptual Hierarchy on Children's Word Learning (언어대비(言語對比)와 개념(槪念)의 위계성(位階性)이 아동의 단어학습에 미치는 효과)

  • Kim, Eun Heui;Lee, Kwee Ok
    • Korean Journal of Child Studies
    • /
    • v.14 no.2
    • /
    • pp.79-94
    • /
    • 1993
  • The purpose of this study was (1) to investigate whether linguistic contrast helps children map a new word into a specific semantic domain when a new word is introduced, (2) to examine the existence of a hierarchy of domains into which children will place a new word, (3) to examine whether children's existing lexicons affect how children map a new word. A total of 320 children from 3 to 6 years of age were drawn from Pusan, Korea. The children were divided into one of four age groups. There were 80 children in each age group. In each group, children were randomly assigned to one of four groups; the linguistic contrast group exposed to color, the linguistic contrast group exposed to shape, a label group and control group. All of the children were tested for production and comprehension of the new word. The results of this study were as follows; (1) The linguistic contrast helped children learn the meanings of a new word. Especially, children age 4 or more showed a significant effect for linguistic contrast; however, it was not sufficient to teach 3-year-old the correct, referent of a term. (2) There was a hierarchy of domains into which children mapped a new word. There was no significant effect for domains into which 3-year-old children mapped the new word, but from 4 years of age children showed a preference for assuming a new word refered to an object's shape rather than its color. (3) Children's existing lexicon had no effect, on how children comprehend a new word.

  • PDF

Feature Weighting for Opinion Classification of Comments on News Articles (뉴스 댓글의 감정 분류를 위한 자질 가중치 설정)

  • Lee, Kong-Joo;Kim, Jae-Hoon;Seo, Hyung-Won;Rhyu, Keel-Soo
    • Journal of Advanced Marine Engineering and Technology
    • /
    • v.34 no.6
    • /
    • pp.871-879
    • /
    • 2010
  • In this paper, we present a system that classifies comments on a news article into a user opinion called a polarity (positive or negative). The system is a kind of document classification system for comments and is based on machine learning techniques like support vector machine. Unlike normal documents, comments have their body that can influence classifying their opinions as polarities. In this paper, we propose a feature weighting scheme using such characteristics of comments and several resources for opinion classification. Through our experiments, the weighting scheme have turned out to be useful for opinion classification in comments on Korean news articles. Also Korean character n-grams (bigram or trigram) have been revealed to be helpful for opinion classification in comments including lots of Internet words or typos. In the future, we will apply this scheme to opinion analysis of comments of product reviews as well as news articles.

Ultrasonographic Features of Triple-Negative Breast Cancer: a Comparison with Other Breast Cancer Subtypes

  • Yang, Qi;Liu, Hong-Yan;Liu, Dan;Song, Yan-Qiu
    • Asian Pacific Journal of Cancer Prevention
    • /
    • v.16 no.8
    • /
    • pp.3229-3232
    • /
    • 2015
  • Background: Triple-negative breast cancer (TNBC) is known to be associated with aggressive biologic features and a poor clinical outcome. Therefore, early detection of TNBC without missed diagnosis is a requirement to improve prognosis. Preoperative ultrasound features of TNBC may potentially assist in early diagnosis as characteristics of disease. Purpose: To retrospectively evaluate the sonographic features of TNBC compared to ER (+) cancers which include HER(-) and HER2 (+), and HER2 (+) cancers which are ER (-). Materials and Methods: From June 2012 through June 2014, sonographic features of 321 surgically confirmed ER (+) cancers (n=214), HER2 (+) cancers (n=66), and TNBC (n=41) were retrospectively reviewed by two ultrasound specialists in consensus. The preoperative ultrasound and clinicopathological features were compared between the three subtypes. In addition, all cases were analyzed using morphologic criteria of the ACR BI-RADS lexicon. Results: Ultrasonographically, TNBC presented as microlobulated nodules without microcalcification (p=0.034). A lower incidence of ductal carcinoma in situ (p<0.001), invasive tumor size that is>2 cm (p=0.011) and BI-RADS category 4 (p<0.001) were significantly associated with TNBC. With regard to morphologic features of 41 TNBC cases, ultrasonographically were most likely to be masses with irregular (70.7%) microlobulated shape (48.8%), be circumscribed (17.1%) or have indistinct margins (17.1%) and parallel orientation (68.9%). Especially TNBC microlobulated mass margins were more more frequent than with ER (+) (2.0%) and HER2 (+) (4.8%) cancers. Conclusions: TNBC have specific characteristic in sonograms. Ultrasonography may be useful to avoid missed diagnosis and false-negative cases of TNBC.