• Title/Summary/Keyword: 한국어 부사

Search Result 78, Processing Time 0.028 seconds

Verb Sense Disambiguation using Subordinating Case Information (종속격 정보를 적용한 동사 의미 중의성 해소)

  • Park, Yo-Sep;Shin, Joon-Choul;Ock, Cheol-Young;Park, Hyuk-Ro
    • The KIPS Transactions:PartB
    • /
    • v.18B no.4
    • /
    • pp.241-248
    • /
    • 2011
  • Homographs can have multiple senses. In order to understand the meaning of a sentence, it is necessary to identify which sense isused for each word in the sentence. Previous researches on this problem heavily relied on the word co-occurrence information. However, we noticed that in case of verbs, information about subordinating cases of verbs can be utilized to further improve the performance of word sense disambiguation. Different senses require different sets of subordinating cases. In this paper, we propose the verb sense disambiguation using subordinating case information. The case information acquire postposition features in Standard Korean Dictionary. Our experiment on 12 high-frequency verb homographs shows that adding case information can improve the performance of word sense disambiguation by 1.34%, from 97.3% to 98.7%. The amount of improvement may seem marginal, we think it is meaningful because the error ratio reduced to less than a half, from 2.7% to 1.3%.

Semantic Clustering of Predicates using Word Definition in Dictionary (사전 뜻풀이를 이용한 용언 의미 군집화)

  • Bae, Young-Jun;Choe, Ho-Seop;Song, Yoo-Hwa;Ock, Cheol-Young
    • Korean Journal of Cognitive Science
    • /
    • v.22 no.3
    • /
    • pp.271-298
    • /
    • 2011
  • The lexical semantic system should be built to grasp lexical semantic information more clearly. In this paper, we studied a semantic clustering of predicates that is one of the steps in building the lexical semantic system. Unlike previous studies that used argument of subcategorization(subject and object), selectional restrictions and interaction information of adverb, we used sense tagged definition in dictionary for the semantic clustering of predicate, and also attempted hierarchical clustering of predicate using the relationship between the generic concept and the specific concept. Most of the predicates in the dictionary were used for clustering. Total of 106,501 predicates(85,754 verbs, 20,747 adjectives) were used for the test. We got results of clustering which is 2,748 clusters of predicate and 130 recursive definition clusters and 261 sub-clusters. The maximum depth of cluster was 16 depth. We compared results of clustering with the Sejong semantic classes for evaluation. The results showed 70.14% of the cohesion.

  • PDF

A Model of Natural Language Information Retrieval Using Main Keywords and Sub-keywords (주 키워드와 부 키워드를 이용한 자연언어 정보 검색 모델)

  • Kang, Hyun-Kyu;Park, Se-Young
    • The Transactions of the Korea Information Processing Society
    • /
    • v.4 no.12
    • /
    • pp.3052-3062
    • /
    • 1997
  • An Information Retrieval (IR) is to retrieve relevant information that satisfies user's information needs. However a major role of IR systems is not just the generation of sets of relevant documents, but to help determine which documents are most likely to be relevant to the given requirements. Various attempts have been made in the recent past to use syntactic analysis methods for the generation of complex construction that are essential for content identification in various automatic text analysis systems. Unfortunately, it is known that methods based on syntactic understanding alone are not sufficiently powerful to Produce complete analyses of arbitrary text samples. In this paper, we present a document ranking method based on two-level ranking. The first level is used to retrieve the documents, and the second level to reorder the retrieved documents. The main keywords used in the first level can be defined as nouns and/or compound nouns that possess good document discrimination powers. The sub-keywords used in the second level can be also defined as adjectives, adverbs, and/or verbs that are not main keywords, and function words. An empirical study was conducted from a Korean encyclopedia with 23,113 entries and 161 Korean natural language queries collected by end users. 850% of the natural language queries contained sub-keywords. The two-level document ranking methods provides significant improvement in retrieval effectiveness over traditional ranking methods.

  • PDF

The Present Status and Outlook of Nano Technology (나노기술의 국내외 현황과 전망)

  • 김용태
    • Proceedings of the International Microelectronics And Packaging Society Conference
    • /
    • 2001.11a
    • /
    • pp.37-39
    • /
    • 2001
  • 21세기의 벽두부터 국내외적으로 활발히 논의되고 있는 나노기술에 대한 정의를 생각해보는 것으로부터 우리가 나아갈 방향을 살펴보고자 한다. 나노기술이란, 원자 하나 하나 혹은 분자단위의 조작을 통해 1~100nm정도의 범위 안에서 근본적으로 새로운 물질이나 구조체를 만들어 내는 기술을 말한다. 즉 앞으로 우리는 경험해 보지 못한 새로운 현상에 대한 이해를 할 수 있어야 하고, 새로운 물질 자체를 다룰 수 있는 방법이 우리가 해야 할 구체적인 일이 될 것이란 말이 된다. 뿐만 아니라 나노기술은 종래의 정보.통신.전자 분야에서 주로 추구하던 마이크로화와 달리 재료, 기계, 전자, 의학, 약학, 에너지, 환경, 화학, 생물학, 농학, 정보, 보안기술 등 과학기술 분야 전반을 위시하여 사회분야가지 새로운 인식과 철학적인 이해가 필요하게 되었다. 21세기를 맞은 인류가 나아갈 방향을 나노세계에 대한 도전으로 보아야 하며, 과학기술의 새로운 틀을 제공할 것 임에 틀림 없다. 그러나, 이와 같은 나노기술의 출발점을 살펴보면 VLSI기술로 통칭할 수 있는 마이크로전자소자 기술이란 점이다. 국내의 VLSI기술은 메모리기술이라고 해도 과언이 아닐 것이다. 문제는 종래의 메모리기술은 대규모 투자와 집중적인 인력양성을 통해서 세계 최고 수준에 도달 할 수 있었다. 그러나 여기까지 오는 동안 사식 우리는 선진국의 뒷꽁무니를 혼신의 힘을 다해 뒤쫓아 온 결과라고 보아도 틀리지 않는다. 즉, 앞선자를 보고 뒤쫓는 사람은 갈방향과 목표가 분명하므로 최선을 다하면 따라 잡을 수 있다. 그런데 나노기술은 앞선 사람이 없다는 점이 큰 차이이다 따라서 뒷껑무니를 쫓아가는 습성을 가지고는 개척해 나갈 수 없다는 점을 깨닫지 않으면 안된다. 그런 점에서 이 시간 나노기술의 국내외 현황을 살펴보고 우리가 어떻게 할 것인가를 생각해 보는데 의미가 있을 것이다.하여 분석한 결과 기존의 제한된 RICH-DP는 실시간 서비스에 대한 처리율이 낮아지며 서비스 시간이 보장되지 못했다. 따라서 실시간 서비스에 대한 새로운 제안된 기법을 제안하고 성능 평가한 결과 기존의 RICH-DP보다 성능이 향상됨을 확인 할 수 있었다.(actual world)에서 가상 관성 세계(possible inertia would)로 변화시켜서, 완수동사의 종결점(ending point)을 현실세계에서 가상의 미래 세계로 움직이는 역할을 한다. 결과적으로, IMP는 완수동사의 닫힌 완료 관점을 현실세계에서는 열린 미완료 관점으로 변환시키되, 가상 관성 세계에서는 그대로 닫힌 관점으로 유지 시키는 효과를 가진다. 한국어와 영어의 관점 변환 구문의 차이는 각 언어의 지속부사구의 어휘 목록의 전제(presupposition)의 차이로 설명된다. 본 논문은 영어의 지속부사구는 논항의 하위간격This paper will describe the application based on this approach developed by the authors in the FLEX EXPRIT IV n$^{\circ}$EP29158 in the Work-package "Knowledge Extraction & Data mining"where the information captured from digital newspapers is extracted and reused in tourist information context.terpolation performance of CNN was relatively better than NN.콩과 자연 콩이 성분 분석에서 차이를 나타내지 않았다는 점, 네 번째. 쥐를 통한 다양섭취 실험에서 아무런 이상 반응이 없었다는 점등의 결과를 기준으로 알레르기에 대한 개별 검사 없이 안전한

  • PDF

The Study of Developing Korean SentiWordNet for Big Data Analytics : Focusing on Anger Emotion (빅데이터 분석을 위한 한국어 SentiWordNet 개발 방안 연구 : 분노 감정을 중심으로)

  • Choi, Sukjae;Kwon, Ohbyung
    • The Journal of Society for e-Business Studies
    • /
    • v.19 no.4
    • /
    • pp.1-19
    • /
    • 2014
  • Efforts to identify user's recognition which exists in the big data are being conducted actively. They try to measure scores of people's view about products, movies and social issues by analyzing statements raised on Internet bulletin boards or SNS. So this study deals with the problem of determining how to find the emotional vocabulary and the degree of these values. The survey methods are using the results of previous studies for the basic emotional vocabulary and degree, and inferring from the dictionary's glosses for the extended emotional vocabulary. The results were found to have the 4 emotional words lists (vocabularies) as basic emotional list, extended 1 stratum 1 level list from basic vocabulary's glosses, extended 2 stratum 1 level list from glosses of non-emotional words, and extended 2 stratum 2 level list from glosses' glosses. And we obtained the emotional degrees by applying the weight of the sentences and the emphasis multiplier values on the basis of basic emotional list. Experimental results have been identified as AND and OR sentence having a weight of average degree of included words. And MULTIPLY sentence having 1.2 to 1.5 weight depending on the type of adverb. It is also assumed that NOT sentence having a certain degree by reducing and reversing the original word's emotional degree. It is also considered that emphasis multiplier values have 2 for 1 stratum and 3 for 2 stratum.

KMSCR: A system for managing knowledge assets of an IT consulting firm (IT 컨설팅 회사의 지적 자산 관리를 위한 지식관리시스템)

  • 김수연;황현석;서의호
    • Proceedings of the Korea Inteligent Information System Society Conference
    • /
    • 2001.06a
    • /
    • pp.233-239
    • /
    • 2001
  • 최근 대부분의 회사들은 업무를 수행하는데 필요한 지식과 노하우를 공유하고 재사용하기 위하여 지적 자산 관리의 중요성을 인식하고 있다. 특히 고도로 지식 집약적인 업종이라 할 수 있는 IT컨설팅 회사에서는 지적 자산의 관리가 다른 어떤 회사에서보다 큰 중요성을 가지게 된다. 컨설팅 회사에 있어서 검증이 완료된 지적 자산의 공유 및 지능적이면서도 신속한 검색은 컨설팅 서비스의 품질과 고객 만족에 직결되는 중요한 요소이다. 따라서 대부분의 컨설팅 회사들은 자사의 지식 자산을 관리하기 위하여 많은 노력을 기울이고 있다. 본 논문의 목적은 IT 컨설팅 회사예서 관리되는 다양한 형태의 지적 자산들을 중앙 관리하여 설친 고객 사이트에 흩어져 프로젝트를 수행하는 컨설턴트들이 공유할 수 있도록 함으로써 컨설팅 서비스의 생산성과 품질들 높이고자 하는데 있다 이를 위하여 건설팅 회사에서 관리되는 모든 지적 자산의 재고를 조사하여 모델링하고 이를 쉽게 저장하고 검색할 수 있는 시스템 아키텍처를 제안한다. 제안된 아키텍처를 NT 기반에서 Index server를 이용하여 시스템으로 구현하였다 (KMSCR: A Knowledge Management System for managing Consulting Resources). KMSCR에서는 컨설턴트가 찾고자 하는 검색어를 입력하면 다양한 포맷의 (.doc, .ppt, xls, .rtf, .txt, .html 등과 같은) 결과물을 관련성이 높은 순서대로 출력해 줌으로써 컨설팅 리소스를 효과적으로 재사용할 수 있도록 도와 준다. 또한 검색 시에는 미리 등록된 키워드 뿐 아니라 본문 내의 텍스트 검색까지 가능하게 함으로써 컨설팅 리소스에 대한 보다 효과적이고 효율적인 검색을 가능하게 한다.간을 성능 평가 인자로 하여 수행하였다. 논문에서 제한된 방법을 적용한 개선된 RICH-DP을 모의 실험을 통하여 분석한 결과 기존의 제한된 RICH-DP는 실시간 서비스에 대한 처리율이 낮아지며 서비스 시간이 보장되지 못했다. 따라서 실시간 서비스에 대한 새로운 제안된 기법을 제안하고 성능 평가한 결과 기존의 RICH-DP보다 성능이 향상됨을 확인 할 수 있었다.(actual world)에서 가상 관성 세계(possible inertia would)로 변화시켜서, 완수동사의 종결점(ending point)을 현실세계에서 가상의 미래 세계로 움직이는 역할을 한다. 결과적으로, IMP는 완수동사의 닫힌 완료 관점을 현실세계에서는 열린 미완료 관점으로 변환시키되, 가상 관성 세계에서는 그대로 닫힌 관점으로 유지 시키는 효과를 가진다. 한국어와 영어의 관점 변환 구문의 차이는 각 언어의 지속부사구의 어휘 목록의 전제(presupposition)의 차이로 설명된다. 본 논문은 영어의 지속부사구는 논항의 하위간격This paper will describe the application based on this approach developed by the authors in the FLEX EXPRIT IV n$^{\circ}$EP29158 in the Work-package "Knowledge Extraction & Data mining"where the information captured from digital newspapers is extracted and reused in tourist information context.terpolation performance of CNN was relatively

  • PDF

Building a Korean Sentiment Lexicon Using Collective Intelligence (집단지성을 이용한 한글 감성어 사전 구축)

  • An, Jungkook;Kim, Hee-Woong
    • Journal of Intelligence and Information Systems
    • /
    • v.21 no.2
    • /
    • pp.49-67
    • /
    • 2015
  • Recently, emerging the notion of big data and social media has led us to enter data's big bang. Social networking services are widely used by people around the world, and they have become a part of major communication tools for all ages. Over the last decade, as online social networking sites become increasingly popular, companies tend to focus on advanced social media analysis for their marketing strategies. In addition to social media analysis, companies are mainly concerned about propagating of negative opinions on social networking sites such as Facebook and Twitter, as well as e-commerce sites. The effect of online word of mouth (WOM) such as product rating, product review, and product recommendations is very influential, and negative opinions have significant impact on product sales. This trend has increased researchers' attention to a natural language processing, such as a sentiment analysis. A sentiment analysis, also refers to as an opinion mining, is a process of identifying the polarity of subjective information and has been applied to various research and practical fields. However, there are obstacles lies when Korean language (Hangul) is used in a natural language processing because it is an agglutinative language with rich morphology pose problems. Therefore, there is a lack of Korean natural language processing resources such as a sentiment lexicon, and this has resulted in significant limitations for researchers and practitioners who are considering sentiment analysis. Our study builds a Korean sentiment lexicon with collective intelligence, and provides API (Application Programming Interface) service to open and share a sentiment lexicon data with the public (www.openhangul.com). For the pre-processing, we have created a Korean lexicon database with over 517,178 words and classified them into sentiment and non-sentiment words. In order to classify them, we first identified stop words which often quite likely to play a negative role in sentiment analysis and excluded them from our sentiment scoring. In general, sentiment words are nouns, adjectives, verbs, adverbs as they have sentimental expressions such as positive, neutral, and negative. On the other hands, non-sentiment words are interjection, determiner, numeral, postposition, etc. as they generally have no sentimental expressions. To build a reliable sentiment lexicon, we have adopted a concept of collective intelligence as a model for crowdsourcing. In addition, a concept of folksonomy has been implemented in the process of taxonomy to help collective intelligence. In order to make up for an inherent weakness of folksonomy, we have adopted a majority rule by building a voting system. Participants, as voters were offered three voting options to choose from positivity, negativity, and neutrality, and the voting have been conducted on one of the largest social networking sites for college students in Korea. More than 35,000 votes have been made by college students in Korea, and we keep this voting system open by maintaining the project as a perpetual study. Besides, any change in the sentiment score of words can be an important observation because it enables us to keep track of temporal changes in Korean language as a natural language. Lastly, our study offers a RESTful, JSON based API service through a web platform to make easier support for users such as researchers, companies, and developers. Finally, our study makes important contributions to both research and practice. In terms of research, our Korean sentiment lexicon plays an important role as a resource for Korean natural language processing. In terms of practice, practitioners such as managers and marketers can implement sentiment analysis effectively by using Korean sentiment lexicon we built. Moreover, our study sheds new light on the value of folksonomy by combining collective intelligence, and we also expect to give a new direction and a new start to the development of Korean natural language processing.

Interpretation and Education of topic phrase shown on the Analects of Confucius (1) - Based on the examples of the 'DO' eventuality meaning phrase (논어 화제구의 해석과 교육(1) - '활동(DO)' 사건의미 표시구의 예를 중심으로)

  • 김종호
    • Journal of Sinology and China Studies
    • /
    • v.79
    • /
    • pp.245-275
    • /
    • 2019
  • The Chinese language is a topic-prominent language and the ancient form of the language is not an exception. This literature examines the theta roles and cases of the arguments by setting sentence structures regarding 'TopP', which marks the eventuality meaning of 'DO', from the Analects of Confucius. Summary of the analysis is as follows: 1. Like the contemporary Chinese, ancient Chinese has two topic types; one is a base-generated topic(so called dangling topic) and the other is a topic generated by the interior movement in the TP. 2. The sentence structure of TopP from the Analects of Confucius that marks the eventuality meaning of 'DO' can be divided into seven categories which are: T+S+V+O', 'T+ES+V+EO', 'T+ES+V+EO', 'T+S+V+EC', 'T+ES+V+C', 'T+ES+V+O+C', 'T+S+V+EO+C'. 3. The predicate(V-v) of TopP from the Analects of Confucius that marks the eventuality meaning of 'DO' has a meaning characteristics of [+ will] and [+ progressive]. 4. The subject of TopP from the Analects of Confucius that marks the eventuality meaning of 'DO' is assigned a thematic role of an , and the object is assigned a thematic role of a . 5. The complement of TopP from the Analects of Confucius that marks the eventuality meaning of 'DO' is composed of overt preposition head and its complements, which receives diverse thematic roles from the preposition. However, it does not have a direct selective relationship but an indirect relationship with the main verb of the sentence. 6. Every argument, which is the subject, object, complement, and adverb, of TopP from the Analects of Confucius that marks the eventuality meaning of 'DO' can theoretically be moved in front of the subject and be topicalized. 7. The topic phrase from the Analects of Confucius that marks the eventuality meaning of 'DO' is derived in the order of predicate(V-v), object, complement, adjunct, subject, and topic, but when interpreting it in Korean it is in inverse order, which is in the order of topic, subject, adjunct, complement, object and predicate(V-v).