• Title/Summary/Keyword: Lexicon

Search Result 272, Processing Time 0.027 seconds

WellnessWordNet: A Word Net for Unconstrained Subjective Well-Being Monitor ing Based on Unstructured Data and Contextual Polarity (웰니스워드넷: 비정형데이터와 상황적 긍부정성에 기반하여 주관적 웰빙 상태를 무구속적으로 모니터링하기 위한 워드넷 개발)

  • Song, Yeongeun;Nam, Suhyun;Kwon, Ohbyung
    • Journal of Intelligence and Information Systems
    • /
    • v.22 no.3
    • /
    • pp.1-21
    • /
    • 2016
  • IT-based subjective well-being (SWB) services, a main part of wellness IT, should measure the SWB state of individuals in an unrestrained, cost-effective manner. The dictionaries for sentiment analysis available in the market may be useful for this purpose, but obtaining proper sentiment values using only words from the sentiment lexicon is impossible; therefore, a new dictionary including wellness vocabulary is needed. The existing sentiment dictionaries link only a single sentiment value to a single sentiment word, although sentiment values may vary depending on personal traits. In this study, we develop an extended version of the SenticNet sentiment dictionary dubbed WellnessWordNet. SenticNet is considered the best and most expressive among the already existing sentiment dictionaries. Using the information provided by SenticNet, we created a database including the wellness states (estimated values) of stress, depression, and anger to develop the WellnessWordNet system. The accuracy of the system was validated through actual tests with live subjects. This study is unique and unprecedented in that i) an extended sentiment dictionary, WellnessWordNet, is developed; ii) values for wellness state language are offered; and iii) different sentiment values, namely contextual polarity, for people of the same gender or age group are suggested.

A.J. Toynbee의 문명론과 도서관의 역사 -Renaissance 관과 도서편집 활동을 중심으로-

  • 손연옥
    • Journal of Korean Library and Information Science Society
    • /
    • v.9
    • /
    • pp.115-144
    • /
    • 1982
  • In ordinary modern wester expression 'the Renaissance' was used to denote the impact made by dead Hellenism civilization in western Christendom, particularly Italian literary and artistic movement at Northern and central Italy in the late medieval period. However, A.J. Toynbee examined the renaissance from the different aspect of view. In his great work "A Study of History" in vol. IX, he succeeded in establishing the theory of historic civilization encounters in space and in time; and in time, civilization of the present and the past or between dead and infant successor contacts on the analogy of parenthood and sonship in the relation of A n.0, pparentation-and-Affiliation. The distinguished his view of 'Renaissance' was illustrated in the sense of encounters between a grown-up civilization and the 'ghost' of its long-dead predecessor. The renaissances (by the process of evocation of ghost of its parent society) has not only one single aspect of literary and artistic field but also in politics, law, science and philosophy, languages and literatures and visual arts, and religion. The main theme of this study is to examine the development of libraries and its historical meaning through Toynbee's literary renaissance. His renaissance of Languages and Literatures has three typical steps: They are: 1st step-to restive the dead literature's remains: 2nd step-to remaster their meaning: 3rd step-to reproduce them in counterfeits... Through its first and second steps, collecting and editing, annotating by compiling an anthology, thesaurus, lexicon or encyclopedia, and in its third step publishing mostly imitation of classics took place. Toynbee depicted the five outstanding eminent representatives of literary renaissance who had a n.0, ppeared on the state of history down to the time of writing. They are: Assurbanipal, Constantine prophyrogenitus, Yung Lo, K'ang Hsi, and Ch'ien Lung and the last four had all been emperors of imperia rediviva. As the result of the examination of these five emperors with three steps of literary renaissance, the common result may be summarized as follows: 1. Those emperors of imperia rediviva interested in intellectual work and study, they also were deeply involved in collecting classics in an ostensible reason. 2. There were strong political intention of collecting materials as an a n.0, ppeasement policy of civilization by transferring scholars energies to an intellectual field. 3. Under the rulers of a resuscitated universal state, the literary renaissance were a product of political plane and that the total size of collection and work were huge. 4. Since there were strong exercise of sovereign power, an active censorship by distortion and elimination was inevitable. 5. There existed newly developed strained atmosphere between grown-up and long-dead parent civilization, whenever the book collection movement had occurred. 6. Over adhesion to the parent civilization caused imitation of classic work and the creative activities were stagnated.stagnated.

  • PDF

Analysis of Korean Spontaneous Speech Characteristics for Spoken Dialogue Recognition (대화체 연속음성 인식을 위한 한국어 대화음성 특성 분석)

  • 박영희;정민화
    • The Journal of the Acoustical Society of Korea
    • /
    • v.21 no.3
    • /
    • pp.330-338
    • /
    • 2002
  • Spontaneous speech is ungrammatical as well as serious phonological variations, which make recognition extremely difficult, compared with read speech. In this paper, for conversational speech recognition, we analyze the transcriptions of the real conversational speech, and then classify the characteristics of conversational speech in the speech recognition aspect. Reflecting these features, we obtain the baseline system for conversational speech recognition. The classification consists of long duration of silence, disfluencies and phonological variations; each of them is classified with similar features. To deal with these characteristics, first, we update silence model and append a filled pause model, a garbage model; second, we append multiple phonetic transcriptions to lexicon for most frequent phonological variations. In our experiments, our baseline morpheme error rate (WER) is 31.65%; we obtain MER reductions such as 2.08% for silence and garbage model, 0.73% for filled pause model, and 0.73% for phonological variations. Finally, we obtain 27.92% MER for conversational speech recognition, which will be used as a baseline for further study.

Song Themes and Variation of Yellow-throated Bunting (Emberiza elegans) (노랑턱멧새(Emberiza elegans)의 테마송과 변이)

  • Lee, Won-Ho;Kwon, Ki-Chung
    • Journal of Ecology and Environment
    • /
    • v.29 no.3
    • /
    • pp.219-225
    • /
    • 2006
  • To study song themes and variation of Yellow-throated Bunting, we obtained and analyzed recordings from 45 males breeding in 16 deciduous forests of 6 provinces. We classified the 3,245 songs into a total of 164 song themes and 1,024 song variants according to the identification on the base of difference(lexicon) in 640 syllable compositions. Males had one to six song themes and averaged 3.5 themes. No males shared an identical song theme. Males had $5{\sim}14$ syllables (ave. 9.4) in one song theme and males increased effectively their repertoire size by changing syllable composition (i.e. adding, deleting, or substituting one or more syllables) in a single song theme. The number of variants averaged 5.1 (range 1 to 31) per song theme. Individual variability was highest in the terminal elements of the song. In PCA, the 16 populations are clearly separated on Co. I based on shared syllable and on Co. II based on unique syllable. Similarity of songs based on shared syllables by distance coefficients, showed a pattern of concordance with geography. Pairwise similarity declined with increasing distance among recording sites. 16 different geographical regions by the syllable were divided in UPGMA tree.

Monitoring Mood Trends of Twitter Users using Multi-modal Analysis method of Texts and Images (텍스트 및 영상의 멀티모달분석을 이용한 트위터 사용자의 감성 흐름 모니터링 기술)

  • Kim, Eun Yi;Ko, Eunjeong
    • Journal of the Korea Convergence Society
    • /
    • v.9 no.1
    • /
    • pp.419-431
    • /
    • 2018
  • In this paper, we propose a novel method for monitoring mood trend of Twitter users by analyzing their daily tweets for a long period. Then, to more accurately understand their tweets, we analyze all types of content in tweets, i.e., texts and emoticons, and images, thus develop a multimodal sentiment analysis method. In the proposed method, two single-modal analyses first are performed to extract the users' moods hidden in texts and images: a lexicon-based and learning-based text classifier and a learning-based image classifier. Thereafter, the extracted moods from the respective analyses are combined into a tweet mood and aggregated a daily mood. As a result, the proposed method generates a user daily mood flow graph, which allows us for monitoring the mood trend of users more intuitively. For evaluation, we perform two sets of experiment. First, we collect the data sets of 40,447 data. We evaluate our method via comparing the state-of-the-art techniques. In our experiments, we demonstrate that the proposed multimodal analysis method outperforms other baselines and our own methods using text-based tweets or images only. Furthermore, to evaluate the potential of the proposed method in monitoring users' mood trend, we tested the proposed method with 40 depressive users and 40 normal users. It proves that the proposed method can be effectively used in finding depressed users.

A Comparative Study on the Korean and English Genderlect: Focused on Polite Expressions (한국어와 영어 성별어 비교연구: 공손표현과 관련하여)

  • Kim, Hyun Hyo
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.16 no.10
    • /
    • pp.6527-6533
    • /
    • 2015
  • It is generally accepted that there are differences between men and women in linguistic communication style. Genderlect is a socio-linguistic term to refer to the linguistic differences spoken by specific gender. Some linguistic features are provided as evidence to show the genderlects: pitch, lexicon, intonation, grammar and styles. The purpose of this paper is to compare the characteristics of genderlect in English and Korean. To do so, I analyzed the scripts of an English movie, 'Mrs. Doubtfire' and Korean tv drama, 'Oohlala couple'. In "Mrs. Doubtfire, tension and laughter arose out of discrepancy from the way he looked (as a woman) and the way he spoke (like a man). The same is true with "Oohlala couple." In the language of Mrs. Doubtfire, male speech characteristics with nouns were salient while in "Oohlala couple" with verb forms, especially with honorific style, which shows a difference between Korean and English genderlect. Korean language has special genderlect characteristics with honorific speech style realized in verb endings. In Korean the highest honorific speech style, 'Habsho-che' is used in official situation and men are more accustomed to it than women. When women have to use polite expressions they have to choose between the highest honorific style, 'Habsho-che' losing the female characteristics or the second highest honorific style 'Haeyo-che' keeping the female characteristics.

Hemispheric Asymmetry in Processing Semantic Relationship Shown in Normals and Aphasic (형태소 공유 어휘의 심성 어휘집 표상 양식)

  • Jung, Jae-Bum;Lee, Hong-Jae;Moon, Young-Sun;Kim, Dong-Hyu;Pyun, Sung-Bum;Nam, Ki-Chun
    • Annual Conference on Human and Language Technology
    • /
    • 1999.10e
    • /
    • pp.359-367
    • /
    • 1999
  • 형태소를 공유하고 있는 어휘가 심성 어휘집(mental lexicon)에 어떻게 저장되어 있고 어떻게 어휘 접근되는지에 관하여 여러 설명이 제기되었다 첫 번째 가설은 형태소 공유 어휘는 심성 어휘집에 모두 같은 어근 혹은 어간을 중심으로 저장되어 있다는 것이다. 두 번째 가설은 어간이나 어근으로의 분석을 통해 활용된 단어를 이해하는 것이 아니라 일단 활용된 형태의 어휘를 심성 어휘집에서 찾고, 만일 해당되는 것이 발견되면, 그 활용된 어절의 이해가 끝나게 되고, 만일에 해당되는 것이 심성 어휘집에 존재하지 않는 경우에만 부수적인 과정으로 구성 형태소로의 분석이 이루어진다는 것이다. 세 번째 가설은 어휘의 품사, 어휘의 빈도, 형태소 활용의 규칙성 등에 따라 구성 형태소로의 분석을 통해 활용된 단어를 이해하거나 아니면 활용된 어휘의 직접적인 접근을 통해 활용된 단어를 이해한다는 것이다. 본 연구에서는 이 세 종류의 가설 중에 어느 가설이 옳은 것인지를 조사하기 위해, "먹은" 흑은 "쥐어"와 같은 한국어 어절을 이용하여 형태소 표상 양식과 이해 과정을 다루었다. 본 연구의 목적을 위해 점화 어휘 판단 과제(primed-lexical decision task)를 사용하였다. 실험 1은 "먹은"처럼 동사 "먹다"로도 해석이 가능하고 명사 "먹"으로도 가능한 중의적 어절을 점화 문자열로 제시하고 이 문자열이 두 의미와 관련된 목표 단어 재인에 어떤 영향을 끼치는지를 조사하였다. 만일에 "먹"이라는 어근 혹은 어간으로의 분석을 통해 이 어절을 이해한다면 두 종류의 의미와 관련된 조건 모두에서 촉진적 점화 효과(facilitatory priming effect)가 나타날 것이고, 어절 전체로의 어휘 접근 과정이 일어난다면 사용빈도에서 높은 동사 뜻과 관련된 조건에서만 촉진적 점화 효과가 나타날 것이다. 실험 1의 결과는 두 종류의 의미가 모두 활성화되는 것을 보여 주었다. 즉, "먹은"과 간은 어절 이해는 구성 형태소로의 분석과 구성 형태소 어휘 접근을 통해 어절 이해가 이루어진다는 가설을 지지하고 있다. 실험 2에서는 실험 1과 다르게 한 뜻으로만 안일 수밖에 없는 "쥐어"와 같은 어절을 사용하여 이런 경우에도(즉, 어절의 문맥이 특정 뜻으로 한정하는 경우) 구성 형태소로의 분석 과정이 일어나는지를 조사하였다. 실험 2의 결과는 실험 1의 결과와는 다르게 어간의 한가지 의미와 관련된 조건만 촉진적 점화 효과가 나타나는 것을 보여주었다. 특히, 실험 2에서 SOA가 1000msec일 경우, 두 의미의 활성화가 나타나는 것을 보여주었는데, 이 같은 결과는 어절 문맥이 특정한 의미로 한정시킬 경우는 심성어휘집에 활용형태로 들어있다는 것이다. 또한 명칭성 실어증 환자의 경우에는 즉시적 점화과제에서는 일반인과 같은 형태소 처리과정을 보였으나, 그이후의 처리과정이 일반인과 다른 형태를 보였다. 실험 1과 실험 2의 결과는 한국어 어절 분석이 구문분석 또는 활용형태를 통해 어휘 접근되는 가설을 지지하고 있다. 또 명칭성 실어증 환자의 경우에는 지연된 점화과제에서 형태소 처리가 일반인과 다르다는 것이 밝혀졌다. 이 결과가 옳다면 한국의 심성 어휘집은 어절 문맥에 따라서 어간이나 어근 또는 활용형 그 자체로 이루어져 있을 것이다.

  • PDF

Reliability Analysis of VOC Data for Opinion Mining (오피니언 마이닝을 위한 VOC 데이타의 신뢰성 분석)

  • Kim, Dongwon;Yu, Song Jin
    • Journal of Intelligence and Information Systems
    • /
    • v.22 no.4
    • /
    • pp.217-245
    • /
    • 2016
  • The purpose of this study is to verify how 7 sentiment domains extracted through sentiment analysis from social media have an influence on business performance. It consists of three phases. In phase I, we constructed the sentiment lexicon after crawling 45,447 pieces of VOC (Voice of the Customer) on 26 auto companies from the car community and extracting the POS information and built a seven-sensitive domains. In phase II, in order to retain the reliability of experimental data, we examined auto-correlation analysis and PCA. In phase III, we investigated how 7 domains impact on the market share of three major (GM, FCA, and VOLKSWAGEN) auto companies by using linear regression analysis. The findings from the auto-correlation analysis proved auto-correlation and the sequence of the sentiments, and the results from PCA reported the 7 sentiments connected with positivity, negativity and neutrality. As a result of linear regression analysis on model 1, we indentified that the sentimental factors have a significant influence on the actual market share. In particular, not only posotive and negative sentiment domains, but neutral sentiment had significantly impacted on auto market share. As we apply the availability of data to the market, and take advantage of auto-correlation of the market-related information and the sentiment, the findings will be a huge contribution to other researches on sentiment analysis as well as actual business performances in various ways.

Design and Implementation of Thesaurus System for Geological Terms (지질용어 시소러스 시스템의 설계 및 구축)

  • Hwang, Jaehong;Chi, KwangHoon;Han, JongGyu;Yeon, Young Kwang;Ryu, Keun Ho
    • Journal of the Korean Association of Geographic Information Studies
    • /
    • v.10 no.2
    • /
    • pp.23-35
    • /
    • 2007
  • With the development of semantic web technologies in information retrieval area, the necessity for thesaurus is recently increasing along with internet lexicons. A thesaurus is the combination of classification and a lexicon, and is the topic map of knowledge structure expressing relations among concepts(terms) subject to human knowledge activities such as learning and research using formally organized and controlled index terms for clarifying the context of superordinate and subordinate concepts. However, although thesaurus are regarded as essential tools for controlling and standardizing terms and searching and processing information efficiently, we do not have a Korean thesaurus for geology. To build a thesaurus, we need standardized and well-defined guidelines. The standardized guidelines enable efficient information management and help information users use correct information easily and conveniently. The present study purposed to build a thesaurus system with terms used in geology. For this, First, we surveyed related works for standardizing geological terms in Korea and other countries. Second, we defined geological topics in 15 areas and prepared a classification system(draft) for each topic. Third, based on the geological thesaurus classification system, we created the specification of geological thesaurus. Lastly, we designed and implemented an internet-based geological thesaurus system using the specification.

  • PDF

Semantic Analysis of Color Terms in Chinese Neologisms: Focusing on Black, White, and Gray (중국어 신조어에 나타난 색채어 의미 분석 - 검은색, 흰색, 회색을 중심으로 -)

  • Lee, Myung-Ah;Han, Yong-su
    • Cross-Cultural Studies
    • /
    • v.47
    • /
    • pp.241-260
    • /
    • 2017
  • A multitude of neologisms has entered the lexicon of modern Chinese society as a reflection of the changes modern Chinese society has undergone, and amid this trend, a variety of color terms has emerged. However, these neologisms of color terms in modern Chinese society are used somewhat differently from their roots. First, the achromatic color terms used in Chinese neologisms include black, white, and gray. The significance criteria generally used in these neologisms of color terms only partially express their meaning in the modern Chinese language. Second, the frequency usage of significant criteria of color terms that have emerged in Chinese neologisms reveals a relative distribution between color terms referring to black and white. The color term "black" is the most active neologism to connote its expanded meaning, followed by its basic meaning. However, the color term "white" is most actively used to connote its basic meaning, followed by its expanded meaning. Third, among the achromatic color terms used in Chinese neologisms, black and gray exhibit expansion of meaning. For example, in the context of neologisms, the color term "black" is used to symbolize "in disaster areas" and "socially discriminated against," while "gray" is used to symbolize the "social aspect."