• Title/Summary/Keyword: lexicons

Search Result 41, Processing Time 0.023 seconds

Building Korean Multi-word Expression Lexicons and Grammars Represented by Finite-State Graphs for FbSA of Cosmetic Reviews (화장품 후기글의 자질기반 감성분석을 위한 다단어 표현의 유한그래프 사전 및 문법 구축)

  • Hwang, Chang-Hoe;Yoo, Gwang-Hoon;Choi, Seong-Yong;Shin, Dong-Heouk;Nam, Jee-Sun
    • Annual Conference on Human and Language Technology
    • /
    • 2018.10a
    • /
    • pp.400-405
    • /
    • 2018
  • 본 연구는 한국어 화장품 리뷰 코퍼스의 자질기반 감성 분석을 위하여, 이 도메인에서 실현되는 중요한 다단어 표현(MWE)의 유한상태 그래프 사전과 문법을 구축하는 방법론을 제시하고, 실제 구축된 사전과 문법의 성능을 평가하는 것을 목표로 한다. 본 연구에서는 자연어처리(NLP)에서 중요한 화두로 논의되어 온 MWE의 어휘-통사적 특징을 부분문법 그래프(LGG)로 형식화하였다. 화장품 리뷰 코퍼스에 DECO 한국어 전자사전을 적용하여 어휘 빈도 통계를 획득하고 이에 대한 언어학적 분석을 통해 극성 MWE(Polarity-MWE)와 화제 MWE(Topic MWE)의 전체 네 가지 하위 범주를 분류하였다. 또한 각 모듈간의 상호관계에 대한 어휘-통사적 속성을 반복적으로 적용하는 이중 증식(double-propagation)을 통해 자원을 확장하였다. 이 과정을 통해 구축된 대용량 MWE 유한그래프 사전 DECO-MWE의 성능을 테스트한 결과 각각 0.844(Pol-MWE), 0.742(Top-MWE)의 조화평균을 보였다. 이를 통해 본 연구에서 제안하는 MWE 언어자원 구축 방법론이 다양한 도메인에서 활용될 수 있고 향후 자질기반 감성 분석에 중요한 자원이 될 것임을 확인하였다.

  • PDF

Epistemologico-Historic Foundations of Linguistic Relativity (언어상대성 원칙의 역사 인식론적 토대 -문화 언어학을 위한 서설-)

  • 김성도
    • Lingua Humanitatis
    • /
    • v.2 no.1
    • /
    • pp.7-42
    • /
    • 2002
  • This paper reexamines ideas about linguistic relativity in the light of new interest in the theoretical climate. The original idea is based on the incommensurability of the semantic structures of different languages. On this view, language, thought, culture are deeply interconnected, so that each language might be associated with it a distinctive world view. Throughout this work I utilize the historico-epistemological standpoint to dissect the conceptual structure of this principle. In the introduction I will of for a justification of choice of the theme. Section 1 will address some essential definition of the linguistic principle and insist on the necessity to elaborate a typological spectrum of relativism and universalism. In the second section some important landmarks of linguistic relativity were marked from Plato to Humboldt via Condillac and Herder. 1 will subdivide the relativity hypothesis into 3 theses which are interlated. In the final section the epistemological structure of the linguistic principle will be analysed in some detail by providing my exposition of Sapir-Whorf hypothesis. By way of conclusion I will present the works of Wierzbicka who demonstrated the lexicons of different languages suggest different conceptual universes. By rejecting analytical tools derived from the English language she proposed instead a natural semantic metalanguage based on lexical universals, which is made up of universal semantic primitives. In this paper we attempted to construct a general problematics of linguistic relativity, focolizing on the Sapir-Whorf hypothesis. We devided this very problematic question into its ontological and epistemological dimensions. In particular the ambivalance of Whorf's relativity is discussed in some detail. Also, an archeological survey of this subtle question on the relation between language, thinking and culture was provided. (from Aristotle to Humboldt, via Condillac and Nitzche). In conclusion this investigation underlines the necessity of preparing the cultural linguistics to enlarge the scope of contempory linguistics.

  • PDF

The Effects of Linguistic Contrast and Conceptual Hierarchy on Children's Word Learning (언어대비(言語對比)와 개념(槪念)의 위계성(位階性)이 아동의 단어학습에 미치는 효과)

  • Kim, Eun Heui;Lee, Kwee Ok
    • Korean Journal of Child Studies
    • /
    • v.14 no.2
    • /
    • pp.79-94
    • /
    • 1993
  • The purpose of this study was (1) to investigate whether linguistic contrast helps children map a new word into a specific semantic domain when a new word is introduced, (2) to examine the existence of a hierarchy of domains into which children will place a new word, (3) to examine whether children's existing lexicons affect how children map a new word. A total of 320 children from 3 to 6 years of age were drawn from Pusan, Korea. The children were divided into one of four age groups. There were 80 children in each age group. In each group, children were randomly assigned to one of four groups; the linguistic contrast group exposed to color, the linguistic contrast group exposed to shape, a label group and control group. All of the children were tested for production and comprehension of the new word. The results of this study were as follows; (1) The linguistic contrast helped children learn the meanings of a new word. Especially, children age 4 or more showed a significant effect for linguistic contrast; however, it was not sufficient to teach 3-year-old the correct, referent of a term. (2) There was a hierarchy of domains into which children mapped a new word. There was no significant effect for domains into which 3-year-old children mapped the new word, but from 4 years of age children showed a preference for assuming a new word refered to an object's shape rather than its color. (3) Children's existing lexicon had no effect, on how children comprehend a new word.

  • PDF

A Proposed Role for Semiotics Methodology in Education of Comics Studies Majors (전공자 대상의 만화교육에 있어서 기호학적 방법론의 역할제안)

  • Kwon, Kyung-Min
    • Cartoon and Animation Studies
    • /
    • s.32
    • /
    • pp.141-158
    • /
    • 2013
  • Comics are a genre that convey meaning through compositional arrangements of dialogue and images as well as through the flow of panels across a page. Communicating that meaning to readers through a combination of language and visual lexicons is the essential process of drawing comics, a process that in itself is significant. The semiotics of comics is a field of scholarship grounded in the broader discipline of semiotic theory in which all the components of comics, both visual and verbal, are the subject of study and research. By adopting a semiotic approach we are able to objectively analyze and understand the symbolic, social and ideological meanings embedded in the signs and sign processes expressed in comics. The fundamental pedagogical mission of teaching comics is to cultivate human nature through the study of theory as well as through the production and completion of original works that explore new modes of expression. To go further, interpreting those embedded meanings in the context of comics fosters effective and creative skills of expression that go beyond a mere fascination for the genre itself. In short, because the semiotic approach to understanding visual communication is the essence teaching comics, we can expect that the act of reading and creating comics plays a significant role in understanding visual communication.

Sensory Drivers of Liking for Adlay (Coix lacryma-jobi) Tea (시판 율무차의 소비자 기호 유도 인자)

  • Gwak, Mi-Jin;Chung, Seo-Jin;Kim, Yang
    • Journal of the Korean Society of Food Culture
    • /
    • v.27 no.5
    • /
    • pp.512-520
    • /
    • 2012
  • This study investigated the sensory characteristics of adlay tea favorably consumed by Korean consumers and analyzed the drivers behind for liking or disliking adlay tea. Six adlay tea products showing the highest market share in South Korea were selected. Sensory properties of the six products were analyzed using generic descriptive analysis. Among these, four products were further selected for consumer taste acceptance test. Sensory lexicons of adlay tea were developed by trained panelists, and the sensory characteristics of each adlay tea product were measured based on the perceived intensities of these attributes elicited from the samples. In the consumer taste acceptance test, frequent tea and coffee drinkers participated. Consumers rated the acceptance of each tea product on a 9-point hedonic scale and evaluated the reasons for liking or disliking each product based on the check-all-that-apply method. Analysis of Variance, principal component analysis, frequency analysis, and correspondence analysis were utilized for statistical analysis. Twenty sensory attributes were developed in order to characterize the six adlay tea products. The results of the descriptive analysis showed that attributes such as viscosity, black soybean flavor, goso flavor, peanut flavor, seaweed flavor, green, and presence of chunks were key factors differentiating the adlay tea products. In the consumer taste test, roasted flavor, goso flavor, peanut flavor, and presence of chunks were positive drivers for liking the adlay tea products, whereas seaweed and green flavors were negative attributes that drove consumers away.

A High-Speed Korean Morphological Analysis Method based on Pre-Analyzed Partial Words (부분 어절의 기분석에 기반한 고속 한국어 형태소 분석 방법)

  • Yang, Seung-Hyun;Kim, Young-Sum
    • Journal of KIISE:Software and Applications
    • /
    • v.27 no.3
    • /
    • pp.290-301
    • /
    • 2000
  • Most morphological analysis methods require repetitive procedures of input character code conversion, segmentation and lemmatization of constituent morphemes, filtering of candidate results through looking up lexicons, which causes run-time inefficiency. To alleviate such problem of run-time inefficiency, many systems have introduced the notion of 'pre-analysis' of words. However, this method based on pre-analysis dictionary of surface also has a critical drawback in its practical application because the size of the dictionaries increases indefinite to cover all words. This paper hybridizes both extreme approaches methodologically to overcome the problems of the two, and presents a method of morphological analysis based on pre-analysis of partial words. Under such hybridized scheme, most computational overheads, such as segmentation and lemmatization of morphemes, are shifted to building-up processes of the pre-analysis dictionaries and the run-time dictionary look-ups are greatly reduced, so as to enhance the run-time performance of the system. Moreover, additional computing overheads such as input character code conversion can also be avoided because this method relies upon no graphemic processing.

  • PDF

The Application of Geography Markup Language(GML) to the Maritime Information

  • Oh, Se-Woong;Park, Jong-Min;Suh, Sang-Hyun
    • Proceedings of the Korean Institute of Navigation and Port Research Conference
    • /
    • v.1
    • /
    • pp.519-524
    • /
    • 2006
  • This paper describes an application of information presentation based geographic map for maritime information, including navigation information. The work is motivated by the need to prepare maritime information representation and distribution for future generation Web network technology. This works consist of map generation using GML and application to maritime information. GML 3.0 became an adopted specification of the Open Geospatial Consortium(OGC) in January 2003, and is rapidly emerging as the world standard for the encoding, transport and storage of all forms of geographic information. This paper looks at the application of GML to one of the more challenging areas of maritime information. Specific features of GML of interest to maritime information provider are discussed and then illustrated through a series of maritime information case studies. The first phase of the work consists of the construction of GML application schema for using as a base map of maritime information. Maritime information is acquired from multiple sources, including standards documents, database schemas, lexicons, collections of symbol definition. The sources of GML ontological knowledge and the contribution of each source to the overall ontology are described in this paper. In the second phase, the prepared GML is used to create a prototype of the mixed maritime information as a base map - for tagging documents within the maritime domain. An overview of this prototype is included. One application area for these information elements described here is the integrated retrieval of maritime information from diverse sources, ranging from Web sites to nautical chart databases and text documents.

  • PDF

Design and Implementation of Thesaurus System for Geological Terms (지질용어 시소러스 시스템의 설계 및 구축)

  • Hwang, Jaehong;Chi, KwangHoon;Han, JongGyu;Yeon, Young Kwang;Ryu, Keun Ho
    • Journal of the Korean Association of Geographic Information Studies
    • /
    • v.10 no.2
    • /
    • pp.23-35
    • /
    • 2007
  • With the development of semantic web technologies in information retrieval area, the necessity for thesaurus is recently increasing along with internet lexicons. A thesaurus is the combination of classification and a lexicon, and is the topic map of knowledge structure expressing relations among concepts(terms) subject to human knowledge activities such as learning and research using formally organized and controlled index terms for clarifying the context of superordinate and subordinate concepts. However, although thesaurus are regarded as essential tools for controlling and standardizing terms and searching and processing information efficiently, we do not have a Korean thesaurus for geology. To build a thesaurus, we need standardized and well-defined guidelines. The standardized guidelines enable efficient information management and help information users use correct information easily and conveniently. The present study purposed to build a thesaurus system with terms used in geology. For this, First, we surveyed related works for standardizing geological terms in Korea and other countries. Second, we defined geological topics in 15 areas and prepared a classification system(draft) for each topic. Third, based on the geological thesaurus classification system, we created the specification of geological thesaurus. Lastly, we designed and implemented an internet-based geological thesaurus system using the specification.

  • PDF

A Conceptual Analysis on Instructional Coaching, Instructional Supervision, and Instructional Consulting (수업코칭, 수업장학, 수업컨설팅에 대한 개념적 분석)

  • Lee, Eunhye;Park, Innwoo
    • 교육공학연구
    • /
    • v.33 no.1
    • /
    • pp.105-135
    • /
    • 2017
  • The purpose of this study is to clarify conceptually the difference of instructional coaching, instructional supervision, and instructional consulting by analyzing their own characteristics. The practices for instructional improvement are common in that fundamental objectives are improvement of instruction and development of teacher's professionalism in instruction. However, each area changed according to the social flow and the demands of the educational field and created a unique activity system. So, in order to get rid of this mixed use of these terms, it is meaningful to distinguish the concepts, attributes, and areas of each activity. The specific study questions were 1) what is the origin of coaching, supervision, and consulting? 2) how are instructional coaching, instructional supervision, and instructional consulting defined in existing research in korea? 3) how can we conceptually distinguish instructional coaching, instructional supervision, and instructional consulting? Based on reviewing various existing studies, First, this study investigated the conceptual origins and lexicons of coaching, supervision, and consulting, respectively, in addition reviewd the prior studies conducted in Korea with regard to instructional coaching, instructional supervision, and instructional consulting, and summarized how each concept is defined according to the researcher. Second, this study compared each two concept with another one. Finally, the existing definitions of instructional coaching, instructional supervision, and instructional consulting were analyzed to find out the inherent and common attributes of each concept. In conclusion, this study suggest that the concept of instructional consulting needs to be redefined to better reflect the characteristics of activities, and that studies that rethink the relationship between instructional coaching and instructional supervision are needed.

Public Sentiment Analysis of Korean Top-10 Companies: Big Data Approach Using Multi-categorical Sentiment Lexicon (국내 주요 10대 기업에 대한 국민 감성 분석: 다범주 감성사전을 활용한 빅 데이터 접근법)

  • Kim, Seo In;Kim, Dong Sung;Kim, Jong Woo
    • Journal of Intelligence and Information Systems
    • /
    • v.22 no.3
    • /
    • pp.45-69
    • /
    • 2016
  • Recently, sentiment analysis using open Internet data is actively performed for various purposes. As online Internet communication channels become popular, companies try to capture public sentiment of them from online open information sources. This research is conducted for the purpose of analyzing pulbic sentiment of Korean Top-10 companies using a multi-categorical sentiment lexicon. Whereas existing researches related to public sentiment measurement based on big data approach classify sentiment into dimensions, this research classifies public sentiment into multiple categories. Dimensional sentiment structure has been commonly applied in sentiment analysis of various applications, because it is academically proven, and has a clear advantage of capturing degree of sentiment and interrelation of each dimension. However, the dimensional structure is not effective when measuring public sentiment because human sentiment is too complex to be divided into few dimensions. In addition, special training is needed for ordinary people to express their feeling into dimensional structure. People do not divide their sentiment into dimensions, nor do they need psychological training when they feel. People would not express their feeling in the way of dimensional structure like positive/negative or active/passive; rather they express theirs in the way of categorical sentiment like sadness, rage, happiness and so on. That is, categorial approach of sentiment analysis is more natural than dimensional approach. Accordingly, this research suggests multi-categorical sentiment structure as an alternative way to measure social sentiment from the point of the public. Multi-categorical sentiment structure classifies sentiments following the way that ordinary people do although there are possibility to contain some subjectiveness. In this research, nine categories: 'Sadness', 'Anger', 'Happiness', 'Disgust', 'Surprise', 'Fear', 'Interest', 'Boredom' and 'Pain' are used as multi-categorical sentiment structure. To capture public sentiment of Korean Top-10 companies, Internet news data of the companies are collected over the past 25 months from a representative Korean portal site. Based on the sentiment words extracted from previous researches, we have created a sentiment lexicon, and analyzed the frequency of the words coming up within the news data. The frequency of each sentiment category was calculated as a ratio out of the total sentiment words to make ranks of distributions. Sentiment comparison among top-4 companies, which are 'Samsung', 'Hyundai', 'SK', and 'LG', were separately visualized. As a next step, the research tested hypothesis to prove the usefulness of the multi-categorical sentiment lexicon. It tested how effective categorial sentiment can be used as relative comparison index in cross sectional and time series analysis. To test the effectiveness of the sentiment lexicon as cross sectional comparison index, pair-wise t-test and Duncan test were conducted. Two pairs of companies, 'Samsung' and 'Hanjin', 'SK' and 'Hanjin' were chosen to compare whether each categorical sentiment is significantly different in pair-wise t-test. Since category 'Sadness' has the largest vocabularies, it is chosen to figure out whether the subgroups of the companies are significantly different in Duncan test. It is proved that five sentiment categories of Samsung and Hanjin and four sentiment categories of SK and Hanjin are different significantly. In category 'Sadness', it has been figured out that there were six subgroups that are significantly different. To test the effectiveness of the sentiment lexicon as time series comparison index, 'nut rage' incident of Hanjin is selected as an example case. Term frequency of sentiment words of the month when the incident happened and term frequency of the one month before the event are compared. Sentiment categories was redivided into positive/negative sentiment, and it is tried to figure out whether the event actually has some negative impact on public sentiment of the company. The difference in each category was visualized, moreover the variation of word list of sentiment 'Rage' was shown to be more concrete. As a result, there was huge before-and-after difference of sentiment that ordinary people feel to the company. Both hypotheses have turned out to be statistically significant, and therefore sentiment analysis in business area using multi-categorical sentiment lexicons has persuasive power. This research implies that categorical sentiment analysis can be used as an alternative method to supplement dimensional sentiment analysis when figuring out public sentiment in business environment.