• Title/Summary/Keyword: Sentiment Dictionary Construction

Search Result 14, Processing Time 0.029 seconds

The Construction of a Domain-Specific Sentiment Dictionary Using Graph-based Semi-supervised Learning Method (그래프 기반 준지도 학습 방법을 이용한 특정분야 감성사전 구축)

  • Kim, Jung-Ho;Oh, Yean-Ju;Chae, Soo-Hoan
    • Science of Emotion and Sensibility
    • /
    • v.18 no.1
    • /
    • pp.103-110
    • /
    • 2015
  • Sentiment lexicon is an essential element for expressing sentiment on a text or recognizing sentiment from a text. We propose a graph-based semi-supervised learning method to construct a sentiment dictionary as sentiment lexicon set. In particular, we focus on the construction of domain-specific sentiment dictionary. The proposed method makes up a graph according to lexicons and proximity among lexicons, and sentiments of some lexicons which already know their sentiment values are propagated throughout all of the lexicons on the graph. There are two typical types of the sentiment lexicon, sentiment words and sentiment phrase, and we construct a sentiment dictionary by creating each graph of them and infer sentiment of all sentiment lexicons. In order to verify our proposed method, we constructed a sentiment dictionary specific to the movie domain, and conducted sentiment classification experiments with it. As a result, it have been shown that the classification performance using the sentiment dictionary is better than the other using typical general-purpose sentiment dictionary.

Movie Rating Inference by Construction of Movie Sentiment Sentence using Movie comments and ratings (영화평과 평점을 이용한 감성 문장 구축을 통한 영화 평점 추론)

  • Oh, Yean-Ju;Chae, Soo-Hoan
    • Journal of Internet Computing and Services
    • /
    • v.16 no.2
    • /
    • pp.41-48
    • /
    • 2015
  • On movie review sites, movie ratings are determined by netizens' subjective judgement. This means that inconsistency between ratings and opinions from netizens often occurs. To solve this problem, this paper proposes sentiment sentence sets which affect movie evaluation, and apply sets to comments to infer ratings. Creation of sentiment sentence sets is consisted of two stages, construction of sentiment word dictionary and creation of sentiment sentences for sentiment estimation. Sentiment word dictionary contains sentimental words and its polarities included in reviews. Elements of sentiment sentences are combined with movie related noun and predicate from words sentiment word dictionary. In this study, to make correspondence between polarity of sentiment sentence and sentiment word dictionary, sentiment sentences which have different polarity with sentiment word dictionary are removed. The scores of comments are calculated by applying averages of sentiment sentences elements. The result of experiment shows that sentence scores from sentiment sentence sets are closer to reflect real opinion of comments than ratings by netizens'.

Sentiment Dictionary Construction Based on Reason-Sentiment Pattern Using Korean Syntax Analysis (한국어 구문분석을 활용한 이유-감성 패턴 기반의 감성사전 구축)

  • Woo Hyun Kim;Heejung Lee
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.46 no.4
    • /
    • pp.142-151
    • /
    • 2023
  • Sentiment analysis is a method used to comprehend feelings, opinions, and attitudes in text, and it is essential for evaluating consumer feedback and social media posts. However, creating sentiment dictionaries, which are necessary for this analysis, is complex and time-consuming because people express their emotions differently depending on the context and domain. In this study, we propose a new method for simplifying this procedure. We utilize syntax analysis of the Korean language to identify and extract sentiment words based on the Reason-Sentiment Pattern, which distinguishes between words expressing feelings and words explaining why those feelings are expressed, making it applicable in various contexts and domains. We also define sentiment words as those with clear polarity, even when used independently and exclude words whose polarity varies with context and domain. This approach enables the extraction of explicit sentiment expressions, enhancing the accuracy of sentiment analysis at the attribute level. Our methodology, validated using Korean cosmetics review datasets from Korean online shopping malls, demonstrates how a sentiment dictionary focused solely on clear polarity words can provide valuable insights for product planners. Understanding the polarity and reasons behind specific attributes enables improvement of product weaknesses and emphasis on strengths. This approach not only reduces dependency on extensive sentiment dictionaries but also offers high accuracy and applicability across various domains.

Construction and Evaluation of a Sentiment Dictionary Using a Web Corpus Collected from Game Domain (게임 도메인 웹 코퍼스를 이용한 감성사전 구축 및 평가)

  • Jeong, Woo-Young;Bae, Byung-Chull;Cho, Sung Hyun;Kang, Shin-Jin
    • Journal of Korea Game Society
    • /
    • v.18 no.5
    • /
    • pp.113-122
    • /
    • 2018
  • This paper describes an approach to building and evaluating a sentiment dictionary using a Web corpus in the game domain. To build a sentiment dictionary, we collected vocabulary based on game-related web documents from a domestic portal site, using the Twitter Korean Processor. From the collected vocabulary, we selected the words whose POS are tagged as either verbs or adjectives, and assigned sentiment score for each selected word. To evaluate the constructed sentiment dictionary, we calculated F1 score with precision and recall, using Korean-SWN that is based on English Senti-word Net(SWN). The evaluation results show that average F1 scores are 0.85 for adjectives and 0.77 for verbs, respectively.

A Domain Adaptive Sentiment Dictionary Construction Method for Domain Sentiment Analysis (도메인 별 감성분석을 위한 도메인 맞춤형 감성사전 구축 기법)

  • Kim, Dahae;Cho, Taemin;Lee, Jee-Hyong
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2015.01a
    • /
    • pp.15-18
    • /
    • 2015
  • SNS의 확산으로 대중들은 제품, 서비스, 사회적 이슈 등 다양한 도메인에 대하여 자신의 기분이나 의견을 적극적으로 표현하고 있다. 이에 따라 SNS를 분석하여 제품의 수요, TV 시청률, 주가 등의 다양한 현상을 예측하는 데 있어 감성분석을 활용하는 연구가 활발히 진행되고 있다. 감성분석은 각 어휘에 대한 품사, 극성, 감성지수를 규정하고 있는 감성사전을 기반으로 이루어진다. 하지만 동일한 단어라도 도메인에 따라 중요도가 달라지기 때문에 도메인의 특성을 고려한 감성사전을 사용해야 할 필요성이 있다. 따라서 본 연구에서는 다양한 도메인에 대하여 각각의 특성에 맞게 더욱 정확한 감성분석을 할 수 있도록 도메인 맞춤형 감성사전을 구축하는 기법을 제안한다. 도메인 별로 긍 / 부정 평가에 있어 중요한 척도가 되는 단어들을 도메인 감성어휘로 선별하여 목록을 구축하고, 각 감성어휘의 중요도에 따라 도메인 감성지수를 새롭게 정의하였다. 실험 결과, 평가 도메인에 적합한 감성사전이 다른 도메인의 감성사전 및 범용 감성사전보다 우수한 성능을 보였다. 이를 통해 도메인 맞춤형 감성사전 구축기법의 효용성을 확인하였다.

  • PDF

A Comparative Study between Stock Price Prediction Models Using Sentiment Analysis and Machine Learning Based on SNS and News Articles (SNS와 뉴스기사의 감성분석과 기계학습을 이용한 주가예측 모형 비교 연구)

  • Kim, Dongyoung;Park, Jeawon;Choi, Jaehyun
    • Journal of Information Technology Services
    • /
    • v.13 no.3
    • /
    • pp.221-233
    • /
    • 2014
  • Because people's interest of the stock market has been increased with the development of economy, a lot of studies have been going to predict fluctuation of stock prices. Latterly many studies have been made using scientific and technological method among the various forecasting method, and also data using for study are becoming diverse. So, in this paper we propose stock prices prediction models using sentiment analysis and machine learning based on news articles and SNS data to improve the accuracy of prediction of stock prices. Stock prices prediction models that we propose are generated through the four-step process that contain data collection, sentiment dictionary construction, sentiment analysis, and machine learning. The data have been collected to target newspapers related to economy in the case of news article and to target twitter in the case of SNS data. Sentiment dictionary was built using news articles among the collected data, and we utilize it to process sentiment analysis. In machine learning phase, we generate prediction models using various techniques of classification and the data that was made through sentiment analysis. After generating prediction models, we conducted 10-fold cross-validation to measure the performance of they. The experimental result showed that accuracy is over 80% in a number of ways and F1 score is closer to 0.8. The result can be seen as significantly enhanced result compared with conventional researches utilizing opinion mining or data mining techniques.

Predicting the Direction of the Stock Index by Using a Domain-Specific Sentiment Dictionary (주가지수 방향성 예측을 위한 주제지향 감성사전 구축 방안)

  • Yu, Eunji;Kim, Yoosin;Kim, Namgyu;Jeong, Seung Ryul
    • Journal of Intelligence and Information Systems
    • /
    • v.19 no.1
    • /
    • pp.95-110
    • /
    • 2013
  • Recently, the amount of unstructured data being generated through a variety of social media has been increasing rapidly, resulting in the increasing need to collect, store, search for, analyze, and visualize this data. This kind of data cannot be handled appropriately by using the traditional methodologies usually used for analyzing structured data because of its vast volume and unstructured nature. In this situation, many attempts are being made to analyze unstructured data such as text files and log files through various commercial or noncommercial analytical tools. Among the various contemporary issues dealt with in the literature of unstructured text data analysis, the concepts and techniques of opinion mining have been attracting much attention from pioneer researchers and business practitioners. Opinion mining or sentiment analysis refers to a series of processes that analyze participants' opinions, sentiments, evaluations, attitudes, and emotions about selected products, services, organizations, social issues, and so on. In other words, many attempts based on various opinion mining techniques are being made to resolve complicated issues that could not have otherwise been solved by existing traditional approaches. One of the most representative attempts using the opinion mining technique may be the recent research that proposed an intelligent model for predicting the direction of the stock index. This model works mainly on the basis of opinions extracted from an overwhelming number of economic news repots. News content published on various media is obviously a traditional example of unstructured text data. Every day, a large volume of new content is created, digitalized, and subsequently distributed to us via online or offline channels. Many studies have revealed that we make better decisions on political, economic, and social issues by analyzing news and other related information. In this sense, we expect to predict the fluctuation of stock markets partly by analyzing the relationship between economic news reports and the pattern of stock prices. So far, in the literature on opinion mining, most studies including ours have utilized a sentiment dictionary to elicit sentiment polarity or sentiment value from a large number of documents. A sentiment dictionary consists of pairs of selected words and their sentiment values. Sentiment classifiers refer to the dictionary to formulate the sentiment polarity of words, sentences in a document, and the whole document. However, most traditional approaches have common limitations in that they do not consider the flexibility of sentiment polarity, that is, the sentiment polarity or sentiment value of a word is fixed and cannot be changed in a traditional sentiment dictionary. In the real world, however, the sentiment polarity of a word can vary depending on the time, situation, and purpose of the analysis. It can also be contradictory in nature. The flexibility of sentiment polarity motivated us to conduct this study. In this paper, we have stated that sentiment polarity should be assigned, not merely on the basis of the inherent meaning of a word but on the basis of its ad hoc meaning within a particular context. To implement our idea, we presented an intelligent investment decision-support model based on opinion mining that performs the scrapping and parsing of massive volumes of economic news on the web, tags sentiment words, classifies sentiment polarity of the news, and finally predicts the direction of the next day's stock index. In addition, we applied a domain-specific sentiment dictionary instead of a general purpose one to classify each piece of news as either positive or negative. For the purpose of performance evaluation, we performed intensive experiments and investigated the prediction accuracy of our model. For the experiments to predict the direction of the stock index, we gathered and analyzed 1,072 articles about stock markets published by "M" and "E" media between July 2011 and September 2011.

A Study on the Polarity of Apartment Price News Using Big Data Analysis Method (빅데이터 분석기법을 활용한 아파트 가격 관련 뉴스 기사의 극성 분석)

  • Cho, Sang-Yeon;Hong, Eun-Pyo
    • Journal of Digital Convergence
    • /
    • v.17 no.9
    • /
    • pp.47-54
    • /
    • 2019
  • This study confirms the polarity of news articles on apartment prices using Opinion Mining which has widely been used for a big data analysis. The analyses were carried out utilizing internet news articles posted on the Naver for two years: 2012 and 2018. We proposed a sentiment analysis model and modeled a topic-oriented sentiment dictionary construction methods. As a result of analyzing the proposed sentiment analysis model, it was confirmed that there was a difference according to the tendency of the media companies in selecting social issues at the time of rising apartment prices. At the same time, we were able to find more affirmative articles in the media companies which share similar sentiment with the government in charge. In this paper, we proposed a sentiment analysis model that can be used in real estate field and analyzed the polarity of unformatted data related to real estate. In order to integrate them into various fields in the future, it is necessary to build the sentiment dictionaries by themes, as well as to collect various unformatted data over extended periods.

Construction of Onion Sentiment Dictionary using Cluster Analysis (군집분석을 이용한 양파 감성사전 구축)

  • Oh, Seungwon;Kim, Min Soo
    • Journal of the Korean Data Analysis Society
    • /
    • v.20 no.6
    • /
    • pp.2917-2932
    • /
    • 2018
  • Many researches are accomplished as a result of the efforts of developing the production predicting model to solve the supply imbalance of onions which are vegetables very closely related to Korean food. But considering the possibility of storing onions, it is very difficult to solve the supply imbalance of onions only with predicting the production. So, this paper's purpose is trying to build a sentiment dictionary to predict the price of onions by using the internet articles which include the informations about the production of onions and various factors of the price, and these articles are very easy to access on our daily lives. Articles about onions are from 2012 to 2016, using TF-IDF for comparing with four kinds of TF-IDFs through the documents classification of wholesale prices of onions. As a result of classifying the positive/negative words for price by k-means clustering, DBSCAN (density based spatial cluster application with noise) clustering, GMM (Gaussian mixture model) clustering which are partitional clustering, GMM clustering is composed with three meaningful dictionaries. To compare the reasonability of these built dictionary, applying classified articles about the rise and drop of the price on logistic regression, and it shows 85.7% accuracy.

Construction of Vietnamese SentiWordNet by using Vietnamese Dictionary (베트남어 사전을 사용한 베트남어 SentiWordNet 구축)

  • Vu, Xuan-Son;Park, Seong-Bae
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2014.04a
    • /
    • pp.745-748
    • /
    • 2014
  • SentiWordNet is an important lexical resource supporting sentiment analysis in opinion mining applications. In this paper, we propose a novel approach to construct a Vietnamese SentiWordNet (VSWN). SentiWordNet is typically generated from WordNet in which each synset has numerical scores to indicate its opinion polarities. Many previous studies obtained these scores by applying a machine learning method to WordNet. However, Vietnamese WordNet is not available unfortunately by the time of this paper. Therefore, we propose a method to construct VSWN from a Vietnamese dictionary, not from WordNet. We show the effectiveness of the proposed method by generating a VSWN with 39,561 synsets automatically. The method is experimentally tested with 266 synsets with aspect of positivity and negativity. It attains a competitive result compared with English SentiWordNet that is 0.066 and 0.052 differences for positivity and negativity sets respectively.