• Title/Summary/Keyword: 감정 단어

Search Result 151, Processing Time 0.028 seconds

The Characteristics of Malicious Comments: Comparisons of the Internet News Comments in Korean and English (악성 댓글의 특성: 한국어와 영어의 인터넷 뉴스 댓글 비교)

  • Kim, Young-il;Kim, Youngjun;Kim, Youngjin;Kim, Kyungil
    • The Journal of the Korea Contents Association
    • /
    • v.19 no.1
    • /
    • pp.548-558
    • /
    • 2019
  • Along generalization of internet news comments, malicious comments have been spread and made many social problems. Because writings reflect human mental state or trait, analyzing malicious comments, human mental states could be inferred when they write internet news comments. In this study, we analyzed malicious comments of English and Korean speaker using LIWC and KLIWC. As a result, in both English and Korean, malicious comments are commonly more used in sentence, word phrase, morpheme, word phrase per sentence, morpheme per sentence, positive emotion words, and cognitive process words than normal comments, and less used in the third person singular, adjective, anger words, and emotional process words than normal comments. This means people are state that they can not control their feeling such as anger and can not think well when they write news comments. Therefore, when internet comments were written, service provider should consider the way that commenters monitor own writings by themselves and that they prevent the other users from getting close to comments included many negative-emotion words. In other sides, it is discovered that English and Korean malicious comments was discriminated by authenticity. In order to be more objective, gathering data from various point of time is needed.

A Study on Fine-Tuning and Transfer Learning to Construct Binary Sentiment Classification Model in Korean Text (한글 텍스트 감정 이진 분류 모델 생성을 위한 미세 조정과 전이학습에 관한 연구)

  • JongSoo Kim
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.28 no.5
    • /
    • pp.15-30
    • /
    • 2023
  • Recently, generative models based on the Transformer architecture, such as ChatGPT, have been gaining significant attention. The Transformer architecture has been applied to various neural network models, including Google's BERT(Bidirectional Encoder Representations from Transformers) sentence generation model. In this paper, a method is proposed to create a text binary classification model for determining whether a comment on Korean movie review is positive or negative. To accomplish this, a pre-trained multilingual BERT sentence generation model is fine-tuned and transfer learned using a new Korean training dataset. To achieve this, a pre-trained BERT-Base model for multilingual sentence generation with 104 languages, 12 layers, 768 hidden, 12 attention heads, and 110M parameters is used. To change the pre-trained BERT-Base model into a text classification model, the input and output layers were fine-tuned, resulting in the creation of a new model with 178 million parameters. Using the fine-tuned model, with a maximum word count of 128, a batch size of 16, and 5 epochs, transfer learning is conducted with 10,000 training data and 5,000 testing data. A text sentiment binary classification model for Korean movie review with an accuracy of 0.9582, a loss of 0.1177, and an F1 score of 0.81 has been created. As a result of performing transfer learning with a dataset five times larger, a model with an accuracy of 0.9562, a loss of 0.1202, and an F1 score of 0.86 has been generated.

Development of Emotional Word Collection System using Hash Tag of SNS (SNS의 해시태그를 이용한 감정 단어 수집 시스템 개발)

  • Lee, Jong-Hwa;Lee, Yun-Jae;Lee, Hyun-Kyu
    • The Journal of Information Systems
    • /
    • v.27 no.2
    • /
    • pp.77-94
    • /
    • 2018
  • Purpose As the amount of data became enormous, it became a time when more efforts were needed to find the necessary information. Curation is a new term similarly to the museum curator, which is a service that helps people to collect, share, and value the contents of the Internet. In SNS, hash tag is used for emotional vocabulary to be transmitted between users by using (#) tag. Design/methodology/approach As the amount of data became enormous, it became a time when more efforts were needed to find the necessary information. Curation is a new term similarly to the museum curator, which is a service that helps people to collect, share, and value the contents of the Internet. In SNS, hash tag is used for emotional vocabulary to be transmitted between users by using (#) tag. Findings This study base on seven emotional sets such as 'Happy', 'Angry', 'Sad', 'Bad', 'Fearful', 'Surprised', 'Disgusted' to construct 327 emotional seeds and utilize the autofill function of web browser to collect 1.5 million emotional words from emotional seeds. The emotional dictionary of this study is considered to be meaningful as a tool to make emotional judgment from unstructured data.

Comparison and Analysis of Domestic and Foreign Sports Brands Using Text Mining and Opinion Mining Analysis (텍스트 마이닝과 오피니언 마이닝 분석을 활용한 국내외 스포츠용품 브랜드 비교·분석 연구)

  • Kim, Jae-Hwan;Lee, Jae-Moon
    • The Journal of the Korea Contents Association
    • /
    • v.18 no.6
    • /
    • pp.217-234
    • /
    • 2018
  • In this study, big data analysis was conducted for domestic and international sports goods brands. Text Mining, TF-IDF, Opinion Mining, interestity graph were conducted through the social matrix program Textom and the fashion data analysis platform MISP. In order to examine the recent recognition of sports brands, the period of study is limited to 1 year from January 1, 2017 to December 31, 2017. As a result of analysis, first, we could confirm the products representing each brand. Second, I could confirm the marketing that represents each brand. Third, the common words extracted from each brand were identified. Fourth, the emotions of positive and negative of each brand were confirmed.

Comparison of Reliability and Validity of Three Korean Versions of the 20-Item Toronto Alexithymia Scale (TAS-20의 한국판 3종간의 신뢰도 및 타당도 비교)

  • Chung, Un-Sun;Rim, Hyo-Deog;Lee, Yang-Hyun;Kim, Sang-Heon
    • Korean Journal of Psychosomatic Medicine
    • /
    • v.11 no.1
    • /
    • pp.77-88
    • /
    • 2003
  • Objectives: The purpose of this study was to compare reliability and validity of three Korean versions of the 20-item Toronto Alexithymia scale and to confirm the most reliable and validated Korean translation of the 20-item Toronto Alexithymia Scale for both clinical and research purpose in Korea. The first one was a Korean version of the 20-Item Toronto Alexithymia Scale developed by Lee YH et al in 1996 which was designated as TAS-20K(1996) in this study. This scale had a problem with one item due to the cultural difference regarding the word 'analyzing' between western culture and Korean culture. The second one was the revised version of TAS-20K(1996) on that point by Lee YH et al in 1996 without validation which was designated as TAS-20K(2003) in this study. The third one was a 23-item Korean version developed by Sin HG and Won HT in 1997, which was somewhat different from the 20-item Toronto Alexithymia Scale(TAS-20) in the number of total item, the content of some items and the scoring method. This scale was designated as S-TAS here. Methods: 408 medical students were tested with one scale composed of all the different items randomly arranged from the three versions. We evaluated goodness-of-fit and Cronbach $\alpha$ coefficients of three scales for reliability. We used confirmatory factor analysis to compare validity. Results: TAS-20K(2003) showed that it had better internal consistency than TAS-20K(1996), which implied that the cultural difference should be considered in the Korean translation. Both TAS-20K(2003) and S-TAS replicated three-factor structures and had adequacy of fit, good internal consistency and acceptable validity. However, S-TAS had one item with poor item-factor correlation and didn't show high correlation between item 2 and factor 1 as before in 1997. Conclusion: Although S-TAS had added 3 items and changed the content of two items, it didn't show better reliability and validity than TAS-20K(2003). Therefore it is proposed to use TAS-20K (2003) as the Korean version of the 20-item Toronto Alexithymia Scale(TAS-20K) for international communication of results of Alexithymia research. It has good internal consistency and validity and maintains original items, the same construct and scoring method as the 20-item Toronto Alexithymia Scale.

  • PDF

Trend Analysis of Corona Virus(COVID-19) based on Social Media (소셜미디어에 나타난 코로나 바이러스(COVID-19) 인식 분석)

  • Yoon, Sanghoo;Jung, Sangyun;Kim, Young A
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.22 no.5
    • /
    • pp.317-324
    • /
    • 2021
  • This study deals with keywords from social media on domestic portal sites related to COVID-19, which is spreading widely. The data were collected between January 20 and August 15, 2020, and were divided into three stages. The precursor period is before COVID-19 started spreading widely between January 20 and February 17, the serious period denotes the spread in Daegu between February 18 and April 20, and the stable period is the decrease in numbers of confirmed infections up to August 15. The top 50 words were extracted and clustered based on TF-IDF. As a result of the analysis, the precursor period keywords corresponded to congestion of the Situation. The frequent keywords in the serious period were Nation and Infection Route, along with instability surrounding the Treatment of COVID-19. The most common keywords in all periods were infection, mask, person, occurrence, confirmation, and information. People's emotions are becoming more positive as time goes by. Cafes and blogs share text containing writers' thoughts and subjectivity via the internet, so they are the main information-sharing spaces in the non-face-to-face era caused by COVID-19. However, since selectivity and randomness in information delivery exists, a critical view of the information produced on social media is necessary.

Development of Scaffolding Strategies Model by Information Search Process (ISP) (정보탐색과정(ISP)에 의한 스캐폴딩 전략 모형 개발)

  • Jeong-Hoon Lim
    • Journal of Korean Library and Information Science Society
    • /
    • v.54 no.1
    • /
    • pp.143-165
    • /
    • 2023
  • This study aims to propose a scaffolding strategy that can be applied to the information search process by using Kuhlthau's ISP model, which presented a design and implementation strategy for the mediation role in the learning process. To this end, the relevant literature was reviewed to categorize scaffolding strategies, and impressions were collected from the students surveys after providing 150 middle school students in the Daejeon area with the project class to which the scaffolding strategy based on the ISP model was applied. The collected data were processed into a form suitable for analysis through data preprocessing for word frequencies to be extracted, and topic analysis was performed using STM (Structural Topic Modeling). First, after determining the optimal number of topics and extracting topics for each stage of the ISP model, the extracted topics were classified into three types: cognitive domain-macro perspective, cognitive domain-micro perspective, and emotional domain perspective. In this process, we focused on cognitive verbs and emotional verbs among words extracted through text mining, and presented a scaffolding strategy model related to each topic by reviewing representative document cases. Based on the results of this study, if an appropriate scaffolding strategy is provided at the ISP model stage, a positive effect on learners' self-directed task solving can be expected.

A Study on Personalized Emotion Recognition in Forest Healing Space - Focus on Subjective Qualitative Analysis and Bio-signal Measurement - (산림 치유 공간에서의 개인 감정 인지 효과에 관한 연구)

  • Lee, Yang-Woo;Seo, Yong-Mo;Lee, Jung-Nyun;Whang, Min-Cheol
    • Journal of Korea Entertainment Industry Association
    • /
    • v.13 no.2
    • /
    • pp.57-65
    • /
    • 2019
  • This study is a scientific approach to psychological factors such as emotional stability among various effects of forest resources. In order to carry out this study, the experiment was conducted on the subjects by setting the forest healing space as various spaces. The subjects who participated in this experiment were the students in their twenties and the average age was 22±1.25 years. The subjects were assessed for emotional words through subjective sequence evaluation in different designated forest healing spot. In addition, the emotional states that they actually perceived were measured by measuring the bio-signals to their perceived emotions. BMP, SDNN, VLF, LF, HF, Amplitude, and PPI were used for the bio-signal reaction experiment applied to this study. The results of this experiment were measured by Friedman test and Wilcoxon test for statistical analysis. n this study, 'good', 'clear', and 'uncomfortable' words were found statistically significant at the spot of forest healing space for subjective emotional vocabulary. In addition, SDNN, HF and Amplitude were statistically significant in the results of quantitative bio-signal measurement at each spot in the forest healing space. Based on the results of this study, we can suggest the application direction and strategic utilization plan of forest healing spot and forest resource utilization field. This is not only a guide for the users who use the facility through the spatial facilities and physical requirements for the emotion based forest-healing, but also can be used as a personalized emotional space design aspect.

Analysis of the Time-dependent Relation between TV Ratings and the Content of Microblogs (TV 시청률과 마이크로블로그 내용어와의 시간대별 관계 분석)

  • Choeh, Joon Yeon;Baek, Haedeuk;Choi, Jinho
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.1
    • /
    • pp.163-176
    • /
    • 2014
  • Social media is becoming the platform for users to communicate their activities, status, emotions, and experiences to other people. In recent years, microblogs, such as Twitter, have gained in popularity because of its ease of use, speed, and reach. Compared to a conventional web blog, a microblog lowers users' efforts and investment for content generation by recommending shorter posts. There has been a lot research into capturing the social phenomena and analyzing the chatter of microblogs. However, measuring television ratings has been given little attention so far. Currently, the most common method to measure TV ratings uses an electronic metering device installed in a small number of sampled households. Microblogs allow users to post short messages, share daily updates, and conveniently keep in touch. In a similar way, microblog users are interacting with each other while watching television or movies, or visiting a new place. In order to measure TV ratings, some features are significant during certain hours of the day, or days of the week, whereas these same features are meaningless during other time periods. Thus, the importance of features can change during the day, and a model capturing the time sensitive relevance is required to estimate TV ratings. Therefore, modeling time-related characteristics of features should be a key when measuring the TV ratings through microblogs. We show that capturing time-dependency of features in measuring TV ratings is vitally necessary for improving their accuracy. To explore the relationship between the content of microblogs and TV ratings, we collected Twitter data using the Get Search component of the Twitter REST API from January 2013 to October 2013. There are about 300 thousand posts in our data set for the experiment. After excluding data such as adverting or promoted tweets, we selected 149 thousand tweets for analysis. The number of tweets reaches its maximum level on the broadcasting day and increases rapidly around the broadcasting time. This result is stems from the characteristics of the public channel, which broadcasts the program at the predetermined time. From our analysis, we find that count-based features such as the number of tweets or retweets have a low correlation with TV ratings. This result implies that a simple tweet rate does not reflect the satisfaction or response to the TV programs. Content-based features extracted from the content of tweets have a relatively high correlation with TV ratings. Further, some emoticons or newly coined words that are not tagged in the morpheme extraction process have a strong relationship with TV ratings. We find that there is a time-dependency in the correlation of features between the before and after broadcasting time. Since the TV program is broadcast at the predetermined time regularly, users post tweets expressing their expectation for the program or disappointment over not being able to watch the program. The highly correlated features before the broadcast are different from the features after broadcasting. This result explains that the relevance of words with TV programs can change according to the time of the tweets. Among the 336 words that fulfill the minimum requirements for candidate features, 145 words have the highest correlation before the broadcasting time, whereas 68 words reach the highest correlation after broadcasting. Interestingly, some words that express the impossibility of watching the program show a high relevance, despite containing a negative meaning. Understanding the time-dependency of features can be helpful in improving the accuracy of TV ratings measurement. This research contributes a basis to estimate the response to or satisfaction with the broadcasted programs using the time dependency of words in Twitter chatter. More research is needed to refine the methodology for predicting or measuring TV ratings.

A Study on the expression and reader cognition of a Comics character (만화캐릭터의 표정과 독자 인지에 관한 연구)

  • Yoon, Jang-Won
    • Proceedings of the Korea Contents Association Conference
    • /
    • 2006.11a
    • /
    • pp.227-231
    • /
    • 2006
  • As for comics and animation, the specific gravity came to become still larger in all the art fields together with the importance in various image media now which is useful and goes the time of the 21st century new media. Especially the demand of users to the vision culture which develops day by day, Sensitivity Engineering Department is trying to realize the necessity for a sensitivity design acutely together. The influence of the comics which have toxicity most also in Japanese culture in a geographical position like South Korea on it, and animation is the actual condition in the reason which has reached from youth universally to the layer for years, to be inquired systematic to a Korean comics language. This reserch was conducted as we thought sufficient study on various situations are required, and among them, for the reserch of expressions of cartoons's characters, we've divided the expressions of characters that comes out in Japanese cartoons into catagories of "happiness, anger, sadness, pleasure" and "fear, astonishment and dislike" and based on these catagories, we've drawn out the minimum elements to express emotions in cartoon and prepared image-map by relating them with languages that express emotions of people and based on this, we've made a calculating tools on how our readers would read the expression languages. Samples of Japanese cartoons of which we've chosen for the purpose of drawing out the elements of expressions were limited to only published cartoons and we've made a foot steps for expression analysis of animation characters in the future.

  • PDF