• Title/Summary/Keyword: Text Sentiment

검색결과 261건 처리시간 0.025초

User Experience Analysis and Management Based on Text Mining: A Smart Speaker Case (텍스트 마이닝 기반 사용자 경험 분석 및 관리: 스마트 스피커 사례)

  • Dine Yeon;Gayeon Park;Hee-Woong Kim
    • Information Systems Review
    • /
    • 제22권2호
    • /
    • pp.77-99
    • /
    • 2020
  • Smart speaker is a device that provides an interactive voice-based service that can search and use various information and contents such as music, calendar, weather, and merchandise using artificial intelligence. Since AI technology provides more sophisticated and optimized services to users by accumulating data, early smart speaker manufacturers tried to build a platform through aggressive marketing. However, the frequency of using smart speakers is less than once a month, accounting for more than one third of the total, and user satisfaction is only 49%. Accordingly, the necessity of strengthening the user experience of smart speakers has emerged in order to acquire a large number of users and to enable continuous use. Therefore, this study analyzes the user experience of the smart speaker and proposes a method for enhancing the user experience of the smart speaker. Based on the analysis results in two stages, we propose ways to enhance the user experience of smart speakers by model. The existing research on the user experience of the smart speaker was mainly conducted by survey and interview-based research, whereas this study collected the actual review data written by the user. Also, this study interpreted the analysis result based on the smart speaker user experience dimension. There is an academic significance in interpreting the text mining results by developing the smart speaker user experience dimension. Based on the results of this study, we can suggest strategies for enhancing the user experience to smart speaker manufacturers.

Generating Sponsored Blog Texts through Fine-Tuning of Korean LLMs (한국어 언어모델 파인튜닝을 통한 협찬 블로그 텍스트 생성)

  • Bo Kyeong Kim;Jae Yeon Byun;Kyung-Ae Cha
    • Journal of Korea Society of Industrial Information Systems
    • /
    • 제29권3호
    • /
    • pp.1-12
    • /
    • 2024
  • In this paper, we fine-tuned KoAlpaca, a large-scale Korean language model, and implemented a blog text generation system utilizing it. Blogs on social media platforms are widely used as a marketing tool for businesses. We constructed training data of positive reviews through emotion analysis and refinement of collected sponsored blog texts and applied QLoRA for the lightweight training of KoAlpaca. QLoRA is a fine-tuning approach that significantly reduces the memory usage required for training, with experiments in an environment with a parameter size of 12.8B showing up to a 58.8% decrease in memory usage compared to LoRA. To evaluate the generative performance of the fine-tuned model, texts generated from 100 inputs not included in the training data produced on average more than twice the number of words compared to the pre-trained model, with texts of positive sentiment also appearing more than twice as often. In a survey conducted for qualitative evaluation of generative performance, responses indicated that the fine-tuned model's generated outputs were more relevant to the given topics on average 77.5% of the time. This demonstrates that the positive review generation language model for sponsored content in this paper can enhance the efficiency of time management for content creation and ensure consistent marketing effects. However, to reduce the generation of content that deviates from the category of positive reviews due to elements of the pre-trained model, we plan to proceed with fine-tuning using the augmentation of training data.

The prediction of the stock price movement after IPO using machine learning and text analysis based on TF-IDF (증권신고서의 TF-IDF 텍스트 분석과 기계학습을 이용한 공모주의 상장 이후 주가 등락 예측)

  • Yang, Suyeon;Lee, Chaerok;Won, Jonggwan;Hong, Taeho
    • Journal of Intelligence and Information Systems
    • /
    • 제28권2호
    • /
    • pp.237-262
    • /
    • 2022
  • There has been a growing interest in IPOs (Initial Public Offerings) due to the profitable returns that IPO stocks can offer to investors. However, IPOs can be speculative investments that may involve substantial risk as well because shares tend to be volatile, and the supply of IPO shares is often highly limited. Therefore, it is crucially important that IPO investors are well informed of the issuing firms and the market before deciding whether to invest or not. Unlike institutional investors, individual investors are at a disadvantage since there are few opportunities for individuals to obtain information on the IPOs. In this regard, the purpose of this study is to provide individual investors with the information they may consider when making an IPO investment decision. This study presents a model that uses machine learning and text analysis to predict whether an IPO stock price would move up or down after the first 5 trading days. Our sample includes 691 Korean IPOs from June 2009 to December 2020. The input variables for the prediction are three tone variables created from IPO prospectuses and quantitative variables that are either firm-specific, issue-specific, or market-specific. The three prospectus tone variables indicate the percentage of positive, neutral, and negative sentences in a prospectus, respectively. We considered only the sentences in the Risk Factors section of a prospectus for the tone analysis in this study. All sentences were classified into 'positive', 'neutral', and 'negative' via text analysis using TF-IDF (Term Frequency - Inverse Document Frequency). Measuring the tone of each sentence was conducted by machine learning instead of a lexicon-based approach due to the lack of sentiment dictionaries suitable for Korean text analysis in the context of finance. For this reason, the training set was created by randomly selecting 10% of the sentences from each prospectus, and the sentence classification task on the training set was performed after reading each sentence in person. Then, based on the training set, a Support Vector Machine model was utilized to predict the tone of sentences in the test set. Finally, the machine learning model calculated the percentages of positive, neutral, and negative sentences in each prospectus. To predict the price movement of an IPO stock, four different machine learning techniques were applied: Logistic Regression, Random Forest, Support Vector Machine, and Artificial Neural Network. According to the results, models that use quantitative variables using technical analysis and prospectus tone variables together show higher accuracy than models that use only quantitative variables. More specifically, the prediction accuracy was improved by 1.45% points in the Random Forest model, 4.34% points in the Artificial Neural Network model, and 5.07% points in the Support Vector Machine model. After testing the performance of these machine learning techniques, the Artificial Neural Network model using both quantitative variables and prospectus tone variables was the model with the highest prediction accuracy rate, which was 61.59%. The results indicate that the tone of a prospectus is a significant factor in predicting the price movement of an IPO stock. In addition, the McNemar test was used to verify the statistically significant difference between the models. The model using only quantitative variables and the model using both the quantitative variables and the prospectus tone variables were compared, and it was confirmed that the predictive performance improved significantly at a 1% significance level.

A Study on Analysis of consumer perception of YouTube advertising using text mining (텍스트 마이닝을 활용한 Youtube 광고에 대한 소비자 인식 분석)

  • Eum, Seong-Won
    • Management & Information Systems Review
    • /
    • 제39권2호
    • /
    • pp.181-193
    • /
    • 2020
  • This study is a study that analyzes consumer perception by utilizing text mining, which is a recent issue. we analyzed the consumer's perception of Samsung Galaxy by analyzing consumer reviews of Samsung Galaxy YouTube ads. for analysis, 1,819 consumer reviews of YouTube ads were extracted. through this data pre-processing, keywords for advertisements were classified and extracted into nouns, adjectives, and adverbs. after that, frequency analysis and emotional analysis were performed. Finally, clustering was performed through CONCOR. the summary of this study is as follows. the first most frequently mentioned words were Galaxy Note (n = 217), Good (n = 135), Pen (n = 40), and Function (n = 29). it can be judged through the advertisement that consumers "Galaxy Note", "Good", "Pen", and "Features" have good functional aspects for Samsung mobile phone products and positively recognize the Note Pen. in addition, the recognition of "Samsung Pay", "Innovation", "Design", and "iPhone" shows that Samsung's mobile phone is highly regarded for its innovative design and functional aspects of Samsung Pay. second, it is the result of sentiment analysis on YouTube advertising. As a result of emotional analysis, the ratio of emotional intensity was positive (75.95%) and higher than negative (24.05%). this means that consumers are positively aware of Samsung Galaxy mobile phones. As a result of the emotional keyword analysis, positive keywords were "good", "good", "innovative", "highest", "fast", "pretty", etc., negative keywords were "frightening", "I want to cry", "discomfort", "sorry", "no", etc. were extracted. the implication of this study is that most of the studies by quantitative analysis methods were considered when looking at the consumer perception study of existing advertisements. In this study, we deviated from quantitative research methods for advertising and attempted to analyze consumer perception through qualitative research. this is expected to have a great influence on future research, and I am sure that it will be a starting point for consumer awareness research through qualitative research.

Intelligent VOC Analyzing System Using Opinion Mining (오피니언 마이닝을 이용한 지능형 VOC 분석시스템)

  • Kim, Yoosin;Jeong, Seung Ryul
    • Journal of Intelligence and Information Systems
    • /
    • 제19권3호
    • /
    • pp.113-125
    • /
    • 2013
  • Every company wants to know customer's requirement and makes an effort to meet them. Cause that, communication between customer and company became core competition of business and that important is increasing continuously. There are several strategies to find customer's needs, but VOC (Voice of customer) is one of most powerful communication tools and VOC gathering by several channels as telephone, post, e-mail, website and so on is so meaningful. So, almost company is gathering VOC and operating VOC system. VOC is important not only to business organization but also public organization such as government, education institute, and medical center that should drive up public service quality and customer satisfaction. Accordingly, they make a VOC gathering and analyzing System and then use for making a new product and service, and upgrade. In recent years, innovations in internet and ICT have made diverse channels such as SNS, mobile, website and call-center to collect VOC data. Although a lot of VOC data is collected through diverse channel, the proper utilization is still difficult. It is because the VOC data is made of very emotional contents by voice or text of informal style and the volume of the VOC data are so big. These unstructured big data make a difficult to store and analyze for use by human. So that, the organization need to automatic collecting, storing, classifying and analyzing system for unstructured big VOC data. This study propose an intelligent VOC analyzing system based on opinion mining to classify the unstructured VOC data automatically and determine the polarity as well as the type of VOC. And then, the basis of the VOC opinion analyzing system, called domain-oriented sentiment dictionary is created and corresponding stages are presented in detail. The experiment is conducted with 4,300 VOC data collected from a medical website to measure the effectiveness of the proposed system and utilized them to develop the sensitive data dictionary by determining the special sentiment vocabulary and their polarity value in a medical domain. Through the experiment, it comes out that positive terms such as "칭찬, 친절함, 감사, 무사히, 잘해, 감동, 미소" have high positive opinion value, and negative terms such as "퉁명, 뭡니까, 말하더군요, 무시하는" have strong negative opinion. These terms are in general use and the experiment result seems to be a high probability of opinion polarity. Furthermore, the accuracy of proposed VOC classification model has been compared and the highest classification accuracy of 77.8% is conformed at threshold with -0.50 of opinion classification of VOC. Through the proposed intelligent VOC analyzing system, the real time opinion classification and response priority of VOC can be predicted. Ultimately the positive effectiveness is expected to catch the customer complains at early stage and deal with it quickly with the lower number of staff to operate the VOC system. It can be made available human resource and time of customer service part. Above all, this study is new try to automatic analyzing the unstructured VOC data using opinion mining, and shows that the system could be used as variable to classify the positive or negative polarity of VOC opinion. It is expected to suggest practical framework of the VOC analysis to diverse use and the model can be used as real VOC analyzing system if it is implemented as system. Despite experiment results and expectation, this study has several limits. First of all, the sample data is only collected from a hospital web-site. It means that the sentimental dictionary made by sample data can be lean too much towards on that hospital and web-site. Therefore, next research has to take several channels such as call-center and SNS, and other domain like government, financial company, and education institute.

A Study on the Enhancing Recommendation Performance Using the Linguistic Factor of Online Review based on Deep Learning Technique (딥러닝 기반 온라인 리뷰의 언어학적 특성을 활용한 추천 시스템 성능 향상에 관한 연구)

  • Dongsoo Jang;Qinglong Li;Jaekyeong Kim
    • Journal of Intelligence and Information Systems
    • /
    • 제29권1호
    • /
    • pp.41-63
    • /
    • 2023
  • As the online e-commerce market growing, the need for a recommender system that can provide suitable products or services to customer is emerging. Recently, many studies using the sentiment score of online review have been proposed to improve the limitations of study on recommender systems that utilize only quantitative information. However, this methodology has limitation in extracting specific preference information related to customer within online reviews, making it difficult to improve recommendation performance. To address the limitation of previous studies, this study proposes a novel recommendation methodology that applies deep learning technique and uses various linguistic factors within online reviews to elaborately learn customer preferences. First, the interaction was learned nonlinearly using deep learning technique for the purpose to extract complex interactions between customer and product. And to effectively utilize online review, cognitive contents, affective contents, and linguistic style matching that have an important influence on customer's purchasing decisions among linguistic factors were used. To verify the proposed methodology, an experiment was conducted using online review data in Amazon.com, and the experimental results confirmed the superiority of the proposed model. This study contributed to the theoretical and methodological aspects of recommender system study by proposing a methodology that effectively utilizes characteristics of customer's preferences in online reviews.

Dedicatory Inscriptions on the Amitabha Buddha and Maitreya Bodhisattva Sculptures of Gamsansa Temple (감산사(甘山寺) 아미타불상(阿彌陁佛像)과 미륵보살상(彌勒菩薩像) 조상기(造像記)의 연구)

  • Nam, Dongsin
    • MISULJARYO - National Museum of Korea Art Journal
    • /
    • 제98권
    • /
    • pp.22-53
    • /
    • 2020
  • This paper analyzes the contents, characteristics, and historical significance of the dedicatory inscriptions (josanggi) on the Amitabha Buddha and the Maitreya Bodhisattva statues of Gamsansa Temple, two masterpieces of Buddhist sculpture from the Unified Silla period. In the first section, I summarize research results from the past century (divided into four periods), before presenting a new perspective and methodology that questions the pre-existing notion that the Maitreya Bodhisattva has a higher rank than the Amitabha Buddha. In the second section, through my own analysis of the dedicatory inscriptions, arrangement, and overall appearance of the two images, I assert that the Amitabha Buddha sculpture actually held a higher rank and greater significance than the Maitreya Bodhisattva sculpture. In the third section, for the first time, I provide a new interpretation of two previously undeciphered characters from the inscriptions. In addition, by comparing the sentence structures from the respective inscriptions and revising the current understanding of the author (chanja) and calligrapher (seoja), I elucidate the possible meaning of some ambiguous phrases. Finally, in the fourth section, I reexamine the content of both inscriptions, differentiating between the parts relating to the patron (josangju), the dedication (josang), and the prayers of the patrons or donors (balwon). In particular, I argue that the phrase "for my deceased parents" is not merely a general axiom, but a specific reference. To summarize, the dedicatory inscriptions can be interpreted as follows: when Kim Jiseong's parents died, they were cremated and he scattered most of their remains by the East Sea. But years later, he regretted having no physical memorial of them to which to pay his respects. Thus, in his later years, he donated his estate on Gamsan as alms and led the construction of Gamsansa Temple. He then commissioned the production of the two stone sculptures of Amitabha Buddha and Maitreya Bodhisattva for the temple, asking that they be sculpted realistically to reflect the actual appearance of his parents. Finally, he enshrined the remains of his parents in the sculptures through the hole in the back of the head (jeonghyeol). The Maitreya Bodhisattva is a standing image with a nirmanakaya, or "transformation Buddha," on the crown. As various art historians have pointed out, this iconography is virtually unprecedented among Maitreya images in East Asian Buddhist sculpture, leading some to speculate that the standing image is actually the Avalokitesvara. However, anyone who reads the dedicatory inscription can have no doubt that this image is in fact the Maitreya. To ensure that the sculpture properly embodied his mother (who wished to be reborn in Tushita Heaven with Maitreya Bodhisattva), Kim Jiseong combined the iconography of the Maitreya and Avalokitesvara (the reincarnation of compassion). Hence, Kim Jiseong's deep love for his mother motivated him to modify the conventional iconography of the Maitreya and Avalokitesvara. A similar sentiment can be found in the sculpture of Amitabha Buddha. To this day, any visitor to the temple who first looks at the sculptures from the front before reading the text on the back will be deeply touched by the filial love of Kim Jiseong, who truly cherished the memory of his parents.

Online news-based stock price forecasting considering homogeneity in the industrial sector (산업군 내 동질성을 고려한 온라인 뉴스 기반 주가예측)

  • Seong, Nohyoon;Nam, Kihwan
    • Journal of Intelligence and Information Systems
    • /
    • 제24권2호
    • /
    • pp.1-19
    • /
    • 2018
  • Since stock movements forecasting is an important issue both academically and practically, studies related to stock price prediction have been actively conducted. The stock price forecasting research is classified into structured data and unstructured data, and it is divided into technical analysis, fundamental analysis and media effect analysis in detail. In the big data era, research on stock price prediction combining big data is actively underway. Based on a large number of data, stock prediction research mainly focuses on machine learning techniques. Especially, research methods that combine the effects of media are attracting attention recently, among which researches that analyze online news and utilize online news to forecast stock prices are becoming main. Previous studies predicting stock prices through online news are mostly sentiment analysis of news, making different corpus for each company, and making a dictionary that predicts stock prices by recording responses according to the past stock price. Therefore, existing studies have examined the impact of online news on individual companies. For example, stock movements of Samsung Electronics are predicted with only online news of Samsung Electronics. In addition, a method of considering influences among highly relevant companies has also been studied recently. For example, stock movements of Samsung Electronics are predicted with news of Samsung Electronics and a highly related company like LG Electronics.These previous studies examine the effects of news of industrial sector with homogeneity on the individual company. In the previous studies, homogeneous industries are classified according to the Global Industrial Classification Standard. In other words, the existing studies were analyzed under the assumption that industries divided into Global Industrial Classification Standard have homogeneity. However, existing studies have limitations in that they do not take into account influential companies with high relevance or reflect the existence of heterogeneity within the same Global Industrial Classification Standard sectors. As a result of our examining the various sectors, it can be seen that there are sectors that show the industrial sectors are not a homogeneous group. To overcome these limitations of existing studies that do not reflect heterogeneity, our study suggests a methodology that reflects the heterogeneous effects of the industrial sector that affect the stock price by applying k-means clustering. Multiple Kernel Learning is mainly used to integrate data with various characteristics. Multiple Kernel Learning has several kernels, each of which receives and predicts different data. To incorporate effects of target firm and its relevant firms simultaneously, we used Multiple Kernel Learning. Each kernel was assigned to predict stock prices with variables of financial news of the industrial group divided by the target firm, K-means cluster analysis. In order to prove that the suggested methodology is appropriate, experiments were conducted through three years of online news and stock prices. The results of this study are as follows. (1) We confirmed that the information of the industrial sectors related to target company also contains meaningful information to predict stock movements of target company and confirmed that machine learning algorithm has better predictive power when considering the news of the relevant companies and target company's news together. (2) It is important to predict stock movements with varying number of clusters according to the level of homogeneity in the industrial sector. In other words, when stock prices are homogeneous in industrial sectors, it is important to use relational effect at the level of industry group without analyzing clusters or to use it in small number of clusters. When the stock price is heterogeneous in industry group, it is important to cluster them into groups. This study has a contribution that we testified firms classified as Global Industrial Classification Standard have heterogeneity and suggested it is necessary to define the relevance through machine learning and statistical analysis methodology rather than simply defining it in the Global Industrial Classification Standard. It has also contribution that we proved the efficiency of the prediction model reflecting heterogeneity.

A Study on Gusadang Kim Nakhaeng's Writing for Ancestral Rites - Exploring the source of his appealing (구사당(九思堂) 김낙행(金樂行)의 제문(祭文) 연구(硏究) - 호소력의 근원에 대한 탐색 -)

  • Jeong, Si-youl
    • (The)Study of the Eastern Classic
    • /
    • 제59호
    • /
    • pp.93-120
    • /
    • 2015
  • The purpose of this study is to explore the source of appealing which Gusadang Kim Nakhaeng's writing for ancestral rites is equipped with. Gusadang was one of the Confucianists in Yeongnam during the 18th century and was praised for his scholarly virtue of jihaenghapil and silcheongunghaeng. Although Gusadang's writing for ancestral rites and his teacher Milam Lee Jaeui's letters were even specially named as 'gujemilchal', there has been almost no research on Gusadang's writing for ancestral rites yet. Therefore, this study selects three pieces of Gusadang's writing for ancestral rites which are especially rich in emotional expression for discussion. Chapter 2 titled as 'the Reconstruction of Memory in a Microscopic Perspective' presents the reason why Gusadang's writing for ancestral rites is recognized even as a piece of work equipped with appealing. Writing for ancestral rites begins from the point that there exists memory that can be shared by both the living and the dead. In reconstructing the anecdote with the dead on the stage of ritual writing in detail, the writer's memory plays an important role. Chapter 3 titled as 'the Rhetorical Reconstruction of Elevated Sensitivity' examines rhetorical devices needed for writing for ancestral rites. Proper rhetoric is needed to upgrade the dignity of the ritual writing and arouse sympathy from the readers. Although writing for ancestral rites is supposed to express sadness in terms of its formal characteristics, it should not end up being a mere outlet of emotion. Chapter 4 looks into 'the Descriptive Reconstruction of Lamenting Sentiment'. There should be a clear focus of description to make the gesture of the living towards the being not existing in the world any longer an appealing story. While maintaining a distinct way of description, Gusadang organizes the noble character of the dead, pitiable death, the precious bond in the past, and the longing of those left for the dead systematically. Writing for ancestral rites is a field to mourn over the death and reproduce the sadness of the living through writing. To make the text written in that way get to work as ritual writing properly, it should be appealing necessarily. This study has found the fact that such appealing that gives life to ritual writing is grounded on authenticity.

The Effect of Domain Specificity on the Performance of Domain-Specific Pre-Trained Language Models (도메인 특수성이 도메인 특화 사전학습 언어모델의 성능에 미치는 영향)

  • Han, Minah;Kim, Younha;Kim, Namgyu
    • Journal of Intelligence and Information Systems
    • /
    • 제28권4호
    • /
    • pp.251-273
    • /
    • 2022
  • Recently, research on applying text analysis to deep learning has steadily continued. In particular, researches have been actively conducted to understand the meaning of words and perform tasks such as summarization and sentiment classification through a pre-trained language model that learns large datasets. However, existing pre-trained language models show limitations in that they do not understand specific domains well. Therefore, in recent years, the flow of research has shifted toward creating a language model specialized for a particular domain. Domain-specific pre-trained language models allow the model to understand the knowledge of a particular domain better and reveal performance improvements on various tasks in the field. However, domain-specific further pre-training is expensive to acquire corpus data of the target domain. Furthermore, many cases have reported that performance improvement after further pre-training is insignificant in some domains. As such, it is difficult to decide to develop a domain-specific pre-trained language model, while it is not clear whether the performance will be improved dramatically. In this paper, we present a way to proactively check the expected performance improvement by further pre-training in a domain before actually performing further pre-training. Specifically, after selecting three domains, we measured the increase in classification accuracy through further pre-training in each domain. We also developed and presented new indicators to estimate the specificity of the domain based on the normalized frequency of the keywords used in each domain. Finally, we conducted classification using a pre-trained language model and a domain-specific pre-trained language model of three domains. As a result, we confirmed that the higher the domain specificity index, the higher the performance improvement through further pre-training.