• Title/Summary/Keyword: Sentiment mining

Search Result 239, Processing Time 1.873 seconds

Analysis of Regional Fertility Gap Factors Using Explainable Artificial Intelligence (설명 가능한 인공지능을 이용한 지역별 출산율 차이 요인 분석)

  • Dongwoo Lee;Mi Kyung Kim;Jungyoon Yoon;Dongwon Ryu;Jae Wook Song
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.47 no.1
    • /
    • pp.41-50
    • /
    • 2024
  • Korea is facing a significant problem with historically low fertility rates, which is becoming a major social issue affecting the economy, labor force, and national security. This study analyzes the factors contributing to the regional gap in fertility rates and derives policy implications. The government and local authorities are implementing a range of policies to address the issue of low fertility. To establish an effective strategy, it is essential to identify the primary factors that contribute to regional disparities. This study identifies these factors and explores policy implications through machine learning and explainable artificial intelligence. The study also examines the influence of media and public opinion on childbirth in Korea by incorporating news and online community sentiment, as well as sentiment fear indices, as independent variables. To establish the relationship between regional fertility rates and factors, the study employs four machine learning models: multiple linear regression, XGBoost, Random Forest, and Support Vector Regression. Support Vector Regression, XGBoost, and Random Forest significantly outperform linear regression, highlighting the importance of machine learning models in explaining non-linear relationships with numerous variables. A factor analysis using SHAP is then conducted. The unemployment rate, Regional Gross Domestic Product per Capita, Women's Participation in Economic Activities, Number of Crimes Committed, Average Age of First Marriage, and Private Education Expenses significantly impact regional fertility rates. However, the degree of impact of the factors affecting fertility may vary by region, suggesting the need for policies tailored to the characteristics of each region, not just an overall ranking of factors.

Exploring the Performance of Multi-Label Feature Selection for Effective Decision-Making: Focusing on Sentiment Analysis (효과적인 의사결정을 위한 다중레이블 기반 속성선택 방법에 관한 연구: 감성 분석을 중심으로)

  • Jong Yoon Won;Kun Chang Lee
    • Information Systems Review
    • /
    • v.25 no.1
    • /
    • pp.47-73
    • /
    • 2023
  • Management decision-making based on artificial intelligence(AI) plays an important role in helping decision-makers. Business decision-making centered on AI is evaluated as a driving force for corporate growth. AI-based on accurate analysis techniques could support decision-makers in making high-quality decisions. This study proposes an effective decision-making method with the application of multi-label feature selection. In this regard, We present a CFS-BR (Correlation-based Feature Selection based on Binary Relevance approach) that reduces data sets in high-dimensional space. As a result of analyzing sample data and empirical data, CFS-BR can support efficient decision-making by selecting the best combination of meaningful attributes based on the Best-First algorithm. In addition, compared to the previous multi-label feature selection method, CFS-BR is useful for increasing the effectiveness of decision-making, as its accuracy is higher.

Investigating Opinion Mining Performance by Combining Feature Selection Methods with Word Embedding and BOW (Bag-of-Words) (속성선택방법과 워드임베딩 및 BOW (Bag-of-Words)를 결합한 오피니언 마이닝 성과에 관한 연구)

  • Eo, Kyun Sun;Lee, Kun Chang
    • Journal of Digital Convergence
    • /
    • v.17 no.2
    • /
    • pp.163-170
    • /
    • 2019
  • Over the past decade, the development of the Web explosively increased the data. Feature selection step is an important step in extracting valuable data from a large amount of data. This study proposes a novel opinion mining model based on combining feature selection (FS) methods with Word embedding to vector (Word2vec) and BOW (Bag-of-words). FS methods adopted for this study are CFS (Correlation based FS) and IG (Information Gain). To select an optimal FS method, a number of classifiers ranging from LR (logistic regression), NN (neural network), NBN (naive Bayesian network) to RF (random forest), RS (random subspace), ST (stacking). Empirical results with electronics and kitchen datasets showed that LR and ST classifiers combined with IG applied to BOW features yield best performance in opinion mining. Results with laptop and restaurant datasets revealed that the RF classifier using IG applied to Word2vec features represents best performance in opinion mining.

A Study on the Change of the View of Love using Text Mining and Sentiment Analysis (텍스트 마이닝과 감성 분석을 통한 연애관의 변화 연구 : <공항가는 길>과 <이번 주 아내가 바람을 핍니다>를 중심으로)

  • Kim, Kyung-Ae;Ku, Jin-Hee
    • Journal of Digital Convergence
    • /
    • v.15 no.2
    • /
    • pp.285-294
    • /
    • 2017
  • In this study, change of the view of love was analyzed by big data analysis in TV drama of married person's love. Two dramas were selected for analysis with opposite theme of love story. The sympathy of audience for the one month period from the end of the drama was analyzed by text mining and sentiment analysis. In particular, changes in the meaning of home meaning are identified. Home is not 'a place where a husband and wife play a social role', but 'a place where they can share real sympathy and one can be happy'. If individuals are not happy, they need to break their homes. In this study, the current divorce rate and the question regarding the matter should be considered. But based on Google Trends, in Korean society, interest in marriage were still higher than romance. It means that people prefer to 'a love to get marriage' in Korean modern society, than 'love for love affair'. It seems to be reflection of cognition change, marriage should be based on true love. This study is expected to be applied to the study of trend change through social media.

Crafting a Quality Performance Evaluation Model Leveraging Unstructured Data (비정형데이터를 활용한 건축현장 품질성과 평가 모델 개발)

  • Lee, Kiseok;Song, Taegeun;Yoo, Wi Sung
    • Journal of the Korea Institute of Building Construction
    • /
    • v.24 no.1
    • /
    • pp.157-168
    • /
    • 2024
  • The frequent occurrence of structural failures at building construction sites in Korea has underscored the critical role of rigorous oversight in the inspection and management of construction projects. As mandated by prevailing regulations and standards, onsite supervision by designated supervisors encompasses thorough documentation of construction quality, material standards, and the history of any reconstructions, among other factors. These reports, predominantly consisting of unstructured data, constitute approximately 80% of the data amassed at construction sites and serve as a comprehensive repository of quality-related information. This research introduces the SL-QPA model, which employs text mining techniques to preprocess supervision reports and establish a sentiment dictionary, thereby enabling the quantification of quality performance. The study's findings, demonstrating a statistically significant Pearson correlation between the quality performance scores derived from the SL-QPA model and various legally defined indicators, were substantiated through a one-way analysis of variance of the correlation coefficients. The SL-QPA model, as developed in this study, offers a supplementary approach to evaluating the quality performance of building construction projects. It holds the promise of enhancing quality inspection and management practices by harnessing the wealth of unstructured data generated throughout the lifecycle of construction projects.

The Analysis on the Relationship between Firms' Exposures to SNS and Stock Prices in Korea (기업의 SNS 노출과 주식 수익률간의 관계 분석)

  • Kim, Taehwan;Jung, Woo-Jin;Lee, Sang-Yong Tom
    • Asia pacific journal of information systems
    • /
    • v.24 no.2
    • /
    • pp.233-253
    • /
    • 2014
  • Can the stock market really be predicted? Stock market prediction has attracted much attention from many fields including business, economics, statistics, and mathematics. Early research on stock market prediction was based on random walk theory (RWT) and the efficient market hypothesis (EMH). According to the EMH, stock market are largely driven by new information rather than present and past prices. Since it is unpredictable, stock market will follow a random walk. Even though these theories, Schumaker [2010] asserted that people keep trying to predict the stock market by using artificial intelligence, statistical estimates, and mathematical models. Mathematical approaches include Percolation Methods, Log-Periodic Oscillations and Wavelet Transforms to model future prices. Examples of artificial intelligence approaches that deals with optimization and machine learning are Genetic Algorithms, Support Vector Machines (SVM) and Neural Networks. Statistical approaches typically predicts the future by using past stock market data. Recently, financial engineers have started to predict the stock prices movement pattern by using the SNS data. SNS is the place where peoples opinions and ideas are freely flow and affect others' beliefs on certain things. Through word-of-mouth in SNS, people share product usage experiences, subjective feelings, and commonly accompanying sentiment or mood with others. An increasing number of empirical analyses of sentiment and mood are based on textual collections of public user generated data on the web. The Opinion mining is one domain of the data mining fields extracting public opinions exposed in SNS by utilizing data mining. There have been many studies on the issues of opinion mining from Web sources such as product reviews, forum posts and blogs. In relation to this literatures, we are trying to understand the effects of SNS exposures of firms on stock prices in Korea. Similarly to Bollen et al. [2011], we empirically analyze the impact of SNS exposures on stock return rates. We use Social Metrics by Daum Soft, an SNS big data analysis company in Korea. Social Metrics provides trends and public opinions in Twitter and blogs by using natural language process and analysis tools. It collects the sentences circulated in the Twitter in real time, and breaks down these sentences into the word units and then extracts keywords. In this study, we classify firms' exposures in SNS into two groups: positive and negative. To test the correlation and causation relationship between SNS exposures and stock price returns, we first collect 252 firms' stock prices and KRX100 index in the Korea Stock Exchange (KRX) from May 25, 2012 to September 1, 2012. We also gather the public attitudes (positive, negative) about these firms from Social Metrics over the same period of time. We conduct regression analysis between stock prices and the number of SNS exposures. Having checked the correlation between the two variables, we perform Granger causality test to see the causation direction between the two variables. The research result is that the number of total SNS exposures is positively related with stock market returns. The number of positive mentions of has also positive relationship with stock market returns. Contrarily, the number of negative mentions has negative relationship with stock market returns, but this relationship is statistically not significant. This means that the impact of positive mentions is statistically bigger than the impact of negative mentions. We also investigate whether the impacts are moderated by industry type and firm's size. We find that the SNS exposures impacts are bigger for IT firms than for non-IT firms, and bigger for small sized firms than for large sized firms. The results of Granger causality test shows change of stock price return is caused by SNS exposures, while the causation of the other way round is not significant. Therefore the correlation relationship between SNS exposures and stock prices has uni-direction causality. The more a firm is exposed in SNS, the more is the stock price likely to increase, while stock price changes may not cause more SNS mentions.

A Study on Social Media Sentiment Analysis for Exploring Public Opinions Related to Education Policies (교육정책관련 여론탐색을 위한 소셜미디어 감정분석 연구)

  • Chung, Jin-Myeong;Yoo, Ki-Young;Koo, Chan-Dong
    • Informatization Policy
    • /
    • v.24 no.4
    • /
    • pp.3-16
    • /
    • 2017
  • With the development of social media services in the era of Web 2.0, the public opinion formation site has been partially shifted from the traditional mass media to social media. This phenomenon is continuing to expand, and public opinions on government polices created and shared on social media are attracting more attention. It is particularly important to grasp public opinions in policy formulation because setting up educational policies involves a variety of stakeholders and conflicts. The purpose of this study is to explore public opinions about education-related policies through an empirical analysis of social media documents on education policies using opinion mining techniques. For this purpose, we collected the education policy-related documents by keyword, which were produced by users through the social media service, tokenized and extracted sentimental qualities of the documents, and scored the qualities using sentiment dictionaries to find out public preferences for specific education policies. As a result, a lot of negative public opinions were found regarding the smart education policies that use the keywords of digital textbooks and e-learning; while the software education policies using coding education and computer thinking as the keywords had more positive opinions. In addition, the general policies having the keywords of free school terms and creative personality education showed more negative public opinions. As much as 20% of the documents were unable to extract sentiments from, signifying that there are still a certain share of blog posts or tweets that do not reflect the writers' opinions.

Sentiment Analysis of Product Reviews to Identify Deceptive Rating Information in Social Media: A SentiDeceptive Approach

  • Marwat, M. Irfan;Khan, Javed Ali;Alshehri, Dr. Mohammad Dahman;Ali, Muhammad Asghar;Hizbullah;Ali, Haider;Assam, Muhammad
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.16 no.3
    • /
    • pp.830-860
    • /
    • 2022
  • [Introduction] Nowadays, many companies are shifting their businesses online due to the growing trend among customers to buy and shop online, as people prefer online purchasing products. [Problem] Users share a vast amount of information about products, making it difficult and challenging for the end-users to make certain decisions. [Motivation] Therefore, we need a mechanism to automatically analyze end-user opinions, thoughts, or feelings in the social media platform about the products that might be useful for the customers to make or change their decisions about buying or purchasing specific products. [Proposed Solution] For this purpose, we proposed an automated SentiDecpective approach, which classifies end-user reviews into negative, positive, and neutral sentiments and identifies deceptive crowd-users rating information in the social media platform to help the user in decision-making. [Methodology] For this purpose, we first collected 11781 end-users comments from the Amazon store and Flipkart web application covering distant products, such as watches, mobile, shoes, clothes, and perfumes. Next, we develop a coding guideline used as a base for the comments annotation process. We then applied the content analysis approach and existing VADER library to annotate the end-user comments in the data set with the identified codes, which results in a labelled data set used as an input to the machine learning classifiers. Finally, we applied the sentiment analysis approach to identify the end-users opinions and overcome the deceptive rating information in the social media platforms by first preprocessing the input data to remove the irrelevant (stop words, special characters, etc.) data from the dataset, employing two standard resampling approaches to balance the data set, i-e, oversampling, and under-sampling, extract different features (TF-IDF and BOW) from the textual data in the data set and then train & test the machine learning algorithms by applying a standard cross-validation approach (KFold and Shuffle Split). [Results/Outcomes] Furthermore, to support our research study, we developed an automated tool that automatically analyzes each customer feedback and displays the collective sentiments of customers about a specific product with the help of a graph, which helps customers to make certain decisions. In a nutshell, our proposed sentiments approach produces good results when identifying the customer sentiments from the online user feedbacks, i-e, obtained an average 94.01% precision, 93.69% recall, and 93.81% F-measure value for classifying positive sentiments.

An Intelligent Recommendation System by Integrating the Attributes of Product and Customer in the Movie Reviews (영화 리뷰의 상품 속성과 고객 속성을 통합한 지능형 추천시스템)

  • Hong, Taeho;Hong, Junwoo;Kim, Eunmi;Kim, Minsu
    • Journal of Intelligence and Information Systems
    • /
    • v.28 no.2
    • /
    • pp.1-18
    • /
    • 2022
  • As digital technology converges into the e-commerce market across industries, online transactions have activated, and the use of online has increased. With the recent spread of infectious diseases such as COVID-19, this market flow is accelerating, and various product information can be provided to customers online. Providing a variety of information provides customers with various opportunities but causes difficulties in decision-making. The recommendation system can help customers to make a decision more effectively. However, the previous research on recommendation systems is limited to only quantitative data and does not reflect detailed factors of products and customers. In this study, we propose an intelligent recommendation system that quantifies the attributes of products and customers by applying text mining techniques to qualitative data based on online reviews and integrates the existing objective indicators of total star rating, sentiment, and emotion. The proposed integrated recommendation model showed superior performance to the overall rating-oriented recommendation model. It expects the new business value to be created through the recommendation result reflecting detailed factors of products and customers.

Importance-Performance Analysis for Korea Mobile Banking Applications: Using Google Playstore Review Data (국내 모바일 뱅킹 애플리케이션에 대한 이용자 중요도-만족도 분석(IPA): 구글 플레이스토어 리뷰 데이터를 활용하여)

  • Sohui, Kim;Moogeon, Kim;Min Ho, Ryu
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.27 no.6
    • /
    • pp.115-126
    • /
    • 2022
  • The purpose of this study is to try to IPA(Importance-Performance Analysis) by applying text mining approaches to user review data for korea mobile banking applications, and to derive priorities for improvement. User review data on mobile banking applications of korea commercial banks (Kookmin Bank, Shinhan Bank, Woori Bank, Hana Bank), local banks (Gyeongnam Bank, Busan Bank), and Internet banks (Kakao Bank, K-Bank, Toss) that gained from Google playstore were used. And LDA topic modeling, frequency analysis, and sentiment analysis were used to derive key attributes and measure the importance and satisfaction of each attribute. Result, although 'Authorizing service', 'Improvement of Function', 'Login', 'Speed/Connectivity', 'System/Update' and 'Banking Service' are relatively important attributes when users use mobile banking applications, their satisfaction is not at the average level, indicating that improvement is urgent.