• Title/Summary/Keyword: Text data

Search Result 2,953, Processing Time 0.031 seconds

A Study on the Tangibility and Intangibility Value Contents Influence Factor of Jongmyo Shrine Using Text Mining Analysis (텍스트 마이닝 분석을 활용한 종묘의 유·무형 콘텐츠 영향요인 연구)

  • Park, Eun Soo;Kim, Ji Eun
    • Korea Science and Art Forum
    • /
    • v.22
    • /
    • pp.169-183
    • /
    • 2015
  • As time is rapidly changing, the culture to represent an era is getting more subdivided and complex. Due to cultural diversity, the influence, cause, characteristics which could be understood in individual field centered by space in the past cannot be understood now only by the viewpoint of one field, and it has become difficult to predict and correspond to the change of the future. With the development of information and knowledge delivery system, various cultural contents to form a space are being created and lapsed, but there are a lot of parts which cannot be explained or understood by only one point of view. To inspect these situation, this study is aimed to draw the Tangibility and Intangibility Value causes that became the influence with Jongmyo Shrine, designated from UNESCO at February 1995, a traditional space with historical superiority, analyze the key factors that became the main factor to form the space, and consider the importance of the related factors. The unconstructured data technique which is applied as the method of analysis in this study can be said to be a new value judgement and viewpoint in interpreting the space. Therefore, this study is a new trial to provide a frame for multilaterally interpreting the various traditional space and culture of Korea from the past to the present.

An Exploratory Study of Generative AI Service Quality using LDA Topic Modeling and Comparison with Existing Dimensions (LDA토픽 모델링을 활용한 생성형 AI 챗봇의 탐색적 연구 : 기존 AI 챗봇 서비스 품질 요인과의 비교)

  • YaeEun Ahn;Jungsuk Oh
    • Journal of Service Research and Studies
    • /
    • v.13 no.4
    • /
    • pp.191-205
    • /
    • 2023
  • Artificial Intelligence (AI), especially in the domain of text-generative services, has witnessed a significant surge, with forecasts indicating the AI-as-a-Service (AIaaS) market reaching a valuation of $55.0 Billion by 2028. This research set out to explore the quality dimensions characterizing synthetic text media software, with a focus on four key players in the industry: ChatGPT, Writesonic, Jasper, and Anyword. Drawing from a comprehensive dataset of over 4,000 reviews sourced from a software evaluation platform, the study employed the Latent Dirichlet Allocation (LDA) topic modeling technique using the Gensim library. This process resulted the data into 11 distinct topics. Subsequent analysis involved comparing these topics against established AI service quality dimensions, specifically AICSQ and AISAQUAL. Notably, the reviews predominantly emphasized dimensions like availability and efficiency, while others, such as anthropomorphism, which have been underscored in prior literature, were absent. This observation is attributed to the inherent nature of the reviews of AI services examined, which lean more towards semantic understanding rather than direct user interaction. The study acknowledges inherent limitations, mainly potential biases stemming from the singular review source and the specific nature of the reviewer demographic. Possible future research includes gauging the real-world implications of these quality dimensions on user satisfaction and to discuss deeper into how individual dimensions might impact overall ratings.

Comparing Corporate and Public ESG Perceptions Using Text Mining and ChatGPT Analysis: Based on Sustainability Reports and Social Media (텍스트마이닝과 ChatGPT 분석을 활용한 기업과 대중의 ESG 인식 비교: 지속가능경영보고서와 소셜미디어를 기반으로)

  • Jae-Hoon Choi;Sung-Byung Yang;Sang-Hyeak Yoon
    • Journal of Intelligence and Information Systems
    • /
    • v.29 no.4
    • /
    • pp.347-373
    • /
    • 2023
  • As the significance of ESG (Environmental, Social, and Governance) management amplifies in driving sustainable growth, this study delves into and compares ESG trends and interrelationships from both corporate and societal viewpoints. Employing a combination of Latent Dirichlet Allocation Topic Modeling (LDA) and Semantic Network Analysis, we analyzed sustainability reports alongside corresponding social media datasets. Additionally, an in-depth examination of social media content was conducted using Joint Sentiment Topic Modeling (JST), further enriched by Semantic Network Analysis (SNA). Complementing text mining analysis with the assistance of ChatGPT, this study identified 25 different ESG topics. It highlighted differences between companies aiming to avoid risks and build trust, and the general public's diverse concerns like investment options and working conditions. Key terms like 'greenwashing,' 'serious accidents,' and 'boycotts' show that many people doubt how companies handle ESG issues. The findings from this study set the foundation for a plan that serves key ESG groups, including businesses, government agencies, customers, and investors. This study also provide to guide the creation of more trustworthy and effective ESG strategies, helping to direct the discussion on ESG effectiveness.

An Analysis of the Support Policy for Small Businesses in the Post-Covid-19 Era Using the LDA Topic Model (LDA 토픽 모델을 활용한 포스트 Covid-19 시대의 소상공인 지원정책 분석)

  • Kyung-Do Suh;Jung-il Choi;Pan-Am Choi;Jaerim Jung
    • Journal of Industrial Convergence
    • /
    • v.22 no.6
    • /
    • pp.51-59
    • /
    • 2024
  • The purpose of the paper is to suggest government policies that are practically helpful to small business owners in pandemic situations such as COVID-19. To this end, keyword frequency analysis and word cloud analysis of text mining analysis were performed by crawling news articles centered on the keywords "COVID-19 Support for Small Businesses", "The Impact of Small Businesses by Response System to COVID-19 Infectious Diseases", and "COVID-19 Small Business Economic Policy", and major issues were identified through LDA topic modeling analysis. As a result of conducting LDA topic modeling, the support policy for small business owners formed a topic label with government cash and financial support, and the impact of small business owners according to the COVID-19 infectious disease response system formed a topic label with a government-led quarantine system and an individual-led quarantine system, and the COVID-19 economic policy formed a topic label with a policy for small business owners to acquire economic crisis and self-sustainability. Focusing on the organized topic label, it was intended to provide basic data for small business owners to understand the damage reduction policy for small business owners and the policy for enhancing market competitiveness in the future pandemic situation.

Exploring the Nature of Cybercrime and Countermeasures: Focusing on Copyright Infringement, Gambling, and Pornography Crimes (사이버 범죄의 특성과 대응방안 연구: 저작권 침해, 도박, 음란물 범죄를 중심으로)

  • Ilwoong Kang;Jaehui Kim;So-Hyun Lee;Hee-Woong Kim
    • Knowledge Management Research
    • /
    • v.25 no.2
    • /
    • pp.69-94
    • /
    • 2024
  • With the development of cyberspace and its increasing interaction with our daily lives, cybercrime has been steadily increasing in recent years and has become more prominent as a serious social problem. Notably, the "four major malicious cybercrimes" - cyber fraud, cyber financial crime, cyber sexual violence, and cyber gambling - have drawn significant attention. In order to minimize the damage of cybercrime, it's crucial to delve into the specifics of each crime and develop targeted prevention and intervention strategies. Yet, most existing research relies on indirect data sources like statistics, victim testimonials, and public opinion. This study seeks to uncover the characteristics and factors of cybercrime by directly interviewing suspects involved in 'copyright infringement', 'gambling' related to illicit online content, and 'pornography crime'. Through coding analysis and text mining, the study aims to offer a more in-depth understanding of cybercrime dynamics. Furthermore, by suggesting preventative and remedial measures, the research aims to equip policymakers with vital information to reduce the repercussions of this escalating digital threat.

A Study on Sentiment Score of Healthcare Service Quality on the Hospital Rating (의료 서비스 리뷰의 감성 수준이 병원 평가에 미치는 영향 분석)

  • Jee-Eun Choi;Sodam Kim;Hee-Woong Kim
    • Information Systems Review
    • /
    • v.20 no.2
    • /
    • pp.111-137
    • /
    • 2018
  • Considering the increase in health insurance benefits and the elderly population of the baby boomer generation, the amount consumed by health care in 2020 is expected to account for 20% of US GDP. As the healthcare industry develops, competition among the medical services of hospitals intensifies, and the need of hospitals to manage the quality of medical services increases. In addition, interest in online reviews of hospitals has increased as online reviews have become a tool to predict hospital quality. Consumers tend to refer to online reviews even when choosing healthcare service providers and after evaluating service quality online. This study aims to analyze the effect of sentiment score of healthcare service quality on hospital rating with Yelp hospital reviews. This study classifies large amount of text data collected online primarily into five service quality measurement indexes of SERVQUAL theory. The sentiment scores of reviews are then derived by SERVQUAL dimensions, and an econometric analysis is conducted to determine the sentiment score effects of the five service quality dimensions on hospital reviews. Results shed light on the means of managing online hospital reputation to benefit managers in the healthcare and medical industry.

A Topic Modeling Approach to the Analysis of Seniors' Happiness and Unhappiness in Korea (토픽 모델링 기반 한국 노인의 행복과 불행 이슈 분석)

  • Dong ji Moon;Dine Yon;Hee-Woong Kim
    • Information Systems Review
    • /
    • v.20 no.2
    • /
    • pp.139-161
    • /
    • 2018
  • As Korea became one of the oldest countries in the world, successful aging emerged as an important issue to individuals as well as to society. This study aims to determine not only the Korean seniors' happiness and unhappiness factors but also the means to enhance their happiness and deal with unhappiness. We collected news articles related to the happiness and unhappiness of seniors with nine keywords based on Alderfer's ERG Theory. We then applied a topic modeling technique, Latent Dirichlet Allocation, to examine the main issues underlying the seniors' happiness and unhappiness. According to the analysis, we investigated the conditions of happiness and unhappiness by inspecting the topics based on each keyword. We also conducted a detailed analysis based on the main factors from topic modeling. We proposed specific ways to increase and overcome the happiness and unhappiness of seniors, respectively, in terms of government, corporate, family, and other social welfare organizations. This study indicates the major factors that affect the happiness and unhappiness of seniors. Specific methods to boost happiness and relief unhappiness are suggested from the additional analysis.

The Usability Evaluation of Kiosks for Individuals with Low Vision (저시력 시각장애인의 키오스크 사용성 평가 연구)

  • Kyounghoon Kim;Yumi Kim;Sumin Baeck;Jeong Hyeun Ko
    • Journal of the Korean Society for information Management
    • /
    • v.41 no.3
    • /
    • pp.331-358
    • /
    • 2024
  • In the rapid digital transformation era, kiosks have become a common element in daily life. However, their widespread deployment has introduced new challenges for socially marginalized groups, including individuals with disabilities and the elderly. This study aims to evaluate the usability of kiosks for individuals with low vision and propose improvement strategies. The study was conducted with eight low-vision university students from A University in Gyeongsangbuk-do and four non-disabled university students from Daegu. Usability was assessed through experiments involving a self-service certificate issuance kiosk and a fast-food restaurant kiosk, using Jakob Nielsen's five usability evaluation criteria: learnability, efficiency, memorability, error prevention, and satisfaction. The results revealed that individuals with low vision faced significant difficulties with small text size, low contrast, no physical buttons, and lack of screen zoom functionality. To address these issues, the study recommends enhancements such as increasing text size and contrast, incorporating physical buttons, adding zoom functionality, ensuring consistent UI design, and providing auditory feedback. This study provides foundational data for enhancing information accessibility for individuals with low vision. It offers critical insights into kiosk design and policy recommendations, thereby contributing to the mitigation of the digital divide.

Customer Behavior Prediction of Binary Classification Model Using Unstructured Information and Convolution Neural Network: The Case of Online Storefront (비정형 정보와 CNN 기법을 활용한 이진 분류 모델의 고객 행태 예측: 전자상거래 사례를 중심으로)

  • Kim, Seungsoo;Kim, Jongwoo
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.2
    • /
    • pp.221-241
    • /
    • 2018
  • Deep learning is getting attention recently. The deep learning technique which had been applied in competitions of the International Conference on Image Recognition Technology(ILSVR) and AlphaGo is Convolution Neural Network(CNN). CNN is characterized in that the input image is divided into small sections to recognize the partial features and combine them to recognize as a whole. Deep learning technologies are expected to bring a lot of changes in our lives, but until now, its applications have been limited to image recognition and natural language processing. The use of deep learning techniques for business problems is still an early research stage. If their performance is proved, they can be applied to traditional business problems such as future marketing response prediction, fraud transaction detection, bankruptcy prediction, and so on. So, it is a very meaningful experiment to diagnose the possibility of solving business problems using deep learning technologies based on the case of online shopping companies which have big data, are relatively easy to identify customer behavior and has high utilization values. Especially, in online shopping companies, the competition environment is rapidly changing and becoming more intense. Therefore, analysis of customer behavior for maximizing profit is becoming more and more important for online shopping companies. In this study, we propose 'CNN model of Heterogeneous Information Integration' using CNN as a way to improve the predictive power of customer behavior in online shopping enterprises. In order to propose a model that optimizes the performance, which is a model that learns from the convolution neural network of the multi-layer perceptron structure by combining structured and unstructured information, this model uses 'heterogeneous information integration', 'unstructured information vector conversion', 'multi-layer perceptron design', and evaluate the performance of each architecture, and confirm the proposed model based on the results. In addition, the target variables for predicting customer behavior are defined as six binary classification problems: re-purchaser, churn, frequent shopper, frequent refund shopper, high amount shopper, high discount shopper. In order to verify the usefulness of the proposed model, we conducted experiments using actual data of domestic specific online shopping company. This experiment uses actual transactions, customers, and VOC data of specific online shopping company in Korea. Data extraction criteria are defined for 47,947 customers who registered at least one VOC in January 2011 (1 month). The customer profiles of these customers, as well as a total of 19 months of trading data from September 2010 to March 2012, and VOCs posted for a month are used. The experiment of this study is divided into two stages. In the first step, we evaluate three architectures that affect the performance of the proposed model and select optimal parameters. We evaluate the performance with the proposed model. Experimental results show that the proposed model, which combines both structured and unstructured information, is superior compared to NBC(Naïve Bayes classification), SVM(Support vector machine), and ANN(Artificial neural network). Therefore, it is significant that the use of unstructured information contributes to predict customer behavior, and that CNN can be applied to solve business problems as well as image recognition and natural language processing problems. It can be confirmed through experiments that CNN is more effective in understanding and interpreting the meaning of context in text VOC data. And it is significant that the empirical research based on the actual data of the e-commerce company can extract very meaningful information from the VOC data written in the text format directly by the customer in the prediction of the customer behavior. Finally, through various experiments, it is possible to say that the proposed model provides useful information for the future research related to the parameter selection and its performance.

Increasing Accuracy of Stock Price Pattern Prediction through Data Augmentation for Deep Learning (데이터 증강을 통한 딥러닝 기반 주가 패턴 예측 정확도 향상 방안)

  • Kim, Youngjun;Kim, Yeojeong;Lee, Insun;Lee, Hong Joo
    • The Journal of Bigdata
    • /
    • v.4 no.2
    • /
    • pp.1-12
    • /
    • 2019
  • As Artificial Intelligence (AI) technology develops, it is applied to various fields such as image, voice, and text. AI has shown fine results in certain areas. Researchers have tried to predict the stock market by utilizing artificial intelligence as well. Predicting the stock market is known as one of the difficult problems since the stock market is affected by various factors such as economy and politics. In the field of AI, there are attempts to predict the ups and downs of stock price by studying stock price patterns using various machine learning techniques. This study suggest a way of predicting stock price patterns based on the Convolutional Neural Network(CNN) among machine learning techniques. CNN uses neural networks to classify images by extracting features from images through convolutional layers. Therefore, this study tries to classify candlestick images made by stock data in order to predict patterns. This study has two objectives. The first one referred as Case 1 is to predict the patterns with the images made by the same-day stock price data. The second one referred as Case 2 is to predict the next day stock price patterns with the images produced by the daily stock price data. In Case 1, data augmentation methods - random modification and Gaussian noise - are applied to generate more training data, and the generated images are put into the model to fit. Given that deep learning requires a large amount of data, this study suggests a method of data augmentation for candlestick images. Also, this study compares the accuracies of the images with Gaussian noise and different classification problems. All data in this study is collected through OpenAPI provided by DaiShin Securities. Case 1 has five different labels depending on patterns. The patterns are up with up closing, up with down closing, down with up closing, down with down closing, and staying. The images in Case 1 are created by removing the last candle(-1candle), the last two candles(-2candles), and the last three candles(-3candles) from 60 minutes, 30 minutes, 10 minutes, and 5 minutes candle charts. 60 minutes candle chart means one candle in the image has 60 minutes of information containing an open price, high price, low price, close price. Case 2 has two labels that are up and down. This study for Case 2 has generated for 60 minutes, 30 minutes, 10 minutes, and 5minutes candle charts without removing any candle. Considering the stock data, moving the candles in the images is suggested, instead of existing data augmentation techniques. How much the candles are moved is defined as the modified value. The average difference of closing prices between candles was 0.0029. Therefore, in this study, 0.003, 0.002, 0.001, 0.00025 are used for the modified value. The number of images was doubled after data augmentation. When it comes to Gaussian Noise, the mean value was 0, and the value of variance was 0.01. For both Case 1 and Case 2, the model is based on VGG-Net16 that has 16 layers. As a result, 10 minutes -1candle showed the best accuracy among 60 minutes, 30 minutes, 10 minutes, 5minutes candle charts. Thus, 10 minutes images were utilized for the rest of the experiment in Case 1. The three candles removed from the images were selected for data augmentation and application of Gaussian noise. 10 minutes -3candle resulted in 79.72% accuracy. The accuracy of the images with 0.00025 modified value and 100% changed candles was 79.92%. Applying Gaussian noise helped the accuracy to be 80.98%. According to the outcomes of Case 2, 60minutes candle charts could predict patterns of tomorrow by 82.60%. To sum up, this study is expected to contribute to further studies on the prediction of stock price patterns using images. This research provides a possible method for data augmentation of stock data.

  • PDF