• Title/Summary/Keyword: Text mining analysis

Search Result 1,221, Processing Time 0.027 seconds

Analysis of CSR·CSV·ESG Research Trends - Based on Big Data Analysis - (CSR·CSV·ESG 연구 동향 분석 - 빅데이터 분석을 중심으로 -)

  • Lee, Eun Ji;Moon, Jaeyoung
    • Journal of Korean Society for Quality Management
    • /
    • v.50 no.4
    • /
    • pp.751-776
    • /
    • 2022
  • Purpose: The purpose of this paper is to present implications by analyzing research trends on CSR, CSV and ESG by text analysis and visual analysis(Comprehensive/ Fields / Years-based) which are big data analyses, by collecting data based on previous studies on CSR, CSV and ESG. Methods: For the collection of analysis data, deep learning was used in the integrated search on the Academic Research Information Service (www.riss.kr) to search for "CSR", "CSV" and "ESG" as search terms, and the Korean abstracts and keyword were scrapped out of the extracted paper and they are organize into EXCEL. For the final step, CSR 2,847 papers, CSV 395 papers, ESG 555 papers derived were analyzed using the Rx64 4.0.2 program and Rstudio using text mining, one of the big data analysis techniques, and Word Cloud for visualization. Results: The results of this study are as follows; CSR, CSV, and ESG studies showed that research slowed down somewhat before 2010, but research increased rapidly until recently in 2019. Research have been found to be heavily researched in the fields of social science, art and physical education, and engineering. As a result of the study, there were many keyword of 'corporate', 'social', and 'responsibility', which were similar in the word cloud analysis. Looking at the frequent keyword and word cloud analysis by field and year, overall keyword were derived similar to all keyword by year. However, some differences appeared in each field. Conclusion: Government support and expert support for CSR, CSV and ESG should be activated, and researches on technology-based strategies are needed. In the future, it is necessary to take various approaches to them. If researches are conducted in consideration of the environment or energy, it is judged that bigger implications can be presented.

A Study of 'Emotion Trigger' by Text Mining Techniques (텍스트 마이닝을 이용한 감정 유발 요인 'Emotion Trigger'에 관한 연구)

  • An, Juyoung;Bae, Junghwan;Han, Namgi;Song, Min
    • Journal of Intelligence and Information Systems
    • /
    • v.21 no.2
    • /
    • pp.69-92
    • /
    • 2015
  • The explosion of social media data has led to apply text-mining techniques to analyze big social media data in a more rigorous manner. Even if social media text analysis algorithms were improved, previous approaches to social media text analysis have some limitations. In the field of sentiment analysis of social media written in Korean, there are two typical approaches. One is the linguistic approach using machine learning, which is the most common approach. Some studies have been conducted by adding grammatical factors to feature sets for training classification model. The other approach adopts the semantic analysis method to sentiment analysis, but this approach is mainly applied to English texts. To overcome these limitations, this study applies the Word2Vec algorithm which is an extension of the neural network algorithms to deal with more extensive semantic features that were underestimated in existing sentiment analysis. The result from adopting the Word2Vec algorithm is compared to the result from co-occurrence analysis to identify the difference between two approaches. The results show that the distribution related word extracted by Word2Vec algorithm in that the words represent some emotion about the keyword used are three times more than extracted by co-occurrence analysis. The reason of the difference between two results comes from Word2Vec's semantic features vectorization. Therefore, it is possible to say that Word2Vec algorithm is able to catch the hidden related words which have not been found in traditional analysis. In addition, Part Of Speech (POS) tagging for Korean is used to detect adjective as "emotional word" in Korean. In addition, the emotion words extracted from the text are converted into word vector by the Word2Vec algorithm to find related words. Among these related words, noun words are selected because each word of them would have causal relationship with "emotional word" in the sentence. The process of extracting these trigger factor of emotional word is named "Emotion Trigger" in this study. As a case study, the datasets used in the study are collected by searching using three keywords: professor, prosecutor, and doctor in that these keywords contain rich public emotion and opinion. Advanced data collecting was conducted to select secondary keywords for data gathering. The secondary keywords for each keyword used to gather the data to be used in actual analysis are followed: Professor (sexual assault, misappropriation of research money, recruitment irregularities, polifessor), Doctor (Shin hae-chul sky hospital, drinking and plastic surgery, rebate) Prosecutor (lewd behavior, sponsor). The size of the text data is about to 100,000(Professor: 25720, Doctor: 35110, Prosecutor: 43225) and the data are gathered from news, blog, and twitter to reflect various level of public emotion into text data analysis. As a visualization method, Gephi (http://gephi.github.io) was used and every program used in text processing and analysis are java coding. The contributions of this study are as follows: First, different approaches for sentiment analysis are integrated to overcome the limitations of existing approaches. Secondly, finding Emotion Trigger can detect the hidden connections to public emotion which existing method cannot detect. Finally, the approach used in this study could be generalized regardless of types of text data. The limitation of this study is that it is hard to say the word extracted by Emotion Trigger processing has significantly causal relationship with emotional word in a sentence. The future study will be conducted to clarify the causal relationship between emotional words and the words extracted by Emotion Trigger by comparing with the relationships manually tagged. Furthermore, the text data used in Emotion Trigger are twitter, so the data have a number of distinct features which we did not deal with in this study. These features will be considered in further study.

Analysis of domestic and foreign research trends of Tricholoma matsutake using text mining techniques

  • Choi, Ah Hyeon;Kang, Jun Won
    • Korean Journal of Agricultural Science
    • /
    • v.48 no.3
    • /
    • pp.505-514
    • /
    • 2021
  • Among non-timber forest products, Tricholoma matsutake is a high value added item. Many countries, including Korea, China, and Japan, are doing research and technology development to increase artificial cultivation and productivity. However, the production of T. matsutake is on the decline due to global warming, abnormal temperatures and pine tree pest problems. Therefore, it is necessary to identify trends in domestic and foreign research on T. matsutake, respond to preemptive research and development to preserve the genetic resources of T. matsutake and increase its productivity. Based on the correlation between keywords in the high frequency keywords, it was observed that microbial clusters of T. matsutake are mainly found in Korea. The main focus in China has been the pharmacology studies on the ingredients of T. matsutake. The main focus in Japan has been on preserving the genetic diversity and species of T. matsutake. Thus, future domestic studies of T. matsutake will require pharmacological studies on the ingredients of T. matsutake and on its genetic diversity and species conservation. In addition, unlike China and Japan, genetic keywords did not appear in Korea at high frequency. Therefore, Korea will have to proceed with research using modern molecular biology techniques.

Keyword Analysis of Arboretums and Botanical Gardens Using Social Big Data

  • Shin, Hyun-Tak;Kim, Sang-Jun;Sung, Jung-Won
    • Journal of People, Plants, and Environment
    • /
    • v.23 no.2
    • /
    • pp.233-243
    • /
    • 2020
  • This study collects social big data used in various fields in the past 9 years and explains the patterns of major keywords of the arboretums and botanical gardens to use as the basic data to establish operational strategies for future arboretums and botanical gardens. A total of 6,245,278 cases of data were collected: 4,250,583 from blogs (68.1%), 1,843,677 from online cafes (29.5%), and 151,018 from knowledge search engine (2.4%). As a result of refining valid data, 1,223,162 cases were selected for analysis. We came up with keywords through big data, and used big data program Textom to derive keywords of arboretums and botanical gardens using text mining analysis. As a result, we identified keywords such as 'travel', 'picnic', 'children', 'festival', 'experience', 'Garden of Morning Calm', 'program', 'recreation forest', 'healing', and 'museum'. As a result of keyword analysis, we found that keywords such as 'healing', 'tree', 'experience', 'garden', and 'Garden of Morning Calm' received high public interest. We conducted word cloud analysis by extracting keywords with high frequency in total 6,245,278 titles on social media. The results showed that arboretums and botanical gardens were perceived as spaces for relaxation and leisure such as 'travel', 'picnic' and 'recreation', and that people had high interest in educational aspects with keywords such as 'experience' and 'field trip'. The demand for rest and leisure space, education, and things to see and enjoy in arboretums and botanical gardens increased than in the past. Therefore, there must be differentiation and specialization strategies such as plant collection strategies, exhibition planning and programs in establishing future operation strategies.

The User Perception in ASMR Marketing Content through Social Media Text-Mining: ASMR Product Review Content vs ASMR How-to Content (텍스트 마이닝을 활용한 ASMR 콘텐츠 분야에 따른 소비자 인식 및 구전효과 차이점 분석: ASMR 제품리뷰 및 ASMR How-to 콘텐츠 중심으로)

  • Tran, Hung Chuong;Choi, Jae Won
    • The Journal of Information Systems
    • /
    • v.30 no.4
    • /
    • pp.1-20
    • /
    • 2021
  • Purpose Nowadays, Autonomous Sensory Meridian Response (ASMR) is rapidly growing in popularity and increasingly appearing in marketing. Not even in TV commercial advertisement, ASMR also fast growing in one-person media communication, many brands and social media influencers used ASMR for their marketing contents. The purpose of this study is to measure consumers' perceptions about the products in ASMR marketing content and compare the differences in communication effect of ASMR content creator between product review and how-to in the same Macro tier influencer - the YouTuber that has 10,000-100,000 subscribers. Design/methodology/approach The research methods selected ASMRtist that do product review content and how-to content, Text comments data was collected from 200 videos of tech-device review videos and beauty-fashion videos. A total of 52,833 text comments were analyzed by applying the LDA topic modeling algorithm and social network analysis. Findings Through the result, we can know that ASMR is good at taking attention of viewers with ASMR triggers. In the Tech device reviews field, ASMR viewers also focus on the product like product's performance and purchase. However, there are many topics related to reaction of ASMR sound, trigger, relaxation. In the Beauty-fashion field, viewers' topics mainly focus on the reaction of the ASMR trigger, response to ASMRtist and other topics are talking about makeup - fashion, product, purchase. From LDA result, many ASMR viewers comment that they feel more comfortable when watching the marketing content that uses ASMR. This result has shown that ASMR marketing contents have a good performance in terms of user watching experience, so applying ASMR can take more consumer intention. And the result of social network analysis showed that product review ASMRtist have a higher communication effectiveness than how-to ASMRtist in the same tier. As an influencer marketing strategy, this study provides information to establish an efficient advertising strategy by using influencers that create ASMR content.

A Study of the Consumer Major Perception of Packaging Using Big Data Analysis -Focusing on Text Mining and Semantic Network Analysis- (빅데이터 분석을 통한 패키징에 대한 소비자의 주요 인식 조사 -텍스트 마이닝과 의미연결망 분석을 중심으로-)

  • Kang, Wook-Geon;Ko, Eui-Suk;Lee, Hak-Rae;Kim, Jai-neung
    • Journal of the Korea Convergence Society
    • /
    • v.9 no.4
    • /
    • pp.15-22
    • /
    • 2018
  • The purpose of this study is to investigate the consumer perception of packaging using big data analysis. This study use text mining to extract meaningful words from text and semantic network analysis to analyze connectivity and propagation trends. Data were collected by dividing the 'packaging(Korean)' and 'packaging(English)'. This study visualized the word network structure of the two key words and classified them into four groups with similar meaning through CONCOR analysis. The group name was specified based on the words constituting the classified group. These groups are a major category of consumers' perception of packaging. Especially cosmetics and design have high frequency of words and high centrality. Therefore it can be expected that the packaging design is perceived as important in the cosmetics industry. This study predicts consumers' perception of packaging so it can be a basis for future research and industry development.

Comparison of responses to issues in SNS and Traditional Media using Text Mining -Focusing on the Termination of Korea-Japan General Security of Military Information Agreement(GSOMIA)- (텍스트 마이닝을 이용한 SNS와 언론의 이슈에 대한 반응 비교 -"한일군사정보보호협정(GSOMIA) 종료"를 중심으로-)

  • Lee, Su Ryeon;Choi, Eun Jung
    • Journal of Digital Convergence
    • /
    • v.18 no.2
    • /
    • pp.277-284
    • /
    • 2020
  • Text mining is a representative method of big data analysis that extracts meaningful information from unstructured and large amounts of text data. Social media such as Twitter generates hundreds of thousands of data per second and acts as a one-person media that instantly and directly expresses public opinions and ideas. The traditional media are delivering informations, criticizing society, and forming public opinions. For this, we compare the responses of SNS with the responses of media on the issue of the termination of the Korea-Japan GSOMIA (General Security of Military Information Agreement), one of the domestic issues in the second half of 2019. Data collected from 201,728 tweets and 20,698 newspaper articles were analyzed by sentiment analysis, association keyword analysis, and cluster analysis. As a result, SNS tends to respond positively to this issue, and the media tends to react negatively. In association keyword analysis, SNS shows positive views on domestic issues such as "destruction, decision, we," while the media shows negative views on external issues such as "disappointment, regret, concern". SNS is faster and more powerful than media when studying or creating social trends and opinions, rather than the function of information delivery. This can complement the role of the media that reflects public perception.

A Study on Educational Data Mining for Public Data Portal through Topic Modeling Method with Latent Dirichlet Allocation (LDA기반 토픽모델링을 활용한 공공데이터 기반의 교육용 데이터마이닝 연구)

  • Seungki Shin
    • Journal of The Korean Association of Information Education
    • /
    • v.26 no.5
    • /
    • pp.439-448
    • /
    • 2022
  • This study aims to search for education-related datasets provided by public data portals and examine what data types are constructed through classification using topic modeling methods. Regarding the data of the public data portal, 3,072 cases of file data in the education field were collected based on the classification system. Text mining analysis was performed using the LDA-based topic modeling method with stopword processing and data pre-processing for each dataset. Program information and student-supporting notifications were usually provided in the pre-classified dataset for education from the data portal. On the other hand, the characteristics of educational programs and supporting information for the disabled, parents, the elderly, and children through the perspective of lifelong education were generally indicated in the dataset collected by searching for education. The results of data analysis through this study show that providing sufficient educational information through the public data portal would be better to help the students' data science-based decision-making and problem-solving skills.

Analysis of the ESG Research Trend : Focusing on SCOPUS DB (ESG 주요 연구 동향 분석: SCOPUS DB를 중심으로)

  • Kyoo-Sung Noh
    • Journal of Digital Convergence
    • /
    • v.21 no.2
    • /
    • pp.9-16
    • /
    • 2023
  • The purpose of this study is to analyze research trends on ESG (Environmental, Social, and Governance), and to present a direction for companies and investors to use ESG information. To this end, text mining, one of the atypical data mining techniques, was used for analysis. Thesis abstracts from January 2014 to February 2023 were collected from the SCOPUS database, and Economics, Econometrics and Finance were the most common. The United States and China published the most ESG papers, and Korea published the 6th most papers in the world. This study is meaningful in that it analyzed the main research trends of ESG using text mining techniques such as LDA and topic modeling. It was confirmed that ESG is being conducted in various fields, not in a specific field, and it is differentiated from previous studies in that it analyzed various influencing factors and ripple effects of ESG.

Association Analysis of Reactive Oxygen Species-Hypertension Genes Discovered by Literature Mining

  • Lim, Ji Eun;Hong, Kyung-Won;Jin, Hyun-Seok;Oh, Bermseok
    • Genomics & Informatics
    • /
    • v.10 no.4
    • /
    • pp.244-248
    • /
    • 2012
  • Oxidative stress, which results in an excessive product of reactive oxygen species (ROS), is one of the fundamental mechanisms of the development of hypertension. In the vascular system, ROS have physical and pathophysiological roles in vascular remodeling and endothelial dysfunction. In this study, ROS-hypertension-related genes were collected by the biological literature-mining tools, such as SciMiner and gene2pubmed, in order to identify the genes that would cause hypertension through ROS. Further, single nucleotide polymorphisms (SNPs) located within these gene regions were examined statistically for their association with hypertension in 6,419 Korean individuals, and pathway enrichment analysis using the associated genes was performed. The 2,945 SNPs of 237 ROS-hypertension genes were analyzed, and 68 genes were significantly associated with hypertension (p < 0.05). The most significant SNP was rs2889611 within MAPK8 (p = $2.70{\times}10^{-5}$; odds ratio, 0.82; confidence interval, 0.75 to 0.90). This study demonstrates that a text mining approach combined with association analysis may be useful to identify the candidate genes that cause hypertension through ROS or oxidative stress.