• Title/Summary/Keyword: Keyword co-word analysis

Search Result 50, Processing Time 0.028 seconds

Text Mining of Wood Science Research Published in Korean and Japanese Journals

  • Eun-Suk JANG
    • Journal of the Korean Wood Science and Technology
    • /
    • v.51 no.6
    • /
    • pp.458-469
    • /
    • 2023
  • Text mining techniques provide valuable insights into research information across various fields. In this study, text mining was used to identify research trends in wood science from 2012 to 2022, with a focus on representative journals published in Korea and Japan. Abstracts from Journal of the Korean Wood Science and Technology (JKWST, 785 articles) and Journal of Wood Science (JWS, 812 articles) obtained from the SCOPUS database were analyzed in terms of the word frequency (specifically, term frequency-inverse document frequency) and co-occurrence network analysis. Both journals showed a significant occurrence of words related to the physical and mechanical properties of wood. Furthermore, words related to wood species native to each country and their respective timber industries frequently appeared in both journals. CLT was a common keyword in engineering wood materials in Korea and Japan. In addition, the keywords "MDF," "MUF," and "GFRP" were ranked in the top 50 in Korea. Research on wood anatomy was inferred to be more active in Japan than in Korea. Co-occurrence network analysis showed that words related to the physical and structural characteristics of wood were organically related to wood materials.

A Study on the Intelligence Information System's Research Identity Using the Keywords Profiling and Co-word Analysis (주제어 프로파일링 및 동시출현분석을 통한 지능정보시스템 연구의 정체성에 관한 연구)

  • Yoon, Seong Jeong;Kim, Min Yong
    • Journal of Intelligence and Information Systems
    • /
    • v.22 no.4
    • /
    • pp.139-155
    • /
    • 2016
  • The purpose of this study is to find the research identity of the Korea Intelligent Information Systems Society through the profiling methods and co-word analysis in the most recent three-year('2014~'2016) study to collect keyword. In order to understand the research identity for intelligence information system, we need that the relative position of the study will be to compare identity by collecting keyword and research methodology of The korea Society of Management Information Systems and Korea Association of Information Systems, as well as Korea Intelligent Information Systems Society for the similar. Also, Korea Intelligent Information Systems Society is focusing on the four research areas such as artificial intelligence/data mining, Intelligent Internet, knowledge management and optimization techniques. So, we analyze research trends with a representative journals for the focusing on the four research areas. A journal of the data-related will be investigated with the keyword and research methodology in Korean Society for Big Data Service and the Korean Journal of Big Data. Through this research, we will find to research trends with research keyword in recent years and compare against the study methodology and analysis tools. Finally, it is possible to know the position and orientation of the current research trends in Korea Intelligent Information Systems Society. As a result, this study revealed a study area that Korea Intelligent Information Systems Society only be pursued through a unique reveal its legitimacy and identity. So, this research can suggest future research areas to intelligent information systems specifically. Furthermore, we will predict convergence possibility of the similar research areas and Korea Intelligent Information Systems Society in overall ecosystem perspectives.

A Study of 'Emotion Trigger' by Text Mining Techniques (텍스트 마이닝을 이용한 감정 유발 요인 'Emotion Trigger'에 관한 연구)

  • An, Juyoung;Bae, Junghwan;Han, Namgi;Song, Min
    • Journal of Intelligence and Information Systems
    • /
    • v.21 no.2
    • /
    • pp.69-92
    • /
    • 2015
  • The explosion of social media data has led to apply text-mining techniques to analyze big social media data in a more rigorous manner. Even if social media text analysis algorithms were improved, previous approaches to social media text analysis have some limitations. In the field of sentiment analysis of social media written in Korean, there are two typical approaches. One is the linguistic approach using machine learning, which is the most common approach. Some studies have been conducted by adding grammatical factors to feature sets for training classification model. The other approach adopts the semantic analysis method to sentiment analysis, but this approach is mainly applied to English texts. To overcome these limitations, this study applies the Word2Vec algorithm which is an extension of the neural network algorithms to deal with more extensive semantic features that were underestimated in existing sentiment analysis. The result from adopting the Word2Vec algorithm is compared to the result from co-occurrence analysis to identify the difference between two approaches. The results show that the distribution related word extracted by Word2Vec algorithm in that the words represent some emotion about the keyword used are three times more than extracted by co-occurrence analysis. The reason of the difference between two results comes from Word2Vec's semantic features vectorization. Therefore, it is possible to say that Word2Vec algorithm is able to catch the hidden related words which have not been found in traditional analysis. In addition, Part Of Speech (POS) tagging for Korean is used to detect adjective as "emotional word" in Korean. In addition, the emotion words extracted from the text are converted into word vector by the Word2Vec algorithm to find related words. Among these related words, noun words are selected because each word of them would have causal relationship with "emotional word" in the sentence. The process of extracting these trigger factor of emotional word is named "Emotion Trigger" in this study. As a case study, the datasets used in the study are collected by searching using three keywords: professor, prosecutor, and doctor in that these keywords contain rich public emotion and opinion. Advanced data collecting was conducted to select secondary keywords for data gathering. The secondary keywords for each keyword used to gather the data to be used in actual analysis are followed: Professor (sexual assault, misappropriation of research money, recruitment irregularities, polifessor), Doctor (Shin hae-chul sky hospital, drinking and plastic surgery, rebate) Prosecutor (lewd behavior, sponsor). The size of the text data is about to 100,000(Professor: 25720, Doctor: 35110, Prosecutor: 43225) and the data are gathered from news, blog, and twitter to reflect various level of public emotion into text data analysis. As a visualization method, Gephi (http://gephi.github.io) was used and every program used in text processing and analysis are java coding. The contributions of this study are as follows: First, different approaches for sentiment analysis are integrated to overcome the limitations of existing approaches. Secondly, finding Emotion Trigger can detect the hidden connections to public emotion which existing method cannot detect. Finally, the approach used in this study could be generalized regardless of types of text data. The limitation of this study is that it is hard to say the word extracted by Emotion Trigger processing has significantly causal relationship with emotional word in a sentence. The future study will be conducted to clarify the causal relationship between emotional words and the words extracted by Emotion Trigger by comparing with the relationships manually tagged. Furthermore, the text data used in Emotion Trigger are twitter, so the data have a number of distinct features which we did not deal with in this study. These features will be considered in further study.

Web Site Keyword Selection Method by Considering Semantic Similarity Based on Word2Vec (Word2Vec 기반의 의미적 유사도를 고려한 웹사이트 키워드 선택 기법)

  • Lee, Donghun;Kim, Kwanho
    • The Journal of Society for e-Business Studies
    • /
    • v.23 no.2
    • /
    • pp.83-96
    • /
    • 2018
  • Extracting keywords representing documents is very important because it can be used for automated services such as document search, classification, recommendation system as well as quickly transmitting document information. However, when extracting keywords based on the frequency of words appearing in a web site documents and graph algorithms based on the co-occurrence of words, the problem of containing various words that are not related to the topic potentially in the web page structure, There is a difficulty in extracting the semantic keyword due to the limit of the performance of the Korean tokenizer. In this paper, we propose a method to select candidate keywords based on semantic similarity, and solve the problem that semantic keyword can not be extracted and the accuracy of Korean tokenizer analysis is poor. Finally, we use the technique of extracting final semantic keywords through filtering process to remove inconsistent keywords. Experimental results through real web pages of small business show that the performance of the proposed method is improved by 34.52% over the statistical similarity based keyword selection technique. Therefore, it is confirmed that the performance of extracting keywords from documents is improved by considering semantic similarity between words and removing inconsistent keywords.

Microplastics Intellectual Network Analysis based on Bigdata (빅데이터 기반한 미세플라스틱 지적네트워크 분석)

  • Kim, Younghee;Chang, Kwanjong
    • Journal of Convergence for Information Technology
    • /
    • v.12 no.4
    • /
    • pp.239-259
    • /
    • 2022
  • Since 2019, research on microplastics has been actively conducted around the world, so analyzing the differences between domestic and foreign microplastics research can be a milestone in establishing the direction of domestic research. In this study, microplastic papers from KCI and WoS were extracted and the differences between domestic and foreign studies were analyzed using a network analysis methodology based on big data such as author keyword co-occurrence word analysis, thesis co-citation analysis, and author co-citation analysis. As a result of the analysis, the analysis of the research topic confirmed that studies that could affect the human body and the treatment of microplastics in daily life were additionally needed in Korea. In the analysis of the depth of thesis citation that examines the quality of research, it was found that Korea was still insufficient at 2.25 overseas and 1.39 in Korea. In the analysis of the composition of the joint research front, where various researchers participate and share information, 3 out of 22 clusters in Korea are Star type. In the case of overseas, all 19 clusters have a mesh structure, so it was confirmed that information flow and sharing were insufficient in specific research fields in Korea. These research results confirmed the need to expand the research topic of microplastics, improve the quality of research, and improve the research promotion system in which various researchers participate. In addition, if the automation program is developed based on topic modeling, it will be possible to build a system capable of real-time analysis.

A Convergence Study of the Research Trends on Stress Urinary Incontinence using Word Embedding (워드임베딩을 활용한 복압성 요실금 관련 연구 동향에 관한 융합 연구)

  • Kim, Jun-Hee;Ahn, Sun-Hee;Gwak, Gyeong-Tae;Weon, Young-Soo;Yoo, Hwa-Ik
    • Journal of the Korea Convergence Society
    • /
    • v.12 no.8
    • /
    • pp.1-11
    • /
    • 2021
  • The purpose of this study was to analyze the trends and characteristics of 'stress urinary incontinence' research through word frequency analysis, and their relationships were modeled using word embedding. Abstract data of 9,868 papers containing abstracts in PubMed's MEDLINE were extracted using a Python program. Then, through frequency analysis, 10 keywords were selected according to the high frequency. The similarity of words related to keywords was analyzed by Word2Vec machine learning algorithm. The locations and distances of words were visualized using the t-SNE technique, and the groups were classified and analyzed. The number of studies related to stress urinary incontinence has increased rapidly since the 1980s. The keywords used most frequently in the abstract of the paper were 'woman', 'urethra', and 'surgery'. Through Word2Vec modeling, words such as 'female', 'urge', and 'symptom' were among the words that showed the highest relevance to the keywords in the study on stress urinary incontinence. In addition, through the t-SNE technique, keywords and related words could be classified into three groups focusing on symptoms, anatomical characteristics, and surgical interventions of stress urinary incontinence. This study is the first to examine trends in stress urinary incontinence-related studies using the keyword frequency analysis and word embedding of the abstract. The results of this study can be used as a basis for future researchers to select the subject and direction of the research field related to stress urinary incontinence.

Research Trends Analysis on ESG Using Unsupervised Learning

  • Woo-Ryeong YANG;Hoe-Chang YANG
    • The Journal of Economics, Marketing and Management
    • /
    • v.11 no.3
    • /
    • pp.47-66
    • /
    • 2023
  • Purpose: The purpose of this study is to identify research trends related to ESG by domestic and overseas researchers so far, and to present research directions and clues for the possibility of applying ESG to Korean companies in the future and ESG practice through comparison of derived topics. Research design, data and methodology: In this study, as of October 20, 2022, after searching for the keyword 'ESG' in 'scienceON', 341 domestic papers with English abstracts and 1,173 overseas papers were extracted. For analysis, word frequency analysis, word co-occurrence frequency analysis, BERTopic, LDA, and OLS regression analysis were performed to confirm trends for each topic using Python 3.7. Results: As a result of word frequency analysis, It was found that words such as management, company, performance, and value were commonly used in both domestic and overseas papers. In domestic papers, words such as activity and responsibility, and in overseas papers, words such as sustainability, impact, and development were included in the top 20 words. As a result of analyzing the co-occurrence frequency of words, it was confirmed that domestic papers were related mainly to words such as company, management, and activity, and overseas papers were related to words such as investment, sustainability, and performance. As a result of topic modeling, 3 topics such as named ESG from the corporate perspective were derived for domestic papers, and a total of 7 topics such as named sustainable investment for overseas papers were derived. As a result of the annual trend analysis, each topic did not show a relatively increasing or decreasing tendency, confirming that all topics were neutral. Conclusions: The results of this study confirmed that although it is desirable that domestic papers have recently started research on consumers, the subject diversity is lower than that of overseas papers. Therefore, it is suggested that future research needs to approach various topics such as forecasting future risks related to ESG and corporate evaluation methods.

An Investigation on Digital Humanities Research Trend by Analyzing the Papers of Digital Humanities Conferences (디지털 인문학 연구 동향 분석 - Digital Humanities 학술대회 논문을 중심으로 -)

  • Chung, EunKyung
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.55 no.1
    • /
    • pp.393-413
    • /
    • 2021
  • Digital humanities, which creates new and innovative knowledge through the combination of digital information technology and humanities research problems, can be seen as a representative multidisciplinary field of study. To investigate the intellectual structure of the digital humanities field, a network analysis of authors and keywords co-word was performed on a total of 441 papers in the last two years (2019, 2020) at the Digital Humanities Conference. As the results of the author and keyword analysis show, we can find out the active activities of Europe, North America, and Japanese and Chinese authors in East Asia. Through the co-author network, 11 dis-connected sub-networks are identified, which can be seen as a result of closed co-authoring activities. Through keyword analysis, 16 sub-subject areas are identified, which are machine learning, pedagogy, metadata, topic modeling, stylometry, cultural heritage, network, digital archive, natural language processing, digital library, twitter, drama, big data, neural network, virtual reality, and ethics. This results imply that a diver variety of digital information technologies are playing a major role in the digital humanities. In addition, keywords with high frequency can be classified into humanities-based keywords, digital information technology-based keywords, and convergence keywords. The dynamics of the growth and development of digital humanities can represented in these combinations of keywords.

Topic Modeling Analysis of Beauty Industry using BERTopic and LDA

  • YANG, Hoe-Chang;LEE, Won-Dong
    • The Journal of Economics, Marketing and Management
    • /
    • v.10 no.6
    • /
    • pp.1-7
    • /
    • 2022
  • Purpose: The purpose of this study is identifying the research trends of degree papers related to the beauty industry and providing information which can contribute to the development of the domestic beauty industry and the direction of various research about beauty industry. Research design, data and methodology: This study used 154 academic papers and 189 academic papers with English abstracts out of 299 academic papers. All of these papers were found by searching for the keyword "beauty industry" in ScienceON on August 15, 2022. For the analysis, BERTopic and LDA (Latent Dirichlet Allocation) analysis were conducted using Python 3.7. Also, OLS regression analysis was conducted to understand the annual increase and decrease trend of each topic derived with trend analysis. Results: As a result of word frequency analysis, the frequency of satisfaction, management, behavior, and service was found to be high. In addition, it was found that 'service', 'satisfaction' and 'customer' were frequently associated with program and relationship in the word co-occurrence frequency analysis. As a result of topic modeling, six topics were derived: 'Beauty shop', 'Health education', 'Cosmetics', 'Customer satisfaction', 'Beauty education', and 'Beauty business'. The trend analysis result of each topic confirmed that 'Beauty education' and 'Health education' are getting more attention as time goes by. Conclusions: The future studies must resolve the extreme polarization between the structure of the small beauty industry and beauty stores. Furthermore, the researches have to direct various ways to create the performance of internal personnel. The ways to maximize product capabilities such as competitive cosmetics and brands are also needed attentions.

Comparative Analysis of News Big Data related to SARS-CoV, MERS-CoV, and SARS-CoV-2 (COVID-19)

  • Woo, Jae-Hyun
    • Journal of the Korea Society of Computer and Information
    • /
    • v.26 no.8
    • /
    • pp.91-101
    • /
    • 2021
  • This paper intends to draw implications for preparing for Post-Corona in the health field and policy fields as the global pandemic is experienced due to COVID-19. The purpose of this study is to analyze the news and trends of media companies through temporal analysis of the three infectious diseases, SARS-CoV, MERS-CoV, and SARS-CoV-2 (COVID-19), in which the domestic infectious disease preventive system was active throughout the first year of the outbreak. To this end, by using the news analysis program of the Korea Press Foundation 'Big Kinds', the number of news articles per year was digitized based on the period when each infectious disease had an impact on Korea, and major trends were implemented and analyzed in a word cloud. As a result of the analysis, the number of articles related to infectious diseases peaked when the World Health Organization (WHO) declared a warning and (suspicious) confirmed cases occurred. According to keyword and word cloud analysis, 'infectious disease outbreak and major epidemic areas', 'prevention authorities', and 'disease information and confirmed patient information' were found to be the main common features, and differences were derived from the three infectious diseases. In addition, the current status of the infodemic was identified by performing word cloud analysis on information in uncertainty. The results of this study are significant in that they were able to derive the roles of the health authorities and the media that should be preceded in the event of a new disease epidemic through previously experienced infectious diseases, and areas to be rearranged.