• Title/Summary/Keyword: co-word network analysis

Search Result 92, Processing Time 0.024 seconds

Analysis of News Agenda Using Text mining and Semantic Network Analysis: Focused on COVID-19 Emotions (텍스트 마이닝과 의미 네트워크 분석을 활용한 뉴스 의제 분석: 코로나 19 관련 감정을 중심으로)

  • Yoo, So-yeon;Lim, Gyoo-gun
    • Journal of Intelligence and Information Systems
    • /
    • v.27 no.1
    • /
    • pp.47-64
    • /
    • 2021
  • The global spread of COVID-19 around the world has not only affected many parts of our daily life but also has a huge impact on many areas, including the economy and society. As the number of confirmed cases and deaths increases, medical staff and the public are said to be experiencing psychological problems such as anxiety, depression, and stress. The collective tragedy that accompanies the epidemic raises fear and anxiety, which is known to cause enormous disruptions to the behavior and psychological well-being of many. Long-term negative emotions can reduce people's immunity and destroy their physical balance, so it is essential to understand the psychological state of COVID-19. This study suggests a method of monitoring medial news reflecting current days which requires striving not only for physical but also for psychological quarantine in the prolonged COVID-19 situation. Moreover, it is presented how an easier method of analyzing social media networks applies to those cases. The aim of this study is to assist health policymakers in fast and complex decision-making processes. News plays a major role in setting the policy agenda. Among various major media, news headlines are considered important in the field of communication science as a summary of the core content that the media wants to convey to the audiences who read it. News data used in this study was easily collected using "Bigkinds" that is created by integrating big data technology. With the collected news data, keywords were classified through text mining, and the relationship between words was visualized through semantic network analysis between keywords. Using the KrKwic program, a Korean semantic network analysis tool, text mining was performed and the frequency of words was calculated to easily identify keywords. The frequency of words appearing in keywords of articles related to COVID-19 emotions was checked and visualized in word cloud 'China', 'anxiety', 'situation', 'mind', 'social', and 'health' appeared high in relation to the emotions of COVID-19. In addition, UCINET, a specialized social network analysis program, was used to analyze connection centrality and cluster analysis, and a method of visualizing a graph using Net Draw was performed. As a result of analyzing the connection centrality between each data, it was found that the most central keywords in the keyword-centric network were 'psychology', 'COVID-19', 'blue', and 'anxiety'. The network of frequency of co-occurrence among the keywords appearing in the headlines of the news was visualized as a graph. The thickness of the line on the graph is proportional to the frequency of co-occurrence, and if the frequency of two words appearing at the same time is high, it is indicated by a thick line. It can be seen that the 'COVID-blue' pair is displayed in the boldest, and the 'COVID-emotion' and 'COVID-anxiety' pairs are displayed with a relatively thick line. 'Blue' related to COVID-19 is a word that means depression, and it was confirmed that COVID-19 and depression are keywords that should be of interest now. The research methodology used in this study has the convenience of being able to quickly measure social phenomena and changes while reducing costs. In this study, by analyzing news headlines, we were able to identify people's feelings and perceptions on issues related to COVID-19 depression, and identify the main agendas to be analyzed by deriving important keywords. By presenting and visualizing the subject and important keywords related to the COVID-19 emotion at a time, medical policy managers will be able to be provided a variety of perspectives when identifying and researching the regarding phenomenon. It is expected that it can help to use it as basic data for support, treatment and service development for psychological quarantine issues related to COVID-19.

Bibliometric Analysis on Health Information-Related Research in Korea (국내 건강정보관련 연구에 대한 계량서지학적 분석)

  • Jin Won Kim;Hanseul Lee
    • Journal of the Korean Society for information Management
    • /
    • v.41 no.1
    • /
    • pp.411-438
    • /
    • 2024
  • This study aims to identify and comprehensively view health information-related research trends using a bibliometric analysis. To this end, 1,193 papers from 2002 to 2023 related to "health information" were collected through the Korea Citation Index (KCI) database and analyzed in diverse aspects: research trends by period, academic fields, intellectual structure, and keyword changes. Results indicated that the number of papers related to health information continued to increase and has been decreasing since 2021. The main academic fields of health information-related research included "biomedical engineering," "preventive medicine/occupational environmental medicine," "law," "nursing," "library and information science," and "interdisciplinary research." Moreover, a co-word analysis was performed to understand the intellectual structure of research related to health information. As a result of applying the parallel nearest neighbor clustering (PNNC) algorithm to identify the structure and cluster of the derived network, four clusters and 17 subgroups belonging to them could be identified, centering on two conglomerates: "medical engineering perspective on health information" and "social science perspective on health information." An inflection point analysis was attempted to track the timing of change in the academic field and keywords, and common changes were observed between 2010 and 2011. Finally, a strategy diagram was derived through the average publication year and word frequency, and high-frequency keywords were presented by dividing them into "promising," "growth," and "mature." Unlike previous studies that mainly focused on content analysis, this study is meaningful in that it viewed the research area related to health information from an integrated perspective using various bibliometric methods.

A Study on the Structures and Characteristics of National Policy Knowledge (국가 정책지식의 구조와 특성에 관한 연구)

  • Lee, Ji-Sue;Chung, Young-Mee
    • Journal of Information Management
    • /
    • v.41 no.2
    • /
    • pp.1-30
    • /
    • 2010
  • This study analyzed research output in dominant research areas of 19 national research institutions. Policy knowledge produced by the institutions during the past 5 years mainly concerned 10 policies dealing with economy and society issues. Similarities between the research subjects of the institutions were displayed by MDS mapping. The study also identified issue attention cycles of the 5 chosen policies and examined the correlation between the issue attention cycles and the yields of policy knowledge. The knowledge structure of each policy was mapped using co-word analysis and Ward's clustering. It was also found that the institutions performing research on similar subjects demonstrated citation preferences for each other.

A Bibliometric Approach for Department-Level Disciplinary Analysis and Science Mapping of Research Output Using Multiple Classification Schemes

  • Gautam, Pitambar
    • Journal of Contemporary Eastern Asia
    • /
    • v.18 no.1
    • /
    • pp.7-29
    • /
    • 2019
  • This study describes an approach for comparative bibliometric analysis of scientific publications related to (i) individual or several departments comprising a university, and (ii) broader integrated subject areas using multiple disciplinary schemes. It uses a custom dataset of scientific publications (ca. 15,000 articles and reviews, published during 2009-2013, and recorded in the Web of Science Core Collections) with author affiliations to the research departments, dedicated to science, technology, engineering, mathematics, and medicine (STEMM), of a comprehensive university. The dataset was subjected, at first, to the department level and discipline level analyses using the newly available KAKEN-L3 classification (based on MEXT/JSPS Grants-in-Aid system), hierarchical clustering, correspondence analysis to decipher the major departmental and disciplinary clusters, and visualization of the department-discipline relationships using two-dimensional stacked bar diagrams. The next step involved the creation of subsets covering integrated subject areas and a comparative analysis of departmental contributions to a specific area (medical, health and life science) using several disciplinary schemes: Essential Science Indicators (ESI) 22 research fields, SCOPUS 27 subject areas, OECD Frascati 38 subordinate research fields, and KAKEN-L3 66 subject categories. To illustrate the effective use of the science mapping techniques, the same subset for medical, health and life science area was subjected to network analyses for co-occurrences of keywords, bibliographic coupling of the publication sources, and co-citation of sources in the reference lists. The science mapping approach demonstrates the ways to extract information on the prolific research themes, the most frequently used journals for publishing research findings, and the knowledge base underlying the research activities covered by the publications concerned.

Identifying Top K Persuaders Using Singular Value Decomposition

  • Min, Yun-Hong;Chung, Ye-Rim
    • Journal of Distribution Science
    • /
    • v.14 no.9
    • /
    • pp.25-29
    • /
    • 2016
  • Purpose - Finding top K persuaders in consumer network is an important problem in marketing. Recently, a new method of computing persuasion scores, interpreted as fixed point or stable distribution for given persuasion probabilities, was proposed. Top K persuaders are chosen according to the computed scores. This research proposed a new definition of persuasion scores relaxing some conditions on the matrix of probabilities, and a method to identify top K persuaders based on the defined scores. Research design, data, and methodology - A new method of computing top K persuaders is computed by singular value decomposition (SVD) of the matrix which represents persuasion probabilities between entities. Results - By testing a randomly generated instance, it turns out that the proposed method is essentially different from the previous study sharing a similar idea. Conclusions - The proposed method is shown to be valid with respect to both theoretical analysis and empirical test. However, this method is limited to the category of persuasion scores relying on the matrix-form of persuasion probabilities. In addition, the strength of the method should be evaluated via additional experiments, e.g., using real instances, different benchmark methods, efficient numerical methods for SVD, and other decomposition methods such as NMF.

A Study on Analysis of Research Trends and Intellectual Structure of Cataloging Field (목록 분야 연구동향 및 지적구조 분석)

  • Lee, Ji Won
    • Journal of the Korean Society for information Management
    • /
    • v.36 no.4
    • /
    • pp.279-300
    • /
    • 2019
  • This study aims to analyze and to demonstrate the research trends and intellectual structure in the field of catalog in the 2000s and 2010s through co-word analysis. The field of catalog had firmly established its own research area and Many differences were found in research trends and intellectual structures in the 2000s and 2010s. First, the average number of articles decreased by 4.2 in the 2010s compared to the 2000s, but the number of author keywords was not significantly different. Only 22.2% of keywords appeared more than three times in both periods, and 77.8% of keywords appeared more than three times in one period. Second, in terms of intellectual structure, the 2000s, represented by three-level clusters, formed a more complex network than the 2010s, represented by two-level clusters. Third, as a result of examining the changes in the characteristics of each cluster, there were some research topics with few changes, but many research topics were more actively progressed or subdivided, and decreased. The results of this study are meaningful in that they can visually grasp the intellectual structure along with the trend of the age of catalogue, and can prepare for related education and research by predicting the future.

Exploring the Key Technologies on Next Production Innovation (4차 산업혁명 차세대 생산혁신 기술 탐색: 키워드 네트워크를 중심으로)

  • Lee, Suchul;Ko, Mihyun
    • Journal of the Korea Convergence Society
    • /
    • v.9 no.9
    • /
    • pp.199-207
    • /
    • 2018
  • This study aims to analyze Next Production Revolution (NPR) technologies through evidence-based keyword network in order to cope with the change of production paradigm called the Fourth Industrial Revolution (4IR). For the analysis, a total of 441 papers related to NPR or 4IR were extracted and the NPR technology network was constructed based on the simultaneous appearance relationship of the author keywords of these papers. Based on the NPR technology network, we explored key technologies through analysis of centrality and keyword group. As a result, technologies such as 'digital twin' and 'modeling and simulation', discovering insights by connecting the virtual and physical world in real time and reflecting them into design and process, are analyzed as key technologies.

Selective Word Embedding for Sentence Classification by Considering Information Gain and Word Similarity (문장 분류를 위한 정보 이득 및 유사도에 따른 단어 제거와 선택적 단어 임베딩 방안)

  • Lee, Min Seok;Yang, Seok Woo;Lee, Hong Joo
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.4
    • /
    • pp.105-122
    • /
    • 2019
  • Dimensionality reduction is one of the methods to handle big data in text mining. For dimensionality reduction, we should consider the density of data, which has a significant influence on the performance of sentence classification. It requires lots of computations for data of higher dimensions. Eventually, it can cause lots of computational cost and overfitting in the model. Thus, the dimension reduction process is necessary to improve the performance of the model. Diverse methods have been proposed from only lessening the noise of data like misspelling or informal text to including semantic and syntactic information. On top of it, the expression and selection of the text features have impacts on the performance of the classifier for sentence classification, which is one of the fields of Natural Language Processing. The common goal of dimension reduction is to find latent space that is representative of raw data from observation space. Existing methods utilize various algorithms for dimensionality reduction, such as feature extraction and feature selection. In addition to these algorithms, word embeddings, learning low-dimensional vector space representations of words, that can capture semantic and syntactic information from data are also utilized. For improving performance, recent studies have suggested methods that the word dictionary is modified according to the positive and negative score of pre-defined words. The basic idea of this study is that similar words have similar vector representations. Once the feature selection algorithm selects the words that are not important, we thought the words that are similar to the selected words also have no impacts on sentence classification. This study proposes two ways to achieve more accurate classification that conduct selective word elimination under specific regulations and construct word embedding based on Word2Vec embedding. To select words having low importance from the text, we use information gain algorithm to measure the importance and cosine similarity to search for similar words. First, we eliminate words that have comparatively low information gain values from the raw text and form word embedding. Second, we select words additionally that are similar to the words that have a low level of information gain values and make word embedding. In the end, these filtered text and word embedding apply to the deep learning models; Convolutional Neural Network and Attention-Based Bidirectional LSTM. This study uses customer reviews on Kindle in Amazon.com, IMDB, and Yelp as datasets, and classify each data using the deep learning models. The reviews got more than five helpful votes, and the ratio of helpful votes was over 70% classified as helpful reviews. Also, Yelp only shows the number of helpful votes. We extracted 100,000 reviews which got more than five helpful votes using a random sampling method among 750,000 reviews. The minimal preprocessing was executed to each dataset, such as removing numbers and special characters from text data. To evaluate the proposed methods, we compared the performances of Word2Vec and GloVe word embeddings, which used all the words. We showed that one of the proposed methods is better than the embeddings with all the words. By removing unimportant words, we can get better performance. However, if we removed too many words, it showed that the performance was lowered. For future research, it is required to consider diverse ways of preprocessing and the in-depth analysis for the co-occurrence of words to measure similarity values among words. Also, we only applied the proposed method with Word2Vec. Other embedding methods such as GloVe, fastText, ELMo can be applied with the proposed methods, and it is possible to identify the possible combinations between word embedding methods and elimination methods.

Keyword networks in RJCC research - A co-word analysis and clustering - (RJCC 연구 키워드 네트워크 - 동시출현단어분석과 군집분석 -)

  • Seo, Hyun-Jin;Choi, Yeong-Hyeon;Oh, Seung-Taek;Lee, Kyu-Hye
    • The Research Journal of the Costume Culture
    • /
    • v.27 no.3
    • /
    • pp.193-205
    • /
    • 2019
  • A trend analysis of research articles in a field of knowledge is significant because it can help in finding out the structural characteristics of the field and the future direction of research through observing change in a time series. We identified the structural characteristics and trends in text data (keywords) gathered from research articles which in itself is an important task in various research areas. The titles and keywords were crawled from research articles published from 2016 to 2018 in the Research Journal of the Costume Culture (RJCC), one of the representative Korean journal in the field of clothing and textile. After we extracted data comprising English titles and keywords from 195 published articles, we transformed it into a 1-mode matrix. We used measures from network analysis (i.e., link, strength, and degree centrality) for evaluating meaningful patterns and trends in the research on clothing and textile. NodeXL was used for visualizing the semantic network. This study observed change in the clothing and textile research trend. In addition to covering the core areas of the field, the subjects of research have been diversifying with every passing year and have evolved onto a developmental direction. The most studied area in articles published by the RJCC was fashion retailing/consumer psychology while aesthetic/historic and fashion industry/policy studies were covered to a more limited extent. We observed that most of the studies reflecting the identity of RJCC share subject keywords to a significant extent.

Research on Brand Value Dimensions of Employers: Based on Online Reviews by the Employees

  • XU, Meng
    • The Journal of Asian Finance, Economics and Business
    • /
    • v.9 no.10
    • /
    • pp.215-225
    • /
    • 2022
  • This study investigates employees' online reviews, conducts in-depth text topic mining, effectively summarizes the dimensions of employer brand value, and seeks effective ways to build employer brands from a multi-dimensional perspective. This study employs samples of employer reviews, filter keywords according to word frequency-inverse document frequency, builds a review network containing the same keywords, explore the community and summarize the theme dimensions. Simultaneously, it makes a dynamic comparison and analysis of the employer brand value dimension of different industries and enterprises. The study shows that the community exploration theme can be summarized into 11 dimensions of employer brand value, and the dimensions of employer brand value are significantly different across industries and among different enterprises within the industry. The attention to the employer brand value dimension has a significant time change. Various industries pay increasing attention to the dimension of work intensity and career development, while employers pay steady attention to the dimension of welfare benefits. The findings of this study suggest that seeking the heterogeneity of employer brand resources from the multi-dimensional differences and changes is an effective way to improve the competitiveness of enterprises in the human capital market.