• Title/Summary/Keyword: 키워드빈도분석

Search Result 353, Processing Time 0.036 seconds

Social Perception of Disaster Safety Education for Migrant Youth based on Big Data (빅데이터를 통해 바라본 이주배경청소년 재난안전교육에 대한 사회적 인식)

  • Ying Jin;Sang Jeong
    • Journal of the Society of Disaster Information
    • /
    • v.20 no.2
    • /
    • pp.462-469
    • /
    • 2024
  • Purpose: This study aims to analyze data on disaster safety education for migrant youth and to examine the corresponding social perceptions. Method: Data on disaster safety education for migrant youth were collected and analyzed using Textom and Ucinet. The data used in the study were searched on portal websites from 2016 to 2023 using the keywords 'migrant youth+ disaster + safety education'. Result: The analysis results showed that 'education (306)' had the highest frequency, followed by 'safety (287)', 'school (97)', 'society (85)', and 'support (77)'. The keyword with the high degree of centrality, closeness centrality, and betweenness centrality were 'education', 'safety' and 'society'. 'Family' ranked higher in betweenness centrality than the rankings of frequency analysis, degree centrality and closeness centrality, indicating that 'family' plays a significant role as a mediator in the network of disaster safety education for migrant youth. Conclusion: By examining social awareness about disaster safety education for migrant youth, the findings will be used to develop policies and strategies for disaster safety education that consider the unique vulnerabilities of migrant youth in disaster situations.

Analysis on Trends in Plogging Culture and Professional Sports Using BIG KINDS Analysis (빅카인즈 분석을 활용한 플로깅 문화와 프로스포츠 분야의 동향 분석)

  • Gyu-Min, Na;Kyung-A, Oh
    • Journal of the Korean Applied Science and Technology
    • /
    • v.40 no.5
    • /
    • pp.1072-1080
    • /
    • 2023
  • The purpose of this study is to analyze major keywords and social phenomena related to 'plogging' in the sports field and to derive important information. In order to achieve this purpose, the news analysis system BIG Kinds provided by the Korea Press Promotion Foundation was used to analyze it. The analysis period is from 2018 to 2022, and 42 of the 5,148 news collected were finally used and analyzed. Frequency analysis, relationship map analysis, and related word analysis were performed as analysis methods, and the results are as follows. First, as a result of the frequency analysis related to 'Plogging' in the sports field, keywords such as 'Jeju', 'Players', 'World Marathon', 'SSG', and 'Lee Bong-ju' were identified. Second, as a result of the analysis of the relationship related to 'Plogging' in the sports field, keywords such as 'COVID-19', 'national representative', 'elite', 'masters', and 'COVID' were identified. Third, as a result of the analysis of words related to 'Plogging' in the sports field, keywords such as 'synthesized words', 'volunteer activities,' 'masters untact', 'Jeju', and 'athletes' were identified. In the domestic professional sports field, it has been shown that plogging is actively used for environmental activities and professional team promotion to practice carbon neutrality by international sports organizations.

A Methodology for Extracting Shopping-Related Keywords by Analyzing Internet Navigation Patterns (인터넷 검색기록 분석을 통한 쇼핑의도 포함 키워드 자동 추출 기법)

  • Kim, Mingyu;Kim, Namgyu;Jung, Inhwan
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.2
    • /
    • pp.123-136
    • /
    • 2014
  • Recently, online shopping has further developed as the use of the Internet and a variety of smart mobile devices becomes more prevalent. The increase in the scale of such shopping has led to the creation of many Internet shopping malls. Consequently, there is a tendency for increasingly fierce competition among online retailers, and as a result, many Internet shopping malls are making significant attempts to attract online users to their sites. One such attempt is keyword marketing, whereby a retail site pays a fee to expose its link to potential customers when they insert a specific keyword on an Internet portal site. The price related to each keyword is generally estimated by the keyword's frequency of appearance. However, it is widely accepted that the price of keywords cannot be based solely on their frequency because many keywords may appear frequently but have little relationship to shopping. This implies that it is unreasonable for an online shopping mall to spend a great deal on some keywords simply because people frequently use them. Therefore, from the perspective of shopping malls, a specialized process is required to extract meaningful keywords. Further, the demand for automating this extraction process is increasing because of the drive to improve online sales performance. In this study, we propose a methodology that can automatically extract only shopping-related keywords from the entire set of search keywords used on portal sites. We define a shopping-related keyword as a keyword that is used directly before shopping behaviors. In other words, only search keywords that direct the search results page to shopping-related pages are extracted from among the entire set of search keywords. A comparison is then made between the extracted keywords' rankings and the rankings of the entire set of search keywords. Two types of data are used in our study's experiment: web browsing history from July 1, 2012 to June 30, 2013, and site information. The experimental dataset was from a web site ranking site, and the biggest portal site in Korea. The original sample dataset contains 150 million transaction logs. First, portal sites are selected, and search keywords in those sites are extracted. Search keywords can be easily extracted by simple parsing. The extracted keywords are ranked according to their frequency. The experiment uses approximately 3.9 million search results from Korea's largest search portal site. As a result, a total of 344,822 search keywords were extracted. Next, by using web browsing history and site information, the shopping-related keywords were taken from the entire set of search keywords. As a result, we obtained 4,709 shopping-related keywords. For performance evaluation, we compared the hit ratios of all the search keywords with the shopping-related keywords. To achieve this, we extracted 80,298 search keywords from several Internet shopping malls and then chose the top 1,000 keywords as a set of true shopping keywords. We measured precision, recall, and F-scores of the entire amount of keywords and the shopping-related keywords. The F-Score was formulated by calculating the harmonic mean of precision and recall. The precision, recall, and F-score of shopping-related keywords derived by the proposed methodology were revealed to be higher than those of the entire number of keywords. This study proposes a scheme that is able to obtain shopping-related keywords in a relatively simple manner. We could easily extract shopping-related keywords simply by examining transactions whose next visit is a shopping mall. The resultant shopping-related keyword set is expected to be a useful asset for many shopping malls that participate in keyword marketing. Moreover, the proposed methodology can be easily applied to the construction of special area-related keywords as well as shopping-related ones.

A Study on the Intellectual Structure of Metadata Research by Using Co-word Analysis (동시출현단어 분석에 기반한 메타데이터 분야의 지적구조에 관한 연구)

  • Choi, Ye-Jin;Chung, Yeon-Kyoung
    • Journal of the Korean Society for information Management
    • /
    • v.33 no.3
    • /
    • pp.63-83
    • /
    • 2016
  • As the usage of information resources produced in various media and forms has been increased, the importance of metadata as a tool of information organization to describe the information resources becomes increasingly crucial. The purposes of this study are to analyze and to demonstrate the intellectual structure in the field of metadata through co-word analysis. The data set was collected from the journals which were registered in the Core collection of Web of Science citation database during the period from January 1, 1998 to July 8, 2016. Among them, the bibliographic data from 727 journals was collected using Topic category search with the query word 'metadata'. From 727 journal articles, 410 journals with author keywords were selected and after data preprocessing, 1,137 author keywords were extracted. Finally, a total of 37 final keywords which had more than 6 frequency were selected for analysis. In order to demonstrate the intellectual structure of metadata field, network analysis was conducted. As a result, 2 domains and 9 clusters were derived, and intellectual relations among keywords from metadata field were visualized, and proposed keywords with high global centrality and local centrality. Six clusters from cluster analysis were shown in the map of multidimensional scaling, and the knowledge structure was proposed based on the correlations among each keywords. The results of this study are expected to help to understand the intellectual structure of metadata field through visualization and to guide directions in new approaches of metadata related studies.

Investigating Topics of Incivility Related to COVID-19 on Twitter: Analysis of Targets and Keywords of Hate Speech (트위터에서의 COVID-19와 관련된 반시민성 주제 탐색: 혐오 대상 및 키워드 분석)

  • Kim, Kyuli;Oh, Chanhee;Zhu, Yongjun
    • Journal of the Korean Society for information Management
    • /
    • v.39 no.1
    • /
    • pp.331-350
    • /
    • 2022
  • This study aims to understand topics of incivility related to COVID-19 from analyzing Twitter posts including COVID-19-related hate speech. To achieve the goal, a total of 63,802 tweets that were created between December 1st, 2019, and August 31st, 2021, covering three targets of hate speech including region and public facilities, groups of people, and religion were analyzed. Frequency analysis, dynamic topic modeling, and keyword co-occurrence network analysis were used to explore topics and keywords. 1) Results of frequency analysis revealed that hate against regions and public facilities showed a relatively increasing trend while hate against specific groups of people and religion showed a relatively decreasing trend. 2) Results of dynamic topic modeling analysis showed keywords of each of the three targets of hate speech. Keywords of the region and public facilities included "Daegu, Gyeongbuk local hate", "interregional hate", and "public facility hate"; groups of people included "China hate", "virus spreaders", and "outdoor activity sanctions"; and religion included "Shincheonji", "Christianity", "religious infection", "refusal of quarantine", and "places visited by confirmed cases". 3) Similarly, results of keyword co-occurrence network analysis revealed keywords of three targets: region and public facilities (Corona, Daegu, confirmed cases, Shincheonji, Gyeongbuk, region); specific groups of people (Coronavirus, Wuhan pneumonia, Wuhan, China, Chinese, People, Entry, Banned); and religion (Corona, Church, Daegu, confirmed cases, infection). This study attempted to grasp the public's anti-citizenship public opinion related to COVID-19 by identifying domestic COVID-19 hate targets and keywords using social media. In particular, it is meaningful to grasp public opinion on incivility topics and hate emotions expressed on social media using data mining techniques for hate-related to COVID-19, which has not been attempted in previous studies. In addition, the results of this study suggest practical implications in that they can be based on basic data for contributing to the establishment of systems and policies for cultural communication measures in preparation for the post-COVID-19 era.

A Study on Analysis of the Trend of Blockchain by Key Words Network Analysis (키워드 네트워크 분석 방법을 활용한 블록체인 트렌드 분석에 관한 연구)

  • Cho, Seong-Hwan
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.11 no.5
    • /
    • pp.550-555
    • /
    • 2018
  • This study aims to identify and compare contents and keywords used in articles related to blockchain applications to various industries. The text mining and Semantic Network Analysis, as methods of keyword network analysis, were used to analyze articles including terms of 'finance' 'energy' and 'logistics', which media and government frequently mentioned as areas that can apply blockchain technologies. For this study, data were collected from 43,093 articles from January, 2017 through July, 2018. Data crawling was carried out by using Python BeautifulSoup and data cleaning was performed in order to eliminate mutual redundancies of the three terms. After that, text mining and semantic network analysis were performed using Textom and UCInet for network analysis between keywords. The results showed that all the three terms were similar in terms of 'technology', but there were differences in the contents of 'government policy' or 'industry' issues. In addition, there were differences in frequencies and centralities of these terms.

An Investigation on Digital Humanities Research Trend by Analyzing the Papers of Digital Humanities Conferences (디지털 인문학 연구 동향 분석 - Digital Humanities 학술대회 논문을 중심으로 -)

  • Chung, EunKyung
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.55 no.1
    • /
    • pp.393-413
    • /
    • 2021
  • Digital humanities, which creates new and innovative knowledge through the combination of digital information technology and humanities research problems, can be seen as a representative multidisciplinary field of study. To investigate the intellectual structure of the digital humanities field, a network analysis of authors and keywords co-word was performed on a total of 441 papers in the last two years (2019, 2020) at the Digital Humanities Conference. As the results of the author and keyword analysis show, we can find out the active activities of Europe, North America, and Japanese and Chinese authors in East Asia. Through the co-author network, 11 dis-connected sub-networks are identified, which can be seen as a result of closed co-authoring activities. Through keyword analysis, 16 sub-subject areas are identified, which are machine learning, pedagogy, metadata, topic modeling, stylometry, cultural heritage, network, digital archive, natural language processing, digital library, twitter, drama, big data, neural network, virtual reality, and ethics. This results imply that a diver variety of digital information technologies are playing a major role in the digital humanities. In addition, keywords with high frequency can be classified into humanities-based keywords, digital information technology-based keywords, and convergence keywords. The dynamics of the growth and development of digital humanities can represented in these combinations of keywords.

A Study on the Knowledge Transfer of Korean Nano-biotechnology based on SCI Citation Analysis (SCI 인용분석을 통한 우리나라 바이오나노 분야의 지식이전 행태 연구)

  • Lee, Hyejin;Lee, Choon-Shil
    • Proceedings of the Korean Society for Information Management Conference
    • /
    • 2012.08a
    • /
    • pp.121-124
    • /
    • 2012
  • 이 연구는 문헌정보 기반의 연구개발 기획 지원 서비스 발굴을 위해 SCI 학술지에 발표한 국내 바이오나노 분야의 논문을 인용한 논문들의 키워드를 기반으로 지식이전 행태를 분석하였다. 피인용 논문은 2004년, 2006년, 2008년에 발표한 논문 중 가장 인용빈도가 높은 바이오공정, 바이오센서, 바이오시스템 분야의 논문들을 대상으로 하였다. 인용한 논문들의 키워드 분석 결과, 바이오공정 분야는 바이오센서와 환경관련 분야로, 바이오센서 분야는 후각센서응용 분야와 면역증강응용 분야로, 바이오시스템 분야는 약물전달 및 암 진단 분야로 지식이 이전되었다.

  • PDF

A Study on the Optimal Search Keyword Extraction and Retrieval Technique Generation Using Word Embedding (워드 임베딩(Word Embedding)을 활용한 최적의 키워드 추출 및 검색 방법 연구)

  • Jeong-In Lee;Jin-Hee Ahn;Kyung-Taek Koh;YoungSeok Kim
    • Journal of the Korean Geosynthetics Society
    • /
    • v.22 no.2
    • /
    • pp.47-54
    • /
    • 2023
  • In this paper, we propose the technique of optimal search keyword extraction and retrieval for news article classification. The proposed technique was verified as an example of identifying trends related to North Korean construction. A representative Korean media platform, BigKinds, was used to select sample articles and extract keywords. The extracted keywords were vectorized using word embedding and based on this, the similarity between the extracted keywords was examined through cosine similarity. In addition, words with a similarity of 0.5 or higher were clustered based on the top 10 frequencies. Each cluster was formed as 'OR' between keywords inside the cluster and 'AND' between clusters according to the search form of the BigKinds. As a result of the in-depth analysis, it was confirmed that meaningful articles appropriate for the original purpose were extracted. This paper is significant in that it is possible to classify news articles suitable for the user's specific purpose without modifying the existing classification system and search form.

A Suggestion for Spatiotemporal Analysis Model of Complaints on Officially Assessed Land Price by Big Data Mining (빅데이터 마이닝에 의한 공시지가 민원의 시공간적 분석모델 제시)

  • Cho, Tae In;Choi, Byoung Gil;Na, Young Woo;Moon, Young Seob;Kim, Se Hun
    • Journal of Cadastre & Land InformatiX
    • /
    • v.48 no.2
    • /
    • pp.79-98
    • /
    • 2018
  • The purpose of this study is to suggest a model analysing spatio-temporal characteristics of the civil complaints for the officially assessed land price based on big data mining. Specifically, in this study, the underlying reasons for the civil complaints were found from the spatio-temporal perspectives, rather than the institutional factors, and a model was suggested monitoring a trend of the occurrence of such complaints. The official documents of 6,481 civil complaints for the officially assessed land price in the district of Jung-gu of Incheon Metropolitan City over the period from 2006 to 2015 along with their temporal and spatial poperties were collected and used for the analysis. Frequencies of major key words were examined by using a text mining method. Correlations among mafor key words were studied through the social network analysis. By calculating term frequency(TF) and term frequency-inverse document frequency(TF-IDF), which correspond to the weighted value of key words, I identified the major key words for the occurrence of the civil complaint for the officially assessed land price. Then the spatio-temporal characteristics of the civil complaints were examined by analysing hot spot based on the statistics of Getis-Ord $Gi^*$. It was found that the characteristic of civil complaints for the officially assessed land price were changing, forming a cluster that is linked spatio-temporally. Using text mining and social network analysis method, we could find out that the occurrence reason of civil complaints for the officially assessed land price could be identified quantitatively based on natural language. TF and TF-IDF, the weighted averages of key words, can be used as main explanatory variables to analyze spatio-temporal characteristics of civil complaints for the officially assessed land price since these statistics are different over time across different regions.