• Title/Summary/Keyword: news big data

Search Result 287, Processing Time 0.023 seconds

An Analysis of News Report Characteristics on Archives & Records Management for the Press in Korea: Based on 1999~2018 News Big Data (뉴스 빅데이터를 이용한 우리나라 언론의 기록관리 분야 보도 특성 분석: 1999~2018 뉴스를 중심으로)

  • Han, Seunghee
    • Journal of the Korean Society for information Management
    • /
    • v.35 no.3
    • /
    • pp.41-75
    • /
    • 2018
  • The purpose of this study is to analyze the characteristics of Korean media on the topic of archives & records management based on time-series analysis. In this study, from January, 1999 to June, 2018, 4,680 news articles on archives & records management topics were extracted from BigKinds. In order to examine the characteristics of the media coverage on the archives & records management topic, this study was analyzed to the difference of the press coverage by period, subject, and type of the media. In addition, this study was conducted word-frequency based content analysis and semantic network analysis to investigate the content characteristics of media on the subject. Based on these results, this study was analyzed to the differences of media coverage by period, subject, and type of media. As a result, the news in the field of records management showed that there was a difference in the amount of news coverage and news contents by period, subject, and type of media. The amount of news coverage began to increase after the Presidential Records Management Act was enacted in 2007, and the largest amount of news was reported in 2013. Daily newspapers and financial newspapers reported the largest amount of news. As a result of analyzing news reports, during the first 10 years after 1999, news topics were formed around the issues arising from the application and diffusion process of the concept of archives & records management. However, since the enactment of the Presidential Records Management Act, archives & records management has become a major factor in political and social issues, and a large amount of political and social news has been reported.

COVID-19 News Analysis Using News Big Data : Focusing on Topic Modeling Analysis (뉴스 빅데이터를 활용한 코로나19 언론보도 분석 :토픽모델링 분석을 중심으로)

  • Kim, Tae-Jong
    • The Journal of the Korea Contents Association
    • /
    • v.20 no.5
    • /
    • pp.457-466
    • /
    • 2020
  • The purpose of this study is to find out what the main agenda of social formation is and how it changes through the media by utilizing the news big data of COVID-19 which is spreading recently, and to suggest the direction of future reporting. In order to achieve the purpose of the research, 47,816 cases of news big data reported from December 31, 2019 to March 11, 2020 were divided into four periods based on the fourth stage of the crisis warning for infectious diseases, and a total of 20 topics were derived. Based on the results of the Topic Modeling analysis, this study proposed the following. First, it is necessary to refrain from provocative expressions such as "anxiety" and "fear" and use neutral and objective reporting terms. Second, more in-depth and contextual news production is required, breaking away from simple event news production. Third, it is necessary to prepare detailed crisis communication manuals for each situation related to infectious diseases. Fourth, we need reports that focus on citizens-led efforts to overcome the crisis. This research has the academic significance that it is the first paper to analyze news big data on COVID-19 using the Topic Modeling Analysis method, and the policy significance that can be used as the basis for developing national crisis communication policy.

Developing and Evaluating Damage Information Classifier of High Impact Weather by Using News Big Data (재해기상 언론기사 빅데이터를 활용한 피해정보 자동 분류기 개발)

  • Su-Ji, Cho;Ki-Kwang Lee
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.46 no.3
    • /
    • pp.7-14
    • /
    • 2023
  • Recently, the importance of impact-based forecasting has increased along with the socio-economic impact of severe weather have emerged. As news articles contain unconstructed information closely related to the people's life, this study developed and evaluated a binary classification algorithm about snowfall damage information by using media articles text mining. We collected news articles during 2009 to 2021 which containing 'heavy snow' in its body context and labelled whether each article correspond to specific damage fields such as car accident. To develop a classifier, we proposed a probability-based classifier based on the ratio of the two conditional probabilities, which is defined as I/O Ratio in this study. During the construction process, we also adopted the n-gram approach to consider contextual meaning of each keyword. The accuracy of the classifier was 75%, supporting the possibility of application of news big data to the impact-based forecasting. We expect the performance of the classifier will be improve in the further research as the various training data is accumulated. The result of this study can be readily expanded by applying the same methodology to other disasters in the future. Furthermore, the result of this study can reduce social and economic damage of high impact weather by supporting the establishment of an integrated meteorological decision support system.

A Study on the Meaning of The First Slam Dunk Based on Text Mining and Semantic Network Analysis

  • Kyung-Won Byun
    • International journal of advanced smart convergence
    • /
    • v.12 no.1
    • /
    • pp.164-172
    • /
    • 2023
  • In this study, we identify the recognition of 'The First Slam Dunk', which is gaining popularity as a sports-based cartoon through big data analysis of social media channels, and provide basic data for the development and development of various contents in the sports industry. Social media channels collected detailed social big data from news provided on Naver and Google sites. Data were collected from January 1, 2023 to February 15, 2023, referring to the release date of 'The First Slam Dunk' in Korea. The collected data were 2,106 Naver news data, and 1,019 Google news data were collected. TF and TF-IDF were analyzed through text mining for these data. Through this, semantic network analysis was conducted for 60 keywords. Big data analysis programs such as Textom and UCINET were used for social big data analysis, and NetDraw was used for visualization. As a result of the study, the keyword with the high frequency in relation to the subject in consideration of TF and TF-IDF appeared 4,079 times as 'The First Slam Dunk' was the keyword with the high frequency among the frequent keywords. Next are 'Slam Dunk', 'Movie', 'Premiere', 'Animation', 'Audience', and 'Box-Office'. Based on these results, 60 high-frequency appearing keywords were extracted. After that, semantic metrics and centrality analysis were conducted. Finally, a total of 6 clusters(competing movie, cartoon, passion, premiere, attention, Box-Office) were formed through CONCOR analysis. Based on this analysis of the semantic network of 'The First Slam Dunk', basic data on the development plan of sports content were provided.

Comparing Social Media and News Articles on Climate Change: Different Viewpoints Revealed

  • Kang Nyeon Lee;Haein Lee;Jang Hyun Kim;Youngsang Kim;Seon Hong Lee
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.17 no.11
    • /
    • pp.2966-2986
    • /
    • 2023
  • Climate change is a constant threat to human life, and it is important to understand the public perception of this issue. Previous studies examining climate change have been based on limited survey data. In this study, the authors used big data such as news articles and social media data, within which the authors selected specific keywords related to climate change. Using these natural language data, topic modeling was performed for discourse analysis regarding climate change based on various topics. In addition, before applying topic modeling, sentiment analysis was adjusted to discover the differences between discourses on climate change. Through this approach, discourses of positive and negative tendencies were classified. As a result, it was possible to identify the tendency of each document by extracting key words for the classified discourse. This study aims to prove that topic modeling is a useful methodology for exploring discourse on platforms with big data. Moreover, the reliability of the study was increased by performing topic modeling in consideration of objective indicators (i.e., coherence score, perplexity). Theoretically, based on the social amplification of risk framework (SARF), this study demonstrates that the diffusion of the agenda of climate change in public news media leads to personal anxiety and fear on social media.

Understanding the Food Hygiene of Cruise through the Big Data Analytics using the Web Crawling and Text Mining

  • Shuting, Tao;Kang, Byongnam;Kim, Hak-Seon
    • Culinary science and hospitality research
    • /
    • v.24 no.2
    • /
    • pp.34-43
    • /
    • 2018
  • The objective of this study was to acquire a general and text-based awareness and recognition of cruise food hygiene through big data analytics. For the purpose, this study collected data with conducting the keyword "food hygiene, cruise" on the web pages and news on Google, during October 1st, 2015 to October 1st, 2017 (two years). The data collection was processed by SCTM which is a data collecting and processing program and eventually, 899 kb, approximately 20,000 words were collected. For the data analysis, UCINET 6.0 packaged with visualization tool-Netdraw was utilized. As a result of the data analysis, the words such as jobs, news, showed the high frequency while the results of centrality (Freeman's degree centrality and Eigenvector centrality) and proximity indicated the distinct rank with the frequency. Meanwhile, as for the result of CONCOR analysis, 4 segmentations were created as "food hygiene group", "person group", "location related group" and "brand group". The diagnosis of this study for the food hygiene in cruise industry through big data is expected to provide instrumental implications both for academia research and empirical application.

Interpretation of the place discourse of Deoksugung Doldam-gil through News Big Data (뉴스 빅데이터를 통한 덕수궁 돌담길의 장소 담론 해석)

  • Sung, Ji-Young;Kim, Sung-Kyun
    • Journal of Digital Contents Society
    • /
    • v.18 no.5
    • /
    • pp.923-932
    • /
    • 2017
  • Based on the metadata of BIGkids, a news big data system, this study analyzed the trends of news coverage by the major fields and topics related to Deoksugung Doldam-gil in mass media. In addition, we tried to interpret the space discourse of Deoksugung Doldam-gil which has been formed in contemporary period through the analysis of data related to BIGKinds, the contents of related reports and context. As a result of the analysis, the coverage of Deoksugung Doldam-gil was mostly reported in the field of 'Culture', and the news related to 'Cooking_Travel', 'Exhibition_Performance' and 'Broadcasting Entertainment.' Deoksugung Doldam-gil was categorized as the pedestrian freindly street, the cultural and artistic street, and the historical street, and interpreted the spatial discourse with related news contents.

Strategies for the Development of Watermelon Industry Using Unstructured Big Data Analysis

  • LEE, Seung-In;SON, Chansoo;SHIM, Joonyong;LEE, Hyerim;LEE, Hye-Jin;CHO, Yongbeen
    • The Journal of Industrial Distribution & Business
    • /
    • v.12 no.1
    • /
    • pp.47-62
    • /
    • 2021
  • Purpose: Our purpose in this study was to examine the strategies for the development of watermelon industry using unstructured big data analysis. That is, this study was to look the change of issues and consumer's perception about watermelon using big data and social network analysis and to investigate ways to strengthen the competitiveness of watermelon industry based on that. Methodology: For this purpose, the data was collected from Naver (blog, news) and Daum (blog, news) by TEXTOM 4.5 and the analysis period was set from 2015 to 2016 and from 2017-2018 and from 2019-2020 in order to understand change of issues and consumer's perception about watermelon or watermelon industry. For the data analysis, TEXTOM 4.5 was used to conduct key word frequency analysis, word cloud analysis and extraction of metrics data. UCINET 6.0 and NetDraw function of UCINET 6.0 were utilized to find the connection structure of words and to visualize the network relations, and to make a cluster of words. Results: The keywords related to the watermelon extracted such as 'the stalk end of a watermelon', 'E-mart', 'Haman', 'Gochang', and 'Lotte Mart' (news: 015-2016), 'apple watermelon', 'Haman', 'E-mart', 'Gochang', and' Mudeungsan watermelon' (news: 2017-2018), 'E-mart', 'apple watermelon', 'household', 'chobok', and 'donation' (news: 2019-2020), 'watermelon salad', 'taste', 'the heat', 'baby', and 'effect' (blog: 2015-2016), 'taste', 'watermelon juice', 'method', 'watermelon salad', and 'baby' (blog: 2017-2018), 'taste', 'effect', 'watermelon juice', 'method', and 'apple watermelon' (blog: 2019-2020) and the results from frequency and TF-IDF analysis presented. And in CONCOR analysis, appeared as four types, respectively. Conclusions: Based on the results, the authors discussed the strategies and policies for boosting the watermelon industry and limitations of this study and future research directions. The results of this study will help prioritize strategies and policies for boosting the consumption of the watermelon and contribute to improving the competitiveness of watermelon industry in Korea. Also, it is expected that this study will be used as a very important basis for agricultural big data studies to be conducted in the future and this study will offer watermelon producers and policy-makers practical points helpful in crafting tailor-made marketing strategies.

Analysis of the Empirical Effects of Contextual Matching Advertising for Online News

  • Oh, Hyo-Jung;Lee, Chang-Ki;Lee, Chung-Hee
    • ETRI Journal
    • /
    • v.34 no.2
    • /
    • pp.292-295
    • /
    • 2012
  • Beyond the simple keyword matching methods in contextual advertising, we propose a rich contextual matching (CM) model adopting a classification method for topic targeting and a query expansion method for semantic ad matching. This letter reports on an investigation into the empirical effects of the CM model by comparing the click-through rates (CTRs) of two practical online news advertising systems. Based on the evaluation results from over 100 million impressions, we prove that the average CTR of our proposed model outperforms that of a traditional model.

Exploring the leading indicator and time series analysis on the diffusion of big data in Korea (빅데이터 확산에 대한 선행 데이터 탐색 및 국내 확산 과정의 시계열 분석)

  • Choi, Jin;Kim, YoungJun
    • Journal of Technology Innovation
    • /
    • v.26 no.4
    • /
    • pp.57-97
    • /
    • 2018
  • Big Data has spread rapidly in various industries since 2010. We analyzed the general characteristics of big data through time series analysis on the initial process of spreading big data and investigated the difference of diffusion characteristics in each industry. By analyzing papers, patents, news data, and Google Trend using Big Data as a keyword, we searched for data corresponding to the leading indicator, and confirmed that trends in news and Google Trend preceded the papers and patents by two years. We used Google Trend to compare the introduction period of domestic, US, Japan, and China and quantify the process of spreading the eight main industries in Korea through news data. Through this study, we present an empirical research method on how the general technology spreads in several industry sectors and we have figured out where the spreading speed difference of big data originated in each industry in Korea. The method presented here can be used to analyze the technology introduced from foreign countries in developing countries because it can be analyzed in diffusion process of other technologies besides big data and corresponds to the diffusion of technology keywords in a specific country. And, on the corporate side, this approach shows what path is effective when it comes to launching and spreading new technologies.