• Title/Summary/Keyword: news big data

Search Result 287, Processing Time 0.026 seconds

CoAID+ : COVID-19 News Cascade Dataset for Social Context Based Fake News Detection (CoAID+ : 소셜 컨텍스트 기반 가짜뉴스 탐지를 위한 COVID-19 뉴스 파급 데이터)

  • Han, Soeun;Kang, Yoonsuk;Ko, Yunyong;Ahn, Jeewon;Kim, Yushim;Oh, Seongsoo;Park, Heejin;Kim, Sang-Wook
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.11 no.4
    • /
    • pp.149-156
    • /
    • 2022
  • In the current COVID-19 pandemic, fake news and misinformation related to COVID-19 have been causing serious confusion in our society. To accurately detect such fake news, social context-based methods have been widely studied in the literature. They detect fake news based on the social context that indicates how a news article is propagated over social media (e.g., Twitter). Most existing COVID-19 related datasets gathered for fake news detection, however, contain only the news content information, but not its social context information. In this case, the social context-based detection methods cannot be applied, which could be a big obstacle in the fake news detection research. To address this issue, in this work, we collect from Twitter the social context information based on CoAID, which is a COVID-19 news content dataset built for fake news detection, thereby building CoAID+ that includes both the news content information and its social context information. The CoAID+ dataset can be utilized in a variety of methods for social context-based fake news detection, thus would help revitalize the fake news detection research area. Finally, through a comprehensive analysis of the CoAID+ dataset in various perspectives, we present some interesting features capable of differentiating real and fake news.

COVID-19 Discourse and Social Welfare Intervention through Online News Big Data: Focusing on the Elderly Living Alone (온라인 뉴스 빅데이터를 통한 코로나 19 담론과 사회복지 개입방안: 독거노인을 중심으로)

  • Yeo, Jiyoung
    • 한국노년학
    • /
    • v.41 no.3
    • /
    • pp.353-371
    • /
    • 2021
  • The purpose of this study is to provide clues to social welfare policy making by revealing discourse on social intervention and response based on big data on elderly living alone in the COVID-19 situation. Keyword analysis, network analysis, and topic analysis were utilized to explore the ways in which news media have portrayed challenges facing older individuals and the ways in which the central and local government as well as private organization have responded to them. Results are as follows. First, networks(degree, closeness, betweenness) were formed around region, delivery, society, support, and vulnerability, suggesting an increased demand for economic assistance and social support as well as stronger service delivery systems. Second, key topics derived included "establishing public delivery systems", "establishing local networks", "Managing care gap", "Establishing a private economic support system", and "Establishing service organization system". Based on the research results, discourse on the organic role of government, communities and the private sector has been presented, suggesting policy and practical implications by proposing a discussion on how to intervene for elderly living alone in disaster situations such as COVID-19.

'Korean Wave' News Analysis Using News Big Data ('한류' 경향에 관한 국내 언론 기사 빅데이터 분석 연구)

  • Hwang, Seo-I;Park, Jeong-Bae
    • Journal of Korea Entertainment Industry Association
    • /
    • v.14 no.5
    • /
    • pp.1-14
    • /
    • 2020
  • This study conducted a topic modeling and semantic network analysis of 'korean wave' and its meaning in Korean society from 2000 to 2019 by applying an agenda setting theory. For this purpose, a total of 197,992 newspaper articles which reported 'korean wave' issues were analyzed by applying topic modeling and semantic network analysis. As a result, first, the word 'korean wave' mainly appeared in korean-related regions in the korean press. culture and economy. second, a total of 9 topics related to korean wave issues appeared. This was followed by 'broadcast', 'export', 'domestic and foreign affairs', 'education', 'beauty and fashion', 'music and performance', 'tourism', 'media(platform)', and 'region'. Lastly, korean wave was mainly discussed at the cultural and economic ares. In addition, it was clustered into five characteristics: 'cultural hallyu', 'business hallyu', 'education', 'environment', and 'geography'.

A study on the rental Hanbok - Focusing on the market and consumer changes between 2006 and 2016 - (대여 한복에 대한 연구 - 2006년과 2016년 시장과 소비자 변화를 중심으로 -)

  • Shim, Joonyoung
    • The Research Journal of the Costume Culture
    • /
    • v.25 no.3
    • /
    • pp.405-418
    • /
    • 2017
  • The purpose of this study is to investigate changes in the rental Hanbok market and consumer during the past 10 years. This study was done by analyzing internet news about rental Hanbok and in-depth interview. The results provide basic data that can be used to understand the rental Hanbok market. Results showed the followings: First, the rental Hanbok market has expanded and consists of two types of rental Hanbok; ceremonial and experiential. The experiential Hanbok is new but a big part in rental Hanbok market. It is not existed until 2007 but it accounted for more than 60% of internet news about rental Hanbok in 2016. Second, there is a significant difference in consumer behavior between the two types of rental Hanbok. Ceremonial Hanbok showed consistent consumer behavior between 2006 and 2016. Consumer want to get benefits such as TPO(occasion suitability), economy, exhibition, trendy and exceptionality through renting Hanbok. On the other hand, experiential Hanbok, consumers are motivated by having unique, conformity and sharing memories. Based on these results, different sets of information reveal the unique features of the two types of rental Hanbok. And also needed to develop new designs and marketing strategies for them.

A Study on the Change of Smart City's Issues and Perception : Focus on News, Blog, and Twitter (스마트도시의 이슈와 인식변화에 관한 연구 : 뉴스, 블로그, 트위터 자료를 중심으로)

  • Jang, Hwan-Young
    • Journal of Cadastre & Land InformatiX
    • /
    • v.49 no.2
    • /
    • pp.67-82
    • /
    • 2019
  • The purpose of this study is to analyze the issues and perceptions of smart cities. First, based on the big data analysis platform, big data analysis on smart cities were conducted to derive keywords by year, word cloud, and frequency of generation of smart city keywords by time. Second, trend and flow by area were analyzed by reclassifying major keywords by year based on meta-keywords. Third, emotional recognition flow for smart cities and major emotional keywords were derived. While U-City in the past is mostly centered on creating infrastructure for new towns, recent smart cities are focusing on sustainable urban construction led by citizens, according to the analysis. In addition, it was analyzed that while infrastructure, service, and technology were emphasized in the past, management and methodology were emphasized recently, and positive perception of smart cities was growing. The study could be used as basic data for the past, present and future of smart cities in Korea at a time when smart city services are being built across the country.

A Study on Monitoring Method of Citizen Opinion based on Big Data : Focused on Gyeonggi Lacal Currency (Gyeonggi Money) (빅데이터 기반 시민의견 모니터링 방안 연구 : "경기지역화폐"를 중심으로)

  • Ahn, Soon-Jae;Lee, Sae-Mi;Ryu, Seung-Ei
    • Journal of Digital Convergence
    • /
    • v.18 no.7
    • /
    • pp.93-99
    • /
    • 2020
  • Text mining is one of the big data analysis methods that extracts meaningful information from atypical large-scale text data. In this study, text mining was used to monitor citizens' opinions on the policies and systems being implemented. We collected 5,108 newspaper articles and 748 online cafe posts related to 'Gyeonggi Lacal Currency' and performed frequency analysis, TF-IDF analysis, association analysis, and word tree visualization analysis. As a result, many articles related to the purpose of introducing local currency, the benefits provided, and the method of use. However, the contents related to the actual use of local currency were written in the online cafe posts. In order to revitalize local currency, the news was involved in the promotion of local currency as an informant. Online cafe posts consisted of the opinions of citizens who are local currency users. SNS and text mining are expected to effectively activate various policies as well as local currency.

An analysis of public perception on Artificial Intelligence(AI) education using Big Data: Based on News articles and Twitter (빅데이터 분석을 통해 본 AI교육에 대한 사회적 인식: 뉴스기사와 트위터를 중심으로)

  • Lee, Sang-Soog;Yoo, Inhyeok;Kim, Jinhee
    • Journal of Digital Convergence
    • /
    • v.18 no.6
    • /
    • pp.9-16
    • /
    • 2020
  • The purpose of this study is to understand the public needs for AI education actively promoted and supported by the current government. In doing so, 11 metropolitan news articles and Twitter posts regarding AI education that have been posted from January 1, 2018 to December 31, 2019 were collected. Then, word frequency analysis using TF(Term Frequency) method and LDA(Latent Dirichlet Allocation) method of topic modeling analysis were conducted. The topics of the news articles turn out to be a macroscopic policy support such as 'training female manpower in the AI field' and 'curriculum reform of university and K-12', whereas the topics of twitter delineate more detailed social perception on future society, such as future competencies and pedagogical methods, including 'coexistence with intelligent robots', 'coding education', and 'humane education competence development'. The findings are expected to be used to suggest the implications for the composition and management of AI curriculum as well as the basic framework of human resources development in the future industry.

Study on the Analysis of National Paralympics by Utilizing Social Big Data Text Mining (소셜 빅데이터 텍스트 마이닝을 활용한 전국장애인체육대회 분석 연구)

  • Kim, Dae kyung;Lee, Hyun Su
    • 한국체육학회지인문사회과학편
    • /
    • v.55 no.6
    • /
    • pp.801-810
    • /
    • 2016
  • The purpose of the study was to conduct a text mining examining keywords related to the National Paralympics and provide the fundamental information that would be used to change perception of people without disabilities toward disabilities and to promote the social participation of people with and without disabilities in the National Paralympics. Social big data regarding the National Paralympics were retrieved from news articles and blog postings identified by search engines, Naver, Daum, and Google. The data were then analysed using R-3.3.1 Version Program. The analysing techniques were cloud analysis, correlation analysis and social network analysis. The results were as follows. First, news were mainly related to game results, sports events, team participation and host avenue of the 33rd ~ 36th National Paralympics. Second, search results about the 33rd ~ 36th National Paralympics between Naver, Daum, and Google were similar to one another. Thirds, the keywrods, National Paralympics, sports for the disabled, and sports, demonstrated a high close centrality. Further, degree centrality and betweenness centrality were associated in the keywords such as sports for all, participation, research, development, sports-disabled, research-disabled, sports for all-participation, disabled-participation, sports for all-disabled, and host-paralympics.

Twitter Issue Tracking System by Topic Modeling Techniques (토픽 모델링을 이용한 트위터 이슈 트래킹 시스템)

  • Bae, Jung-Hwan;Han, Nam-Gi;Song, Min
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.2
    • /
    • pp.109-122
    • /
    • 2014
  • People are nowadays creating a tremendous amount of data on Social Network Service (SNS). In particular, the incorporation of SNS into mobile devices has resulted in massive amounts of data generation, thereby greatly influencing society. This is an unmatched phenomenon in history, and now we live in the Age of Big Data. SNS Data is defined as a condition of Big Data where the amount of data (volume), data input and output speeds (velocity), and the variety of data types (variety) are satisfied. If someone intends to discover the trend of an issue in SNS Big Data, this information can be used as a new important source for the creation of new values because this information covers the whole of society. In this study, a Twitter Issue Tracking System (TITS) is designed and established to meet the needs of analyzing SNS Big Data. TITS extracts issues from Twitter texts and visualizes them on the web. The proposed system provides the following four functions: (1) Provide the topic keyword set that corresponds to daily ranking; (2) Visualize the daily time series graph of a topic for the duration of a month; (3) Provide the importance of a topic through a treemap based on the score system and frequency; (4) Visualize the daily time-series graph of keywords by searching the keyword; The present study analyzes the Big Data generated by SNS in real time. SNS Big Data analysis requires various natural language processing techniques, including the removal of stop words, and noun extraction for processing various unrefined forms of unstructured data. In addition, such analysis requires the latest big data technology to process rapidly a large amount of real-time data, such as the Hadoop distributed system or NoSQL, which is an alternative to relational database. We built TITS based on Hadoop to optimize the processing of big data because Hadoop is designed to scale up from single node computing to thousands of machines. Furthermore, we use MongoDB, which is classified as a NoSQL database. In addition, MongoDB is an open source platform, document-oriented database that provides high performance, high availability, and automatic scaling. Unlike existing relational database, there are no schema or tables with MongoDB, and its most important goal is that of data accessibility and data processing performance. In the Age of Big Data, the visualization of Big Data is more attractive to the Big Data community because it helps analysts to examine such data easily and clearly. Therefore, TITS uses the d3.js library as a visualization tool. This library is designed for the purpose of creating Data Driven Documents that bind document object model (DOM) and any data; the interaction between data is easy and useful for managing real-time data stream with smooth animation. In addition, TITS uses a bootstrap made of pre-configured plug-in style sheets and JavaScript libraries to build a web system. The TITS Graphical User Interface (GUI) is designed using these libraries, and it is capable of detecting issues on Twitter in an easy and intuitive manner. The proposed work demonstrates the superiority of our issue detection techniques by matching detected issues with corresponding online news articles. The contributions of the present study are threefold. First, we suggest an alternative approach to real-time big data analysis, which has become an extremely important issue. Second, we apply a topic modeling technique that is used in various research areas, including Library and Information Science (LIS). Based on this, we can confirm the utility of storytelling and time series analysis. Third, we develop a web-based system, and make the system available for the real-time discovery of topics. The present study conducted experiments with nearly 150 million tweets in Korea during March 2013.

Study of major issues and trends facing ports, using big data news: From 1991 to 2020 (뉴스 빅데이터를 활용한 항만이슈 변화연구 : 1991~2020)

  • Yoon, Hee-Young
    • Journal of Korea Port Economic Association
    • /
    • v.37 no.1
    • /
    • pp.159-178
    • /
    • 2021
  • This study analyzed issues and trends related to ports with 86,611 news articles for the 30 years from 1991 to 2020, using BIGKinds, a big data news analysis service. The analysis was based on keyword analysis, word cloud, relationship diagram analysis offered by BIG Kinds. Analysis results of issues and trends on ports for the last 30 years are summarized as follows. First, during Phase 1 (1991-2000), individual ports such as Busan, Incheon, and Gwangyang ports tried to strengthen their own competitiveness. During Phase 2 (2001-2010), efforts were made on gaining more professional and specialized port management abilities by establishing the Busan Port Authority in 2004, the Incheon Port Authority in 2005, and the Ulsan Port Authority in 2007. During Phase 3 (2011-2020), the promotion of future-oriented, eco-friendly, and smart ports was major issues. Efforts to reduce particulate matters and pollutants produced from ports were accelerated, and an attempt to build a smart port driven by port automation and digitalization was also intensified. Lastly, in 2020, when the maritime sector was severely hit by the unexpected shock of the COVID-19 pandemic, a microscopic analysis of trends and issues in 2019 and 2020 was made to look into the impact the pandemic on the maritime industry. It was found that shipping and port industries experienced more drastic changes than ever while trying to prepare for a post-pandemic era as well as promoting future-oriented ports. This study made policy suggestions by analyzing port-related news articles and trends, and it is expected that based on the findings of this research, further studies on enhancing the competitiveness of ports and devising a sustainable development strategy will follow through a comparative analysis of port issues of different countries, thereby making further progress toward academic research on ports.