• Title/Summary/Keyword: news data

Search Result 885, Processing Time 0.045 seconds

A Research on Developing a Card News System based on News Generation Algorithm (알고리즘 기반의 개인화된 카드뉴스 생성 시스템 연구)

  • Kim, Dongwhan;Lee, Sanghyuk;Oh, Jonghwan;Kim, Junsuk;Park, Sungmin;Choi, Woobin;Lee, Joonhwan
    • Journal of Korea Multimedia Society
    • /
    • v.23 no.2
    • /
    • pp.301-316
    • /
    • 2020
  • Algorithm journalism refers to the practices of automated news generation using algorithms that generate human sounding narratives. Algorithm journalism is known to have strengths in automating repetitive tasks through rapid and accurate analysis of data, and has been actively used in news domains such as sports and finance. In this paper, we propose an interactive card news system that generates personalized local election articles in 2018. The system consists of modules that collects and analyzes election data, generates texts and images, and allows users to specify their interests in the local elections. When a user selects interested regions, election types, candidate names, and political parties, the system generates card news according to their interest. In the study, we examined how personalized card news are evaluated in comparison with text and card news articles by human journalists, and derived implications on the potential use of algorithm in reporting political events.

The Representation of Cancer Risk by Korean Health Journalism: Comparing the Crude Rates of 10 Cancers to the Amount of Cancer News in the Three Major Newspapers(1990-2010) (10대암 조발생률과 신문 보도량의 비교: 3대 일간지 보도(1990년~2010년)를 중심으로)

  • Ju, Youngkee;Jeong, Da-Eun;You, Myoungsoon
    • Korean Journal of Health Education and Promotion
    • /
    • v.30 no.5
    • /
    • pp.201-210
    • /
    • 2013
  • Objectives: The public relies on the news media to understand health risks. To examine the surveillance function of Korean health journalism, this study compared the rank-order of the 10 most frequently diagnosed cancers with that of the 10 cancers most frequently covered by three major Korean newspapers. Methods: News stories published between 1999 and 2010 by the Chosun-Ilbo, Joong-Ang-Ilbo, and Dong-A-Ilbo were examined. Data on cancer incidence were collected using the epidemiological data published by a governmental public health institution. To compare the level of the crude rates and the amount of news coverage, rank-order correlation tests and regression analyses were employed. Results: A reduction in the rank-ordered correlation coefficient was observed despite an increase in the overall number of cancer news stories released. The significance of the correlation disappeared after 2006. The big difference of the rank order between the crude rate and the amount of news coverage was observed in the cancer of breast, uteri, thyroid, and gallbladder/biliary. Finally, the three newspapers did not follow the amount change in stomach, lung, liver, and uterine cervix cancer. The four cancers' rank orders of crude rate were lowering, signifying a reduction of the comparative dangerousness of the four cancers. Conclusions: The news media's customization of news content and the negative bias in journalism are suggested as possible influences on the news media's inaccurate representation of cancer risk.

An Analysis of News Media Coverage of the QRcode: Based on 2008-2023 News Big Data (QR코드에 대한 언론 보도 경향: 2008-2023년 뉴스 빅데이터 분석)

  • Sunjeong Kim;Jisu Lee
    • Journal of the Korean Society for information Management
    • /
    • v.41 no.2
    • /
    • pp.269-294
    • /
    • 2024
  • This study analyzed the news media coverage of QRcodes in Korea over a 16-year period (2008 to 2023). A total of 13,335 articles were extracted from the Korea Press Foundation's BigKinds. A quantitative and content analysis was conducted on the news frames. The results indicated that the quantity of news coverage has increased. The greatest quantity of news coverage was observed in 2020, and the most frequently discussed topic in the news was 'IT_Science'. The results of the keyword analysis indicated that the primary words were 'QRcode', 'smartphone', 'service', 'application', and 'payment'. The news media primarily focused on the QRcode's ability to provide instant access and recognition technology. This study demonstrates that advanced information and communication technologies and the increased prevalence of mobile devices have led to a rise in the utilization of QRcodes. Furthermore, QRcodes have become a significant information media in contemporary society.

Wrapper-based Economy Data Collection System Design And Implementation (래퍼 기반 경제 데이터 수집 시스템 설계 및 구현)

  • Piao, Zhegao;Gu, Yeong Hyeon;Yoo, Seong Joon
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2015.05a
    • /
    • pp.227-230
    • /
    • 2015
  • For analyzing and prediction of economic trends, it is necessary to collect particular economic news and stock data. Typical Web crawler to analyze the page content, collects document and extracts URL automatically. On the other hand there are forms of crawler that can collect only document of a particular topic. In order to collect economic news on a particular Web site, we need to design a crawler which could directly analyze its structure and gather data from it. The wrapper-based web crawler design is required. In this paper, we design a crawler wrapper for Economic news analysis system based on big data and implemented to collect data. we collect the data which stock data, sales data from USA auto market since 2000 with wrapper-based crawler. USA and South Korea's economic news data are also collected by wrapper-based crawler. To determining the data update frequency on the site. And periodically updated. We remove duplicate data and build a structured data set for next analysis. Primary to remove the noise data, such as advertising and public relations, etc.

  • PDF

Predicting Stock Prices Based on Online News Content and Technical Indicators by Combinatorial Analysis Using CNN and LSTM with Self-attention

  • Sang Hyung Jung;Gyo Jung Gu;Dongsung Kim;Jong Woo Kim
    • Asia pacific journal of information systems
    • /
    • v.30 no.4
    • /
    • pp.719-740
    • /
    • 2020
  • The stock market changes continuously as new information emerges, affecting the judgments of investors. Online news articles are valued as a traditional window to inform investors about various information that affects the stock market. This paper proposed new ways to utilize online news articles with technical indicators. The suggested hybrid model consists of three models. First, a self-attention-based convolutional neural network (CNN) model, considered to be better in interpreting the semantics of long texts, uses news content as inputs. Second, a self-attention-based, bi-long short-term memory (bi-LSTM) neural network model for short texts utilizes news titles as inputs. Third, a bi-LSTM model, considered to be better in analyzing context information and time-series models, uses 19 technical indicators as inputs. We used news articles from the previous day and technical indicators from the past seven days to predict the share price of the next day. An experiment was performed with Korean stock market data and news articles from 33 top companies over three years. Through this experiment, our proposed model showed better performance than previous approaches, which have mainly focused on news titles. This paper demonstrated that news titles and content should be treated in different ways for superior stock price prediction.

User-Perspective Issue Clustering Using Multi-Layered Two-Mode Network Analysis (다계층 이원 네트워크를 활용한 사용자 관점의 이슈 클러스터링)

  • Kim, Jieun;Kim, Namgyu;Cho, Yoonho
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.2
    • /
    • pp.93-107
    • /
    • 2014
  • In this paper, we report what we have observed with regard to user-perspective issue clustering based on multi-layered two-mode network analysis. This work is significant in the context of data collection by companies about customer needs. Most companies have failed to uncover such needs for products or services properly in terms of demographic data such as age, income levels, and purchase history. Because of excessive reliance on limited internal data, most recommendation systems do not provide decision makers with appropriate business information for current business circumstances. However, part of the problem is the increasing regulation of personal data gathering and privacy. This makes demographic or transaction data collection more difficult, and is a significant hurdle for traditional recommendation approaches because these systems demand a great deal of personal data or transaction logs. Our motivation for presenting this paper to academia is our strong belief, and evidence, that most customers' requirements for products can be effectively and efficiently analyzed from unstructured textual data such as Internet news text. In order to derive users' requirements from textual data obtained online, the proposed approach in this paper attempts to construct double two-mode networks, such as a user-news network and news-issue network, and to integrate these into one quasi-network as the input for issue clustering. One of the contributions of this research is the development of a methodology utilizing enormous amounts of unstructured textual data for user-oriented issue clustering by leveraging existing text mining and social network analysis. In order to build multi-layered two-mode networks of news logs, we need some tools such as text mining and topic analysis. We used not only SAS Enterprise Miner 12.1, which provides a text miner module and cluster module for textual data analysis, but also NetMiner 4 for network visualization and analysis. Our approach for user-perspective issue clustering is composed of six main phases: crawling, topic analysis, access pattern analysis, network merging, network conversion, and clustering. In the first phase, we collect visit logs for news sites by crawler. After gathering unstructured news article data, the topic analysis phase extracts issues from each news article in order to build an article-news network. For simplicity, 100 topics are extracted from 13,652 articles. In the third phase, a user-article network is constructed with access patterns derived from web transaction logs. The double two-mode networks are then merged into a quasi-network of user-issue. Finally, in the user-oriented issue-clustering phase, we classify issues through structural equivalence, and compare these with the clustering results from statistical tools and network analysis. An experiment with a large dataset was performed to build a multi-layer two-mode network. After that, we compared the results of issue clustering from SAS with that of network analysis. The experimental dataset was from a web site ranking site, and the biggest portal site in Korea. The sample dataset contains 150 million transaction logs and 13,652 news articles of 5,000 panels over one year. User-article and article-issue networks are constructed and merged into a user-issue quasi-network using Netminer. Our issue-clustering results applied the Partitioning Around Medoids (PAM) algorithm and Multidimensional Scaling (MDS), and are consistent with the results from SAS clustering. In spite of extensive efforts to provide user information with recommendation systems, most projects are successful only when companies have sufficient data about users and transactions. Our proposed methodology, user-perspective issue clustering, can provide practical support to decision-making in companies because it enhances user-related data from unstructured textual data. To overcome the problem of insufficient data from traditional approaches, our methodology infers customers' real interests by utilizing web transaction logs. In addition, we suggest topic analysis and issue clustering as a practical means of issue identification.

Developing and Evaluating Damage Information Classifier of High Impact Weather by Using News Big Data (재해기상 언론기사 빅데이터를 활용한 피해정보 자동 분류기 개발)

  • Su-Ji, Cho;Ki-Kwang Lee
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.46 no.3
    • /
    • pp.7-14
    • /
    • 2023
  • Recently, the importance of impact-based forecasting has increased along with the socio-economic impact of severe weather have emerged. As news articles contain unconstructed information closely related to the people's life, this study developed and evaluated a binary classification algorithm about snowfall damage information by using media articles text mining. We collected news articles during 2009 to 2021 which containing 'heavy snow' in its body context and labelled whether each article correspond to specific damage fields such as car accident. To develop a classifier, we proposed a probability-based classifier based on the ratio of the two conditional probabilities, which is defined as I/O Ratio in this study. During the construction process, we also adopted the n-gram approach to consider contextual meaning of each keyword. The accuracy of the classifier was 75%, supporting the possibility of application of news big data to the impact-based forecasting. We expect the performance of the classifier will be improve in the further research as the various training data is accumulated. The result of this study can be readily expanded by applying the same methodology to other disasters in the future. Furthermore, the result of this study can reduce social and economic damage of high impact weather by supporting the establishment of an integrated meteorological decision support system.

Effects of Anchors' Reputation and Brand Equity Evaluation of TV News Program on the Continuous Watching Intention : Focusing on KBS, JTBC, YTN TV News (TV 뉴스 프로그램의 앵커 평판과 브랜드 자산 평가가 지속적 시청 의도에 미치는 영향 : KBS, JTBC, YTN 뉴스를 중심으로)

  • Ha, Dong-Keun;Ahn, Seo-Jin
    • The Journal of the Korea Contents Association
    • /
    • v.18 no.9
    • /
    • pp.91-101
    • /
    • 2018
  • This research verified the effects of anchors' reputation and brand equity evaluation of news on the continuous watching intention for general news channels such as KBS, JTBC, and YTN. Data collection was conducted on nationwide 539 adults who were watching news for each channel, and Hierarchical regression analysis was conducted to analyze the impact of the anchors' reputation and news brand equity evaluation factors. As a result, first, KBS showed continuous watching intention as viewers are men, their academic background is lower, they are more conservative, viewing frequency is higher, anchor awareness is higher, and news awareness and news preference are higher. Second, JTBC showed continuous watching intention as viewers are more advancing, viewing frequency is higher, anchor confidence, news awareness, and news preference, and evaluation on news quality are higher. Third, YTN showed continuous watching intention as their viewing frequency is higher, anchor confidence and anchor attraction are higher, news preference and evaluation on news quality are higher.

Fake News Detection for Korean News Using Text Mining and Machine Learning Techniques (텍스트 마이닝과 기계 학습을 이용한 국내 가짜뉴스 예측)

  • Yun, Tae-Uk;Ahn, Hyunchul
    • Journal of Information Technology Applications and Management
    • /
    • v.25 no.1
    • /
    • pp.19-32
    • /
    • 2018
  • Fake news is defined as the news articles that are intentionally and verifiably false, and could mislead readers. Spread of fake news may provoke anxiety, chaos, fear, or irrational decisions of the public. Thus, detecting fake news and preventing its spread has become very important issue in our society. However, due to the huge amount of fake news produced every day, it is almost impossible to identify it by a human. Under this context, researchers have tried to develop automated fake news detection method using Artificial Intelligence techniques over the past years. But, unfortunately, there have been no prior studies proposed an automated fake news detection method for Korean news. In this study, we aim to detect Korean fake news using text mining and machine learning techniques. Our proposed method consists of two steps. In the first step, the news contents to be analyzed is convert to quantified values using various text mining techniques (Topic Modeling, TF-IDF, and so on). After that, in step 2, classifiers are trained using the values produced in step 1. As the classifiers, machine learning techniques such as multiple discriminant analysis, case based reasoning, artificial neural networks, and support vector machine can be applied. To validate the effectiveness of the proposed method, we collected 200 Korean news from Seoul National University's FactCheck (http://factcheck.snu.ac.kr). which provides with detailed analysis reports from about 20 media outlets and links to source documents for each case. Using this dataset, we will identify which text features are important as well as which classifiers are effective in detecting Korean fake news.

Dynamics in Election News Making: An Exploratory Study (선거보도의 역동성에 대한 탐색적 연구)

  • Lee, Han Soo
    • Korean Journal of Legislative Studies
    • /
    • v.27 no.3
    • /
    • pp.155-188
    • /
    • 2021
  • This study examines dynamics in election news making. It is important to understand when and how news media produce election news in order to grasp news making and voting behavior. The news media sometimes make election news by focusing on issues and policies. Often they frame elections as a game and focus on election strategies while covering elections. This article argues that as time goes by during the election period, the number of policy news tends to decrease while the frequency of strategic news is likely to increase. Also, TV's and newspapers show distinctive patterns of election news making. In order to examine the arguments, this study categorizes election news stories into policy and strategic news stories produced during the 2020 Korean congressional elections and constructs daily time-series data of them. The results of structural break and regression analyses partially support the arguments.