• Title/Summary/Keyword: News Article Classification

Search Result 24, Processing Time 0.026 seconds

Research on Multi-facted News Article Classification Models Classifying Subjects, Geographies and Genres (심층 주제, 지역, 장르를 모두 분류할 수 있는 다면적 뉴스 기사 자동 분류 모델 연구)

  • Hyojin Lee;SungPil Choi
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.58 no.3
    • /
    • pp.65-89
    • /
    • 2024
  • This study developed a model to classify news articles into categories of topic, genre, and region using a Korean Pre-trained Language model. To achieve this, a new news article classification system was designed by referring to the classification systems of domestic media outlets. The topic and genre classification models were implemented as hierarchical classification models that link the main categories and subcategories, and their performance was compared with that of an integrated category model. The evaluation results showed that the hierarchical structure classification model had the advantage of providing more precise categorization in ambiguous or overlapping categories compared to the integrated category model. For regional classification of news articles, a model was built to classify into 18 categories, and for regional news articles, the regional characteristics were clearly reflected in the text, resulting in high performance. This study demonstrated the effectiveness of classifying news articles from multiple perspectives-topic, genre, and region-and emphasized the significance of suggesting the potential for a multi-dimensional news article classification service that meets user needs.

FAGON: Fake News Detection Model Using Grammatical Transformation on Deep Neural Network

  • Seo, Youngkyung;Han, Seong-Soo;Jeon, You-Boo;Jeong, Chang-Sung
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.13 no.10
    • /
    • pp.4958-4970
    • /
    • 2019
  • As technology advances, the amount of fake news is increasing more and more by various reasons such as political issues and advertisement exaggeration. However, there have been very few research works on fake news detection, especially which uses grammatical transformation on deep neural network. In this paper, we shall present a new Fake News Detection Model, called FAGON(Fake news detection model using Grammatical transformation On deep Neural network) which determines efficiently if the proposition is true or not for the given article by learning grammatical transformation on neural network. Especially, our model focuses the Korean language. It consists of two modules: sentence generator and classification. The former generates multiple sentences which have the same meaning as the proposition, but with different grammar by training the grammatical transformation. The latter classifies the proposition as true or false by training with vectors generated from each sentence of the article and the multiple sentences obtained from the former model respectively. We shall show that our model is designed to detect fake news effectively by exploiting various grammatical transformation and proper classification structure.

Arabic Stock News Sentiments Using the Bidirectional Encoder Representations from Transformers Model

  • Eman Alasmari;Mohamed Hamdy;Khaled H. Alyoubi;Fahd Saleh Alotaibi
    • International Journal of Computer Science & Network Security
    • /
    • v.24 no.2
    • /
    • pp.113-123
    • /
    • 2024
  • Stock market news sentiment analysis (SA) aims to identify the attitudes of the news of the stock on the official platforms toward companies' stocks. It supports making the right decision in investing or analysts' evaluation. However, the research on Arabic SA is limited compared to that on English SA due to the complexity and limited corpora of the Arabic language. This paper develops a model of sentiment classification to predict the polarity of Arabic stock news in microblogs. Also, it aims to extract the reasons which lead to polarity categorization as the main economic causes or aspects based on semantic unity. Therefore, this paper presents an Arabic SA approach based on the logistic regression model and the Bidirectional Encoder Representations from Transformers (BERT) model. The proposed model is used to classify articles as positive, negative, or neutral. It was trained on the basis of data collected from an official Saudi stock market article platform that was later preprocessed and labeled. Moreover, the economic reasons for the articles based on semantic unit, divided into seven economic aspects to highlight the polarity of the articles, were investigated. The supervised BERT model obtained 88% article classification accuracy based on SA, and the unsupervised mean Word2Vec encoder obtained 80% economic-aspect clustering accuracy. Predicting polarity classification on the Arabic stock market news and their economic reasons would provide valuable benefits to the stock SA field.

A Study of Housing Environment Problems through the Daily newspapers ( I ) - The Change of a type of the Dong-A daily papers (1920~1990) - (일간지를 통해 본 주거환경문제의 연구 ( I ) - 동아일보 (1920년~1990년) 기사 유형의 변천 -)

  • 신경주
    • Journal of the Korean housing association
    • /
    • v.2 no.2
    • /
    • pp.41-53
    • /
    • 1991
  • This study discussed the change of housing environmental problems from the early 1900s to the present.The reason is to find the solution of serious housing environment problems. The documentary research method was used for this study.Articles of content analysis(N= 1129)were published in 1920(the first edition)to December. 31, 1990 which were The Dong - A daily news article about housing environment. The main content of this study was examined the change, such as the number of whole article by time series and importance of article(column number of article), classification of article subject, and the number of article by subject. On the basis of this data, was made by chronological classification of the change of housing environment problems for 70 years. Since overall results will become supply of right information about housing environment to fur peoples, will provide the oppronment that oneself ran participate the protection of housing environment, and further will take a part solution of housing environment problems.At the future, I am going to design deep analysis of article content by subject.

  • PDF

Analysis of Shipping and Logistics News Articles using Topic Modeling (토픽모델링을 활용한 해운물류 뉴스 분석)

  • Hee-Young Yoon;Il-Youp Kwak
    • Korea Trade Review
    • /
    • v.46 no.4
    • /
    • pp.61-76
    • /
    • 2021
  • This study focuses on three logistics-related news (Logistics Newspaper, Korea Shipping Gadget, and Korea Shipping Newspaper) in order to present changes in logistics issues, centering on Corona 19, which has recently had the greatest impact in the world. For data collection, two-year news articles in 2019 and 2020 (title, article, content, date, article classification, article URL) were collected through web crawling (using Python's BeautifulSoup, requests module) on the homepages of three representative logistics-related media companies. As for the data analysis methods, fundamental statistical analysis, Latent Dirichlet Allocation (LDA) for topic modeling, and Scattertext were performed. The analysis results were as follows. First, among the three news media related to logistics, the Korea Shipping Newspaper was carrying out the most active media activities. Second, through topic modeling with LDA, eight logistics-related topics were identified, and keywords and significant issues of each topic were presented. Third, the keywords were visually expressed through Scattertext. This is the first study to present changes in the logistics field, focusing on articles from representative logistics-related media in 2019 and 2020. In particular, 2019 and 2020 can be divided into before and after the outbreak of Corona 19, which has had a great impact not only on the logistics field but also on our lives as a whole. For future work, a multi-faceted approach is required, such as comparative studies of logistics issues between countries or presenting implications based on long-term time-series articles.

Controversy and Guideline Suggestion Surrounding Fake News in the Digital Media Age (가짜뉴스(Fake News) 현황분석을 통해 본 디지털매체 시대의 쟁점과 뉴스콘텐츠 제작 가이드라인)

  • Kwon, Mahnwoo;Jun, Yong Woo;Im, Hajin
    • Journal of Korea Multimedia Society
    • /
    • v.18 no.11
    • /
    • pp.1419-1426
    • /
    • 2015
  • Distinguishing border between news and advertising is disappearing. Traditional journalism considered editorial part deals news and ad part handle commercial messages. But now this classification is meaningless. Current news consumers do not separate advertising content and non-advertising content. In Korea, making fake news or paid news pages is becoming social problem. Fake news uses various camouflages to pretend to be real news. This paper descriptively analyzed Korean fake news cases and suggested some guidelines for publishing news. We analyzed 3 major newspaper web sites from July to September, 2014. These three newspapers publish section pages everyday containing fake news or sponsored news. Totally more than one thousand articles were selected for content analysis. We coded the numbers of fake news, day of the week, the rate of sponsored news, average fake news publication number per pages, the conformity between news and advertising, and the type of fake news. We also coded the number of sponsored news article in day sections. We used method of comparing the advertising contents and news articles. As a result, 24.8% of news article were published for the advertising sponsors. Advertorial or fake news were sometimes arranged same pages the same day. We coded the conformity between same advertising and news content. More than 60 percent (60.9%) of fake news match with their sponsors. PR style of fake news is top and advertising type of fake news is the lowest.

Feature Weighting for Opinion Classification of Comments on News Articles (뉴스 댓글의 감정 분류를 위한 자질 가중치 설정)

  • Lee, Kong-Joo;Kim, Jae-Hoon;Seo, Hyung-Won;Rhyu, Keel-Soo
    • Journal of Advanced Marine Engineering and Technology
    • /
    • v.34 no.6
    • /
    • pp.871-879
    • /
    • 2010
  • In this paper, we present a system that classifies comments on a news article into a user opinion called a polarity (positive or negative). The system is a kind of document classification system for comments and is based on machine learning techniques like support vector machine. Unlike normal documents, comments have their body that can influence classifying their opinions as polarities. In this paper, we propose a feature weighting scheme using such characteristics of comments and several resources for opinion classification. Through our experiments, the weighting scheme have turned out to be useful for opinion classification in comments on Korean news articles. Also Korean character n-grams (bigram or trigram) have been revealed to be helpful for opinion classification in comments including lots of Internet words or typos. In the future, we will apply this scheme to opinion analysis of comments of product reviews as well as news articles.

An Experimental Study on Automatic Summarization of Multiple News Articles (복수의 신문기사 자동요약에 관한 실험적 연구)

  • Kim, Yong-Kwang;Chung, Young-Mee
    • Journal of the Korean Society for information Management
    • /
    • v.23 no.1 s.59
    • /
    • pp.83-98
    • /
    • 2006
  • This study proposes a template-based method of automatic summarization of multiple news articles using the semantic categories of sentences. First, the semantic categories for core information to be included in a summary are identified from training set of documents and their summaries. Then, cue words for each slot of the template are selected for later classification of news sentences into relevant slots. When a news article is input, its event/accident category is identified, and key sentences are extracted from the news article and filled in the relevant slots. The template filled with simple sentences rather than original long sentences is used to generate a summary for an event/accident. In the user evaluation of the generated summaries, the results showed the 54.l% recall ratio and the 58.l% precision ratio in essential information extraction and 11.6% redundancy ratio.

Developing and Evaluating Damage Information Classifier of High Impact Weather by Using News Big Data (재해기상 언론기사 빅데이터를 활용한 피해정보 자동 분류기 개발)

  • Su-Ji, Cho;Ki-Kwang Lee
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.46 no.3
    • /
    • pp.7-14
    • /
    • 2023
  • Recently, the importance of impact-based forecasting has increased along with the socio-economic impact of severe weather have emerged. As news articles contain unconstructed information closely related to the people's life, this study developed and evaluated a binary classification algorithm about snowfall damage information by using media articles text mining. We collected news articles during 2009 to 2021 which containing 'heavy snow' in its body context and labelled whether each article correspond to specific damage fields such as car accident. To develop a classifier, we proposed a probability-based classifier based on the ratio of the two conditional probabilities, which is defined as I/O Ratio in this study. During the construction process, we also adopted the n-gram approach to consider contextual meaning of each keyword. The accuracy of the classifier was 75%, supporting the possibility of application of news big data to the impact-based forecasting. We expect the performance of the classifier will be improve in the further research as the various training data is accumulated. The result of this study can be readily expanded by applying the same methodology to other disasters in the future. Furthermore, the result of this study can reduce social and economic damage of high impact weather by supporting the establishment of an integrated meteorological decision support system.

Prediction of Stock Returns from News Article's Recommended Stocks Using XGBoost and LightGBM Models

  • Yoo-jin Hwang;Seung-yeon Son;Zoon-ky Lee
    • Journal of the Korea Society of Computer and Information
    • /
    • v.29 no.2
    • /
    • pp.51-59
    • /
    • 2024
  • This study examines the relationship between the release of the news and the individual stock returns. Investors utilize a variety of information sources to maximize stock returns when establishing investment strategies. News companies publish their articles based on stock recommendation reports of analysts, enhancing the reliability of the information. Defining release of a stock-recommendation news article as an event, we examine its economic impacts and propose a binary classification model that predicts the stock return 10 days after the event. XGBoost and LightGBM models are applied for the study with accuracy of 75%, 71% respectively. In addition, after categorizing the recommended stocks based on the listed market(KOSPI/KOSDAQ) and market capitalization(Big/Small), this study verifies difference in the accuracy of models across four sub-datasets. Finally, by conducting SHAP(Shapley Additive exPlanations) analysis, we identify the key variables in each model, reinforcing the interpretability of models.