• Title/Summary/Keyword: news articles

Search Result 584, Processing Time 0.026 seconds

Semantic Pre-training Methodology for Improving Text Summarization Quality (텍스트 요약 품질 향상을 위한 의미적 사전학습 방법론)

  • Mingyu Jeon;Namgyu Kim
    • Smart Media Journal
    • /
    • v.12 no.5
    • /
    • pp.17-27
    • /
    • 2023
  • Recently, automatic text summarization, which automatically summarizes only meaningful information for users, is being studied steadily. Especially, research on text summarization using Transformer, an artificial neural network model, has been mainly conducted. Among various studies, the GSG method, which trains a model through sentence-by-sentence masking, has received the most attention. However, the traditional GSG has limitations in selecting a sentence to be masked based on the degree of overlap of tokens, not the meaning of a sentence. Therefore, in this study, in order to improve the quality of text summarization, we propose SbGSG (Semantic-based GSG) methodology that selects sentences to be masked by GSG considering the meaning of sentences. As a result of conducting an experiment using 370,000 news articles and 21,600 summaries and reports, it was confirmed that the proposed methodology, SbGSG, showed superior performance compared to the traditional GSG in terms of ROUGE and BERT Score.

Summarization of Korean Dialogues through Dialogue Restructuring (대화문 재구조화를 통한 한국어 대화문 요약)

  • Eun Hee Kim;Myung Jin Lim;Ju Hyun Shin
    • Smart Media Journal
    • /
    • v.12 no.11
    • /
    • pp.77-85
    • /
    • 2023
  • After COVID-19, communication through online platforms has increased, leading to an accumulation of massive amounts of conversational text data. With the growing importance of summarizing this text data to extract meaningful information, there has been active research on deep learning-based abstractive summarization. However, conversational data, compared to structured texts like news articles, often contains missing or transformed information, necessitating consideration from multiple perspectives due to its unique characteristics. In particular, vocabulary omissions and unrelated expressions in the conversation can hinder effective summarization. Therefore, in this study, we restructured by considering the characteristics of Korean conversational data, fine-tuning a pre-trained text summarization model based on KoBART, and improved conversation data summary perfomance through a refining operation to remove redundant elements from the summary. By restructuring the sentences based on the order of utterances and extracting a central speaker, we combined methods to restructure the conversation around them. As a result, there was about a 4 point improvement in the Rouge-1 score. This study has demonstrated the significance of our conversation restructuring approach, which considers the characteristics of dialogue, in enhancing Korean conversation summarization performance.

Exploring the phenomenon of veganphobia in vegan food and vegan fashion (비건 음식과 비건 패션에서 나타난 비건포비아 현상에 대한 탐구)

  • Yeong-Hyeon Choi;Sangyung Lee
    • The Research Journal of the Costume Culture
    • /
    • v.32 no.3
    • /
    • pp.381-397
    • /
    • 2024
  • This study investigates the negative perceptions (veganphobia) held by consumers toward vegan diets and fashion and aims to foster a genuine acceptance of ethical veganism in consumption. The textual data web-crawled Korean online posts, including news articles, blogs, forums, and tweets, containing keywords such as "contradiction," "dilemma," "conflict," "issues," "vegan food" and "vegan fashion" from 2013 to 2021. Data analysis was conducted through text mining, network analysis, and clustering analysis using Python and NodeXL programs. The analysis revealed distinct negative perceptions regarding vegan food. Key issues included the perception of hypocrisy among vegetarians, associations with specific political leanings, conflicts between environmental and animal rights, and contradictions between views on companion animals and livestock. Regarding the vegan fashion industry, the eco-friendliness of material selection and design processes were seen as the pivotal factors shaping negative attitudes. Furthermore, the study identified a shared negative perception regarding vegan food and vegan fashion. This negativity was characterized by confusion and conflicts between animal and environmental rights, biased perceptions linked to specific political affiliations, perceived self-righteousness among vegetarians, and general discomfort toward them. These factors collectively contributed to a broader negative perception of vegan consumption. In conclusion, this study is significant in understanding the complex perceptions and attitudes that con- sumers hold toward vegan food and fashion. The insights gained from this research can aid in the design of more effective campaign strategies aimed at promoting vegan consumerism, ultimately contributing to a more widespread acceptance of ethical veganism in society.

An Analysis of Fishing Village Tourism Issues Reported in Korea Media (국내 언론에 보도된 어촌관광 이슈의 변동 분석)

  • Ji-Yeong Ko;Chae-wan Lee
    • Journal of the Korean Society of Marine Environment & Safety
    • /
    • v.30 no.4
    • /
    • pp.299-307
    • /
    • 2024
  • Fishing villages, which are the focus of this study, are interested in fishing tourism for creating a new income base and sustaining fishing communities. This is because the extraordinary nature of the fishing village space creates new values in line with the function of tourism, however, the related policies are less than adequate compared to the importance of fishing village tourism. Therefore, this study aims to analyze the interest of Korean society in fishing village tourism the manner in which this issue has changed over time. Using the news analysis system, BigKinds, we systematically collected and analyzed articles related to fishing village tourism reported in the domestic media. The results showed that social interest in fishing village tourism and government policy support had increased over time, suggesting that fishing village tourism was an important strategy that could revitalize local economies and prevent the disappearance of fishing villages.

Investigating Dynamic Mutation Process of Issues Using Unstructured Text Analysis (비정형 텍스트 분석을 활용한 이슈의 동적 변이과정 고찰)

  • Lim, Myungsu;Kim, Namgyu
    • Journal of Intelligence and Information Systems
    • /
    • v.22 no.1
    • /
    • pp.1-18
    • /
    • 2016
  • Owing to the extensive use of Web media and the development of the IT industry, a large amount of data has been generated, shared, and stored. Nowadays, various types of unstructured data such as image, sound, video, and text are distributed through Web media. Therefore, many attempts have been made in recent years to discover new value through an analysis of these unstructured data. Among these types of unstructured data, text is recognized as the most representative method for users to express and share their opinions on the Web. In this sense, demand for obtaining new insights through text analysis is steadily increasing. Accordingly, text mining is increasingly being used for different purposes in various fields. In particular, issue tracking is being widely studied not only in the academic world but also in industries because it can be used to extract various issues from text such as news, (SocialNetworkServices) to analyze the trends of these issues. Conventionally, issue tracking is used to identify major issues sustained over a long period of time through topic modeling and to analyze the detailed distribution of documents involved in each issue. However, because conventional issue tracking assumes that the content composing each issue does not change throughout the entire tracking period, it cannot represent the dynamic mutation process of detailed issues that can be created, merged, divided, and deleted between these periods. Moreover, because only keywords that appear consistently throughout the entire period can be derived as issue keywords, concrete issue keywords such as "nuclear test" and "separated families" may be concealed by more general issue keywords such as "North Korea" in an analysis over a long period of time. This implies that many meaningful but short-lived issues cannot be discovered by conventional issue tracking. Note that detailed keywords are preferable to general keywords because the former can be clues for providing actionable strategies. To overcome these limitations, we performed an independent analysis on the documents of each detailed period. We generated an issue flow diagram based on the similarity of each issue between two consecutive periods. The issue transition pattern among categories was analyzed by using the category information of each document. In this study, we then applied the proposed methodology to a real case of 53,739 news articles. We derived an issue flow diagram from the articles. We then proposed the following useful application scenarios for the issue flow diagram presented in the experiment section. First, we can identify an issue that actively appears during a certain period and promptly disappears in the next period. Second, the preceding and following issues of a particular issue can be easily discovered from the issue flow diagram. This implies that our methodology can be used to discover the association between inter-period issues. Finally, an interesting pattern of one-way and two-way transitions was discovered by analyzing the transition patterns of issues through category analysis. Thus, we discovered that a pair of mutually similar categories induces two-way transitions. In contrast, one-way transitions can be recognized as an indicator that issues in a certain category tend to be influenced by other issues in another category. For practical application of the proposed methodology, high-quality word and stop word dictionaries need to be constructed. In addition, not only the number of documents but also additional meta-information such as the read counts, written time, and comments of documents should be analyzed. A rigorous performance evaluation or validation of the proposed methodology should be performed in future works.

A Qualitative Study on the Forces that Influence the Article Production of Local Newspapers Focus on the Article Production of Gwangjudream (지역신문 기사생산에 영향을 미치는 요인에 대한 질적 연구 "광주드림" 기사생산을 중심으로)

  • Her, Jin-Ah;Lee, Oh-Hyeon
    • Korean journal of communication and information
    • /
    • v.46
    • /
    • pp.449-484
    • /
    • 2009
  • It has been said that Gwangjudream, nevertheless a free press, plays a role as a local press that it should be, in a situation that other local papers do not. This study aims to reveal the forces that influence the article production of Gwangjudream, and to examine the interrelations between them, through using the methods of participant observations and depth interviews. In this course, it is eventually purpose of providing more deep understandings on the present circumstances and problems of the local papers and having a chance to concern the concrete ways to enhance them. This study results in revealing the five forces that primarily influence the article production of Gwangjudream: 1) as a historical force, keeping the spirit of the first publication that look forward to playing a role as a local press that it sound be, 2) as an individual force, the habitus of its members that is critical of mainstream society and culture, 3) as an organizational force, non-hierarchical culture and the independence of the editorial rights, 4) as a habitual force, the deny of beat system, 5) as an economical force, the power of sponsors, financial poorness, and the competition for attracting subscribers. While the historical force and the individual force play a role as fundamental circumstances and the organizational force and the habitual force as practical circumstances for producing articles, they encourage to emerge the characteristics of the articles that are related to citizens' everyday life and reflect locality, and criticize and keep an eye on government and other public offices. However, the economical force provides the circumstances that weaken the characteristics of Gwangjudream. The results of this study question the perspective to overly regard it as coming from their economical weakness that the local newspaper do not play a role as a local press that it should be.

  • PDF

Corona Blue and Leisure Activities : Focusing on Korean Case (코로나 블루와 여가 활동 : 한국 사례를 중심으로)

  • Sa, Hye Ji;Lee, Won Sang;Lee, Bong Gyou
    • Journal of Internet Computing and Services
    • /
    • v.22 no.2
    • /
    • pp.109-121
    • /
    • 2021
  • As the global COVID-19 pandemic is prolonged, the Corona Blue phenomenon, combined with COVID-19 and blue, is intensifying. The purpose of this study is to analyze the current trend of Corona Blue in consideration of the possibility of increasing mental illness and the need for countermeasures, especially after COVID-19. This study tried to find out the relationship between stress and leisure activities before and after COVID-19 by using Corona Blue news article analysis through the topic modeling method, and questionnaire find out the help of stress and leisure activities. This study was compared and analyzed using two research methods. First, a total of 363 news articles were analyzed through topic modeling based on newspaper articles from January 2020, when COVID- 19 was upgraded to the "border" stage, until September, where the social distancing stage was strengthened to stage 2.5 in Korea. As a result of the study, a total of 28 topics were extracted, and similar topics were grouped into 7 groups: mental-demic, generational spread, causes of depression acceleration, increased fatigue, attitude to coping with long-term wars, changes in consumption, and efforts to overcome depression. Second, the SPSS statistical program was used to analyze the level of stress change according to leisure activities before/after COVID-19 and the main help according to leisure activities. As a result of the study, it was confirmed that the average difference in stress reduction according to participation in leisure activities before COVID-19 was larger than after COVID-19. Also, leisure activities were found to be effective in stress relief even after COVID-19. In addition, if the main help from leisure activities before COVID-19 was the meaning of relaxation and recharging through physical and social activities. After COVID-19, psychological roles such as mood swings through nature, outdoor activities, or intellectual activities were found to play a large part. As such, in this study, it was confirmed that understanding the current status of Corona Blue and coping with leisure in extreme stress situations has a positive effect. It is expected that this research can serve as a basis for preparing realistic and desirable leisure policies and countermeasures to overcome Corona Blue.

Analysis of Social Trends for Electric Scooters Using Dynamic Topic Modeling and Sentiment Analysis (동적 토픽 모델링과 감성 분석을 활용한 전동킥보드에 대한 사회적 동향 분석)

  • Kyoungok, Kim;Yerang, Shin
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.12 no.1
    • /
    • pp.19-30
    • /
    • 2023
  • An electric scooter(e-scooter), one popularized micro-mobility vehicle has shown rapidly increasing use in many cities. In South Korea, the use of e-scooters has greatly increased, as some companies have launched e-scooter sharing services in a few large cities, starting with Seoul in 2018. However, the use of e-scooters is still controversial because of issues such as parking and safety. Since the perception toward the means of transportation affects the mode choice, it is necessary to track the trends for electric scooters to make the use of e-scooters more active. Hence, this study aimed to analyze the trends related to e-scooters. For this purpose, we analyzed news articles related to e-scooters published from 2014 to 2020 using dynamic topic modeling to extract issues and sentiment analysis to investigate how the degree of positive and negative opinions in news articles had changed. As a result of topic modeling, it was possible to extract three different topics related to micro-mobility technologies, shared e-scooter services, and regulations for micro-mobility, and the proportion of the topic for regulations for micro-mobility increased as shared e-scooter services increased in recent years. In addition, the top positive words included quick, enjoyable, and easy, whereas the top negative words included threat, complaint, and ilegal, which implies that people satisfied with the convenience of e-scooter or e-scooter sharing services, but safety and parking issues should be addressed for micro-mobility services to become more active. In conclusion, this study was able to understand how issues and social trends related to e-scooters have changed, and to determine the issues that need to be addressed. Moreover, it is expected that the research framework using dynamic topic modeling and sentiment analysis will be helpful in determining social trends on various areas.

Mapping Categories of Heterogeneous Sources Using Text Analytics (텍스트 분석을 통한 이종 매체 카테고리 다중 매핑 방법론)

  • Kim, Dasom;Kim, Namgyu
    • Journal of Intelligence and Information Systems
    • /
    • v.22 no.4
    • /
    • pp.193-215
    • /
    • 2016
  • In recent years, the proliferation of diverse social networking services has led users to use many mediums simultaneously depending on their individual purpose and taste. Besides, while collecting information about particular themes, they usually employ various mediums such as social networking services, Internet news, and blogs. However, in terms of management, each document circulated through diverse mediums is placed in different categories on the basis of each source's policy and standards, hindering any attempt to conduct research on a specific category across different kinds of sources. For example, documents containing content on "Application for a foreign travel" can be classified into "Information Technology," "Travel," or "Life and Culture" according to the peculiar standard of each source. Likewise, with different viewpoints of definition and levels of specification for each source, similar categories can be named and structured differently in accordance with each source. To overcome these limitations, this study proposes a plan for conducting category mapping between different sources with various mediums while maintaining the existing category system of the medium as it is. Specifically, by re-classifying individual documents from the viewpoint of diverse sources and storing the result of such a classification as extra attributes, this study proposes a logical layer by which users can search for a specific document from multiple heterogeneous sources with different category names as if they belong to the same source. Besides, by collecting 6,000 articles of news from two Internet news portals, experiments were conducted to compare accuracy among sources, supervised learning and semi-supervised learning, and homogeneous and heterogeneous learning data. It is particularly interesting that in some categories, classifying accuracy of semi-supervised learning using heterogeneous learning data proved to be higher than that of supervised learning and semi-supervised learning, which used homogeneous learning data. This study has the following significances. First, it proposes a logical plan for establishing a system to integrate and manage all the heterogeneous mediums in different classifying systems while maintaining the existing physical classifying system as it is. This study's results particularly exhibit very different classifying accuracies in accordance with the heterogeneity of learning data; this is expected to spur further studies for enhancing the performance of the proposed methodology through the analysis of characteristics by category. In addition, with an increasing demand for search, collection, and analysis of documents from diverse mediums, the scope of the Internet search is not restricted to one medium. However, since each medium has a different categorical structure and name, it is actually very difficult to search for a specific category insofar as encompassing heterogeneous mediums. The proposed methodology is also significant for presenting a plan that enquires into all the documents regarding the standards of the relevant sites' categorical classification when the users select the desired site, while maintaining the existing site's characteristics and structure as it is. This study's proposed methodology needs to be further complemented in the following aspects. First, though only an indirect comparison and evaluation was made on the performance of this proposed methodology, future studies would need to conduct more direct tests on its accuracy. That is, after re-classifying documents of the object source on the basis of the categorical system of the existing source, the extent to which the classification was accurate needs to be verified through evaluation by actual users. In addition, the accuracy in classification needs to be increased by making the methodology more sophisticated. Furthermore, an understanding is required that the characteristics of some categories that showed a rather higher classifying accuracy of heterogeneous semi-supervised learning than that of supervised learning might assist in obtaining heterogeneous documents from diverse mediums and seeking plans that enhance the accuracy of document classification through its usage.

Korea National College of Agriculture and Fisheries in Naver News by Web Crolling : Based on Keyword Analysis and Semantic Network Analysis (웹 크롤링에 의한 네이버 뉴스에서의 한국농수산대학 - 키워드 분석과 의미연결망분석 -)

  • Joo, J.S.;Lee, S.Y.;Kim, S.H.;Park, N.B.
    • Journal of Practical Agriculture & Fisheries Research
    • /
    • v.23 no.2
    • /
    • pp.71-86
    • /
    • 2021
  • This study was conducted to find information on the university's image from words related to 'Korea National College of Agriculture and Fisheries (KNCAF)' in Naver News. For this purpose, word frequency analysis, TF-IDF evaluation and semantic network analysis were performed using web crawling technology. In word frequency analysis, 'agriculture', 'education', 'support', 'farmer', 'youth', 'university', 'business', 'rural', 'CEO' were important words. In the TF-IDF evaluation, the key words were 'farmer', 'dron', 'agricultural and livestock food department', 'Jeonbuk', 'young farmer', 'agriculture', 'Chonju', 'university', 'device', 'spreading'. In the semantic network analysis, the Bigrams showed high correlations in the order of 'youth' - 'farmer', 'digital' - 'agriculture', 'farming' - 'settlement', 'agriculture' - 'rural', 'digital' - 'turnover'. As a result of evaluating the importance of keywords as five central index, 'agriculture' ranked first. And the keywords in the second place of the centrality index were 'farmers' (Cc, Cb), 'education' (Cd, Cp) and 'future' (Ce). The sperman's rank correlation coefficient by centrality index showed the most similar rank between Degree centrality and Pagerank centrality. The KNCAF articles of Naver News were used as important words such as 'agriculture', 'education', 'support', 'farmer', 'youth' in terms of word frequency. However, in the evaluation including document frequency, the words such as 'farmer', 'dron', 'Ministry of Agriculture, Food and Rural Affairs', 'Jeonbuk', and 'young farmers' were found to be key words. The centrality analysis considering the network connectivity between words was suitable for evaluation by Cd and Cp. And the words with strong centrality were 'agriculture', 'education', 'future', 'farmer', 'digital', 'support', 'utilization'.