• Title/Summary/Keyword: 뉴스기사

Search Result 507, Processing Time 0.025 seconds

A Proposal of a Keyword Extraction System for Detecting Social Issues (사회문제 해결형 기술수요 발굴을 위한 키워드 추출 시스템 제안)

  • Jeong, Dami;Kim, Jaeseok;Kim, Gi-Nam;Heo, Jong-Uk;On, Byung-Won;Kang, Mijung
    • Journal of Intelligence and Information Systems
    • /
    • v.19 no.3
    • /
    • pp.1-23
    • /
    • 2013
  • To discover significant social issues such as unemployment, economy crisis, social welfare etc. that are urgent issues to be solved in a modern society, in the existing approach, researchers usually collect opinions from professional experts and scholars through either online or offline surveys. However, such a method does not seem to be effective from time to time. As usual, due to the problem of expense, a large number of survey replies are seldom gathered. In some cases, it is also hard to find out professional persons dealing with specific social issues. Thus, the sample set is often small and may have some bias. Furthermore, regarding a social issue, several experts may make totally different conclusions because each expert has his subjective point of view and different background. In this case, it is considerably hard to figure out what current social issues are and which social issues are really important. To surmount the shortcomings of the current approach, in this paper, we develop a prototype system that semi-automatically detects social issue keywords representing social issues and problems from about 1.3 million news articles issued by about 10 major domestic presses in Korea from June 2009 until July 2012. Our proposed system consists of (1) collecting and extracting texts from the collected news articles, (2) identifying only news articles related to social issues, (3) analyzing the lexical items of Korean sentences, (4) finding a set of topics regarding social keywords over time based on probabilistic topic modeling, (5) matching relevant paragraphs to a given topic, and (6) visualizing social keywords for easy understanding. In particular, we propose a novel matching algorithm relying on generative models. The goal of our proposed matching algorithm is to best match paragraphs to each topic. Technically, using a topic model such as Latent Dirichlet Allocation (LDA), we can obtain a set of topics, each of which has relevant terms and their probability values. In our problem, given a set of text documents (e.g., news articles), LDA shows a set of topic clusters, and then each topic cluster is labeled by human annotators, where each topic label stands for a social keyword. For example, suppose there is a topic (e.g., Topic1 = {(unemployment, 0.4), (layoff, 0.3), (business, 0.3)}) and then a human annotator labels "Unemployment Problem" on Topic1. In this example, it is non-trivial to understand what happened to the unemployment problem in our society. In other words, taking a look at only social keywords, we have no idea of the detailed events occurring in our society. To tackle this matter, we develop the matching algorithm that computes the probability value of a paragraph given a topic, relying on (i) topic terms and (ii) their probability values. For instance, given a set of text documents, we segment each text document to paragraphs. In the meantime, using LDA, we can extract a set of topics from the text documents. Based on our matching process, each paragraph is assigned to a topic, indicating that the paragraph best matches the topic. Finally, each topic has several best matched paragraphs. Furthermore, assuming there are a topic (e.g., Unemployment Problem) and the best matched paragraph (e.g., Up to 300 workers lost their jobs in XXX company at Seoul). In this case, we can grasp the detailed information of the social keyword such as "300 workers", "unemployment", "XXX company", and "Seoul". In addition, our system visualizes social keywords over time. Therefore, through our matching process and keyword visualization, most researchers will be able to detect social issues easily and quickly. Through this prototype system, we have detected various social issues appearing in our society and also showed effectiveness of our proposed methods according to our experimental results. Note that you can also use our proof-of-concept system in http://dslab.snu.ac.kr/demo.html.

베트남에 가면 커피의 진실이 보인다

  • Korea Vending Machine Manufacturers Association
    • Vending industry
    • /
    • v.7 no.1 s.19
    • /
    • pp.88-89
    • /
    • 2007
  • 최근 베트남의 경제성장은 실로 눈부실 지경이다. 2000년 이후 연평균 7.6%의 성장률이 계속되고 있어 우리나라의 고도 성장기를 연상케 한다. 2007년 1월 150번째 WTO에 가입한 베트남은 마치 중국의 성공을 벤치마킹한 성장패턴을 보이고 있다. 충부한 자본과 노동력을 자원으로 고속성장을 해나가는 베트남 경제의 잠재력은 무궁무진하다 할 수 있다. 이런 베트남 시장에서 자판기 분야 역시 관심을 가질 만하다. 베트남 경제의 거침없는 성장은 자판기 산업의 높은 잠재 성장가능성 역시 예상케 한다. 특히 베트남의 경제인구의 60%이상이 30대 이하인 점은 감안한다면 자판기 문화가 빠른 시간 안에 형성 발전이 가능하다 할 수 있다. 게다가 베트남은 세계 2위의 커피수출 이다. 우리나라에서 가장 많이 인스턴트커피를 수입하는 나라가 베트남이며 자판기 커피에 쓰이는 원료의 대부분이 베트남 커피이다. 저기 커피시장을 꽉 잡고 있다 할 정도로 많은 생산수출을 진행하는 것이 베트남 커피 시장의 특징이다. 생산이 이처럼 많이 이루어지다 보니 자국내 커피음용 문화가 일상화되어 있다. 마치 물마시듯 커피를 마시는 문화가 형성되어 있다. 이런 점에서 본다면 커피자판기 분야의 큰 시장 확대의 여지 역시 크다 할 수 있다. 베트남의 자판기 시장 잠재력을 제대로 파악하기 위해서는 베트남 커피 산업의 현황과 문화를 제대로 아는 일이 중요하다. 국정홍보처 국정브리핑에서 제공한 '베트남에 가면 커피의 진실이 보인다'는 기사는 이런 니즈를 충족시키는데 있어 큰 도움이 된다. 자료 제공에 적극 협조를 해 준 국정홍보처 정책뉴스팀에 감사의 말씀을 전하며, 관련 기사 전문을 게재했다.

  • PDF

Keyword trends analysis related to the aviation industry during the Covid-19 period using text mining (텍스트마이닝을 활용한 Covid-19 기간 동안의 항공산업 관련 키워드 트렌드 분석)

  • Choi, Donghyun;Song, Bomi;Park, Dahyeon;Lee, Sungwoo
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.27 no.2
    • /
    • pp.115-128
    • /
    • 2022
  • The purpose of this study is to conduct keyword trend analysis using articles data on the impact of Covid-19 in the aviation in dustry. In this study, related articles were extracted centering on the keyword "Airline" by dividing the period of 6months before and after Covid-19 occurrence. After that, Topic modeling(LDA) was performed. Through this, The main topic was extracted in the event of an epidemic such as Covid-19, It is expected to be used as primary data to predict the aviation industry's impact when occurrence like Covid-19.

Factors Affecting Behavioral Intention to Use the Integrated Platform of Newspaper (신문사 통합형 플랫폼 사용의도에 영향을 미치는 요인)

  • Kim, Daewon;Kim, Min Sung;Kim, Seongcheol
    • The Journal of the Korea Contents Association
    • /
    • v.15 no.4
    • /
    • pp.122-138
    • /
    • 2015
  • This research examines factors affecting behavioral intention to use integrated platform (IP), which is one of the business models regarding paid digital contents, based on extended United Theory of Acceptance and Use of Technology (UTAUT2). IP is a service which is designed to make digital content charged. It combines digitalized format of newspaper, premium news or behind news, and additional information. The results of this paper show that effort expectancy, social influence and hedonic motivation have positive relationship with behavioral intention to use IP (BI). Satisfaction with newspaper shows negative relationship with BI, which means that newspaper and IP can be replaced with each other. Newspaper subscribers tend to have higher BI than non-subscribers. Performance expectancy, gender, age, and satisfaction with website do not have significant effect on BI.

A Content Analysis of International News in the North Korean Newspapers: The Relationships Between Foreign Policy of the Labor Party and Propaganda of Foreign Affairs (북한신문의 국제뉴스 내용분석: 당 대외정책과 국제정세선전의 관계를 중심으로)

  • Kim, Kyung-Mo
    • Korean journal of communication and information
    • /
    • v.31
    • /
    • pp.9-49
    • /
    • 2005
  • This research analyzes contents of international news in the North Korean newspapers to examine the relationships between foreign policy of the labor party and propaganda of foreign affairs. The results show that international news of the Rodong shinmun and Minjoo Chosun as the ideological apparatus reflects the recent changes of foreign policy of the Kim Jung-Il regime, aiming at the successful control and mobilization of the people. However, the coverage difference between two newspapers has not been found, which means that the press control of the state is routinized. The relationships between political power and the press in North Korea has been discussed from the perspective of international communication.

  • PDF

Wrapper-based Economy Data Collection System Design And Implementation (래퍼 기반 경제 데이터 수집 시스템 설계 및 구현)

  • Piao, Zhegao;Gu, Yeong Hyeon;Yoo, Seong Joon
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2015.05a
    • /
    • pp.227-230
    • /
    • 2015
  • For analyzing and prediction of economic trends, it is necessary to collect particular economic news and stock data. Typical Web crawler to analyze the page content, collects document and extracts URL automatically. On the other hand there are forms of crawler that can collect only document of a particular topic. In order to collect economic news on a particular Web site, we need to design a crawler which could directly analyze its structure and gather data from it. The wrapper-based web crawler design is required. In this paper, we design a crawler wrapper for Economic news analysis system based on big data and implemented to collect data. we collect the data which stock data, sales data from USA auto market since 2000 with wrapper-based crawler. USA and South Korea's economic news data are also collected by wrapper-based crawler. To determining the data update frequency on the site. And periodically updated. We remove duplicate data and build a structured data set for next analysis. Primary to remove the noise data, such as advertising and public relations, etc.

  • PDF

Investigation and Analysis of Dark Patterns in Advertisements of News Websites (뉴스 사이트별 다크패턴(Dark Patterns) 광고 실태조사 및 분석)

  • Jun-Young Han;Sang-Jun Yeon;Jun-Hyoung Oh
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.34 no.3
    • /
    • pp.515-525
    • /
    • 2024
  • Dark patterns refer to intentionally deceptive design techniques used by online service providers to hide necessary information, preventing users from taking desired actions or luring them into unintended behaviors. In this study, we analyzed the prevalence of dark patterns such as banners, advertorials, pop-ups, and video ads, and their impact on users across the top 200 news websites worldwide. The research revealed that there is a minimal correlation between banner ads and user bounce rates or unique visitors. Consequently, the main screen moving banner and headline news screen moving banner were most frequently observed in South America, while the headline news screen fixed banner was most commonly observed in Asia. All other categories were predominantly observed in Europe, making European websites the most diverse and abundant in various dark patterns.

An Analysis of News Media Coverage of the QRcode: Based on 2008-2023 News Big Data (QR코드에 대한 언론 보도 경향: 2008-2023년 뉴스 빅데이터 분석)

  • Sunjeong Kim;Jisu Lee
    • Journal of the Korean Society for information Management
    • /
    • v.41 no.2
    • /
    • pp.269-294
    • /
    • 2024
  • This study analyzed the news media coverage of QRcodes in Korea over a 16-year period (2008 to 2023). A total of 13,335 articles were extracted from the Korea Press Foundation's BigKinds. A quantitative and content analysis was conducted on the news frames. The results indicated that the quantity of news coverage has increased. The greatest quantity of news coverage was observed in 2020, and the most frequently discussed topic in the news was 'IT_Science'. The results of the keyword analysis indicated that the primary words were 'QRcode', 'smartphone', 'service', 'application', and 'payment'. The news media primarily focused on the QRcode's ability to provide instant access and recognition technology. This study demonstrates that advanced information and communication technologies and the increased prevalence of mobile devices have led to a rise in the utilization of QRcodes. Furthermore, QRcodes have become a significant information media in contemporary society.

Stock Price Prediction by Utilizing Category Neutral Terms: Text Mining Approach (카테고리 중립 단어 활용을 통한 주가 예측 방안: 텍스트 마이닝 활용)

  • Lee, Minsik;Lee, Hong Joo
    • Journal of Intelligence and Information Systems
    • /
    • v.23 no.2
    • /
    • pp.123-138
    • /
    • 2017
  • Since the stock market is driven by the expectation of traders, studies have been conducted to predict stock price movements through analysis of various sources of text data. In order to predict stock price movements, research has been conducted not only on the relationship between text data and fluctuations in stock prices, but also on the trading stocks based on news articles and social media responses. Studies that predict the movements of stock prices have also applied classification algorithms with constructing term-document matrix in the same way as other text mining approaches. Because the document contains a lot of words, it is better to select words that contribute more for building a term-document matrix. Based on the frequency of words, words that show too little frequency or importance are removed. It also selects words according to their contribution by measuring the degree to which a word contributes to correctly classifying a document. The basic idea of constructing a term-document matrix was to collect all the documents to be analyzed and to select and use the words that have an influence on the classification. In this study, we analyze the documents for each individual item and select the words that are irrelevant for all categories as neutral words. We extract the words around the selected neutral word and use it to generate the term-document matrix. The neutral word itself starts with the idea that the stock movement is less related to the existence of the neutral words, and that the surrounding words of the neutral word are more likely to affect the stock price movements. And apply it to the algorithm that classifies the stock price fluctuations with the generated term-document matrix. In this study, we firstly removed stop words and selected neutral words for each stock. And we used a method to exclude words that are included in news articles for other stocks among the selected words. Through the online news portal, we collected four months of news articles on the top 10 market cap stocks. We split the news articles into 3 month news data as training data and apply the remaining one month news articles to the model to predict the stock price movements of the next day. We used SVM, Boosting and Random Forest for building models and predicting the movements of stock prices. The stock market opened for four months (2016/02/01 ~ 2016/05/31) for a total of 80 days, using the initial 60 days as a training set and the remaining 20 days as a test set. The proposed word - based algorithm in this study showed better classification performance than the word selection method based on sparsity. This study predicted stock price volatility by collecting and analyzing news articles of the top 10 stocks in market cap. We used the term - document matrix based classification model to estimate the stock price fluctuations and compared the performance of the existing sparse - based word extraction method and the suggested method of removing words from the term - document matrix. The suggested method differs from the word extraction method in that it uses not only the news articles for the corresponding stock but also other news items to determine the words to extract. In other words, it removed not only the words that appeared in all the increase and decrease but also the words that appeared common in the news for other stocks. When the prediction accuracy was compared, the suggested method showed higher accuracy. The limitation of this study is that the stock price prediction was set up to classify the rise and fall, and the experiment was conducted only for the top ten stocks. The 10 stocks used in the experiment do not represent the entire stock market. In addition, it is difficult to show the investment performance because stock price fluctuation and profit rate may be different. Therefore, it is necessary to study the research using more stocks and the yield prediction through trading simulation.

Trends of South Korea's Informatization and Libraries' Role Based on Newspaper Big Data (신문 빅데이터를 바탕으로 본 국내 정보화의 경향과 도서관의 역할)

  • Na, Kyoungsik;Lee, Jisu
    • The Journal of the Korea Contents Association
    • /
    • v.18 no.9
    • /
    • pp.14-33
    • /
    • 2018
  • The purpose of this study to analyze the informatization trends in Korea through objective newspaper data for the period from 1998 to 2017 for informatization and library in four newspapers including KyoungHyang Newspaper, Kookmin Ilbo, Hankyoreh and Hankookilbo. Based on the analysis results of metadata and related words using BIGKinds, a news big data system, this study presented analysis of simple frequency, classification and classification of the keywords 'information', 'informatization' and 'library'. Based on the results, we compared and analyzed the tendency of informatization in the media through comparison with the 'Information White Paper' which is the publication of government agencies and with research about the research topic of 4 academic journals in the Library and Information Science field. This study tried to interpret the trends of informatization based on the media and it is meaningful that we analyzed the big data of newspaper article which is the long term and time series data. Based on the results of the study, implications of the growth and development of libraries with domestic informatization were suggested. It is expected that we will be able to create a basic framework for developing library informatization policy through the further studies.