• Title/Summary/Keyword: news paper articles

Search Result 149, Processing Time 0.033 seconds

A Proposal of a Keyword Extraction System for Detecting Social Issues (사회문제 해결형 기술수요 발굴을 위한 키워드 추출 시스템 제안)

  • Jeong, Dami;Kim, Jaeseok;Kim, Gi-Nam;Heo, Jong-Uk;On, Byung-Won;Kang, Mijung
    • Journal of Intelligence and Information Systems
    • /
    • v.19 no.3
    • /
    • pp.1-23
    • /
    • 2013
  • To discover significant social issues such as unemployment, economy crisis, social welfare etc. that are urgent issues to be solved in a modern society, in the existing approach, researchers usually collect opinions from professional experts and scholars through either online or offline surveys. However, such a method does not seem to be effective from time to time. As usual, due to the problem of expense, a large number of survey replies are seldom gathered. In some cases, it is also hard to find out professional persons dealing with specific social issues. Thus, the sample set is often small and may have some bias. Furthermore, regarding a social issue, several experts may make totally different conclusions because each expert has his subjective point of view and different background. In this case, it is considerably hard to figure out what current social issues are and which social issues are really important. To surmount the shortcomings of the current approach, in this paper, we develop a prototype system that semi-automatically detects social issue keywords representing social issues and problems from about 1.3 million news articles issued by about 10 major domestic presses in Korea from June 2009 until July 2012. Our proposed system consists of (1) collecting and extracting texts from the collected news articles, (2) identifying only news articles related to social issues, (3) analyzing the lexical items of Korean sentences, (4) finding a set of topics regarding social keywords over time based on probabilistic topic modeling, (5) matching relevant paragraphs to a given topic, and (6) visualizing social keywords for easy understanding. In particular, we propose a novel matching algorithm relying on generative models. The goal of our proposed matching algorithm is to best match paragraphs to each topic. Technically, using a topic model such as Latent Dirichlet Allocation (LDA), we can obtain a set of topics, each of which has relevant terms and their probability values. In our problem, given a set of text documents (e.g., news articles), LDA shows a set of topic clusters, and then each topic cluster is labeled by human annotators, where each topic label stands for a social keyword. For example, suppose there is a topic (e.g., Topic1 = {(unemployment, 0.4), (layoff, 0.3), (business, 0.3)}) and then a human annotator labels "Unemployment Problem" on Topic1. In this example, it is non-trivial to understand what happened to the unemployment problem in our society. In other words, taking a look at only social keywords, we have no idea of the detailed events occurring in our society. To tackle this matter, we develop the matching algorithm that computes the probability value of a paragraph given a topic, relying on (i) topic terms and (ii) their probability values. For instance, given a set of text documents, we segment each text document to paragraphs. In the meantime, using LDA, we can extract a set of topics from the text documents. Based on our matching process, each paragraph is assigned to a topic, indicating that the paragraph best matches the topic. Finally, each topic has several best matched paragraphs. Furthermore, assuming there are a topic (e.g., Unemployment Problem) and the best matched paragraph (e.g., Up to 300 workers lost their jobs in XXX company at Seoul). In this case, we can grasp the detailed information of the social keyword such as "300 workers", "unemployment", "XXX company", and "Seoul". In addition, our system visualizes social keywords over time. Therefore, through our matching process and keyword visualization, most researchers will be able to detect social issues easily and quickly. Through this prototype system, we have detected various social issues appearing in our society and also showed effectiveness of our proposed methods according to our experimental results. Note that you can also use our proof-of-concept system in http://dslab.snu.ac.kr/demo.html.

Predicting the Popularity of Post Articles with Virtual Temperature in Web Bulletin (웹게시판에서 가상온도를 이용한 게시글의 인기 예측)

  • Kim, Su-Do;Kim, So-Ra;Cho, Hwan-Gue
    • The Journal of the Korea Contents Association
    • /
    • v.11 no.10
    • /
    • pp.19-29
    • /
    • 2011
  • A Blog provides commentary, news, or content on a particular subject. The important part of many blogs is interactive format. Sometimes, there is a heated debate on a topic and any article becomes a political or sociological issue. In this paper, we proposed a method to predict the popularity of an article in advance. First, we used hit count as a factor to predict the popularity of an article. We defined the saturation point and derived a model to predict the hit count of the saturation point by a correlation coefficient of the early hit count and hit count of the saturation point. Finally, we predicted the virtual temperature of an article using 4 types(explosive, hot, warm, cold). We can predict the virtual temperature of Internet discussion articles using the hit count of the saturation point with more than 70% accuracy, exploiting only the first 30 minutes' hit count. In the hot, warm, and cold categories, we can predict more than 86% accuracy from 30 minutes' hit count and more than 90% accuracy from 70 minutes' hit count.

Seasonal analysis of Beach-related Issues using Local Newspaper Articles and Topic Modeling (지역신문기사 자료와 토픽모델링을 이용한 해변 관련 계절별 현안분석)

  • Yoo, Mu-Sang;Jeong, Su-Yeon;Kim, Geon-Hu;Sohn, Chul
    • Journal of the Korean Regional Science Association
    • /
    • v.34 no.4
    • /
    • pp.19-34
    • /
    • 2018
  • The purpose of this study is to analyze the seasonal issues using the local newspaper articles with the keyword beach from 2004 to 2017. Topic modeling and Time series regression analysis based on open source programs were performed for analysis. Topic modeling results showed 35 topics in spring, 47 topics in summer, 36 topics in autumn and 35 topics in winter. The common themes were 'beaches', 'festivals and events', 'accident and environmental issues', 'tourism', 'development and sale', 'administration and policy' and 'weather'. Time series regression analysis showed in the spring, 5 Hot-Topics and 2 Cold-Topic were found out of the 35 topics. In the summer, 6 Hot-Topics and 3 Cold-Topic were found out of the 47 topics. In the autumn, 4 Hot-Topics and 3 Cold-Topic were found out of the 36 topics. In the winter, 3 Hot-Topics and 3 Cold-Topic were found out of the 35 topics. And for each season, topics that do not fall into the Hot-Topic and Cold-Topic are classified as Neutral-Topic. In this study if seasonal uses are different such as beaches are deemed that seasonal topic modeling for analysis of regional issues will yield more useful results and enable detailed diagnosis.

The Diffusion of Internet of Things: Forecasting Technologies and Company Strategies using Qualitative and Quantitative Approach (사물인터넷의 확산: 정성적·정량적 기법을 이용한 기술 및 기업 전략 예측)

  • Lee, Saerom;Jahng, Jungjoo
    • The Journal of Society for e-Business Studies
    • /
    • v.20 no.4
    • /
    • pp.19-39
    • /
    • 2015
  • Internet of Things (IoT) is expected to provide efficiency and convenience in human life by integrating the Internet into the things that we use in daily lives. IoT can not only create new businesses but also can bring great changes in our lives thanks to the various ways of technical application: defining relationships among things or automatic use of technology by analyzing the usage pattern. This study uses the qualitative research of interviewing the experts to predict the changes that IoT technology is expected to bring in our lives. In addition, this paper analyzes news articles about internet of things in Korea using text-network analysis. This study also discusses the factors which need to be considered to put IoT into successful use in business contexts.

The 'Be Slow'Movement and Its Impact on the Current Fashion (최근 국내외 패션에 나타난 느리게 살기 운동의 영향)

  • 김윤희
    • Journal of the Korean Society of Costume
    • /
    • v.52 no.6
    • /
    • pp.165-179
    • /
    • 2002
  • This paper begins with the thesis that the so-called 'Be Slow' Movement has not only affected the contemporary life style but also the current fashion trend in the West as well as in Korea. The influence of the 'Be Slow' Movement on the everyday life of Western and Korean society can be documented by recent books, news reports, and many articles from various kinds of mass media and fashion magazines since the year 2000. The results of this study can be summarized as follows. First. the 'Be Slow' Movement is a new cultural phenomenon and very different from that of the past century. It has emerged very recently and it could affect the life style o( its followers for a long period of time. Second, the influence of 'Be Slow' Movement on everyday life can be witnessed in many behavioral choices. such as the preference of organic food and natural cooking for food and the preference of rural life and a green patch of land for housing. Some aspects of the way of rearing the children and long-term planning of one's life are also under the influence of 'Be Slow' Movement. In a way. the life style Proposed by the 'Be Slow' Movement is somewhat similar to that of 'Bobos'. Third, the influence of 'Be Slow' Movement on the current fashion trend can be observed in the appreciation of time-consuming labour and increased usage of D.I.Y. clothing. The higher value of fashion goods with handcrafted part or scarce luxury item are good examples of the influence by the 'Be Slow' Movement. One can say that the 'Be Slow' Movement is not retrogression, but a re-creation of time and space to be grateful for one's life. Thus, it is not anti-technology but a commercialism with technology in order to enhance the quality of life and to place people in the center of production and consumption. Consequently, one may say that the 'Be Slow' Movement is a appropriate and affluent way of living.

A study on Technology Push-based Future Weapon System and Core Technology Derivation Methodology (빅데이터분석기반의 기술주도형 미래 국방무기체계 및 핵심기술 도출 방법연구)

  • Kang, Hyunkyu;Park, Yongjun;Park, Jaehun
    • Journal of Korean Society for Quality Management
    • /
    • v.46 no.2
    • /
    • pp.225-242
    • /
    • 2018
  • Purpose: Recent trends have shown that the usage of big data analysis is becoming the core of identifying promising future technologies and emerging technologies. Accordingly, applying these trends by analyzing defense related data in such sources as journals, articles, and news will provide crucial clues in predicting and identifying core future technologies that can be used to develop creative and unprecedented future weapon systems that could change the warfare. Methods: To identify technology fields that are closely related to the 4th industrial revolution and recent technology development trends, environmental analysis, text mining, and military applicability survey have been included in the process. After the identification of core technologies that are militarily applicable, future weapon systems based on these technologies as well as their operation concepts are suggested. Results: Through the study, 73 important trends, from which 11 mega trends are derived, are identified. These mega trends can be expressed by 13 promising technology fields. From these technology fields, 248 promising future technologies are identified. Afterwards, further assessment is performed, which leads to the selection of 63 core technologies from the pool. These are named as "future defense technologies" which then become the bases for 40 future weapons systems that the military can use. Conclusion: Predicting future technologies using text mining analysis have been attempted by various organizations across the globe, especially in the fields related to the 4th industrial revolution. However, the application of it in the field of defense industry is unprecedented. Therefore, this study is meaningful in that it not only enables the military personnel to see promising future technologies that can be utilized for future weapon system development, but helps one to predict the future defense technologies using the method introduced in the paper.

Hot Topic Prediction Scheme Using Modified TF-IDF in Social Network Environments (소셜 네트워크 환경에서 변형된 TF-IDF를 이용한 핫 토픽 예측 기법)

  • Noh, Yeonwoo;Lim, Jongtae;Bok, Kyoungsoo;Yoo, Jaesoo
    • KIISE Transactions on Computing Practices
    • /
    • v.23 no.4
    • /
    • pp.217-225
    • /
    • 2017
  • Recently, the interest in predicting hot topics has grown significantly as it has become more important to find and analyze meaningful information from a large amount of data flowing in social networking services. Existing hot topic detection schemes do not consider a temporal property, so they are not suitable to predict hot topics that are rapidly issued in a changing society. This paper proposes a hot topic prediction scheme that uses a modified TF-IDF in social networking environments. The modified TF-IDF extracts a candidate set of keywords that are momentarily issued. The proposed scheme then calculates the hot topic prediction scores by assigning weights considering user influence and professionality to extract the candidate keywords. The superiority of the proposed scheme is shown by comparing it to an existing detection scheme. In addition, to show whether or not it predicts hot topics correctly, we evaluate its quality with Korean news articles from Naver.

Corporate Social Responsibility (CSR) of Small Enterprises in Hospitality and Tourism Industry (환대관광산업 소규모기업 사회적 책임활동(CSR): 회사 홈페이지 커뮤니케이션 분석을 중심으로)

  • Ahn, Young-Joo
    • Journal of Distribution Science
    • /
    • v.15 no.7
    • /
    • pp.73-83
    • /
    • 2017
  • Purpose - The purpose of this paper is to explore the CSR activities of small enterprises in hospitality and tourism industry in South Korea. Since previous research on CSR activities has considerably focused on large enterprises whereas small enterprises have relatively less attention, this study aims to explore the characteristics of small enterprises in hospitality and tourism industry and their CSR activities. Research design, data, and methodology - The population of interest for this study was social enterprises registered in Korea Social Enterprise Promotion Agency (2016), and it was used to verify the social enterprises which has a certification for social enterprises. From 1672 companies in total, the sampling frame was a database with 117 companies in hospitality and tourism industry. This study investigates social enterprises' CSR activities on the company's official websites (e.g., company reports, magazines, the news articles, and interviews). The websites of the selected enterprises in hospitality and tourism industry were analyzed for examining CSR activities by the quantitative content analysis. All of the CSR activities in small social enterprises were classified into six dimensions based on the stakeholder theory. Results - The findings of this study provide the characteristics of the 117 small social enterprises and their specific CSR initiatives. A total of eight main business lines were identified: 1) fair travel, 2) leisure/sports, 3) accommodation/camping, 4) medical tourism, 5) exhibitions/art events/cultural events, 6) leisure activities for vulnerable social groups, 7) Korean traditional culture, and 8) ecotourism/agricultural tourism. The CSR initiatives were classified into six dimensions: 1) environment, 2) employment, 3) multicultural families and vulnerable social groups, 4) local community, 5) economic prosperity, and 6) product. Conclusions - This study revealed the special CSR initiative examples of small enterprises in hospitality and tourism industry. Small social enterprises participate in CSR activities mainly related to their own business lines. Moreover, these enterprises are more closely embedded in their local community development, job creation and education for local residents and vulnerable social groups, and traditional heritage preservation. The findings of this study provide theoretical and practical implications and they can contribute to enrich CSR with literature for small enterprises in hospitality and tourism industry.

An Automatic Classification System of Korean Documents Using Weight for Keywords of Document and Word Cluster (문서의 주제어별 가중치 부여와 단어 군집을 이용한 한국어 문서 자동 분류 시스템)

  • Hur, Jun-Hui;Choi, Jun-Hyeog;Lee, Jung-Hyun;Kim, Joong-Bae;Rim, Kee-Wook
    • The KIPS Transactions:PartB
    • /
    • v.8B no.5
    • /
    • pp.447-454
    • /
    • 2001
  • The automatic document classification is a method that assigns unlabeled documents to the existing classes. The automatic document classification can be applied to a classification of news group articles, a classification of web documents, showing more precise results of Information Retrieval using a learning of users. In this paper, we use the weighted Bayesian classifier that weights with keywords of a document to improve the classification accuracy. If the system cant classify a document properly because of the lack of the number of words as the feature of a document, it uses relevance word cluster to supplement the feature of a document. The clusters are made by the automatic word clustering from the corpus. As the result, the proposed system outperformed existing classification system in the classification accuracy on Korean documents.

  • PDF

Color Recommendation for Text Based on Colors Associated with Words

  • Liba, Saki;Nakamura, Tetsuaki;Sakamoto, Maki
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.17 no.1
    • /
    • pp.21-29
    • /
    • 2012
  • In this paper, we propose a new method to select colors representing the meaning of text contents based on the cognitive relation between words and colors, Our method is designed on the previous study revealing the existence of crucial words to estimate the colors associated with the meaning of text contents, Using the associative probability of each color with a given word and the strength of color association of the word, we estimate the probability of colors associated with a given text. The goal of this study is to propose a system to recommend the cognitively plausible colors for the meaning of the input text. To build a versatile and efficient database used by our system, two psychological experiments were conducted by using news site articles. In experiment 1, we collected 498 words which were chosen by the participants as having the strong association with color. Subsequently, we investigated which color was associated with each word in experiment 2. In addition to those data, we employed the estimated values of the strength of color association and the colors associated with the words included in a very large corpus of newspapers (approximately 130,000 words) based on the similarity between the words obtained by Latent Semantic Analysis (LSA). Therefore our method allows us to select colors for a large variety of words or sentences. Finally, we verified that our system cognitively succeeded in proposing the colors associated with the meaning of the input text, comparing the correct colors answered by participants with the estimated colors by our method. Our system is expected to be of use in various types of situations such as the data visualization, the information retrieval, the art or web pages design, and so on.