• Title/Summary/Keyword: news data

Search Result 888, Processing Time 0.032 seconds

Detecting Spam Data for Securing the Reliability of Text Analysis (텍스트 분석의 신뢰성 확보를 위한 스팸 데이터 식별 방안)

  • Hyun, Yoonjin;Kim, Namgyu
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.42 no.2
    • /
    • pp.493-504
    • /
    • 2017
  • Recently, tremendous amounts of unstructured text data that is distributed through news, blogs, and social media has gained much attention from many researchers and practitioners as this data contains abundant information about various consumers' opinions. However, as the usefulness of text data is increasing, more and more attempts to gain profits by distorting text data maliciously or nonmaliciously are also increasing. This increase in spam text data not only burdens users who want to obtain useful information with a large amount of inappropriate information, but also damages the reliability of information and information providers. Therefore, efforts must be made to improve the reliability of information and the quality of analysis results by detecting and removing spam data in advance. For this purpose, many studies to detect spam have been actively conducted in areas such as opinion spam detection, spam e-mail detection, and web spam detection. In this study, we introduce core concepts and current research trends of spam detection and propose a methodology to detect the spam tag of a blog as one of the challenging attempts to improve the reliability of blog information.

MPIL: Market prediction through image learning of unstructured and structured data (비정형, 정형 데이터의 이미지 학습을 활용한 시장예측)

  • Lee, Yoon Seon;Lee, Ju Hong;Choi, Bum Ghi;Song, Jae Won
    • Smart Media Journal
    • /
    • v.10 no.2
    • /
    • pp.16-21
    • /
    • 2021
  • Financial time series analysis plays a very important role economically and socially in modern society and is an important task affecting global development, but due to difficulties such as a lot of noise and uncertainty, financial time series analysis prediction is a difficult research topic. In this paper, we propose a market prediction method (MPIL) by converting unstructured data and structured data into images. For market prediction, it analyzes SNS and news data, which is unstructured data for n days, and converts the market data, which is structured data, to an image with the GADF algorithm, and predicts an ultra-short market that predicts the price of n+1 days through image learning. MPIL has an average accuracy of 56%, which is higher than the 50% average accuracy of the model that predicts the market with LSTM by using sentiment analysis used for existing market forecasting.

A study on the User Experience at Unmanned Checkout Counter Using Big Data Analysis (빅데이터를 활용한 편의점 간편식에 대한 의미 분석)

  • Kim, Ae-sook;Ryu, Gi-hwan;Jung, Ju-hee;Kim, Hee-young
    • The Journal of the Convergence on Culture Technology
    • /
    • v.8 no.4
    • /
    • pp.375-380
    • /
    • 2022
  • The purpose of this study is to find out consumers' perception and meaning of convenience store convenience food by using big data. For this study, NNAVER and Daum analyzed news, intellectuals, blogs, cafes, intellectuals(tips), and web documents, and used 'convenience store convenience food' as keywords for data search. The data analysis period was selected as 3 years from January 1, 2019 to December 31, 2021. For data collection and analysis, frequency and matrix data were extracted using TEXTOM, and network analysis and visualization analysis were conducted using the NetDraw function of the UCINET 6 program. As a result, convenience store convenience foods were clustered into health, diversity, convenience, and economy according to consumers' selection attributes. It is expected to be the basis for the development of a new convenience menu that pursues convenience and convenience based on consumers' meaning of convenience store convenience foods such as appropriate prices, discount coupons, and events.

Volatility of Export Volume and Export Value of Gwangyang Port (광양항의 수출물동량과 수출액의 변동성)

  • Mo, Soo-Won;Lee, Kwang-Bae
    • Journal of Korea Port Economic Association
    • /
    • v.31 no.1
    • /
    • pp.1-14
    • /
    • 2015
  • The standard GARCH model imposing symmetry on the conditional variance, tends to fail in capturing some important features of the data. This paper, hence, introduces the models capturing asymmetric effect. They are the EGARCH model and the GJR model. We provide the systematic comparison of volatility models focusing on the asymmetric effect of news on volatility. Specifically, three diagnostic tests are provided: the sign bias test, the negative size bias test, and the positive size bias test. This paper shows that there is significant evidence of GARCH-type process in the data, as shown by the test for the Ljung-Box Q statistic on the squared residual data. The estimated unconditional density function for squared residual is clearly skewed to the left and markedly leptokurtic when compared with the standard normal distribution. The observation of volatility clustering is also clearly reinforced by the plot of the squared value of residuals of export volume and values. The unconditional variance of both export volumes and export value indicates that large shocks of either sign tend to be followed by large shocks, and small shocks of either sign tend to follow small shocks. The estimated export volume news impact curve for the GARCH also suggests that $h_t$ is overestimated for large negative and positive shocks. The conditional variance equation of the GARCH model for export volumes contains two parameters ${\alpha}$ and ${\beta}$ that are insignificant, indicating that the GARCH model is a poor characterization of the conditional variance of export volumes. The conditional variance equation of the EGARCH model for export value, however, shows a positive sign of parameter ${\delta}$, which is contrary to our expectation, while the GJR model exhibits that parameters ${\alpha}$ and ${\beta}$ are insignificant, and ${\delta}$ is marginally significant. That indicates that the asymmetric volatility models are poor characterization of the conditional variance of export value. It is concluded that the asymmetric EGARCH and GJR model are appropriate in explaining the volatility of export volume, while the symmetric standard GARCH model is good for capturing the volatility.

A Study on the Privacy Awareness through Bigdata Analysis (빅데이터 분석을 통한 프라이버시 인식에 관한 연구)

  • Lee, Song-Yi;Kim, Sung-Won;Lee, Hwan-Soo
    • Journal of Digital Convergence
    • /
    • v.17 no.10
    • /
    • pp.49-58
    • /
    • 2019
  • In the era of the 4th industrial revolution, the development of information technology brought various benefits, but it also increased social interest in privacy issues. As the possibility of personal privacy violation by big data increases, academic discussion about privacy management has begun to be active. While the traditional view of privacy has been defined at various levels as the basic human rights, most of the recent research trends are mainly concerned only with the information privacy of online privacy protection. This limited discussion can distort the theoretical concept and the actual perception, making the academic and social consensus of the concept of privacy more difficult. In this study, we analyze the privacy concept that is exposed on the internet based on 12,000 news data of the portal site for the past one year and compare the difference between the theoretical concept and the socially accepted concept. This empirical approach is expected to provide an understanding of the changing concept of privacy and a research direction for the conceptualization of privacy for current situations.

Assessment of Public Awareness on Invasive Alien Species of Freshwater Ecosystem Using Conservation Culturomics (보전문화체학 접근방식을 통한 생태계교란 생물인 담수 외래종의 대중인식 평가)

  • Park, Woong-Bae;Do, Yuno
    • Journal of Wetlands Research
    • /
    • v.23 no.4
    • /
    • pp.364-371
    • /
    • 2021
  • Public awareness of alien species can vary by generation, period, or specific events associated with these species. An understanding of public awareness is important for the management of alien species because differences in public awareness can affect the establishment and implementation of management plans. We analyzed digital texts on social media platforms, news articles, and internet search volumes used in conservation culturomics to understand public interest and sentiment regarding alien freshwater species. The number of tweets, number of news articles, and relative search volume to 11 freshwater alien species were extracted to determine public interest. Additionally, the trend over time, seasonal variability, and repetition period of these data were confirmed. We also calculated the sentiment score and analyzed public sentiment in the collected data using sentiment analysis based on text mining techniques. The American bullfrog, nutria, bluegill, and largemouth bass drew relatively more public interest than other species. Some species showed repeated patterns in the number of Twitter posts, media coverage, and internet searches found according to the specified periods. The text mining analysis results showed negative sentiments from most people regarding alien freshwater species. Particularly, negative sentiments increased over the years after alien species were designated as ecologically disturbing species.

An Analysis on Media Trends in Public Agency for Social Service Applying Text Mining (텍스트 마이닝을 적용한 사회서비스원 언론보도기사 분석)

  • Park, Hae-Keung;Youn, Ki-Hyok
    • Journal of Internet of Things and Convergence
    • /
    • v.8 no.2
    • /
    • pp.41-48
    • /
    • 2022
  • This study tried to empirically explore which issues related to the social service agency for public(as below SSA), that is, social perceptions were formed, by using mess media related to the SSA. This study is meaningful in that it identifies the overall social perception and trend of SSA through public opinion. In order to extract media trend data, the search used the big data analysis system, Textom, to collect data from the representative portals Naver News and Daum News. The collected texts were 1,299 in 2020 and 1,410 in 2021, for a total of 2,709. As a result of the analysis, first, the most derived words in relation to the frequency of text appearance were 'SSA', 'establishment', and 'operation'. Second, as a result of the N-gram analysis, the pairs of words directly related to the SSA 'SSA and public', 'SSA and opening', 'SSA and launch', and 'SSA and Department Director', 'SSA and Staff', 'SSA and Caregiver' etc. Third, in the results of TF-IDF analysis and word network analysis, similar to the word occurrence frequency and N-gram results, 'establishment', 'operation', 'public', 'launch', 'provided', 'opened', ' 'Holding' and 'Care' were derived. Based on the above analysis results, it was suggested to strengthen the emergency care support group, to commercialize it in detail, and to stabilize jobs.

Media Use during the Sewol Ferry Disaster and Post Traumatic Stress Disorder (미디어 이용과 외상 후 스트레스 장애(PTSD): 세월호 사건을 중심으로)

  • Park, Nohil;Chang, Seok-Hwan;Jeong, JiYeon
    • Journal of Digital Contents Society
    • /
    • v.19 no.4
    • /
    • pp.673-683
    • /
    • 2018
  • The accident of Sewol Ferry is a disaster that provoked serious mental shock to the Korean people way beyond the level of generally-perceived catastrophic aftermaths. The purpose of this study is to examine the relationship between vicarious disaster experiences through media and post-traumatic stress(PTSD) symptoms of media users related to the accident. The responses of 417 people consisted of college, middle and high school students, and adults in a metropolitan area were collected for 12 days from the April 28, 2014 right after the accident. The results showed that the level of PTSD of social media users were higher than that of traditional media (newspapers or TV news) users on the accident. Also, the amount of use of disaster news information and social media revealed positive correlations with PTSD. Implications of this study are to demonstrate possible mechanisms of psychological trauma mediated by media on a disaster and its empirical data and to facilitate further research.

Strategy of CSR Storytelling with the application of Greimas Actantial Model -focusing on Hyundai Motor Company's CSR website (그레마스 행위소 모델을 통해 본 기업의 CSR스토리텔링 전략-현대자동차 CSR홈페이지를 중심으로)

  • HONG, Sook-Yeong
    • Journal of Digital Convergence
    • /
    • v.14 no.10
    • /
    • pp.119-128
    • /
    • 2016
  • This study is designed with an intention to understand CSR story strategies that the corporates use, focusing on analyzing the method of composition of Hyundai Motor Company's CSR website stories. When analyzing based on interactivity, ease of use, newest, and informativity, interactive dialogue feature was mostly lacking. It is not the most up-to-date data and lacks the newest. However, as sharing information feature was presented, information spread quickly. When applying Greimas' Actantial Model into the 'CSR News' that conveys the news about corporate philanthropic activities, it turned out that the CSR strategies were authenticity, consistency and flexibility. When doing CSR storytelling, a corporate should not only use pre-existing executives and staff members but should use new icons including civic organizations and the youth if possible, and perform its role as a supporter. At the same time, a corporate must build strategical storytelling to the new values and engage in systematic corporate philanthropic activities that meets the need of the time period.

Business Strategies of Japanese Newspaper Companies: Top-level Managers' Perception and Reacting toward the Newspaper Crisis (일본 신문기업의 경영전략 연구: 경영진의 위기인식과 대응책을 중심으로)

  • Kim, Young-Ju;Jung, Jae-Min
    • Korean journal of communication and information
    • /
    • v.51
    • /
    • pp.65-86
    • /
    • 2010
  • This paper reviews the problems and opportunities of Japanese newspaper industry. In the focus are particularly issues related to causes of the crisis and reacting strategies overcoming the crisis. To answer the questions, we conducted in-depth interviews with top-level managers of major newspaper companies in Japan. Secondary data were gathered to complement the interview. Top managers mainly perceived the causes of the crisis as young readers' alienation from newspaper due to the change of media environment and structural factors such as increasing production/distribution costs and diminishing appeals for advertisers. They are implementing several strategies to overcome this crisis - developing newspaper content for young readers; packaging customized on-off line advertisement; extending news distribution via online and mobile platforms; diversifying business into diverse sectors such as real-estate; enhancing brand equity of newspaper; and collaborating with rival newspaper companies in operating news site and joint printing/distribution. Lessons from the case of Japanese newspaper industry may be useful starting points for the Korean newspaper companies agonizing for diminishing circulation and advertising revenues.

  • PDF