• Title/Summary/Keyword: Big data modeling

Search Result 334, Processing Time 0.024 seconds

Development of a Prediction Model for Advertising Effects of Celebrity Models using Big data Analysis (빅데이터 분석을 통한 유명인 모델의 광고효과 예측 모형 개발)

  • Kim, Yuna;Han, Sangpil
    • Journal of the Korea Convergence Society
    • /
    • v.11 no.8
    • /
    • pp.99-106
    • /
    • 2020
  • The purpose of this study is to find out whether image similarity between celebrities and brands on social network service be a determinant to predict advertising effectiveness. To this end, an advertising effect prediction model for celebrity endorsed advertising was created and its validity was verified through a machine learning method which is a big data analysis technique. Firstly, the celebrity-brand image similarity, which was used as an independent variable, was quantified by the association network theory with social big data, and secondly a multiple regression model which used data representing advertising effects as a dependent variable was repeatedly conducted to generate an advertising effect prediction model. The accuracy of the prediction model was decided by comparing the prediction results with the survey outcomes. As for a result, it was proved that the validity of the predictive modeling of advertising effects was secured since the classification accuracy of 75%, which is a criterion for judging validity, was shown. This study suggested a new methodological alternative and direction for big data-based modeling research through celebrity-brand image similarity structure based on social network theory, and effect prediction modeling by machine learning.

Twitter Issue Tracking System by Topic Modeling Techniques (토픽 모델링을 이용한 트위터 이슈 트래킹 시스템)

  • Bae, Jung-Hwan;Han, Nam-Gi;Song, Min
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.2
    • /
    • pp.109-122
    • /
    • 2014
  • People are nowadays creating a tremendous amount of data on Social Network Service (SNS). In particular, the incorporation of SNS into mobile devices has resulted in massive amounts of data generation, thereby greatly influencing society. This is an unmatched phenomenon in history, and now we live in the Age of Big Data. SNS Data is defined as a condition of Big Data where the amount of data (volume), data input and output speeds (velocity), and the variety of data types (variety) are satisfied. If someone intends to discover the trend of an issue in SNS Big Data, this information can be used as a new important source for the creation of new values because this information covers the whole of society. In this study, a Twitter Issue Tracking System (TITS) is designed and established to meet the needs of analyzing SNS Big Data. TITS extracts issues from Twitter texts and visualizes them on the web. The proposed system provides the following four functions: (1) Provide the topic keyword set that corresponds to daily ranking; (2) Visualize the daily time series graph of a topic for the duration of a month; (3) Provide the importance of a topic through a treemap based on the score system and frequency; (4) Visualize the daily time-series graph of keywords by searching the keyword; The present study analyzes the Big Data generated by SNS in real time. SNS Big Data analysis requires various natural language processing techniques, including the removal of stop words, and noun extraction for processing various unrefined forms of unstructured data. In addition, such analysis requires the latest big data technology to process rapidly a large amount of real-time data, such as the Hadoop distributed system or NoSQL, which is an alternative to relational database. We built TITS based on Hadoop to optimize the processing of big data because Hadoop is designed to scale up from single node computing to thousands of machines. Furthermore, we use MongoDB, which is classified as a NoSQL database. In addition, MongoDB is an open source platform, document-oriented database that provides high performance, high availability, and automatic scaling. Unlike existing relational database, there are no schema or tables with MongoDB, and its most important goal is that of data accessibility and data processing performance. In the Age of Big Data, the visualization of Big Data is more attractive to the Big Data community because it helps analysts to examine such data easily and clearly. Therefore, TITS uses the d3.js library as a visualization tool. This library is designed for the purpose of creating Data Driven Documents that bind document object model (DOM) and any data; the interaction between data is easy and useful for managing real-time data stream with smooth animation. In addition, TITS uses a bootstrap made of pre-configured plug-in style sheets and JavaScript libraries to build a web system. The TITS Graphical User Interface (GUI) is designed using these libraries, and it is capable of detecting issues on Twitter in an easy and intuitive manner. The proposed work demonstrates the superiority of our issue detection techniques by matching detected issues with corresponding online news articles. The contributions of the present study are threefold. First, we suggest an alternative approach to real-time big data analysis, which has become an extremely important issue. Second, we apply a topic modeling technique that is used in various research areas, including Library and Information Science (LIS). Based on this, we can confirm the utility of storytelling and time series analysis. Third, we develop a web-based system, and make the system available for the real-time discovery of topics. The present study conducted experiments with nearly 150 million tweets in Korea during March 2013.

Topic Analysis Using Big Data Related to 'Blockchain usage': Focused on Newspaper Articles ('블록체인 활용' 관련 빅데이터를 활용한 토픽 분석: 신문기사를 중심으로)

  • Kim, Sungae;Jun, Soojin
    • Journal of Industrial Convergence
    • /
    • v.18 no.1
    • /
    • pp.73-78
    • /
    • 2020
  • To analyze the main topics related to the use of blockchain technology, the Topic Modeling Technique was applied to the 'Blockchain Technology Utilization' big data shown in newspaper articles. To this end, from 2013 to 2019, when newspaper articles on the use of blockchain technology first appeared, the topics were extracted from 21 newspapers and analyzed by time to 15,537 articles. As a result of the analysis, articles related to the utilization of blockchain technology have increased exponentially since 2015 and focused on IT_science and economics. Key words related to cryptocurrency, bitcoin and virtual currency were weighted high, although they differed depending on time. Blockchain technology, which had focused on financial transactions, gradually expanded to big data, Internet of Things and artificial intelligence. As a result, changes in corporate topics were also made together to expand into various fields at banks for financial transactions, focusing on large and global companies. The study showed how these topics were changing, along with the main topics in newspaper articles related to the use of blockchain technology.

Verification of firefighters' heuristics through big data analysis (빅데이터 분석을 통한 소방관의 경험법칙 검증 및 화재예방 활용)

  • Park, Sohyun;Park, Jeong-Hoon;Shin, Eun-Ji;Shin, Dongil
    • Journal of the Korean Institute of Gas
    • /
    • v.24 no.2
    • /
    • pp.50-55
    • /
    • 2020
  • The heuristics accumulated in the field activities of firefighters were reviewed through big data analysis of fire occurrences in Gyeonggi-do and researched to be utilized for proper fire prevention activities according to time, day, and target through quantitative modeling. Empirical rules with high sympathy were collected through direct interviews with firefighters. Among them, the rule of thumb that "Friday is the most fire-prone" is considered to be the most important in terms of fire monitoring and prediction. A big data comparison analysis was conducted, including the number of fires and damages that occurred in Gyeonggi-do in 2018. Furthermore, fire occurrence patterns by region, day of the week, time of day, and building type were derived. Regarding empirical rules that have been validated through research, relatively inexperienced firefighters also can make decisions by relying on refined quantitative predictive modeling and empirical rules including local government and time-based factors that reflect big fire occurrence data.

An analysis of the change in media's reports and attitudes about face masks during the COVID-19 pandemic in South Korea: a study using Big Data latent dirichlet allocation (LDA) topic modelling (빅데이터 LDA 토픽 모델링을 활용한 국내 코로나19 대유행 기간 마스크 관련 언론 보도 및 태도 변화 분석)

  • Suh, Ye-Ryoung;Koh, Keumseok Peter;Lee, Jaewoo
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.25 no.5
    • /
    • pp.731-740
    • /
    • 2021
  • This study applied LDA topic modeling analysis to collect and analyze news media big data related to face masks in the three waves of the COVID-19 pandemic in Korea. The results empirically show that media reports focused on mask production and distribution policies in the first wave and the mandatory mask wearing in the second wave. In contrast, more reports on trivial, gossipy events consist of the media coverage in the second and third waves. The findings imply that Korea's governmental interventions to address the shortage of face masks and to regulate mask wearing were successful relatively in a short time. In contrast, the study also reports that there may be relative less number of science-based news reports like the ones on the effectiveness of face masks or different levels of filter types. This study exemplifies how a big data analysis can be applied to evaluate and enhance public health communication.

News Big Data Analysis of 'Tap Water Larvae' Using Topic Modeling Analysis (토픽 모델링을 활용한 '수돗물 유충' 뉴스 빅데이터 분석)

  • Lee, Su Yeon;Kim, Tae-Jong
    • The Journal of the Korea Contents Association
    • /
    • v.20 no.11
    • /
    • pp.28-37
    • /
    • 2020
  • This study was conducted to propose measures to improve crisis response to environmental issues by analyzing the news big data on the 'tap water larvae' situation and identifying related major keywords and topics. To accomplish this, 1,975 cases of 'tap water larvae' reported between July 13 to August 31, 2020 were divided into three periods and analyzed using topical modeling techniques. The analysis output 15 topics for each period. According to the result, the 'tap water larvae' incident, as reported in the media, is divided into the occurrence, diffusion, and rectification stages. The government's response and civilian risk consciousness and reaction could also be seen. Based on the result, the following measures to respond to environment risk is proposed. First, it is necessary to explore the various intertwined context with the 'tap water larvae' incident at its core and develop responsiveness to environmental problems through education which forms integrated views. Second, a role to monitor the environment must be implemented and civilian-participated environmental information must be shared through the application of internet communities. Third, the cultivation and deployment of environmental communicators who provide and communicate fast and accurate environment information is required. This study, as the first in Korea to use the topic modeling analysis method based on big data related to 'tap water larvae', has academic significance in that it has empirically and systematically analyzed environmental issues which appear as unstructured data. It also political significance as it suggests ways to improve environmental education and communication.

Big Data Analysis of Busan Civil Affairs Using the LDA Topic Modeling Technique (LDA 토픽모델링 기법을 활용한 부산시 민원 빅데이터 분석)

  • Park, Ju-Seop;Lee, Sae-Mi
    • Informatization Policy
    • /
    • v.27 no.2
    • /
    • pp.66-83
    • /
    • 2020
  • Local issues that occur in cities typically garner great attention from the public. While local governments strive to resolve these issues, it is often difficult to effectively eliminate them all, which leads to complaints. In tackling these issues, it is imperative for local governments to use big data to identify the nature of complaints, and proactively provide solutions. This study applies the LDA topic modeling technique to research and analyze trends and patterns in complaints filed online. To this end, 9,625 cases of online complaints submitted to the city of Busan from 2015 to 2017 were analyzed, and 20 topics were identified. From these topics, key topics were singled out, and through analysis of quarterly weighting trends, four "hot" topics(Bus stops, Taxi drivers, Praises, and Administrative handling) and four "cold" topics(CCTV installation, Bus routes, Park facilities including parking, and Festivities issues) were highlighted. The study conducted big data analysis for the identification of trends and patterns in civil affairs and makes an academic impact by encouraging follow-up research. Moreover, the text mining technique used for complaint analysis can be used for other projects requiring big data processing.

Analysis of the supportive care needs of the parents of preterm children in South Korea using big data text-mining: Topic modeling

  • Park, Ji Hyeon;Lee, Hanna;Cho, Haeryun
    • Child Health Nursing Research
    • /
    • v.27 no.1
    • /
    • pp.34-42
    • /
    • 2021
  • Purpose: The purpose of this study was to identify the supportive care needs of parents of preterm children in South Korea using text data from a portal site. Methods: In total, 628 online newspaper articles and 1,966 social network service posts published between January 1 and December 31, 2019 were analyzed. The procedures in this study were conducted in the following order: keyword selection, data collection, morpheme analysis, keyword analysis, and topic modeling. Results: The term "yirundung-yi", which is a native Korean word referring to premature infants, was confirmed to be a useful term for parents. The following four topics were identified as the supportive care needs of parents of preterm children: 1) a vague fear of caring for a baby upon imminent neonatal intensive care unit discharge, 2) real-world difficulties encountered while caring for preterm children, 3) concerns about growth and development problems, and 4) anxiety about possible complications. Conclusion: Supportive care interventions for parents of preterm children should include general parenting methods for babies. A team composed of multidisciplinary experts must support the individual growth and development of preterm children and manage the complications of prematurity using highly accessible media.

Real-time private consumption prediction using big data (빅데이터를 이용한 실시간 민간소비 예측)

  • Seung Jun Shin;Beomseok Seo
    • The Korean Journal of Applied Statistics
    • /
    • v.37 no.1
    • /
    • pp.13-38
    • /
    • 2024
  • As economic uncertainties have increased recently due to COVID-19, there is a growing need to quickly grasp private consumption trends that directly reflect the economic situation of private economic entities. This study proposes a method of estimating private consumption in real-time by comprehensively utilizing big data as well as existing macroeconomic indicators. In particular, it is intended to improve the accuracy of private consumption estimation by comparing and analyzing various machine learning methods that are capable of fitting ultra-high-dimensional big data. As a result of the empirical analysis, it has been demonstrated that when the number of covariates including big data is large, variables can be selected in advance and used for model fit to improve private consumption prediction performance. In addition, as the inclusion of big data greatly improves the predictive performance of private consumption after COVID-19, the benefit of big data that reflects new information in a timely manner has been shown to increase when economic uncertainty is high.

Developing a Solution to Improve Road Safety Using Multiple Deep Learning Techniques

  • Humberto, Villalta;Min gi, Lee;Yoon Hee, Jo;Kwang Sik, Kim
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.15 no.1
    • /
    • pp.85-96
    • /
    • 2023
  • The number of traffic accidents caused by wet or icy road surface conditions is on the rise every year. Car crashes in such bad road conditions can increase fatalities and serious injuries. Historical data (from the year 2016 to the year 2020) on weather-related traffic accidents show that the fatality rates are fairly high in Korea. This requires accurate prediction and identification of hazardous road conditions. In this study, a forecasting model is developed to predict the chances of traffic accidents that can occur on roads affected by weather and road surface conditions. Multiple deep learning algorithms taking into account AlexNet and 2D-CNN are employed. Data on orthophoto images, automatic weather systems, automated synoptic observing systems, and road surfaces are used for training and testing purposes. The orthophotos images are pre-processed before using them as input data for the modeling process. The procedure involves image segmentation techniques as well as the Z-Curve index. Results indicate that there is an acceptable performance of prediction such as 65% for dry, 46% for moist, and 33% for wet road conditions. The overall accuracy of the model is 53%. The findings of the study may contribute to developing comprehensive measures for enhancing road safety.