• Title/Summary/Keyword: news data

Search Result 888, Processing Time 0.035 seconds

Social Media Rumors in Bangladesh

  • Al-Zaman, Md. Sayeed;Sife, Sifat Al;Sultana, Musfika;Akbar, Mahbuba;Ahona, Kazi Taznahel Sultana;Sarkar, Nandita
    • Journal of Information Science Theory and Practice
    • /
    • v.8 no.3
    • /
    • pp.77-90
    • /
    • 2020
  • This study analyzes N=181 social media rumors from Bangladesh to find out the most popular themes, sources, and aims. The result shows that social media rumors have seven popular themes: political, health & education, crime & human rights, religious, religiopolitical, entertainment, and other. Also, online media and mainstream media are the two main sources of social media rumors, along with three tentative aims: positive, negative, and unknown. A few major findings of this research are: Political rumors dominate social media, but its percentage is decreasing, while religion-related rumors are increasing; most of the social media rumors are negative and emerge from online media, and social media itself is the dominant online source of social media rumors; and, most of the health-related rumors are negative and surge during a crisis period, such as the COVID-19 pandemic. This paper identifies some of its limitations with the data collection period, data source, and data analysis. Providing a few research directions, this study also elucidates the contributions of its results in academia and policymaking.

Performance of speech recognition unit considering morphological pronunciation variation (형태소 발음변이를 고려한 음성인식 단위의 성능)

  • Bang, Jeong-Uk;Kim, Sang-Hun;Kwon, Oh-Wook
    • Phonetics and Speech Sciences
    • /
    • v.10 no.4
    • /
    • pp.111-119
    • /
    • 2018
  • This paper proposes a method to improve speech recognition performance by extracting various pronunciations of the pseudo-morpheme unit from an eojeol unit corpus and generating a new recognition unit considering pronunciation variations. In the proposed method, we first align the pronunciation of the eojeol units and the pseudo-morpheme units, and then expand the pronunciation dictionary by extracting the new pronunciations of the pseudo-morpheme units at the pronunciation of the eojeol units. Then, we propose a new recognition unit that relies on pronunciation by tagging the obtained phoneme symbols according to the pseudo-morpheme units. The proposed units and their extended pronunciations are incorporated into the lexicon and language model of the speech recognizer. Experiments for performance evaluation are performed using the Korean speech recognizer with a trigram language model obtained by a 100 million pseudo-morpheme corpus and an acoustic model trained by a multi-genre broadcast speech data of 445 hours. The proposed method is shown to reduce the word error rate relatively by 13.8% in the news-genre evaluation data and by 4.5% in the total evaluation data.

A Study on the Awareness of Artificial Intelligence Development Ethics based on Social Big Data (소셜 빅데이터 기반 인공지능 개발윤리 인식 분석)

  • Kim, Marie;Park, Seoha;Roh, Seungkook
    • Journal of Engineering Education Research
    • /
    • v.25 no.3
    • /
    • pp.35-44
    • /
    • 2022
  • Artificial intelligence is a core technology in the era of digital transformation, and as the technology level is advanced and used in various industries, its influence is growing in various fields, including social, ethical and legal issues. Therefore, it is time to raise social awareness on ethics of artificial intelligence as a prevention measure as well as improvement of laws and institutional systems related to artificial intelligence development. In this study, we analyzed unstructured data, typically text, such as online news articles and comments to confirm the degree of social awareness on ethics of artificial intelligence development. The analysis showed that the public intended to concentrate on specific issues such as "Human," "Robot," and "President" in 2018 to 2019, while the public has been interested in the use of personal information and gender conflics in 2020 to 2021.

Governance of A Public Platform Project in the Context of Digital Transformation Focusing on the 'Special Delivery' (공공플랫폼 구축사업의 거버넌스: 경기도 배달플랫폼 '배달특급'의 사례를 중심으로)

  • Seo, Jeongone
    • Journal of Information Technology Services
    • /
    • v.21 no.5
    • /
    • pp.15-28
    • /
    • 2022
  • Recently, government agencies are actively adopting the platform model as a means of public policy. However, existing studies on the public platform are minimal and have focused on user experiences or the possibility of public usage of the platform model. Now the research concerning building governance structure and utilizing network effects of the platform after adopting the platform model in the public sector is keenly required. This study intended to ignite academic dialogue on the governance of public platforms in the context of digital transformation. This study focused on a case of the 'Special delivery,' a public delivery app established by Gyeonggi-do. In order to analyze the characteristics of the public platform and its governance structure, data were collected from press releases, policy reports, and news articles. Data was analyzed using the frame of Hagui's platform design factors and Ansell & Gash's collaborative governance model. The results of the public platform analyses showed 1) incompleteness in the value trade-off accounting, which was designed for platform business based on general cost-benefit analysis, and 2) a closed governance structure that limits direct participation of diverse user groups(i.e., service provider, customer) in order to enhance providers' utility by preventing customers' excessive online activities. The results of this study provided theoretical and policy implications regarding designing the strategy for accounting for value trade-offs and functioning governance structure for public platforms.

Data Analysis Research to Analyze the Cause of Low Birth Rate (저출산 원인 확인을 위한 데이터 분석연구)

  • Lee, Jeongwon;Lee, Choong Ho
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2021.05a
    • /
    • pp.496-498
    • /
    • 2021
  • In Korea, based on the high fertility rate before 1980, the total population has been steadily increasing, and since the mid-1980s, the fertility rate has fallen sharply and has fallen below the level of population replacement. The cause of low birth rate in the region is not voluntary rejection, but rather, it is necessary to find out the cause by identifying the structural causes of the local community from various angles. We collected local Internet news and local representative cafe data, where many mothers participate, based on the budget area with a very low fertility rate among various areas. Factors of childbirth inhibition were analyzed by using the frequency of concurrent words that became issues related to population decline, low birthrate, and child-rearing welfare.

  • PDF

Conflict Analysis in Construction Project with Unstructured Data: A Case Study of Jeju Naval Base Project in South Korea

  • Baek, Seungwon;Han, Seung Heon;Lee, Changjun;Jang, Woosik;Ock, Jong Ho
    • International conference on construction engineering and project management
    • /
    • 2017.10a
    • /
    • pp.291-296
    • /
    • 2017
  • Infrastructure development as national project suffers from social conflict which is one of main risk to be managed. Social conflicts have a negative impact on not only the social integration but also the national economy as they require enormous social costs to be solved. Against this backdrop, this study analyzes social conflict using articles published by online news media based on web-crawling and natural language processing (NLP) techniques. As an illustrative case, the Jeju Naval Base (JNB) project which is one of representative conflict case in South Korea is analyzed. Total of 21,788 articles and representative keywords are identified annually. Additionally, comparative analysis is conducted between the extracted keywords and actual events occurred during the project. The authors explain actual events in the JNB project based on the extracted words by the year. This study contributes to analyze social conflict and to extract meaningful information from unstructured data.

  • PDF

The 2018 US Midterm Elections and the Latino Voting: Diversity and Change (미국의 2018년 중간선거와 라티노 투표: 다양성과 변화)

  • Lee, Byung-Jae
    • Korean Journal of Legislative Studies
    • /
    • v.25 no.1
    • /
    • pp.5-44
    • /
    • 2019
  • The 2018 midterm elections were considered a referendum for Trump Presidency, especially because Latino community has been feeling that the anti-immigration, anti-Latino policies of Trump administration are harmful to the community. News Media and pundits predicted the boost of the Latino turnout and its positive effects on Democratic candidates at all levels. The purpose of this paper is to provide an overview of Latino demographics and Latino public opinion and to analyze the election results with exit poll data and actual aggregate data. The data analysis shows that, compared to 2014 midterm elections, Latino turnout and the support for Democratic candidates actually increased in most counties and precincts, which is more salient in the areas with heavy Latino concentration.

Prediction of infectious diseases using multiple web data and LSTM (다중 웹 데이터와 LSTM을 사용한 전염병 예측)

  • Kim, Yeongha;Kim, Inhwan;Jang, Beakcheol
    • Journal of Internet Computing and Services
    • /
    • v.21 no.5
    • /
    • pp.139-148
    • /
    • 2020
  • Infectious diseases have long plagued mankind, and predicting and preventing them has been a big challenge for mankind. For this reasen, various studies have been conducted so far to predict infectious diseases. Most of the early studies relied on epidemiological data from the Centers for Disease Control and Prevention (CDC), and the problem was that the data provided by the CDC was updated only once a week, making it difficult to predict the number of real-time disease outbreaks. However, with the emergence of various Internet media due to the recent development of IT technology, studies have been conducted to predict the occurrence of infectious diseases through web data, and most of the studies we have researched have been using single Web data to predict diseases. However, disease forecasting through a single Web data has the disadvantage of having difficulty collecting large amounts of learning data and making accurate predictions through models for recent outbreaks such as "COVID-19". Thus, we would like to demonstrate through experiments that models that use multiple Web data to predict the occurrence of infectious diseases through LSTM models are more accurate than those that use single Web data and suggest models suitable for predicting infectious diseases. In this experiment, we predicted the occurrence of "Malaria" and "Epidemic-parotitis" using a single web data model and the model we propose. A total of 104 weeks of NEWS, SNS, and search query data were collected, of which 75 weeks were used as learning data and 29 weeks were used as verification data. In the experiment we predicted verification data using our proposed model and single web data, Pearson correlation coefficient for the predicted results of our proposed model showed the highest similarity at 0.94, 0.86, and RMSE was also the lowest at 0.19, 0.07.

A Study of Perception of Golfwear Using Big Data Analysis (빅데이터를 활용한 골프웨어에 관한 인식 연구)

  • Lee, Areum;Lee, Jin Hwa
    • Fashion & Textile Research Journal
    • /
    • v.20 no.5
    • /
    • pp.533-547
    • /
    • 2018
  • The objective of this study is to examine the perception of golfwear and related trends based on major keywords and associated words related to golfwear utilizing big data. For this study, the data was collected from blogs, Jisikin and Tips, news articles, and web $caf{\acute{e}}$ from two of the most commonly used search engines (Naver & Daum) containing the keywords, 'Golfwear' and 'Golf clothes'. For data collection, frequency and matrix data were extracted through Textom, from January 1, 2016 to December 31, 2017. From the matrix created by Textom, Degree centrality, Closeness centrality, Betweenness centrality, and Eigenvector centrality were calculated and analyzed by utilizing Netminer 4.0. As a result of analysis, it was found that the keyword 'brand' showed the highest rank in web visibility followed by 'woman', 'size', 'man', 'fashion', 'sports', 'price', 'store', 'discount', 'equipment' in the top 10 frequency rankings. For centrality calculations, only the top 30 keywords were included because the density was extremely high due to high frequency of the co-occurring keywords. The results of centrality calculations showed that the keywords on top of the rankings were similar to the frequency of the raw data. When the frequency was adjusted by subtracting 100 and 500 words, it showed different results as the low-ranking keywords such as J. Lindberg in the frequency analysis ranked high along with changes in the rankings of all centrality calculations. Such findings of this study will provide basis for marketing strategies and ways to increase awareness and web visibility for Golfwear brands.

A Study on Development of Data Broadcasting Service Using RSS on Web 2.0 Environment (웹 2.0 환경에서 RSS를 활용한 데이터방송 서비스 구현에 대한 연구)

  • Jang, Yun-Yong;Yim, Hyun-Jeong;Lim, Soon-Bum
    • Journal of Korea Multimedia Society
    • /
    • v.12 no.5
    • /
    • pp.664-676
    • /
    • 2009
  • As data broadcasting has become available, diverse contents can now be provided through digital TV, IPTV and DMB; yet it is true in the current situation that killer contents are insufficient that can satisfy users. On the contrary, in case of the web, with the advent of web 2.0 aiming at user-centered services, the contents market has grown greatly. The ideas and technologies of Web 2.0, if gratified into data broadcasting, are expected to make a contribution to the vitalization of contents. This paper proposes a method of using RSS, as a concrete example of applying web 2.0 to data broadcasting. Accordingly, we have developed a system for producing data services for the ground wave DMB data broadcasting, by using RSS at the stage of authoring news and the like, for which provision of the latest information is important. And for IPTV, we have developed a data broadcasting application so that users can select the RSS they want, as well as a creation system for using RSS at the authoring stage. It is anticipated that the application of RSS by means of the system will simplify the authoring process, making easy the provision of the latest information through the web, and thereby making possible the provision of diverse services to users.

  • PDF