• Title/Summary/Keyword: twitter data

Search Result 302, Processing Time 0.022 seconds

A Design of Smart Retweet Supporting the Efficient Information Transfer (효과적인 정보전달을 지원하는 스마트 리트윗의 설계)

  • Jeong, Do-Seong;Cho, Dae-Soo
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2011.05a
    • /
    • pp.252-255
    • /
    • 2011
  • Growing demand for smart phones and data communication diminishes the constraints of Twitter and Facebook than a smartphone has become a subject of interest. On the other hand facebook users in their relationships to obtain the consent of the other, twitter is a relatively simple procedure for the information ripple effect is excellent. Twitter is beyond a simple social networking services(SNS) located in one of the popular media and powerful have the upper retweet. Retweet to the top of his sympathy with the ability th send tweets to their subscriber information can spread quickly. In this paper, we propose the smart retweet that system actively extend the existing retweet. In order to realize the smart retweet and additional criteria for determining the destination of the information is required. Based on tweet generated regional or an local information mentioned to tweet, to determine the destination. Smart retweet of the speed and scope of information transmission through the scale is expected.

  • PDF

Unspecified Event Detection System Based on Contextual Location Name on Twitter (트위터에서 문맥상 지역명을 기반으로 한 불특정 이벤트 탐지 시스템)

  • Oh, Pyeonghwa;Yim, Junyeob;Yoon, Jinyoung;Hwang, Byung-Yeon
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.3 no.9
    • /
    • pp.341-348
    • /
    • 2014
  • The advance in web accessibility with dissemination of smart phones gives rise to rapid increment of users on social network platforms. Many research projects are in progress to detect events using Twitter because it has a powerful influence on the dissemination of information with its open networks, and it is the representative service which generates more than 500 million Tweets a day in average; however, existing studies to detect events has been used TFIDF algorithm without any consideration of the various conditions of tweets. In addition, some of them detected predefined events. In this paper, we propose the RTFIDF VT algorithm which is a modified algorithm of TFIDF by reflecting features of Twitter. We also verified the optimal section of TF and DF for detecting events through the experiment. Finally, we suggest a system that extracts result-sets of places and related keywords at the given specific time using the RTFIDF VT algorithm and validated section of TF and DF.

Social Issue Analysis Based on Sentiment of Twitter Users (트위터 사용자들의 감성을 이용한 사회적 이슈 분석)

  • Kim, Hannah;Jeong, Young-Seob
    • Journal of Convergence for Information Technology
    • /
    • v.9 no.11
    • /
    • pp.81-91
    • /
    • 2019
  • Recently, social network service (SNS) is actively used by public. Among them, Twitter has a lot of tweets including sentiment and it is convenient to collect data through open Aplication Programming Interface (API). In this paper, we analyze social issues and suggest the possibility of using them in marketing through sentimental information of users. In this paper, we collect twitter text about social issues and classify as positive or negative by sentiment classifier to provide qualitative analysis. We provide a quantitative analysis by analyzing the correlation between the number of like and retweet of each tweet. As a result of the qualitative analysis, we suggest solutions to attract the interest of the public or consumers. As a result of the quantitative analysis, we conclude that the positive tweet should be brief to attract the users' attention on the Twitter. As future work, we will continue to analyze various social issues.

A Method for Twitter Spam Detection Using N-Gram Dictionary Under Limited Labeling (트레이닝 데이터가 제한된 환경에서 N-Gram 사전을 이용한 트위터 스팸 탐지 방법)

  • Choi, Hyeok-Jun;Park, Cheong Hee
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.6 no.9
    • /
    • pp.445-456
    • /
    • 2017
  • In this paper, we propose a method to detect spam tweets containing unhealthy information by using an n-gram dictionary under limited labeling. Spam tweets that contain unhealthy information have a tendency to use similar words and sentences. Based on this characteristic, we show that spam tweets can be effectively detected by applying a Naive Bayesian classifier using n-gram dictionaries which are constructed from spam tweets and normal tweets. On the other hand, constructing an initial training set requires very high cost because a large amount of data flows in real time in a twitter. Therefore, there is a need for a spam detection method that can be applied in an environment where the initial training set is very small or non exist. To solve the problem, we propose a method to generate pseudo-labels by utilizing twitter's retweet function and use them for the configuration of the initial training set and the n-gram dictionary update. The results from various experiments using 1.3 million korean tweets collected from December 1, 2016 to December 7, 2016 prove that the proposed method has superior performance than the compared spam detection methods.

Geographical Name Denoising by Machine Learning of Event Detection Based on Twitter (트위터 기반 이벤트 탐지에서의 기계학습을 통한 지명 노이즈제거)

  • Woo, Seungmin;Hwang, Byung-Yeon
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.4 no.10
    • /
    • pp.447-454
    • /
    • 2015
  • This paper proposes geographical name denoising by machine learning of event detection based on twitter. Recently, the increasing number of smart phone users are leading the growing user of SNS. Especially, the functions of short message (less than 140 words) and follow service make twitter has the power of conveying and diffusing the information more quickly. These characteristics and mobile optimised feature make twitter has fast information conveying speed, which can play a role of conveying disasters or events. Related research used the individuals of twitter user as the sensor of event detection to detect events that occur in reality. This research employed geographical name as the keyword by using the characteristic that an event occurs in a specific place. However, it ignored the denoising of relationship between geographical name and homograph, it became an important factor to lower the accuracy of event detection. In this paper, we used removing and forecasting, these two method to applied denoising technique. First after processing the filtering step by using noise related database building, we have determined the existence of geographical name by using the Naive Bayesian classification. Finally by using the experimental data, we earned the probability value of machine learning. On the basis of forecast technique which is proposed in this paper, the reliability of the need for denoising technique has turned out to be 89.6%.

Analysis of the Spread of Issues Related to COVID-19 Vaccine on Twitter: Focusing on Issue Salience (코로나19 백신 관련 트위터 상의 이슈 확산 양상 분석: 이슈 현저성을 중심으로)

  • Hong, Juhyun;Lee, Mina
    • The Journal of the Convergence on Culture Technology
    • /
    • v.7 no.4
    • /
    • pp.613-621
    • /
    • 2021
  • This study conducted a network analysis to determine how COVID-19 vaccine-related issue spread on Twitter during the introduction stage of the COVID-19. Issue diffusion tendency is analyzed according to the time period: phase 1 (initiation of vaccine introduction: March 7 - April 3, 2021), phase 2 (stagnant period of vaccination: April 4 - April 22, 2021), and phase 3 (increase of vaccination: April 23 - May 5, 2021). NodeXL was used for data collection and analysis. Daily Twitter network data were collected by entering search terms highly related to the COVID-19 vaccine. This study found that side effects-related opinions were repeatedly formed throughout the analysis period. As the vaccination rate increased and death cases were reported on media, death-related issues also emerged on Twitter. On the other hand, vaccine safety did not receive much attention on Twitter. The results of this study highlight the role of social media as a channel of issue diffusion when a national disaster strikes. We emphasize the need for the government to monitor public opinions on social media and reflect them in crisis communication strategies.

Spatiotemporal Data Visualization using Gravity Model (중력 모델을 이용한 시공간 데이터의 시각화)

  • Kim, Seokyeon;Yeon, Hanbyul;Jang, Yun
    • Journal of KIISE
    • /
    • v.43 no.2
    • /
    • pp.135-142
    • /
    • 2016
  • Visual analysis of spatiotemporal data has focused on a variety of techniques for analyzing and exploring the data. The goal of these techniques is to explore the spatiotemporal data using time information, discover patterns in the data, and analyze spatiotemporal data. The overall trend flow patterns help users analyze geo-referenced temporal events. However, it is difficult to extract and visualize overall trend flow patterns using data that has no trajectory information for movements. In order to visualize overall trend flow patterns, in this paper, we estimate continuous distributions of discrete events over time using KDE, and we extract vector fields from the continuous distributions using the gravity model. We then apply our technique on twitter data to validate techniques.

Tweet Entity Linking Method based on User Similarity for Entity Disambiguation (개체 중의성 해소를 위한 사용자 유사도 기반의 트윗 개체 링킹 기법)

  • Kim, SeoHyun;Seo, YoungDuk;Baik, Doo-Kwon
    • Journal of KIISE
    • /
    • v.43 no.9
    • /
    • pp.1043-1051
    • /
    • 2016
  • Web based entity linking cannot be applied in tweet entity linking because twitter documents are shorter in comparison to web documents. Therefore, tweet entity linking uses the information of users or groups. However, data sparseness problem is occurred due to the users with the inadequate number of twitter experience data; in addition, a negative impact on the accuracy of the linking result for users is possible when using the information of unrelated groups. To solve the data sparseness problem, we consider three features including the meanings from single tweets, the users' own tweet set and the sets of other users' tweets. Furthermore, we improve the performance and the accuracy of the tweet entity linking by assigning a weight to the information of users with a high similarity. Through a comparative experiment using actual twitter data, we verify that the proposed tweet entity linking has higher performance and accuracy than existing methods, and has a correlation with solving the data sparseness problem and improved linking accuracy for use of information of high similarity users.

Semantic-Based K-Means Clustering for Microblogs Exploiting Folksonomy

  • Heu, Jee-Uk
    • Journal of Information Processing Systems
    • /
    • v.14 no.6
    • /
    • pp.1438-1444
    • /
    • 2018
  • Recently, with the development of Internet technologies and propagation of smart devices, use of microblogs such as Facebook, Twitter, and Instagram has been rapidly increasing. Many users check for new information on microblogs because the content on their timelines is continually updating. Therefore, clustering algorithms are necessary to arrange the content of microblogs by grouping them for a user who wants to get the newest information. However, microblogs have word limits, and it has there is not enough information to analyze for content clustering. In this paper, we propose a semantic-based K-means clustering algorithm that not only measures the similarity between the data represented as a vector space model, but also measures the semantic similarity between the data by exploiting the TagCluster for clustering. Through the experimental results on the RepLab2013 Twitter dataset, we show the effectiveness of the semantic-based K-means clustering algorithm.

Design and Implementation of Automated Twitter Data Collecting System : Focus on Environmental Data (자동화된 트위터 데이터 수집 시스템 설계 및 구현 : 환경 데이터를 중심으로)

  • Kim, Do-Hyung;Koo, Jahwan;Kim, Ung-Mo
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2020.05a
    • /
    • pp.361-364
    • /
    • 2020
  • 소셜 네트워크 서비스의 사용자가 늘어나면서, 소셜 네트워크 서비스상에서 발생하는 빅데이터를 활용한 서비스가 늘어나고 있다. 소셜 네트워크 서비스 데이터는 실시간으로 생성되며, 따라서 데이터 수집 시스템 역시 자동화하여 준 실시간으로 데이터를 수집할 필요가 있다. 본 논문에서는 대표적인 소셜 네트워크 서비스인 트위터의 데이터를 지속적으로 수집하기 위한 자동 수집 시스템을 제안한다. 수집 시스템은 Twitter API 를 활용하는 Python 라이브러리를 통해 내용 및 메타데이터를 수집하며, 수집된 데이터를 재 검증한 뒤 저장한다. 또한 구현된 시스템에 환경 데이터를 주제로 하는 쿼리를 입력하여 실제 트위터 데이터를 수집하며 구현된 시스템을 검증해보았다.