• 제목/요약/키워드: twitter data

검색결과 301건 처리시간 0.021초

Twitter Crawling System

  • Ganiev, Saydiolim;Nasridinov, Aziz;Byun, Jeong-Yong
    • Journal of Multimedia Information System
    • /
    • 제2권3호
    • /
    • pp.287-294
    • /
    • 2015
  • We are living in epoch of information when Internet touches all aspects of our lives. Therefore, it provides a plenty of services each of which benefits people in different ways. Electronic Mail (E-mail), File Transfer Protocol (FTP), Voice/Video Communication, Search Engines are bright examples of Internet services. Between them Social Network Services (SNS) continuously gain its popularity over the past years. Most popular SNSs like Facebook, Weibo and Twitter generate millions of data every minute. Twitter is one of SNS which allows its users post short instant messages. They, 100 million, posted 340 million tweets per day (2012)[1]. Often big amount of data contains lots of noisy data which can be defined as uninteresting and unclassifiable data. However, researchers can take advantage of such huge information in order to analyze and extract meaningful and interesting features. The way to collect SNS data as well as tweets is handled by crawlers. Twitter crawler has recently emerged as a great tool to crawl Twitter data as well as tweets. In this project, we develop Twitter Crawler system which enables us to extract Twitter data. We implemented our system in Java language along with MySQL. We use Twitter4J which is a java library for communicating with Twitter API. The application, first, connects to Twitter API, then retrieves tweets, and stores them into database. We also develop crawling strategies to efficiently extract tweets in terms of time and amount.

Conversations about Open Data on Twitter

  • Jalali, Seyed Mohammad Jafar;Park, Han Woo
    • International Journal of Contents
    • /
    • 제13권1호
    • /
    • pp.31-37
    • /
    • 2017
  • Using the network analysis method, this study investigates the communication structure of Open Data on the Twitter sphere. It addresses the communication path by mapping influential activities and comparing the contents of tweets about Open Data. In the years 2015 and 2016, the NodeXL software was applied to collect tweets from the Twitter network, containing the term "opendata". The structural patterns of social media communication were analyzed through several network characteristics. The results indicate that the most common activities on the Twitter network are related to the subjects such as new applications and new technologies in Open Data. The study is the first to focus on the structural and informational pattern of Open Data based on social network analysis and content analysis. It will help researchers, activists, and policy-makers to come up with a major realization of the pattern of Open Data through Twitter.

Analysis and Implications of Twitter Data during the 2012 Election

  • 윤홍원
    • 한국산업정보학회논문지
    • /
    • 제19권6호
    • /
    • pp.7-13
    • /
    • 2014
  • Twitter is a microblogging service that allows users to post short messages on a variety of topics in real-time. In this work, we analyze Twitter messages posted during the 2012 elections and find those implications. This study uses Twitter messages related to the 2012 South Korean presidential campaign. The three main candidates are represented by the abbreviations A, M, and P. According to the statistical analysis, the number of tweets and re-tweets for candidate P was relatively stable over the entire campaign period. Candidate P had the highest percentage of terms related to elections pledges, and candidates A and M were judged to be a little bit poorer with respect to campaign promises. The positive terms ratio for candidate P was higher than those for the other two candidates. The negative terms ratio in the Twitter messages of P was considerably smaller than those of candidates A and M. After considering all these results, it is suggested cautiously that Twitter messages posted during an election campaign could be correlated with the outcome of the election.

The Usage Characteristics of Twitter, and Their Relationship with Gender, Age, and Brand Preferences

  • Ahn, Hyung Jun
    • 한국컴퓨터정보학회논문지
    • /
    • 제21권3호
    • /
    • pp.73-81
    • /
    • 2016
  • With the increasing popularity of social network services (SNSs), there have been many attempts to analyze the users of SNSs. By doing so, the characteristics and preferences of the users can be understood, which can help companies provide personalized information and services that they need or are relevant for them. This study aimed to analyze the usage behavior of Korean Twitter users from various perspectives to deepen the understanding of it. For this research goal, an online survey was conducted for the users of Twitter and the data about their actual usage were collected using the open API of Twitter. Factor analysis of the data revealed five factors that explain about 69.3% of the usage variables. It was also investigated how the factors are related to gender, age, and brand preferences. The results showed that the usage behavior of Twitter is largely affected by age (p<0.001), and also by gender through an interaction effect (p<0.05). Also, the factors showed significant statistical correlations with the brand preferences of the users.

트위터 사용자정보의 유사성을 기반으로 한 팔로어 분류시스템 (Follower classification system based on the similarity of Twitter node information)

  • 계용선;윤영미
    • 한국컴퓨터정보학회논문지
    • /
    • 제19권1호
    • /
    • pp.111-118
    • /
    • 2014
  • 현재 트위터에서 제공되는 친구추천 시스템은 영향력이 높은 사용자를 우선적으로 추천해준다. 하지만 사용자정보의 유사성이 높은 다른 사용자는 추천되지 않는 단점을 가지고 있다. 사용자들은 정보의 유사성이 높은 사용자 추천을 원하기 때문에 이러한 단점을 극복하기 위하여 본 논문에서는 사용자정보의 유사성을 기반으로 팔로어 추천 시스템을 구현하였다. 본 논문에서 사용된 데이터는 SNAP(Stanford Network Analysis Platform)에서 제공하는 데이터로, 팔로어의 수가 10,000명이상인 트위터의 사용자정보와 노드간 연결 데이터로 구성된다. 이 데이터를 트레이닝 데이터로 활용하여 팔로어간의 관계를 분류해줄 수 있는 분류자를 생성하고, 10-Fold Cross Validation을 활용하여, 분류자의 정확도를 판단한다. 두 트위터의 정보가 주어지면 그들 사이에 친구 관계, 팔로우 관계, 비연결 관계를 추천한다.

A Study on Efficient Market Hypothesis to Predict Exchange Rate Trends Using Sentiment Analysis of Twitter Data

  • Komariah, Kokoy Siti;Machbub, Carmadi;Prihatmanto, Ary S.;Sin, Bong-Kee
    • 한국멀티미디어학회논문지
    • /
    • 제19권7호
    • /
    • pp.1107-1115
    • /
    • 2016
  • Efficient Market Hypothesis (EMH), states that at any point in time in a liquid market security prices fully reflect all available information. This paper presents a study of proving the hypothesis through daily Twitter sentiments using the hybrid approach of the lexicon-based approach and the naïve Bayes classifier. In this research we analyze the currency exchange rate movement of Indonesia Rupiah vs US dollar as a way of testing the Efficient Market Hypothesis. In order to find a correlation between the prediction sentiments from Twitter data and the actual currency exchange rate trends we collect Twitter data every day and compute the overall sentiment to label them as positive or negative. Experimental results have shown 69% correct prediction of sentiment analysis and 65.7% correlation with positive sentiments. This implies that EMH is semi-strong Efficient Market Hypothesis, and that public information provide by Twitter sentiment correlate with changes in the exchange market trends.

Disaster Events Detection using Twitter Data

  • Yun, Hong-Won
    • Journal of information and communication convergence engineering
    • /
    • 제9권1호
    • /
    • pp.69-73
    • /
    • 2011
  • Twitter is a microblogging service that allows its user to share short messages called tweets with each other. All the tweets are visible on a public timeline. These tweets have the valuable geospatial component and particularly time critical events. In this paper, our interest is in the rapid detection of disaster events such as tsunami, tornadoes, forest fires, and earthquakes. We describe the detection system of disaster events and show the way to detect a target event from Twitter data. This research examines the three disasters during the same time period and compares Twitter activity and Internet news on Google. A significant result from this research is that emergency detection could begin using microblogging service.

Analysis of YouTube's role as a new platform between media and consumers

  • Hur, Tai-Sung;Im, Jung-ju;Song, Da-hye
    • 한국컴퓨터정보학회논문지
    • /
    • 제27권2호
    • /
    • pp.53-60
    • /
    • 2022
  • Youtube는 낮은 진입장벽과 영상물 규제 기준의 모호함으로 인하여 검증되지 않은 사실을 기반으로 한 가짜뉴스, 편파적 콘텐츠 등이 사실적으로 나타난다. 따라서 본 연구에서는 언론과 Youtube가 개인의 행동에 미치는 영향과 이들의 관계성을 분석하고자 한다. selenium, beautiful soup, Twitter API로 Youtube와 Twitter의 데이터를 무작위로 가져와 가장 자주 언급되는 키워드 31개를 분류한다. 분류된 31개의 키워드를 기반으로 Youtube, Twitter, 네이버 뉴스에서 데이터를 수집 후, NLTK(Natural Language Toolkit)의 Vader 모델로 긍정, 부정, 중립감정을 분류 및 수치화하여 분석 데이터로 사용했다. 데이터들의 상관성을 분석한 결과, 뉴스의 부정수치가 높아질수록 Youtube에서는 긍정적인 콘텐츠가 많아지는 것으로 분석되었다. 본 연구결과로, Youtube는 2차로 가공하여 전달되는 특성으로 인해 뉴스에서 나타나는 감정 지수와 일치하지는 않는다. 즉, 가공된 Youtube 콘텐츠는 소통의 창구인 Twitter의 긍정, 부정수치에도 직관적으로 영향을 미치게 된다. 본 연구결과는 사람들의 흥미와 본능을 자극하여 시선을 끄는 황색언론의 등장으로 정보의 정확한 판단이 어려워진 현 상황에서, 자극적이고 부정적인 영상으로 사회에 악영향을 끼치는 것으로 인식되어있는 Youtube가 도리어 개인의 식별력을 보조하는 역할을 하는 것으로 분석되었다.

하둡을 이용한 소셜네트워킹의 TV광고효과 분석 시스템 설계 (A Design of Analysis System on TV Advertising Effect of Social Networking Using Hadoop)

  • 허서연;김윤희
    • 인터넷정보학회논문지
    • /
    • 제14권6호
    • /
    • pp.49-57
    • /
    • 2013
  • 빅데이터가 화두가 되면서, 그 대표적인 예인 SNS을 이용한 서비스 개발도 활기를 띠고 있다. SNS는 기존 매체와는 다르게 실시간으로 의견을주고받는 하나의 장으로 확장되었고, 다양하고 많은 개인들의 의견을 분석하고자 하는 서비스들도 늘어나고 있다. 한편, 매체가 다양화되면서, TV광고계에서도 광고에 대한 의견의 확보와 분석에 새로운 접근방법이 필요해졌다. 이에 본 연구에서는 TV광고의 효과를 트위터 데이터를 기반으로 분석하며 특히 하둡을 이용하여 트위터 데이터와 같은 빅데이터를 저장 및 분석하도록 하는 LiveAD라는 시스템을 설계 및 구축하여, 트위터를 대상으로 TV광고 분석을 빠르게 수행할 수 있음을 보여주었다.

Anatomy of Sentiment Analysis of Tweets Using Machine Learning Approach

  • Misbah Iram;Saif Ur Rehman;Shafaq Shahid;Sayeda Ambreen Mehmood
    • International Journal of Computer Science & Network Security
    • /
    • 제23권10호
    • /
    • pp.97-106
    • /
    • 2023
  • Sentiment analysis using social network platforms such as Twitter has achieved tremendous results. Twitter is an online social networking site that contains a rich amount of data. The platform is known as an information channel corresponding to different sites and categories. Tweets are most often publicly accessible with very few limitations and security options available. Twitter also has powerful tools to enhance the utility of Twitter and a powerful search system to make publicly accessible the recently posted tweets by keyword. As popular social media, Twitter has the potential for interconnectivity of information, reviews, updates, and all of which is important to engage the targeted population. In this work, numerous methods that perform a classification of tweet sentiment in Twitter is discussed. There has been a lot of work in the field of sentiment analysis of Twitter data. This study provides a comprehensive analysis of the most standard and widely applicable techniques for opinion mining that are based on machine learning and lexicon-based along with their metrics. The proposed work is helpful to analyze the information in the tweets where opinions are highly unstructured, heterogeneous, and polarized positive, negative or neutral. In order to validate the performance of the proposed framework, an extensive series of experiments has been performed on the real world twitter dataset that alter to show the effectiveness of the proposed framework. This research effort also highlighted the recent challenges in the field of sentiment analysis along with the future scope of the proposed work.