• Title/Summary/Keyword: tweet analysis

Search Result 74, Processing Time 0.027 seconds

Relationship Between Tweet Frequency and User Velocity on Twitter (트위터에서 트윗 주기와 사용자 속도 사이 관계)

  • Jeon, So-Young;Lee, Al-Chan;Seo, Go-Eun;Shin, Won-Yong
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.19 no.6
    • /
    • pp.1380-1386
    • /
    • 2015
  • Recently, the importance of users' geographic location information has been highlighted with a rapid increase of online social network services. In this paper, by utilizing geo-tagged tweets that provides high-precision location information of users, we first identify both Twitter users' exact location and the corresponding timestamp when the tweet was sent. Then, we analyze a relationship between the tweet frequency and the average user velocity. Specifically, we introduce a tweet-frequency computing algorithm, and show analysis results by country and by city. As a main result, it is shown that the tweet frequency according to user velocity follows a power-law distribution (i.e., Zipf' distribution or a Pareto distribution). In addition, by performing a comparison between the United States and Japan, one can see that the exponent of the distribution in Japan is smaller than that in the United States.

Issue summarization scheme based on real-time SNS trend analysis (실시간 SNS 트렌드 분석에 기반한 이슈 요약 기법)

  • Kim, Daeyong;Kim, Daehoon;Hwang, Eenjun
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2013.11a
    • /
    • pp.1096-1097
    • /
    • 2013
  • 최근 Twitter를 비롯한 소셜 네트워크 서비스의 급속한 확산으로 인해, 많은 수의 SNS 메시지가 실시간으로 생성되고 있다. 이러한 SNS 상의 모든 글을 읽어보는 것은 현실적으로 불가능하며, 여러 포탈 사이트에서 제공되는 실시간 검색어 순위만으로는 상세 내용을 직관적으로 파악하기 어렵다. 따라서, 이러한 SNS상의 글을 실시간으로 분석하여 최신의 트렌드를 찾고 이와 연관된 내용을 분류 및 요약할 수 있다면, 사용자에게 유용한 최신 정보를 생성하여 제공할 수 있다. 본 논문에서는 Tweet 들을 분석하여 얻은 트렌드 키워드를 기반으로 관련된 Tweet 들을 주제 별로 분류한 후, 각 주제 별로 세부 내용을 요약해서 제공하는 기법을 제안한다. 제안하는 기법은 실시간으로 생성되는 Tweet 내에서 최근 화제가 된 트렌드 및 연관 키워드를 추출해낸다. 그 후, 해당 키워드가 출현한 Tweet 내에서 핵심 키워드를 찾고, 이를 기반으로 Tweet 들을 각각의 주제별로 분류하고 각 주제를 '이슈'로 정의한다. 마지막으로, 특정한 이슈에 해당되는 Tweet들을 분석하여 각 이슈 별로 키워드 리스트 및 단문 형식으로 요약된 줄거리를 생성한다. 제안된 기법을 바탕으로 프로토타입 시스템을 구현하고, 다양한 실험을 통하여 이슈 검출 기법의 유용성 면에서 성능을 평가한다.

An Efficient Method for Design and Implementation of Tweet Analysis System (효율적인 트윗 분석 시스템 설계 및 구현 방법)

  • Choi, Minseok
    • Journal of Digital Convergence
    • /
    • v.13 no.2
    • /
    • pp.43-50
    • /
    • 2015
  • Since the popularity of social network services (SNS) rise, the data produced from them is rapidly increased. The SNS data includes personal propensity or interest and propagates rapidly so there are many requests on analyzing the data for applying the analytic results to various fields. New technologies and services for processing and analyzing big data in the real-time are introduced but it is hard to apply them in a short time and low coast. In this paper, an efficient method to build a tweet analysis system without inducing new technologies or service platforms for handling big data is proposed. The proposed method was verified through building a prototype monitoring system to collect and analyze tweets using the MySQL database and the PHP scripts.

A Content Analysis on the Domestic Public Libraries' Use of Twitter (국내 공공도서관의 트위터 이용에 관한 내용분석)

  • Shim, Jiyoung
    • Journal of the Korean Society for information Management
    • /
    • v.34 no.1
    • /
    • pp.241-262
    • /
    • 2017
  • This study aims to identify and analyze the Twitter use of domestic public libraries. In order to identify the detailed patterns of Twitter use in library and information services, a content analysis was conducted for the 3,038 tweet data from the top 14 public libraries' accounts on Twitter use. Inductive approach was adopted to develop a coding scheme and open coding was conducted with the entire tweet. Additionally, correspondence analysis was conducted for the result of content analysis to identify how library accounts correspond to specific types. As a result, 3 main categories and 9 sub-categories of public libraries' Twitter use were developed. And the 37 detailed patterns of public libraries' use of Twitter were identified. The identified patterns can provide the libraries interested in Twitter use with guidelines.

Predicting the Lifespan and Retweet Times of Tweets Based on Multiple Feature Analysis

  • Bae, Yongjin;Ryu, Pum-Mo;Kim, Hyunki
    • ETRI Journal
    • /
    • v.36 no.3
    • /
    • pp.418-428
    • /
    • 2014
  • In social network services, such as Facebook, Google+, Twitter, and certain postings attract more people than others. In this paper, we propose a novel method for predicting the lifespan and retweet times of tweets, the latter being a proxy for measuring the popularity of a tweet. We extract information from retweet graphs, such as posting times; and social, local, and content features, so as to construct prediction knowledge bases. Tweets with a similar topic, retweet pattern, and properties are sequentially extracted from the knowledge base and then used to make a prediction. To evaluate the performance of our model, we collected tweets on Twitter from June 2012 to October 2012. We compared our model with conventional models according to the prediction goal. For the lifespan prediction of a tweet, our model can reduce the time tolerance of a tweet lifespan by about four hours, compared with conventional models. In terms of prediction of the retweet times, our model achieved a significantly outstanding precision of about 50%, which is much higher than two of the conventional models showing a precision of around 30% and 20%, respectively.

A Study on the Improvement and Analysis of SNS Operation Status on Disaster Information in Domestic and Foreign Public Institution (국내·외 기관의 재난정보관련 SNS 운용현황 및 개선방안에 관한 연구)

  • Doo, Hyo-Chul;Park, Jun-Hyeong;Kim, Hye-Young;Oh, Hyo-Jung;Kim, Yong
    • Journal of the Korean BIBLIA Society for library and Information Science
    • /
    • v.28 no.2
    • /
    • pp.57-78
    • /
    • 2017
  • SNS is a useful tool to quickly deliver information in an emergency given their speed and expandability. Especially, SNS in the event of a disaster or an accident can offer on-site, accurate and detailed updates about essential information such as the safety of victims and the development of the situation, served as a valuable complement to the conventional media. This study aims to perform a comparative analysis on how social media are currently used by emergency management authorities in South Korea and other countries. Based on the results, this study proposed more effective ways to exploit SNS and improve efficiency of disaster management. To accomplish the goals, this study collected tweet information from various sources including the FEMA of the U. S., the FDMA and the Central Disaster Council of Japan, and the MPSS of Korea. The collected tweet information was analyzed by feedback, time series, and information types. The feedback analysis aims to quantify the number of monthly user feedback in order to assess user satisfaction about the tweet information. The time series analysis identifies the number of tweet information, feedback index and keywords by country for certain duration, examining why certain messages showed high feedback indices and what kind of contents should be offered by the authorities. Finally, the analysis of information type reviews the type of information contained in the tweet information that drew users' attention to identify the information type in which the authorities should deliver information to users. Based on these analyses, this study proposed improvement methods to use Tweeter in MPSS.

A Study on the Spatial Patterns of Tweet Data for Urban Areas by Time - A Case of Busan City - (도시 지역 트윗 데이터의 시간대별 공간분포 특성 - 부산광역시를 사례로 -)

  • Ku, Cha Yong
    • Journal of Cadastre & Land InformatiX
    • /
    • v.46 no.2
    • /
    • pp.269-281
    • /
    • 2016
  • The process of spatial big data, such as social media, is being paid more attention in the field of spatial information in recent years. This study, as an example of spatial big data analysis, analyzed the spatial and temporal distribution of Tweet data based on the location and time information. In addition, the characteristics of its spatial pattern by times were identified. Tweet data in Busan city are collected, processed, and analyzed to identify the characteristics of the temporal and spatial pattern. Then, the results of Tweet data analysis were compared with the characteristics of the land type. This study found that spatial pattern of tweeting in the city was associated with given time periods such as daytime and nighttime in both weekdays and weekends. The spatial distribution patterns of individual time periods were compared with the characteristics of the land for the spatially concentrated area. The results of this study showed that tweeted data would be related to different spatial distribution depending on the time, which potentially reflects the daily pattern and characteristics of the land type of urban area to some extent. This study presented the possible incorporation of social media data, e. g. Tweet data, into the field of spatial information. It is expected that there will be more advantage to use a variety of social media data in areas such as land planning and urban planning.

Location Inference of Twitter Users using Timeline Data (타임라인데이터를 이용한 트위터 사용자의 거주 지역 유추방법)

  • Kang, Ae Tti;Kang, Young Ok
    • Spatial Information Research
    • /
    • v.23 no.2
    • /
    • pp.69-81
    • /
    • 2015
  • If one can infer the residential area of SNS users by analyzing the SNS big data, it can be an alternative by replacing the spatial big data researches which result from the location sparsity and ecological error. In this study, we developed the way of utilizing the daily life activity pattern, which can be found from timeline data of tweet users, to infer the residential areas of tweet users. We recognized the daily life activity pattern of tweet users from user's movement pattern and the regional cognition words that users text in tweet. The models based on user's movement and text are named as the daily movement pattern model and the daily activity field model, respectively. And then we selected the variables which are going to be utilized in each model. We defined the dependent variables as 0, if the residential areas that users tweet mainly are their home location(HL) and as 1, vice versa. According to our results, performed by the discriminant analysis, the hit ratio of the two models was 67.5%, 57.5% respectively. We tested both models by using the timeline data of the stress-related tweets. As a result, we inferred the residential areas of 5,301 users out of 48,235 users and could obtain 9,606 stress-related tweets with residential area. The results shows about 44 times increase by comparing to the geo-tagged tweets counts. We think that the methodology we have used in this study can be used not only to secure more location data in the study of SNS big data, but also to link the SNS big data with regional statistics in order to analyze the regional phenomenon.

Dynamic Seed Selection for Twitter Data Collection (트위터 데이터 수집을 위한 동적 시드 선택)

  • Lee, Hyoenchoel;Byun, Changhyun;Kim, Yanggon;Lee, Sang Ho
    • Journal of KIISE:Databases
    • /
    • v.41 no.4
    • /
    • pp.217-225
    • /
    • 2014
  • Analysis of social media such as Twitter can yield interesting perspectives to understanding human behavior, detecting hot issues, identifying influential people, or discovering a group and community. However, it is difficult to gather the data relevant to specific topics due to the main characteristics of social media data; data is large, noisy, and dynamic. This paper proposes a new algorithm that dynamically selects the seed nodes to efficiently collect tweets relevant to topics. The algorithm utilizes attributes of users to evaluate the user influence, and dynamically selects the seed nodes during the collection process. We evaluate the proposed algorithm with real tweet data, and get satisfactory performance results.

Design of Big Data Preference Analysis System (빅데이터 선호도 분석 시스템 설계)

  • Son, Sung Il;Park, Chan Khon
    • Journal of Korea Multimedia Society
    • /
    • v.17 no.11
    • /
    • pp.1286-1295
    • /
    • 2014
  • This paper suggests the way that it could improve the reliability about preference of user's feedback by adding weighting factor on sentiment analysis, and efficiently make a sentiment analysis of users' emotional perspective on the big data massively generated on twitter. To solve errors on earlier studies, this paper has improved recall and precision of sensibility determination by using sensibility dictionary subdivided sentiment polarity based on the level of sensibility and given impotance to sensibility determination by populating slang, new words, emoticons and idiomatic expressions not in the system dictionary. It has considered the context through conjunctive adverbs fixed in korean characteristics which are free to the word order. It also recognize sensibility words such as TF(Term Frequency), RT(Retweet), Follower which are weighting factors of preference and has increased reliability of preference analysis considering weight on 'a very emotional tweet', 'a recognised tweet from users' and 'a tweeter influencer'