Twitter Crawling System

Ganiev, Saydiolim;Nasridinov, Aziz;Byun, Jeong-Yong;

doi:10.9717/JMIS.2015.2.3.287

Journal of Multimedia Information System

Volume 2 Issue 3
/
Pages.287-294
/
2015
/
2383-7632(eISSN)

Korea Multimedia Society (한국멀티미디어학회)

DOI QR Code

Twitter Crawling System

Ganiev, Saydiolim (Computer Science and Multimedia Engineering, Dongguk University) ;
Nasridinov, Aziz (Computer Science and Multimedia Engineering, Dongguk University) ;
Byun, Jeong-Yong (Dongguk University Gyeongju Campus)

Received : 2015.09.20
Accepted : 2015.10.12
Published : 2015.09.30

https://doi.org/10.9717/JMIS.2015.2.3.287 Citation PDF

Download PDF

⟨ Previous Next ⟩

Abstract

We are living in epoch of information when Internet touches all aspects of our lives. Therefore, it provides a plenty of services each of which benefits people in different ways. Electronic Mail (E-mail), File Transfer Protocol (FTP), Voice/Video Communication, Search Engines are bright examples of Internet services. Between them Social Network Services (SNS) continuously gain its popularity over the past years. Most popular SNSs like Facebook, Weibo and Twitter generate millions of data every minute. Twitter is one of SNS which allows its users post short instant messages. They, 100 million, posted 340 million tweets per day (2012)[1]. Often big amount of data contains lots of noisy data which can be defined as uninteresting and unclassifiable data. However, researchers can take advantage of such huge information in order to analyze and extract meaningful and interesting features. The way to collect SNS data as well as tweets is handled by crawlers. Twitter crawler has recently emerged as a great tool to crawl Twitter data as well as tweets. In this project, we develop Twitter Crawler system which enables us to extract Twitter data. We implemented our system in Java language along with MySQL. We use Twitter4J which is a java library for communicating with Twitter API. The application, first, connects to Twitter API, then retrieves tweets, and stores them into database. We also develop crawling strategies to efficiently extract tweets in terms of time and amount.

Keywords

References

https://en.wikipedia.org/ , 2015
Z. Xu, R. Lu., L. Xiang and Q. Yang, "Discovering User Interest on Twitter with a Modified Author-Topic Model," IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology, pp. 422-429, 2011.
X. Wang, L. Tokarchuk, F. Cuadrado and S. Poslad, " Exploiting Hashtags for Adaptive Microblog Crawling," IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, pp. 311-315, 2013.
Y. Kim and K. Shim, "TWITOBI: A Recommendation System for Twitter Using Probabilistic Modeling," 11th IEEE International Conference on Data Mining, pp. 340-349, 2011.
M. Yang and H. Rim, "Identifying interesting Twitter contents using topical analysis," Expert Systems with Applications, Vol. 41, pp. 4330-4336, 2014. https://doi.org/10.1016/j.eswa.2013.12.051
M. Yigit, B. Bilgin and A. Karahoca, "Extended topology based recommendation system for unidirectional social networks," Expert Systems with Applications, Vol. 42, pp. 3653-3661, 2015. https://doi.org/10.1016/j.eswa.2014.12.043
S. Saif, Y. He, Z. Fernandez and H. Alani, " Contextual semantics for sentiment analysis of Twitter," Information Processing and Management, 2015.
L. Cagliero, T. Cerquitelli, P. Garza and Grimaudo, " Twitter data analysis by means of Strong Flipping Generalized Itemsets," Journal of Systems and Software, Vol. 94, pp. 16-29, 2014. https://doi.org/10.1016/j.jss.2014.03.060
https://dev.twitter.com/rest/public/rate-limits, 2015

Journal of Multimedia Information System

Twitter Crawling System

Abstract

Keywords

References

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)