DOI QR코드

DOI QR Code

Tweet Acquisition System by Considering Location Information and Tendency of Twitter User

트위터 사용자의 위치정보와 성향을 고려한 트윗 수집 시스템

  • Choi, Woosung (Dept. of Computer Science and Engineering, The Catholic University of Korea) ;
  • Yim, Junyeob (Dept. of Computer Science and Engineering, The Catholic University of Korea) ;
  • Hwang, Byung-Yeon (Dept. of Computer Science and Engineering, The Catholic University of Korea)
  • Received : 2014.02.12
  • Accepted : 2014.06.09
  • Published : 2014.06.30

Abstract

While SNS services such as Twitter or Facebook are rapidly growing, research for the SNS analysis has been concerned. Especially, twitter reacts to social issues in real-time so that it is used to get useful experimental data for researchers of social science or information retrieval. However, it is still lack of research on the methodology to collect data. Therefore, this paper suggests the tweet acquisition system by considering tendency of twitter user oriented location-based event and political social event. First the system acquires tweets including information of location and keyword about event and secure IDs for acquisition of political social event. Then we plan ID-analyzer to classify the tendency of users. In addition for measuring reliability of ID-analyzer, it acquires and analyzes the tweet by using high-ranked ID. In analyses result, top-ranked ID shows 88.8% reliability, 2nd-ranked ID shows 76.05% and ID-analyzer shows 77.5%, it shortens collection time by using minority ID.

최근 소셜 네트워크 서비스가 급격히 성장하면서, 소셜 네트워크 분석에 관련된 연구들도 많은 관심을 받고 있다. 특히 트위터는 사회적 이슈나 사건들에 대해 실시간으로 반응하기 때문에, 사회과학 분야나 정보검색 분야의 연구자들이 유용한 실험 데이터를 수집하는 데에 활용되고 있다. 그러나 정작 데이터를 수집하는 방법론에 관한 연구는 아직 미흡하다. 이에 본 논문에서는 위치 기반의 이벤트와 정치 사회적 이벤트 위주의 사용자의 성향을 고려한 트윗 수집 시스템을 제안한다. 우선 위치정보와 이벤트 관련 키워드를 포함하고 있는 트윗과 정치 사회적인 이벤트 검출에 필요한 ID들을 수집한 후, 사용자들의 성향을 분류할 ID 분석기를 설계했다. 또한 ID 분석기의 신뢰도 측정을 위해 상위 등급에 분류된 ID를 이용하여 트윗을 분석했다. 분석결과 1등급으로 분류된 ID는 88.8%의 신뢰도를 보였으며, 2등급으로 분류된 ID는 76.05%의 신뢰도를 보였다. 또한 ID 분석기는 77.5%의 신뢰도를 보였으며 소수의 ID를 사용함으로써 데이터의 수집시간을 줄였다.

Keywords

References

  1. Cheblb, N. K; Sohall, R. M. 2011, The Reasons Social Media Contributed to the 2011 Egyptian Revolution, Journal of Business Research and Management, 2(3):139-162.
  2. Cho, A. R; Kang, Y. O. 2010, The Design and Implemenation of Festival Information Website using the GeoRSS Function, Journal of Korea Spatial Information Society, 18(1):89-99.
  3. Chowdhury, A. 2011, Global Pulse, Twitter Blog, http://blog.twitter.com/2011/06/global-pulse.html
  4. Daniel. G.-A. 2012, A Balanced Survey on Election Prediction using Twitter Data, Department of Computer Science, University of Oviedo May 1.
  5. Hwang, K. S. 2011, Twitter Users Increased and the Concentration of Top 1% has Intensified, The Kyunghyang Shinmun.
  6. ITU. 2011, ITU Measuring the Information Society 2011, International Telecommunication Union, http://www.itu.int/ITU-D/ict/
  7. Korea Communications Commission, 2011, 2011 Responses to National Audit Written Interrogatories.
  8. Kwon, O. J; Kim, J. H; Li, K. J. 2010, A Spatial Data Stream Processing System for Spatial Context Analysis in Real-time, Journal of Korea Spatial Information Society, 18(1):69-76.
  9. Lee, B. S; Hwang, B. Y. 2012, A Study of the Correlation between the Spatial Attributes on Twitter, Paper presented at the 28th Conference of Data Engineering Workshop on Spatio Temporal data Integration and Retrieval, 337-340.
  10. Lee, B. S; Kim S. J; Choi, W. S; Jang, K. H; Yoon, J, Y; Hwang, B. Y. 2011, Analyzing the Credibility of the Location Information Provided by Twitter Users, Paper presented at the 28th Conference of Korea Multimedia Society, 1-3.
  11. Min, J. S. 2012, Study on Twitter users political participation, Journal of Korea Regional Communication Research Association, 12(2):274-303.
  12. Mislove, A; Lehmann, S; Ahn, Y. Y; Onnela, J. P; Rosenquist. J. N. 2011, Understanding the Demographics of Twitter Users, Paper presented at the 5th Conference of AAAI on Weblogs and Social Media (ICWSM'11), 554-557.
  13. Mustafaraj, E; Finn, S; Whitlock C; Metaxas, P. T. 2011, Vocal Minority versus Silent Majority: Discovering the Opinions of the Long Tail, Paper presented at Conference of Social Com / PASSAT, 103-110.
  14. Nagarajan, M; Gomadam, K; Sheth, A. P; Ranabahu, A; Mutharaju, R; Jadhav, A. 2009, Spatio-Temporal- Thematic Analysis of Citizen Sensor Data: Challenges and Experiences, Paper presented at the 10th Conference of LNCS on Web Information Systems Engineering, 539-553.
  15. Russell J. 2011, Japan Overtakes Indonesia as Biggest Twitter User in Asia, http://www.asian correspondent.com, 2011.
  16. Sakaki, T; Okzaki, M; Matsuo, Y. 2010, Earthquake Shakes Twitter Users: Real-time Event Detection by Social Sensors, Paper presented at the 19th Conference of World Wide Web, 851-860.
  17. Twitter Search API, 2013, https://dev.twitter.com/ docs/api/1/get/search
  18. Twitter Streaming API, 2013, https://dev.twitter.com/ docs/streaming-apis

Cited by

  1. Recent research trends for geospatial information explored by Twitter data vol.24, pp.2, 2014, https://doi.org/10.1007/s41324-016-0007-0
  2. Evaluating residential location inference of twitter users at district level: focused on Seoul city vol.24, pp.4, 2014, https://doi.org/10.1007/s41324-016-0039-5