DOI QR코드

DOI QR Code

A Method for Analyzing Web Log of the Hadoop System for Analyzing a Effective Pattern of Web Users

효과적인 웹 사용자의 패턴 분석을 위한 하둡 시스템의 웹 로그 분석 방안

  • 이병주 (숭실대학교 정보과학대학원) ;
  • 권정숙 (숭실대학교 SW특성화대학원) ;
  • 고기철 (숭실대학교 일반대학원) ;
  • 최용락 (숭실대학교 SW특성화대학원)
  • Received : 2014.07.25
  • Accepted : 2014.12.10
  • Published : 2014.12.31

Abstract

Of the various data that corporations can approach, web log data are important data that correspond to data analysis to implement customer relations management strategies. As the volume of approachable data has increased exponentially due to the Internet and popularization of smart phone, web log data have also increased a lot. As a result, it has become difficult to expand storage to process large amounts of web logs data flexibly and extremely hard to implement a system capable of categorizing, analyzing, and processing web log data accumulated over a long period of time. This study thus set out to apply Hadoop, a distributed processing system that had recently come into the spotlight for its capacity of processing large volumes of data, and propose an efficient analysis plan for large amounts of web log. The study checked the forms of web log by the effective web log collection methods and the web log levels by using Hadoop and proposed analysis techniques and Hadoop organization designs accordingly. The present study resolved the difficulty with processing large amounts of web log data and proposed the activity patterns of users through web log analysis, thus demonstrating its advantages as a new means of marketing.

Keywords

References

  1. Apache Flume, http://flume.apache.org/ (Accessed May 20. 2013).
  2. Apache Hadoop, http://wiki.apache.org/hadoop (Accessed May 20. 2013).
  3. Dean, J., S. Ghemawat, "MapReduce : Simplified Data Processing on Large Clusters", Communications of the ACM, Vol.51, No.1, 2008, 107-113.
  4. Go, Y.D., Design and Implementation of Web Analyzing System based on User Create Log, Korea National Open University Graduate School Master's Thesis, 2007.
  5. Han, J., M. Kamber, and J. Pei, Data Mining : Concepts and Techniques, Second edition, ELSEVIER Inc, New York, 2006.
  6. Hive, http://wiki.apache.org/hadoop/Hive (Accessed January 16. 2013).
  7. Jang, N.S., Data Mining, Daechung Media, 1999.
  8. Jang. Y.K., A study on the Relationship between Customer's Action and Customer's Value using Log Analysis, Graduate School of Ajou Master's Thesis, 2002.
  9. Jung, J.H., Get started! Hadoop Programming, Wikibooks, 2012.
  10. Jung, S.K. and C.W. Lee, "Web long Data Analysis Apply to Web Contents Analysis methodology", HCI Conference of Korean Institute of Information Scientists and Engineers, Vol.2, 2003, 1462-1467.
  11. Kang, R.G., H.K. Lim, and C.Y. Jung, "Datamining technique for successful eCRM, CRM", Journal of Korea Institute of Information and Communication Engineering, Vol.10, No.9, 2006, 1596-1601.
  12. Kim, B.S., "The Role of Site Stickiness and Its Antecedents in a Social Commerce Environment", Journal of Information Technology Services, Vol.12, No.3, 2013, 23-37.
  13. Kim, H.T., Internet Marketing.com, Triangle M&B, 2000.
  14. Kim, H.T. and O.G. Min, Web Log Analytics, Bibicom, 2001.
  15. Kim, S.H. and H.S. Park, "An Empirical Study on Individual and Social Commerce Factors Impacting Shopping Value and Intention to Repurchase in Social Commerce and Moderating Effects of Perceived Security", Journal of Information Technology Services, Vol.12, No.2, 2013, 31-53.
  16. Lee, J.I., K.M. Baek, J.H. Shin, and W.S. Lee, "Building Data Warehouse System for Weblog Analysis", Conference of Information Technology Services, Vol.2010, No.1, 291-295.
  17. Lee, S.J., A Case Study of Weblog Analysisfrom a Path Analytic Point of View, Graduate School of Dankook Master's Thesis, 2004.
  18. Oh, J.H., J.H. Kim, and J.W. Kim, "A Study on the Development of Realtime Online Maketing System Using Web Log Analytics", Journal of Society for e-Business Studies, Vol.16, No.3, 2011, 249-261. https://doi.org/10.7838/jsebs.2011.16.3.249
  19. Thusoo, A., J. S. Sarma, N. Jain, Z. Shao, P. Chakka, N. Zhang, S. Antony, H. Liu, and R. Murthy, "Hive-a petabyte scale data warehouse using hadoop", IEEE ICDE, Proceedings of the 26th IEEE International Conference on Data Engineering(ICDE), 2010, 996-1005.
  20. Tom, W., Hadoop : The Definitive Guide, O'Reilly Media, 2009.

Cited by

  1. An adaptable UI/UX considering user’s cognitive and behavior information in distributed environment pp.1573-7543, 2017, https://doi.org/10.1007/s10586-017-0999-9