DOI QR코드

DOI QR Code

Pre-Processing of Query Logs in Web Usage Mining

  • Received : 2011.11.25
  • Accepted : 2012.02.15
  • Published : 2012.03.01

Abstract

In For the past few years, query log data has been collected to find user's behavior in using the site. Many researches have studied on the usage of query logs to extract user's preference, recommend personalization, improve caching and pre-fetching of Web objects, build better adaptive user interfaces, and also to improve Web search for a search engine application. A query log contain data such as the client's IP address, time and date of request, the resources or page requested, status of request HTTP method used and the type of browser and operating system. A query log can offer valuable insight into web site usage. A proper compilation and interpretation of query log can provide a baseline of statistics that indicate the usage levels of website and can be used as tool to assist decision making in management activities. In this paper we want to discuss on the tasks performed of query logs in pre-processing of web usage mining. We will use query logs from an online newspaper company. The query logs will undergo pre-processing stage, in which the clickstream data is cleaned and partitioned into a set of user interactions which will represent the activities of each user during their visits to the site. The query logs will undergo essential task in pre-processing which are data cleaning and user identification.

Keywords

References

  1. Batista, P., Silva, M. J., Silva, M., and Grande, C. (2002), Mining On-line Newspaper Web Access Logs, Proceedings of the AH'2002 Workshop on Recommendation and Personalization in eCommerce, 100-108.
  2. Choa, Y. H., Kim, J. K., and Kima, S. H. (2002), A personalized recommender system based on web usage mining and decision tree induction, Expert Systems with Applications, 23, 329-342. https://doi.org/10.1016/S0957-4174(02)00052-0
  3. Cooley, R., Mobasher, B., and Srivastava, J. (1999), Data Preparation for Mining World Wide Web Browsing Patterns, Knowledge and Information Systems, 1(1), 5-32. https://doi.org/10.1007/BF03325089
  4. Dixit, D. and Gadge, J. (2010), Automatic Recommendation for Online Users Using Web Usage Mining, International Journal of Managing Information Technology (IJMIT), 2, 33-42. https://doi.org/10.5121/ijmit.2010.2303
  5. Elsheikh, S. (2008), Web Usage Data for Web Access Control (WUDWAC), Proceedings of the World Congress on Engineering.
  6. Hao, T., Brimmer, D. J., Lin, J. M. S., Tumpey, A. J. and Reeves, W. C. (2009), Web Usage Data as a Means of Evaluating Public Health Messaging and Outreach, Journal of Medical Internet Research, 11, 99-118.
  7. Vellingiri, J. S. And Pandian, C. (2011), A Survey on Web Usage Mining, Global Journal Of Computer Science and Technology, 1, 4343-4350.
  8. Kumari, V. V. and Raju, K. S. (2010), Understanding User Behavior using Web Usage Mining, International Journal of Computer Applications, 7, 162-286.
  9. Markellou, P., Rigou, M., and Sirmakessis, S. (2005), Mining for Web Personalization, in Scime, A. (Ed.) Web Mining: Applications and Techniques, London: Idea Group Publishing, 27-48.
  10. Mobasher, B., Dai, H., Luo, T., Sun, Y., and Zhu, J. (2000), Integrating web usage and content mining for more effective personalization, Proceedings of the First International Conference on Electronic Commerce and Web Technologies, LNCS, 1875, 165-176.
  11. Murgue, T. and Jaillon, P. (2005), Data Preparation and Structural Models for Web Usage Mining, SETIT International Conference: Sciences of Electronic, Technologies of Information and Telecommunication.
  12. Nicholas, D., Huntington, P., Williams, P., and Dobrowolski, T. (2004), Reappraising information seeking behavior in a digital environment, Documentation, 60(1), 24-43. https://doi.org/10.1108/00220410410516635
  13. Pitkow, J. (1997), In search of reliable usage data on the WWW, Sixth International World Wide Web Conference, 451-463.
  14. Srivastava, J., Cooley, R., Deshpande, M., and Tan, P. N. (2000), Web Usage Mining: Discovery and Applications of Usage Patterns from Web Data, ACM SIGKDD, 1(2), 12-23. https://doi.org/10.1145/846183.846188
  15. Sanjay, B. and Thakare, S. (2010), A effective and complete preprocessing for Web Usage Mining, IJCSE International Journal on Computer Science and Engineering, 2(3), 848-851.
  16. Status codes (2011), Available at http://www.w3.org/Protocols/HTTP/HTRESP.html.
  17. Tanasa, D. and Trousse, B. (2004), Advanced Data Preprocessing for Intersites Web Usage Mining. IEEE Intelligent Systems, 19(2), 59-65. https://doi.org/10.1109/MIS.2004.1274912
  18. Tyagi, N. K., Solanki, A. K., and Wadhwa, M. (2010), Analysis of Server Log by Web Usage Mining for Website Improvement, International Journal of Computer Science Issues, 7(4-8), 17-21.

Cited by

  1. Relevant Feedback Based Accurate and Intelligent Retrieval on Capturing User Intention for Personalized Websites vol.6, pp.2169-3536, 2018, https://doi.org/10.1109/ACCESS.2018.2828081