PIRS : Personalized Information Retrieval System using Adaptive User Profiling and Real-time Filtering for Search Results

적응형 사용자 프로파일기법과 검색 결과에 대한 실시간 필터링을 이용한 개인화 정보검색 시스템

  • Jeon, Ho-Cheol (Department of Computer Science and Engineering, Hanyang University) ;
  • Choi, Joong-Min (Department of Computer Science and Engineering, Hanyang University)
  • 전호철 (한양대학교 컴퓨터공학과) ;
  • 최중민 (한양대학교 컴퓨터공학과)
  • Received : 2010.09.03
  • Accepted : 2010.11.21
  • Published : 2010.12.31

Abstract

This paper proposes a system that can serve users with appropriate search results through real time filtering, and implemented adaptive user profiling based personalized information retrieval system(PIRS) using users' implicit feedbacks in order to deal with the problem of existing search systems such as Google or MSN that does not satisfy various user' personal search needs. One of the reasons that existing search systems hard to satisfy various user' personal needs is that it is not easy to recognize users' search intentions because of the uncertainty of search intentions. The uncertainty of search intentions means that users may want to different search results using the same query. For example, when a user inputs "java" query, the user may want to be retrieved "java" results as a computer programming language, a coffee of java, or a island of Indonesia. In other words, this uncertainty is due to ambiguity of search queries. Moreover, if the number of the used words for a query is fewer, this uncertainty will be more increased. Real-time filtering for search results returns only those results that belong to user-selected domain for a given query. Although it looks similar to a general directory search, it is different in that the search is executed for all web documents rather than sites, and each document in the search results is classified into the given domain in real time. By applying information filtering using real time directory classifying technology for search results to personalization, the number of delivering results to users is effectively decreased, and the satisfaction for the results is improved. In this paper, a user preference profile has a hierarchical structure, and consists of domains, used queries, and selected documents. Because the hierarchy structure of user preference profile can apply the context when users perfomed search, the structure is able to deal with the uncertainty of user intentions, when search is carried out, the intention may differ according to the context such as time or place for the same query. Furthermore, this structure is able to more effectively track web documents search behaviors of a user for each domain, and timely recognize the changes of user intentions. An IP address of each device was used to identify each user, and the user preference profile is continuously updated based on the observed user behaviors for search results. Also, we measured user satisfaction for search results by observing the user behaviors for the selected search result. Our proposed system automatically recognizes user preferences by using implicit feedbacks from users such as staying time on the selected search result and the exit condition from the page, and dynamically updates their preferences. Whenever search is performed by a user, our system finds the user preference profile for the given IP address, and if the file is not exist then a new user preference profile is created in the server, otherwise the file is updated with the transmitted information. If the file is not exist in the server, the system provides Google' results to users, and the reflection value is increased/decreased whenever user search. We carried out some experiments to evaluate the performance of adaptive user preference profile technique and real time filtering, and the results are satisfactory. According to our experimental results, participants are satisfied with average 4.7 documents in the top 10 search list by using adaptive user preference profile technique with real time filtering, and this result shows that our method outperforms Google's by 23.2%.

본 논문은 다양한 사용자의 개인적 검색요구를 충족시키지 못하는 기존 검색시스템의 문제점을 해결하기 위해 사용자의 묵시적 피드백을 이용한 적응형 사용자 기호정보 기반의 개인화 검색을 실현하고, 검색결과에 대한 실시간 필터링을 통해 사용자에게 적합한 검색 결과를 제공하는 시스템을 제안한다. 기존의 검색 시스템들은 검색의도의 불확실성 때문에 사용자의 검색실패율이 높다. 검색 의도의 불확실성은 동일한 사용자가 "java"와 같은 다의어에 대해 동일한 질의어를 사용하더라도 다른 검색 결과를 원할 수 있다는 것이며, 단어의 수가 적을수록 불확실성은 가중될 것이다. 실시간 필터링은 사용자의 도메인 지정여부에 따라 주어진 도메인에 해당하는 웹문서들만 추출하거나, 적절한 도메인을 추론하고 해당하는 웹문서들만 검색 결과로 보여주는 것으로, 일반적인 디렉토리 검색과 유사하지만 모든 웹문서에 대해 이루어진다는 것과 실시간으로 분류된다는 것이 다르다. 실시간 필터링을 개인화에 활용함으로써 검색 결과의 수를 줄이고 검색만족도를 개선했다. 본 논문에서 생성한 기호정보파일은 계층적 구조로 이루어지며, 상황정보의 반영이 가능하기 때문에 의도의 불확실성을 해결 할 수 있다. 또한 사용자의 도메인별 웹문서 검색 동작을 효과적으로 추적(track) 할 수 있으며, 사용자의 기호 변화를 적절하게 알아낼 수 있다. 각 사용자 식별을 위해 IP address를 사용했으며, 기호정보파일은 사용자의 검색 행동에 대한 관찰을 기반으로 지속적으로 갱신된다. 또한 사용자의 검색결과에 대한 행동 관찰을 통해, 사용자 기호를 인지하고, 기호정보를 동적으로 반영했으며, 검색결과에 대한 만족도를 측정했다. 기호정보파일과 반영비율은 사용자가 검색을 수행할 때 시스템에 의해 생성되거나 갱신된다. 실험결과 적응형 사용자 기호정보파일과 실시간 필터링을 함께 사용함으로써, 상위 10개의 검색결과 중 평균 4.7개의 결과들에 대해 만족하는 것으로 나타났으며, 이는 구글의 결과에 비해 약 23.2% 향상된 만족도를 나타내었다.

Keywords

References

  1. 연철, 지애띠, 김흥남, 조근식, "효과적인 추천 시스템을 위한 협업적 태그 기반의 여과 기법", 한국지능정보시스템학회, 한국지능정보시스템학회논문지, 14권 2호(2008), 157-177.
  2. Agichtein, E. and Z. Zheng, "Identifying "best bet" web search results by mining past user behavior", In Proc. 12th ACM SIGKDD Intl. Conf. on Knowledge Discovery and Data Mining(KDD-06), (2006), 902-908.
  3. Chen, T., W. Han, H. Wang, Y. Zhou, B. Xu, and B. Zang, "Content recommendation system based on private dynamic user profile", In Proc. Intl. Conf. on Machine Learning and Cybernetics, (2007), 2112-2118.
  4. Fox, S., K. Karnawat, M. Mydland, S. Dumais and T. White, "Evaluating implicit measures to improve web search", ACM Transactions on Information Systems, Vol.23, No.2(2005), 147-168. https://doi.org/10.1145/1059981.1059982
  5. Golemati, M., A. Katifori, C. Vassilakis, G. Lepouras, and C. Halatsis, "Creating an ontology for the user profile : Method and applications", In Proc. 1st RCIS Conf., (2007), 23-26.
  6. Jenu Shrestha, Mohammed Nazim Uddin, and GeunSik Jo, "Combining Collaborative, Diversity and Content Based Filtering for Recommendation System", 한국지능정보시스템학회, 한국지능정보시스템학회논문지, Vol.14, No.1(2008), 101-115.
  7. Jeon, H. C., T. H. Kim, and J. M. Choi, "Adaptive user profiling for personalized information retrieval", In Proc. 3rd Intl. Conf. on Convergence and Hybrid Information Technology(ICCIT 2008), Vol.2 (2008), 836-841.
  8. Joachims, T., "Optimizing search engines using clickthrough data", In Proc. 8th ACM SIGKDD Intl. Conf. on Knolwedge Discovery and Data Mining, (2002), 133-142.
  9. Kaki, M., "Enhancing Web search result access with automatic categorization", Ph.D. dissertation, Department of Computer Sciences, University of Tampere, Tampere, Finland, 2005.
  10. Kelly, D. and J. Teevan, "Implicit feedback for inferring user preference : A bibliography", ACM SIGIR Forum, Vol.37, No.2(2003), 18-28. https://doi.org/10.1145/959258.959260
  11. Koutrika, G. and Y. Ioannidis, "A unified user profile framework for query disambiguation and personalization", In Proc. Workshop on New Technologies for Personalized Information Access (PIA2005), (2005) 44-53.
  12. Lam, W., S. Mukhopadhyay, J. Mostafa, and M. Palakal, "Detection of shifts in user interests for personalized information filtering", In Proc. 19th Intl. ACMSIGIR Conf. on Research and Development in Information Retrieval, (1996), 317-325.
  13. Moukas, A., "Amalthea : Information discovery and filtering using a multiagent evolving ecosystem", In Proc. 1st Intl. Conf. on The Practical Applications of Intelligent Agents and Multi-Agent Technology (PAAM), 1996.
  14. Naderi, H. and B. Rumpler, "PERCIRS : a system to combine personalized and collaborative information retrieval", Journal of Documentation, Vol.66, No.4(2006), 532-562.
  15. Nanas, N., V. Uren and A. de Roeck, "Building and applying a concept hierarchy representation of a user profile", In Proc. 26th Intl. ACM SIGIR Conf. on Research and Development in Information Retrieval, (2003), 194-204.
  16. Pazzani, M. and D. Billsus, "Learning and revising user profiles : The identification of interesting web sites", Machine Learning, Vol.27(1997), 313-331. https://doi.org/10.1023/A:1007369909943
  17. Robal, T. and A. Kalja, "Applying user profile ontology for mining web site adaptation recommendations", In Proc. ADBIS 2007. LNCS, Vol.4690(2007), 126-135.
  18. Sieg, A., B. Mobasher, and R. Burke, "Representing context in web search with ontological user profiles", In Proc. 6th Intl. and Interdisciplinary Conf. on Modeling and Using Context, LNCS Vol.4635(2007), 439-452.
  19. Singh, A. and K. Nakata, "Hierarchical classification of web search results using personalized ontologies", In Proc. 3rd Intl. Conf. on Universal Access in Human-Computer Interaction, (2005).
  20. Speretta, M. and S. Gauch, "Personalized search based on user search histories", In Proc. IEEE/ACM Intl. Conf. on Web Intelligence, WI '05, (2005), 622-628.
  21. Stan, J., E. Egyed‐Zsigmond, A. Joly, and P. Maret, "A user profile ontology for situation-aware social networking", In Proc. 3rd Workshop on Artificial Intelligence Techniques for Ambient Intelligence, 2008.
  22. Stermsek, G., M. Strembeck, and G. Neumann, "User profile refinement using explicit user interest modeling", In Proc. 37. Jahrestagung der Gesellschaft f¨ur Informatik (GI). Lecture Notes in Informatics (LNI), Vol. 109, 289-293, (2007).
  23. Sugiyama, K., K. Hatano, and M. Yoshikawa, "Adaptive web search based on user profile constructed without any effort from users", In Proc. 13th Intl. Conf. on World Wide Web, (2004), 675-684.
  24. Tan, A. and C. Teo, "Learning user profiles for personalized information dissemination", In Proc. Intl. Joint Conf. on Neural Network, 183-188, 1998.
  25. Trajkova, J. and S. Gauch, "Improving ontology-based user profiles", In Proc. the RIAO, (2004), 380-389.
  26. Wang, J., Z. Li, J. Yao, Z. Sun, M. Li, and W. Ma, "Adaptive user profile model and collaborative filtering for personalized news", APWeb, (2006), 474-485.
  27. Zayani, C., A. Peninou, M. Canut, and F. Sedes, "An adaptation approach : query enrichment by user profile", In Proc. IEEE/ACM Signal -Image Technology and Internet‐Based Systems (SITIS), (2006), 24-35.