DOI QR코드

DOI QR Code

Scalable Collaborative Filtering Technique based on Adaptive Clustering

적응형 군집화 기반 확장 용이한 협업 필터링 기법

  • 이오준 (단국대학교 소프트웨어학과) ;
  • 홍민성 (단국대학교 컴퓨터학과) ;
  • 이원진 (단국대학교 미디어콘텐츠연구원) ;
  • 이재동 (단국대학교 소프트웨어학과)
  • Received : 2014.04.05
  • Accepted : 2014.05.16
  • Published : 2014.06.30

Abstract

An Adaptive Clustering-based Collaborative Filtering Technique was proposed to solve the fundamental problems of collaborative filtering, such as cold-start problems, scalability problems and data sparsity problems. Previous collaborative filtering techniques were carried out according to the recommendations based on the predicted preference of the user to a particular item using a similar item subset and a similar user subset composed based on the preference of users to items. For this reason, if the density of the user preference matrix is low, the reliability of the recommendation system will decrease rapidly. Therefore, the difficulty of creating a similar item subset and similar user subset will be increased. In addition, as the scale of service increases, the time needed to create a similar item subset and similar user subset increases geometrically, and the response time of the recommendation system is then increased. To solve these problems, this paper suggests a collaborative filtering technique that adapts a condition actively to the model and adopts the concepts of a context-based filtering technique. This technique consists of four major methodologies. First, items are made, the users are clustered according their feature vectors, and an inter-cluster preference between each item cluster and user cluster is then assumed. According to this method, the run-time for creating a similar item subset or user subset can be economized, the reliability of a recommendation system can be made higher than that using only the user preference information for creating a similar item subset or similar user subset, and the cold start problem can be partially solved. Second, recommendations are made using the prior composed item and user clusters and inter-cluster preference between each item cluster and user cluster. In this phase, a list of items is made for users by examining the item clusters in the order of the size of the inter-cluster preference of the user cluster, in which the user belongs, and selecting and ranking the items according to the predicted or recorded user preference information. Using this method, the creation of a recommendation model phase bears the highest load of the recommendation system, and it minimizes the load of the recommendation system in run-time. Therefore, the scalability problem and large scale recommendation system can be performed with collaborative filtering, which is highly reliable. Third, the missing user preference information is predicted using the item and user clusters. Using this method, the problem caused by the low density of the user preference matrix can be mitigated. Existing studies on this used an item-based prediction or user-based prediction. In this paper, Hao Ji's idea, which uses both an item-based prediction and user-based prediction, was improved. The reliability of the recommendation service can be improved by combining the predictive values of both techniques by applying the condition of the recommendation model. By predicting the user preference based on the item or user clusters, the time required to predict the user preference can be reduced, and missing user preference in run-time can be predicted. Fourth, the item and user feature vector can be made to learn the following input of the user feedback. This phase applied normalized user feedback to the item and user feature vector. This method can mitigate the problems caused by the use of the concepts of context-based filtering, such as the item and user feature vector based on the user profile and item properties. The problems with using the item and user feature vector are due to the limitation of quantifying the qualitative features of the items and users. Therefore, the elements of the user and item feature vectors are made to match one to one, and if user feedback to a particular item is obtained, it will be applied to the feature vector using the opposite one. Verification of this method was accomplished by comparing the performance with existing hybrid filtering techniques. Two methods were used for verification: MAE(Mean Absolute Error) and response time. Using MAE, this technique was confirmed to improve the reliability of the recommendation system. Using the response time, this technique was found to be suitable for a large scaled recommendation system. This paper suggested an Adaptive Clustering-based Collaborative Filtering Technique with high reliability and low time complexity, but it had some limitations. This technique focused on reducing the time complexity. Hence, an improvement in reliability was not expected. The next topic will be to improve this technique by rule-based filtering.

기존 협업 필터링 기법은 사용자들의 아이템에 대한 선호도를 기반으로 유사 아이템 집합 또는 유사 사용자 집합을 구성하고, 이를 이용해 예측된 사용자의 특정 아이템에 대한 선호도를 기반으로 추천을 수행한다. 이로 인해, 사용자 선호도 정보가 부족하게 되면, 유사 아이템 사용자 집합의 신뢰도가 낮아지고, 추천 서비스의 신뢰도 또한 따라서 낮아진다. 또한, 서비스의 규모가 커질수록, 유사 아이템, 사용자 집합의 생성에 걸리는 시간은 기하급수적으로 증가하고 추천서비스의 응답시간 또한 그에 따라 증가하게 된다. 위와 같은 문제점을 해결하기 위해 본 논문에서는 적응형 군집화 기법을 제안하고 이를 적용한 협업 필터링 기법을 제안하고 있다. 이 기법은 크게 네 가지 방법으로 이루어진다. 첫째, 사용자와 아이템의 특성 벡터를 기반으로 사용자와 아이템 각각을 군집화 하여, 기존 협업 필터링 기법에서 유사 아이템, 사용자 집합을 생성하는데 소요되는 시간을 절약하며, 사용자 선호도 정보만을 이용한 부분 집합 생성보다 추천의 신뢰도를 높이고, 초기 평가 문제와 초기 이용자 문제를 일부 해소한다. 둘째, 미리 구성된 사용자와 아이템의 군집을 기반으로 군집간의 선호도를 이용해 추천을 수행한다. 사용자가 속한 군집의 선호도가 높은 순서대로 아이템 군집을 조회하여 사용자에게 제공할 아이템 목록을 구성하여, 추천 시스템의 부하 대부분을 모델 생성 단계에서 부담하고 실제 수행 시 부하를 최소화한다. 셋째, 누락된 사용자 선호도 정보를 사용자와 아이템 군집을 이용하여 예측함으로써 협업 필터링 추천 기법의 사용자 선호도 정보 희박성으로 인한 문제를 해소한다. 넷째, 사용자와 아이템의 특성 벡터를 사용자의 피드백에 따라 학습시켜 아이템과 사용자의 정성적 특성 정량화의 어려움을 해결한다. 본 연구의 검증은 기존에 제안되었던 하이브리드 필터링 기법들과의 성능 비교를 통해 이루어졌으며, 평가 방법으로는 평균 절대 오차와 응답 시간을 이용하였다.

Keywords

References

  1. Adomavicius, G., and A. Tuzhilin, "Towards the next generation of recommender system : A survey of the state-of-the-art and possible extensions," IEEE Transactions on Knowledge and Data Engineering, (2005), 634-749.
  2. Ali, K., and W. V. Stam, "Tivo: Making show recommendations using a distributed collaborative filtering architecture," Conference on Knowledge Discovery and Data Mining, (2004), 394-401.
  3. Baeza-Yates, R., and B. Ribeiro-Neto, Modern information retrieval, ACM press, New York, 1999.
  4. Bennet, J., and S. Lanning, "The netflix prize," KDD Cup and Workshop, www.netflixprize.com, (2007).
  5. Braak, P. T., N. Abdullah, and Y. Xu, "Improving the Performance of Collaborative Filtering Recommender Systems through User Profile Clustering," Web Intelligence and Intelligent Agent Technologies, 2009. WI-IAT '09. IEEE/WIC/ACM International Joint Conferences on Vol.3, (2009), 147-150.
  6. Connor, M. O., and J. Herlocker, "Clustering Items for Collaborative Filtering," Proceedings of the ACM SIGIR Workshop on Recommender Systems, Berkeley, CA, (1999).
  7. Das, A., M. Datar, A. Garg, and S. Rajaram, "Google news personalization: Scalable online collaborative filtering," World Wide Web Conference, (2003), 271-280.
  8. George, T., and S. Merugu, "A scalable collaborative filtering framework based on co-clustering," In: Data Mining, Fifth IEEE International Conference on. IEEE, (2005).
  9. Goldberg, K., T. Roeder, D. Gupta, and C. Perkins, "Eigentaste: a constant time collaborative filtering algorithm," Information Retrieval, Vol.4, No.2(2001), 133-151. https://doi.org/10.1023/A:1011419012209
  10. Groh, G., and C. Ehming, "Recommendations in Taste Related Domains: Collaborative filtering vs. Social filtering," Proceedings of GROUP'07, (2007), 127-136.
  11. Heo, G. Y., and W. W. Woo, "Extensions of X-means with Efficient Learning the Number of Clusters," The journal of the Korea Institute of Maritime Information & Communication Sciences, Vol.12, No.4(2008), 772-780.
  12. Herlocker, J. L., J. A. Konstan, L. G. Terveen, and J. T. Riedl, "Evaluating collaborative filtering recommender systems," ACM Transactions on Information Systems, Vol.22, No.1(2004), 5-53. https://doi.org/10.1145/963770.963772
  13. Herlocker, J. L., J. A. Konstan, and J. T. Riedl, "An Empirical Analysis of Design Choices in Neighborhood-based Collaborative Filtering Systems," Information Retrieval, Vol.5, (2002), 287-310. https://doi.org/10.1023/A:1020443909834
  14. Huang, Z., H. Chen, and D. Zeng, "Applying Associative Retrieval Techniques to Alleviate the Sparsity Problem in Collaborative Filtering," ACM Trans. Information Systems, Vol.22, No.1(2004), 116-142. https://doi.org/10.1145/963770.963775
  15. Ji, H., J. Li, C. Ren, and M. He, "Hybrid collaborative filtering model for improved recommendation," Service Operations and Logistics, and Informatics (SOLI), 2013 IEEE International Conference, (2013), 142-145.
  16. Kass, R. E., and L. Wasserman, "A Reference Bayesian Test for Nested Hypotheses and Its Relationship to the Schwarz Criterion," Journal of the American Statistical Association, Vol.90, No.431(1995), 928-934. https://doi.org/10.1080/01621459.1995.10476592
  17. Kim, S. H., B. H. Oh, M. J. Kim, and J. H. Yang, "A Movie Recommendation Algorithm Combining Collaborative Filtering and Content Information," Journal of KIISE. Vol.39, No.4(2012), 261-268.
  18. Kim, S. Y., S. H. Lee, and H. S. Hwang, "Design and Implementation of the recommendation system that is personalized using the hybrid filter and AHP," Journal of Korean Industrial Information Systems Society, Vol.17, No.7 (2012), 111-118. https://doi.org/10.9723/jksiis.2012.17.7.111
  19. Kim, Y., and S. B. Moon, "A Study on Hybrid Recommendation System Based on Usage frequency for Multimedia Contents," Journal of the Korean society for information management, Vol.23 No.3(2006), 91-125. https://doi.org/10.3743/KOSIM.2006.23.3.091
  20. Konstan, J., D. B. Miller, D. Maltz, J. Herlocker, L. Gordon, and J. Ridl, "GroupLens: Applying collaborative filtering to Usenet news," Communication of ACM, Vol.40, No.3(1997), 77-87. https://doi.org/10.1145/245108.245126
  21. Linden, G., B. Smith, and J. York, "Amazon.com recommendations: Item-to-item collaborative filtering," IEEE Internet Computing, (2003), 76-80.
  22. Melville, P., R. J. Mooney, and R. Nagarajan, "Content-Boosted Collaborative Filtering for Improved Recommendations," Proceedings of the Eighteenth National Conference on Artificial Intelligence, Edmonton, Canada, (2002), 187-192.
  23. Park, K. S., J. M. Choi, and D. H. Lee, "A Single-Scaled Hybrid Filtering Method for IPTV Program Recommendation," INTERNATIONAL JOURNAL OF CIRCUITS, SYSTEMS AND SIGNAL PROCESSING, Vol.4, No.4(2010), 161-168.
  24. Park, S., and D. Pennock, "Applying collaborative filtering techniques to movie search for better ranking and browsing," Proceedings of 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, (2007), 550-559.
  25. Pazzani, M., "A Framework for Collaborative, Content-Based and Demographic Filtering," Artificial Intelligence Review, Vol.13(1999), 398-408.
  26. Renick, P., and H. R. Varian, "Recommender System," Communication of the ACM, Vol.40, No.3(1997), 56-58
  27. Sarwar, B., G. Karypis, J. Konstan, and J. Riedl, "Itembased collaborative filtering recommendation algorithm," Proceedings of the 10th international conference on World Wide Web, (2001), 285-295.
  28. Schwarz, G., "Estimating the Dimension of a Model," The Annals of Statistics, Vol.6, No.2(1987), 461-464.
  29. Shen, Y., H. C. Shin, D. G. Kim, Y. H. Hong, and P. K. Rhee, "Reinforcement Learning Algorithm Based Hybrid Filtering Image Recommender System," The journal of the Institute of Internet Broadcasting and Communication, Vol.12, No.3(2012), 75-81. https://doi.org/10.7236/JIWIT.2012.12.3.75
  30. Shepitsen, A., J. Gemmell, B. Mobasher, and R. Burke, "Personalized Recommendation in Social Tagging Systems Using Hierarchical Clustering," Proceeding of the 2008 ACM conference on Recommender systems, Lausanne, Switzerland, (2008), 259-266.
  31. Xue, G. R., C. Lin, Q. Yang, W. Xi, H. J. Zeng, Y. Yu, and Z. Chen, "Scalable collaborative filtering using cluster-based smoothing," Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval, (2005), 114-121.

Cited by

  1. Predictive Clustering-based Collaborative Filtering Technique for Performance-Stability of Recommendation System vol.21, pp.1, 2015, https://doi.org/10.13088/jiis.2015.21.1.119
  2. Meta-data Configuration and Wellness Feature Analysis Technique for Wellness Content Recommendation vol.19, pp.8, 2014, https://doi.org/10.9708/jksci.2014.19.8.083
  3. RFM 다차원 분석 기법을 활용한 암시적 사용자 피드백 기반 협업 필터링 개선 연구 vol.25, pp.1, 2014, https://doi.org/10.13088/jiis.2019.25.1.139