Collaborative Filtering using Co-Occurrence and Similarity information

Na, Kwang Tek;Lee, Ju Hong;

doi:10.7472/jksii.2017.18.3.19

Journal of Internet Computing and Services (인터넷정보학회논문지)

Volume 18 Issue 3
/
Pages.19-28
/
2017
/
1598-0170(pISSN)
/
2287-1136(eISSN)

Korean Society for Internet Information (한국인터넷정보학회)

DOI QR Code

Collaborative Filtering using Co-Occurrence and Similarity information

상품 동시 발생 정보와 유사도 정보를 이용한 협업적 필터링

Na, Kwang Tek (Dept of Computer Science and Information Engineering, Inha University) ;
Lee, Ju Hong (Dept of Computer Science and Information Engineering, Inha University)

나광택 ;
이주홍

Received : 2016.11.24
Accepted : 2017.04.04
Published : 2017.06.30

https://doi.org/10.7472/jksii.2017.18.3.19 Citation PDF KSCI

Download PDF

⟨ Previous Next ⟩

Abstract

Collaborative filtering (CF) is a system that interprets the relationship between a user and a product and recommends the product to a specific user. The CF model is advantageous in that it can recommend products to users with only rating data without any additional information such as contents. However, there are many cases where a user does not give a rating even after consuming the product as well as consuming only a small portion of the total product. This means that the number of ratings observed is very small and the user rating matrix is very sparse. The sparsity of this rating data poses a problem in raising CF performance. In this paper, we concentrate on raising the performance of latent factor model (especially SVD). We propose a new model that includes product similarity information and co occurrence information in SVD. The similarity and concurrence information obtained from the rating data increased the expressiveness of the latent space in terms of latent factors. Thus, Recall increased by 16% and Precision and NDCG increased by 8% and 7%, respectively. The proposed method of the paper will show better performance than the existing method when combined with other recommender systems in the future.

협업적 필터링(CF)은 사용자와 상품간의 관계를 해석하여 특정 사용자에게 상품을 추천 해주는 시스템이다. CF 모델은 컨텐츠 등 다른 추가 정보 없이 평점 데이터만으로 사용자에게 상품을 추천해 줄 수 있다는 장점이 있다. 하지만 사용자는 전체 상품의 극히 일부분만을 소비하고 상품을 소비한 후에도 평점을 부여하지 않는 경우가 매우 많다. 이는 관찰된 평점의 수가 매우 적으며 사용자 평점 행렬이 매우 희박함을 의미한다. 이러한 평점 데이터의 희박성은 CF의 성능을 끌어올리는데 문제를 야기한다. 본 논문에서는 CF 모델 중 하나인 잠재 요인 모델(특히 SVD)의 성능을 끌어올리는데 집중한다. SVD에 상품 유사도 정보와 상품 동시 발생(co occurrence) 정보를 포함시킨 새로운 모델을 제안한다. 평점 데이터로부터 얻어지는 유사도와 동시 발생 정보는 상품 잠재 요인에 대한 잠재 공간상의 표현력을 높여주어 기존방법보다 Recall은 약 16%, Precision과 NDCG는 각각 8%, 7% 상승하였다. 본 논문에서 제안하는 방법이 향후 다른 추천 시스템과 결합하면 기존의 방법보다 더 좋은 성능을 보여줄 것이다.

Keywords

References

W. Zhang, H. sun, X. Liu, and X. Guo, "Temporal qos-aware web service recommendation via non-negative tensor factorization", In WWW, pp. 585-596, 2014. http://dl.acm.org/citation.cfm?id=2568001
M. Deshpande, and G. Karypis. "Item-based top-n recommendation. ACM Transactions on Information Systems", vol.22, No.1, pp.143-177, 2004. http://dl.acm.org/citation.cfm?id=963776 https://doi.org/10.1145/963770.963776
X. Liu, C. Aggarwal, Y. Li, X. Kong, X. Sun, "Kernelized Matrix Factorization for Collaborative Filtering", In Proc. SIAM Conference on Data Mining, pp 399-416, 2016. http://epubs.siam.org/doi/abs/10.1137/1.9781611974348.43
Y. Hu, Y. Koren and C. Volinsky, "Collaborative Filtering for Implicit Feedback Datasets", 2008 Eighth IEEE International Conference on Data Mining, Pisa, pp. 263-272, 2008, http://ieeexplore.ieee.org/abstract/document/4781121/
Y. Koren, "Factor in the Neighbors: Scalable and Knowledge Discovery from Data", Vol. 4, No. 1, Article 1, Publication date: January 2010. http://dl.acm.org/citation.cfm?id=1644874
X. Ning and G. Karypis, "SLIM: Sparse Linear Methods for Top-N Recommender Systems," 2011 IEEE 11th International Conference on Data Mining, pp. 497-506, 2011. http://ieeexplore.ieee.org/abstract/document/6137254/
Y.Koren, R. Bell, and C. Volinsky. "Matrix factorization techniques for recommender systems", IEEE Computer, 42, pp. 30-37, 2009. http://ieeexplore.ieee.org/abstract/document/5197422/
T. Hofmann, "Latent Semantic Models for Collaborative Filtering", ACM Transactions on Information Systems, vol. 22, no. 1, pp. 89-115, 2004. http://dl.acm.org/citation.cfm?id=963774 https://doi.org/10.1145/963770.963774
S.Funk, "Netflix Update: Try This at Home", Dec, 2006: http://sifter.org/-simon/journal/20061211.html
Y. Koren, "Factorization Meets the Neighborhood: A Multifaceted Collaborative Filtering Model", In Proc. of 14th SIGKDD Int'l Conf. Knowledge Discovery and Data Mining, pp.426-434, 2008. http://dl.acm.org/citation.cfm?id=1401944
P. Cremonesi, Y. Koren, and R. Turrin, "Performance of Recommender Algorithms on Top-N Recommendation Tasks", In Proc. of 4th ACM Conference on Recommender Systems, pp. 39-46, 2010. http://dl.acm.org/citation.cfm?id=1864721
D. Liang, J. Altosaar, L. Charlin, and D. Beli, "Factorization Meets the Item Embedding: Regularizing Matrix Factorization with Item Co-occurrence", In proc. of the 10th ACM Conference on Recommender Systems, pp 59-66, 2016. http://dl.acm.org/citation.cfm?id=2959182
Hao Ma, Irwin King, and Michael R. Lyu, "Effective missing Data Prediction for Collaborative Filtering", In Proc. of 30th ACM SIGIR'07, pp 39-46, 2007. http://dl.acm.org/citation.cfm?id=1277751
T. Mikolov, I. Sutskever, K. Chen, G. Corrado, and J. Dean "Distributed Representations of Words and Phrases and their Compositionality", In Advances in Neural Information Processing Systems, NIPS, pp 3111-3119. 2013. http://papers.nips.cc/paper/5021-distributed-representations-of-words-and-phrases-and-their-compositionality
O. Levy, Y. Goldberg, "Neural Word Embedding as Implicit Matrix Factorization", In Advances in Neural Information Processing Systems, NIPS, pp 2177-2185, 2014. http://dl.acm.org/citation.cfm?id=2969070
P. Resnick, N. Iacovou, M. Suchak, P. Bergstrom and J. Riedl, "GroupLens: An Open Architecture for Collaborative Filtering of Netnews", In proc, of ACM CSCW, pp 175-186, 1994. http://dl.acm.org/citation.cfm?id=192905

Cited by

온라인 상품추천 서비스에 대한 소비자 사용 의도 -신뢰-몰입의 매개역할을 중심으로- vol.42, pp.5, 2018, https://doi.org/10.5850/jksct.2018.42.5.871