DOI QR코드

DOI QR Code

A Vision Transformer Based Recommender System Using Side Information

부가 정보를 활용한 비전 트랜스포머 기반의 추천시스템

  • Kwon, Yujin (Department of AI, Big Data & Management, Kookmin University) ;
  • Choi, Minseok (Department of AI, Big Data & Management, Kookmin University) ;
  • Cho, Yoonho (Department of AI, Big Data & Management, Kookmin University)
  • 권유진 (국민대학교 AI빅데이터융합경영학과) ;
  • 최민석 (국민대학교 AI빅데이터융합경영학과) ;
  • 조윤호 (국민대학교 AI빅데이터융합경영학과)
  • Received : 2022.09.04
  • Accepted : 2022.09.13
  • Published : 2022.09.30

Abstract

Recent recommendation system studies apply various deep learning models to represent user and item interactions better. One of the noteworthy studies is ONCF(Outer product-based Neural Collaborative Filtering) which builds a two-dimensional interaction map via outer product and employs CNN (Convolutional Neural Networks) to learn high-order correlations from the map. However, ONCF has limitations in recommendation performance due to the problems with CNN and the absence of side information. ONCF using CNN has an inductive bias problem that causes poor performances for data with a distribution that does not appear in the training data. This paper proposes to employ a Vision Transformer (ViT) instead of the vanilla CNN used in ONCF. The reason is that ViT showed better results than state-of-the-art CNN in many image classification cases. In addition, we propose a new architecture to reflect side information that ONCF did not consider. Unlike previous studies that reflect side information in a neural network using simple input combination methods, this study uses an independent auxiliary classifier to reflect side information more effectively in the recommender system. ONCF used a single latent vector for user and item, but in this study, a channel is constructed using multiple vectors to enable the model to learn more diverse expressions and to obtain an ensemble effect. The experiments showed our deep learning model improved performance in recommendation compared to ONCF.

최근 추천 시스템 연구에서는 사용자와 아이템 간 상호 작용을 보다 잘 표현하고자 다양한 딥 러닝 모델을 적용하고 있다. ONCF(Outer product-based Neural Collaborative Filtering)는 사용자와 아이템의 행렬을 외적하고 합성곱 신경망을 거치는 구조로 2차원 상호작용 맵을 제작해 사용자와 아이템 간의 상호 작용을 더욱 잘 포착하고자 한 대표적인 딥러닝 기반 추천시스템이다. 하지만 합성곱 신경망을 이용하는 ONCF는 학습 데이터에 나타나지 않은 분포를 갖는 데이터의 경우 예측성능이 떨어지는 귀납적 편향을 가지는 한계가 있다. 본 연구에서는 먼저 NCF구조에 Transformer에 기반한 ViT(Vision Transformer)를 도입한 방법론을 제안한다. ViT는 NLP분야에서 주로 사용되던 트랜스포머를 이미지 분류에 적용하여 좋은 성과를 거둔 방법으로 귀납적 편향이 합성곱 신경망보다 약해 처음 보는 분포에도 robust한 특징이 있다. 다음으로, ONCF는 사용자와 아이템에 대한 단일 잠재 벡터를 사용하였지만 본 연구에서는 모델이 더욱 다채로운 표현을 학습하고 앙상블 효과도 얻기 위해 잠재 벡터를 여러 개 사용하여 채널을 구성한다. 마지막으로 ONCF와 달리 부가 정보(side information)를 추천에 반영할 수 있는 아키텍처를 제시한다. 단순한 입력 결합 방식을 활용하여 신경망에 부가 정보를 반영하는 기존 연구와 달리 본 연구에서는 독립적인 보조 분류기(auxiliary classifier)를 도입하여 추천 시스템에 부가정보를 보다 효율적으로 반영할 수 있도록 하였다. 결론적으로 본 논문에서는 ViT 의 적용, 임베딩 벡터의 채널화, 부가정보 분류기의 도입을 적용한 새로운 딥러닝 모델을 제안하였으며 실험 결과 ONCF보다 높은 성능을 보였다.

Keywords

Acknowledgement

본 논문은 산업통상자원부 지식서비스산업핵심기술개발사업으로 지원된 연구결과입니다. (20015152, 빅데이터 가공 및 공급 자동화를 기반한 통합 스몰 데이터 분석 기술과 비대면 시장조사 시스템 융합 기술 개발)

References

  1. 김민정, 조윤호. (2015). 빅데이터 기반 추천시스템 구현을 위한 다중 프로파일 앙상블 기법, 지능정보연구, 21(4), 93-110. https://doi.org/10.13088/JIIS.2015.21.4.093
  2. 박종진. (2022). 넷플릭스 "韓 콘텐츠 이미 세계적" 올해 투자액 8000 억원 전망. 전자신문. https://www.etnews.com/20220119000185
  3. 박호연, & 김경재. (2021). BERT 기반 감성분석을 이용한 추천시스템. 지능정보연구, 27(2), 1-15. https://doi.org/10.13088/JIIS.2021.27.2.001
  4. 최성이, 현윤진, & 김남규. (2015). 사용자관심 이슈 분석을 통한 추천시스템 성능 향상 방안. 지능정보연구, 21(3), 101-116. https://doi.org/10.13088/JIIS.2015.21.3.101
  5. Burke, R. (2002), Hybrid recommender systems : Survey and experiments, User modeling and user-adapted interaction, 12(4), 331-370. https://doi.org/10.1023/A:1021240730564
  6. Burke, R. (2007), Hybrid web recommender systems, In The adaptive web, 377-408.
  7. Cheng, H. T., Koc, L., Harmsen, J., Shaked, T., Chandra, T., Aradhye, H., ... & Shah, H. (2016, September). Wide & deep learning for recommender systems. In Proceedings of the 1st workshop on deep learning for recommender systems (pp. 7-10).
  8. Das, M., De Francisci Morales, G., Gionis, A., &Weber, I. (2013), Learning to question : Leveraging user preferences for shopping advice, In Proceedings o f the 19th ACM SIGKDD international conference on Knowledge discovery and data mining, 203-211.
  9. Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2018). Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805.
  10. Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., ... & Houlsby, N. (2020). An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929.
  11. Goldberg, D., Nichols, D., Oki, B. M., and Terry, D. (1992), Using collaborative filterin g to weave an information tapestry, Communications of the ACM, 35(12), 61-70. https://doi.org/10.1145/138859.138867
  12. Hariri, N., Mobasher, B., Burke, R., & Zheng, Y. (2011, January). Context-aware recommendation based on review mining. In ITWP@ IJCAI.
  13. Hatamizadeh, A., Yin, H., Kautz, J. & Molchanov, P. (2022). Global Context Vision Transformers, arXiv preprint arXiv:2206.09959.
  14. He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 770-778).
  15. He, R., & McAuley, J. (2016, February). VBPR: visual bayesian personalized ranking from implicit feedback. In Proceedings of the AAAI conference on artificial intelligence (Vol. 30, No. 1).
  16. He, X., Du, X., Wang, X., Tian, F., Tang, J., & Chua, T. S. (2018). Outer product-based neural collaborative filtering. arXiv preprint arXiv:1808.03912.
  17. He, X., Liao, L., Zhang, H., Nie, L., Hu, X., & Chua, T. S. (2017, April). Neural collaborative filtering. In Proceedings of the 26th international conference on world wide web (pp. 173-182).
  18. Kang, W. C., & McAuley, J. (2018, November). Self-attentive sequential recommendation. In 2018 IEEE international conference on data mining (ICDM) (pp. 197-206). IEEE.
  19. Kolesnikov, A., Beyer, L., Zhai, X., Puigcerver, J., Yung, J., Gelly, S., & Houlsby, N. (2020, August). Big transfer (bit): General visual representation learning. In European conference on computer vision (pp. 491-507). Springer, Cham.
  20. Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems, 25.
  21. Kulkarni, S., & Rodd, S. F. (2020). Context Aware Recommendation Systems: A review of the state of the art techniques. Computer Science Review, 37, 100255. https://doi.org/10.1016/j.cosrev.2020.100255
  22. LeCun, Y., Boser, B., Denker, J. S., Henderson, D., Howard, R. E., Hubbard, W., & Jackel, L. D. (1989). Backpropagation applied to handwritten zip code recognition. Neural computation, 1(4), 541-551. https://doi.org/10.1162/neco.1989.1.4.541
  23. Li Y., Mao, H., Girshick, R., & He, K. (2022). Exploring Plain Vision Transformer Backbones for Object Detection, arXiv preprint arXiv: 2203.16527.
  24. Rendle, S. (2010, December). Factorization machines. In 2010 IEEE International conference on data mining (pp. 995-1000). IEEE.
  25. Sun, F., Liu, J., Wu, J., Pei, C., Lin, X., Ou, W., & Jiang, P. (2019, November). BERT4Rec: Sequential recommendation with bidirectional encoder representations from transformer. In Proceedings of the 28th ACM international conference on information and knowledge management (pp. 1441-1450).
  26. Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., ... & Rabinovich, A. (2015). Going deeper with convolutions. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1-9).
  27. Touvron, H., Vedaldi, A., Douze, M., & Jegou, H. (2019). Fixing the train-test resolution discrepancy. Advances in neural information processing systems, 32.
  28. Wu, Y. H. and Chen, A. L. (2000), Index structures of user profiles for efficien t web page filtering services, In 2012 IEEE 32nd International Conference on Distributed Computing Systems, 644-644.
  29. Xie, Y., Zhou, P., & Kim, S. (2022). Decoupled Side Information Fusion for Sequential Recommendation. arXiv preprint arXiv:2204. 11046.
  30. Xie, Y., Zhou, P., & Kim, S. (2022). Decoupled Side Information Fusion for Sequential Recommendation. arXiv preprint arXiv:2204. 11046.
  31. Yuan, X., Duan, D., Tong, L., Shi, L., & Zhang, C. (2021, July). ICAI-SR: Item Categorical Attribute Integrated Sequential Recommendation. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 1687-1691).
  32. Zhang, T., Zhao, P., Liu, Y., Sheng, V. S., Xu, J., Wang, D., ... & Zhou, X. (2019, August). Feature-levelDeeper Self-Attention Network for Sequential Recommendation. In IJCAI (pp. 4320-4326).
  33. Zhou, K., Wang, H., Zhao, W. X., Zhu, Y., Wang, S., Zhang, F., ... & Wen, J. R. (2020, October). S3-rec: Self-supervised learning for sequential recommendation with mutual information maximization. In Proceedings of the 29th ACM International Conference on Information & Knowledge Management (pp. 1893-1902).