A Regularity-Based Preprocessing Method for Collaborative Recommender Systems

Toledo, Raciel Yera;Mota, Yaile Caballero;Borroto, Milton Garcia

  • Received : 2013.04.02
  • Accepted : 2013.07.21
  • Published : 2013.09.30


Recommender systems are popular applications that help users to identify items that they could be interested in. A recent research area on recommender systems focuses on detecting several kinds of inconsistencies associated with the user preferences. However, the majority of previous works in this direction just process anomalies that are intentionally introduced by users. In contrast, this paper is centered on finding the way to remove non-malicious anomalies, specifically in collaborative filtering systems. A review of the state-of-the-art in this field shows that no previous work has been carried out for recommendation systems and general data mining scenarios, to exactly perform this preprocessing task. More specifically, in this paper we propose a method that is based on the extraction of knowledge from the dataset in the form of rating regularities (similar to frequent patterns), and their use in order to remove anomalous preferences provided by users. Experiments show that the application of the procedure as a preprocessing step improves the performance of a data-mining task associated with the recommendation and also effectively detects the anomalous preferences.


Collaborative Recommender Systems;Inconsistencies;Rating Regularities


  1. A. Gunawardana, Shani, G., "A survey of accuracy evaluation metrics of recommendation tasks," Journal of Machine Learning Research, vol. 10, pp. 2935-2962, 2009.
  2. G. Adomavicius, Tuzhilin, A., "Toward the next generation of recommender systems: A survey of the state-of-the-art and possible extensions," IEEE Transactions on Knowledge and Data Engineering, vol. 17, pp. 734-749, 2005.
  3. X. Su, Khoshgoftaar, T., "A survey of collaborative filtering techniques," Advances in artificial intelligence, vol. 2009, p. 19, 2009.
  4. P. Lops, De Gemmis, M., Semeraro, G, "Content-based recommender systems: state of the art and trends.," in Recommender systems handbook, L. R. F. Ricci, B. Shapira, P.B. Kantor, Ed., ed: Springer, 2011, pp. 73-105.
  5. L. Martinez, Perez, L.G., Barranco, M.J., "A multigranular linguistic content-based recommendation model.," International Journal of Intelligent Systems, vol. 22, pp. 419-434, 2007.
  6. C. Desrosiers, Karypis, G., "A comprehensive survey of neighborhood-based recommendation methods.," in Recommender systems handbook L. R. F. Ricci, B. Shapira, P.B. Kantor Ed., ed: Springer, 2011, pp. 106-144.
  7. W. H. Jeong, Kim, S.J., Park, D.S., Kwak, J., "Performance Improvement of a Movie Recommendation System based on Personal Propensity and Secure Collaborative Filtering," Journal of Information Processing Systems, vol. 9, pp. 157-172, 2013.
  8. X. Amatriain, Pujol, J., Oliver, N., "I like it... I like it not: Evaluating user ratings noise in recommender systems," presented at the 17th International Conference on User Modeling, Adaptation and Personalization (UMAP), 2009.
  9. J. A. Konstan, Riedl, J., "Recommender systems: from algorithms to user experience," User Modeling and User-Adapted Interaction, vol. 22, pp. 101-123, 2012.
  10. J. Han, Kamber, M., Data Mining: concepts and techniques. (2nd ed.). San Francisco, 2006.
  11. X. Zhu, Wu, X., "Class Noise vs. Attribute Noise: A Quantitative Study of Their Impacts," Artificial Intelligence Review, vol. 22, pp. 177-210, 2004.
  12. C. E. Brodley, Friedl, M.A., "Identifying and Eliminating Mislabeled Training Instances," presented at the 13th National Conference on Artificial Intelligence (AAAI'96), 1996.
  13. D. Gamberger, Lavrac, N., Dzeroski, S., "Noise Detection and Elimination in Data Preprocessing: experiments in medical domains.," Applied Artificial Intelligence, vol. 14, pp. 205-223, 2000.
  14. J. D. Van Hulse, Khoshgoftaar, T.M., "Class noise detection using frequent itemsets," Intelligent Data Analysis, vol. 10, pp. 487-507, 2006.
  15. C. M. Teng, "Correcting noisy data," presented at the Proceedings of the Sixteenth International Conference on Machine Learning (ICML' 99), 1999.
  16. X. Zhu, Wu, X., Yang, Y., "Error detection and impact-sensitive instance ranking in noisy datasets," presented at the Ninetheenth National Conference on Artificial Intelligence (AAAI'04), 2004.
  17. J. D. Van Hulse, Khoshgoftaar, T.M., Huang, H., "The pairwise attribute noise detection algorithm," Knowledge and Information Systems, vol. 11, pp. 171-190, 2007.
  18. Y. Zhang, "Noise tolerant data mining," Ph.D. Thesis, The Faculty of the Graduate College, University of Vermont, 2008.
  19. Y. Zhang, Zhu, X., Wu, X., Bond, J.P., "Ace: An aggressive classifier ensemble with error detection, correction and cleansing.," presented at the Seventeenth International Conference on Tools with Artificial Intelligence (ICTAI'05), 2005.
  20. Y. Zhang, Wu, X., "Noise modeling with associative corruption rules.," presented at the Seventh IEEE International Conference on Data Mining (ICDM'07), 2007.
  21. A. Marcus, Maletic, J.I., Lin, K.I., "Ordinal Association Rules for Error Identification in Data Sets," presented at the 10th International Conference on Information and Knowledge Management (CIKM'01), 2001.
  22. B. Mehta, Nejdl, W., "Unsupervised strategies for shilling detection and robust collaborative filtering.," User Modeling and User-Adapted Interaction, vol. 19, pp. 65-97, 2009.
  23. I. Gunes, Kaleli, C., Bilge, A., Polat, H., "Shilling attacks against recommender systems: a comprehensive survey," Artificial Intelligence Review, 2012.
  24. H. X. Pham, Jung, J.J., "Preference-based user rating correction process for interactive recommendation systems," Multimedia Tools and Applications, vol. 65, pp. 119-132, 2013.
  25. B. Li, Chen, L., Xingquan, Z., Chengqi, Z., "Noisy but non-malicious user detection in social recommender systems," World Wide Web, 2012.
  26. G. Piatetsky-Shapiro, Frawley, W.J., Knowledge Discovery in Databases: AAAI/MIT Press, 1991.
  27. R. Agrawal, Srikant, R., "Fast algorithms for mining association rules," presented at the VLDB, 1994.
  28. M. Garcia-Borroto, Martinez-Trinidad, J.F., Carrasco-Ochoa, J.A., Medina-Perez, M.A., Ruiz-Schulcloper, J., "Lcmine: An efficient algorithm for mining discriminative regularities and its application in supervised classification.," Pattern Recognition, vol. 43, pp. 3025-3034, 2010.
  29. M. Garcia-Borroto, Martinez-Trinidad, J.F., Carrasco-Ochoa, "Fuzzy emerging patterns for classifying hard domains," Knowledge and Information Systems, vol. 28, pp. 473-489, 2011.
  30. C. W. K. Leung, Chan, S.C.F., Chung, F.L., "A collaborative filtering framework based on fuzzy association rules and multi-level similarity," Knowledge and Information Systems, vol. 10, pp. 357-381, 2006.
  31. C. Desrosier, Karypis, G., "A Comprehensive Survey of Neighborhood-based Recommendation Methods," in Recommender Systems Handbook, F. R. Ricci, L.;Shapira,B.;Kantor,P., Ed., ed, 2011, pp. 107-145.
  32. C. Borgelt, "Frequent item set mining," WIREs Data Mining Knowl Discov, vol. 2, pp. 437-456, 2012.
  33. C. Borgelt, Kruse, R., "Induction of Association Rules: Apriori Implementation," presented at the 14th Conference on Computational Statistics (COMPSTAT), 2002.
  34. Y. Koren, Bell, R. M. Volinsky, C., "Matrix factorization techniques for recommender systems.," IEEE Computer, vol. 42, pp. 30-37, 2009.
  35. J. Breese, Heckerman, D., Kadie, C., "Empirical analysis of predictive algorithms for collaborative filtering," presented at the 14th Conference on Uncertainty in Artificial Intelligence (UAI), 1998.
  36. C. N. Ziegler, McNee, S. M., Konstan, J. A., Lausen, G., "Improving recommendation lists through topic diversification.," presented at the 14th International Conference on World Wide Web, 2005.
  37. M. D. Ekstrand, Riedl, J. T., Konstan, J. A., "Collaborative filtering recommender systems.," Foundations and trends in Human-Computer Interaction, vol. 4, pp. 81-173, 2010.

Cited by

  1. Social group recommendation based on dynamic profiles and collaborative filtering vol.209, 2016,
  2. Big data pre-processing methods with vehicle driving data using MapReduce techniques vol.73, pp.7, 2017,
  3. Information science techniques for investigating research areas: a case study in telecommunications policy 2017,
  4. Missing Values and Optimal Selection of an Imputation Method and Classification Algorithm to Improve the Accuracy of Ubiquitous Computing Applications vol.2015, 2015,
  5. Recommendation system for sharing economy based on multidimensional trust model vol.75, pp.23, 2016,
  6. A recommendation approach for programming online judges supported by data preprocessing techniques vol.47, pp.2, 2017,
  7. A location-sensitive over-the-counter medicines recommender based on tensor decomposition pp.1573-0484, 2018,