DOI QR코드

DOI QR Code

Naive Bayes classifiers boosted by sufficient dimension reduction: applications to top-k classification

  • 투고 : 2022.04.28
  • 심사 : 2022.05.31
  • 발행 : 2022.09.30

초록

The naive Bayes classifier is one of the most straightforward classification tools and directly estimates the class probability. However, because it relies on the independent assumption of the predictor, which is rarely satisfied in real-world problems, its application is limited in practice. In this article, we propose employing sufficient dimension reduction (SDR) to substantially improve the performance of the naive Bayes classifier, which is often deteriorated when the number of predictors is not restrictively small. This is not surprising as SDR reduces the predictor dimension without sacrificing classification information, and predictors in the reduced space are constructed to be uncorrelated. Therefore, SDR leads the naive Bayes to no longer be naive. We applied the proposed naive Bayes classifier after SDR to build a recommendation system for the eyewear-frames based on customers' face shape, demonstrating its utility in the top-k classification problem.

키워드

참고문헌

  1. Beasley TM, Erickson S, and Allison DB (2009). Rank-based inverse normal transformations are increasingly used, but are they merited?, Behavior Genetics, 39, 580-595. https://doi.org/10.1007/s10519-009-9281-0
  2. Box GE and Cox DR (1964). An analysis of transformations, Journal of the Royal Statistical Society: Series B (Methodological), 26, 211-243. https://doi.org/10.1111/j.2517-6161.1964.tb00553.x
  3. Cook RD(1998). Principal hessian directions revisited, Journal of the American Statistical Association, 93, 84-94. https://doi.org/10.1080/01621459.1998.10474090
  4. Cook RD and Weisberg S (1991). Discussion of "Sliced inverse regression for dimension reduction", Journal of the American Statistical Association, 86, 28-33.
  5. Lapin M, Hein M, and Schiele B (2016). Loss functions for top-k error: Analysis and insights, In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 1468-1477.
  6. Li B (2018). Sufficient Dimension Reduction: Methods and Applications with R, CRC Press, Florida.
  7. Li B, Artemiou A, and Li L (2011). Principal support vector machines for linear and nonlinear sufficient dimension reduction, The Annals of Statistics, 39, 3182-3210. https://doi.org/10.1214/11-AOS932
  8. Li B and Wang S (2007). On directional regression for dimension reduction, Journal of the American Statistical Association, 102, 997-1008. https://doi.org/10.1198/016214507000000536
  9. Li KC (1991). Sliced inverse regression for dimension reduction (with discussion), Journal of the American Statistical Association, 86, 316-342. https://doi.org/10.1080/01621459.1991.10475035
  10. Liu Y, Zhang HH, and Wu Y (2011). Hard or soft classification? large-margin unified machines, Journal of the American Statistical Association, 106, 166-177. https://doi.org/10.1198/jasa.2011.tm10319
  11. Shin SJ, Wu Y, Zhang HH, and Liu Y (2014). Probability enhanced sufficient dimension reduction in binary classification, Biometrics, 70, 546-555. https://doi.org/10.1111/biom.12174
  12. Shin SJ, Wu Y, Zhang HH, and Liu Y (2017). Principal weighted support vector machines for sufficient dimension reduction in binary classification, Biometrika, 104, 67-81. https://doi.org/10.1093/biomet/asw057
  13. Vapnik V (1996). The Nature of Statistical Learning Theory, Cambridge University Press, Cambridge.
  14. Yeo IK and Johnson RA (2000) . A new family of power transformations to improve normality or symmetry, Biometrika, 87, 954-959. https://doi.org/10.1093/biomet/87.4.954
  15. Yin X, Li B, and Cook RD (2008). Successive direction extraction for estimating the central subspace in a multiple-index regression, Journal of Multivariate Analysis, 99, 1733-1757. https://doi.org/10.1016/j.jmva.2008.01.006
  16. Zhu LP, Zhu LX, and Feng ZH (2010). Dimension reduction in regressions through cumulative slicing estimation, Journal of the American Statistical Association, 105, 1455-1466. https://doi.org/10.1198/jasa.2010.tm09666