DOI QR코드

DOI QR Code

A modification of McFadden's R2 for binary and ordinal response models

  • Ejike R. Ugba (Department of Mathematics and Statistics, School of Economics and Social Sciences, Helmut Schmidt University) ;
  • Jan Gertheiss (Department of Mathematics and Statistics, School of Economics and Social Sciences, Helmut Schmidt University)
  • 투고 : 2022.04.01
  • 심사 : 2022.09.29
  • 발행 : 2023.01.31

초록

A lot of studies on the summary measures of predictive strength of categorical response models consider the likelihood ratio index (LRI), also known as the McFadden-R2, a better option than many other measures. We propose a simple modification of the LRI that adjusts for the effect of the number of response categories on the measure and that also rescales its values, mimicking an underlying latent measure. The modified measure is applicable to both binary and ordinal response models fitted by maximum likelihood. Results from simulation studies and a real data example on the olfactory perception of boar taint show that the proposed measure outperforms most of the widely used goodness-of-fit measures for binary and ordinal models. The proposed R2 interestingly proves quite invariant to an increasing number of response categories of an ordinal model.

키워드

과제정보

This research was supported in part by Deutsche Forschungsgemeinschaft (DFG) through grant number GE2353/2-1.

참고문헌

  1. Agresti A (1986). Applying R2-Type measures to ordered categorical data, Technometrics, 28, 133-138, Available from: https://doi.org/10.2307/1270449
  2. Agresti A (2002). Categorical Data Analysis (2nd ed), John Wiley & Sons, New York, Available from: https://doi.org/10.1002/0471249688
  3. Allison P (2013). What's the best r-squared for logistic regression?, Available from: https://statisticalhorizons.com/r2logistic (accessed on 29-09-2021)
  4. Allison PD (2014). Measures of fit for logistic regressions, In Proceedings of the SAS Global 2014 Conference, Washington DC, 1-12.
  5. Cox DR and Snell EJ (1989). Analysis of Binary Data (2nd ed), Chapman and Hall, London.
  6. Domencich TA, McFadden D, and Associates CR (1975). Urban Travel Demand: A Behavioral Analysis : A Charles River Associates Research Study, North-Holland Pub.Co.;American Elsevier Amsterdam, New York, Available from: https://books.google.de/books?id=VUZxnQEACAAJ
  7. Fagerland MW and Hosmer DW (2016). Tests for goodness-of-fit in ordinal logistic regression models, Journal of Statistical Computation and Simulation, 86, 3398-3418. https://doi.org/10.1080/00949655.2016.1156682
  8. Hagle TM and Mitchell GE (1992). Goodness-of-fit measures for probit and logit, American Journal of Political Science, 36, 762-784, Available from: https://doi.org/10.2307/2111590
  9. Hauser JR (1978). Testing the accuracy, usefulness, and significance of probabilistic choice models: An information-theoretic approach, Operations Research, 26, 406-421, Available from: https://doi.org/10.1287/opre.26.3.406
  10. Heinzl H and M Mittlbock (2003). Pseudo R-squared measures for poisson regression models with over-or underdispersion, Computational Statistics and Data Analysis, 44, 253-271, Available from: https://doi.org/10.1016/s0167-9473(03)00062-8
  11. Hosmer DW and Lemeshow S (1989). Applied Logistic Regression, John Wiley & Sons, New York.
  12. Jeong KM and Lee HY (2009). Goodness-of-fit tests for the ordinal response models with misspecified links, Communications for Statistical Applications and Methods, 16, 697-705. https://doi.org/10.5351/CKSS.2009.16.4.697
  13. Long JS (1997). Regression Models for Categorical and Limited Dependent Variables, Sage Publications, California.
  14. Louviere JJ, Hensher DA, and Swait JD (2000). Stated Choice Methods: Analysis and Application, Cambridge University Press, Cambridge.
  15. Maddala GS (1983). Limited-Dependent and Qualitative Variables in Econometrics, Cambridge University, Cambridge.
  16. McFadden D (1974). Conditional logit analysis of qualitative choice behavior, Frontiers in Econometrics, 105-142.
  17. McKelvey RD and Zavoina W (1976). A statistical model for the analysis of ordinal level dependent variables, The Journal of Mathematical Sociology, 4, 103-120. https://doi.org/10.1080/0022250X.1975.9989847
  18. Meier-Dinkel L, Gertheiss J, Muller S, Wesoly R, and Morlein D (2015). Evaluating the performance of sensory quality control: The case of boar taint, Meat Science, 100, 73-84, Available from: https://doi.org/10.1016/j.meatsci.2014.09.013
  19. Menard S (2000). Coefficients of determination for multiple logistic regression analysis, The American Statistician, 54, 17-24, Available from: https://doi.org/10.1080/00031305.2000.10474502
  20. Morlein D, Morlein J, Gertheiss J et al. (2021), Androstenone, skatole and the olfactory perception of boar taint (1.0.0) [Data set], Zenodo, Available from: https://doi.org/10.5281/zenodo.4869352
  21. Morlein D, Trautmann J, Gertheiss J et al. (2016). Interaction of skatole and androstenone in the olfactory perception of boar taint, Journal of Agricultural and Food Chemistry, 64, 4556-4565, Available from: https://doi.org/10.1021/acs.jafc.6b00355
  22. Nagelkerke NJD (1991). A note on a general definition of the coefficient of determination, Biometrika, 78, 691-692, Available from: https://doi.org/10.1093/biomet/78.3.691
  23. Piepho HP (2019). A coefficient of determination (R2) for generalized linear mixed models, Biometrical Journal, 61, 860-872, Available from: https://doi.org/10.1002/bimj.201800270
  24. Rao CR (1973). Linear Statistical Inference and Its Applications (2nd ed), Wiley, New York.
  25. RC Team (2022). R: A Language and Environment for Statistical Computing, Foundation for Statistical Computing, Vienna, Available from: http://www.R-project.org/
  26. Tjur T (2009). Coefficients of determination in logistic regression models-a new proposal: The coefficient of discrimination, The American Statistician, 63, 366-372, Available from: https://doi.org/10.1198/tast.2009.08210
  27. Trautmann J, Gertheiss J, Wicke M, and Morlein D (2014). How olfactory acuity affects the sensory assessment of boar fat: A proposal for quantification, Meat Science, 98, 255-262, Available from: https://doi.org/10.1016/j.meatsci.2014.05.037
  28. Ugba ER (2022). Gofcat: An R-package for goodness-of-fit of categorical response models, Journal of Open Source Software, 7, 4382, Available from: https://doi.org/10.21105/joss.04382
  29. Ugba ER, Morlein D, and Gertheiss J (2021). Smoothing in ordinal regression: An application to  sensory data, Stats, 4, 616-633, Available from: https://doi.org/10.3390/stats4030037
  30. Ugba ER and Gertheiss J (2018). An augmented likelihood ratio index for categorical response models, In Proceedings of 33rd International Workshop on Statistical Modelling, Bristol, UK, 293-298.
  31. Veall MR and Zimmermann KF (1992). Pseudo-R2's in the ordinal probit model, The Journal of Mathematical Sociology , 16, 333-342, Available from: https://doi.org/10.1080/0022250x.1992.9990094
  32. Veall MR and Zimmermann KF (1996). Pseudo-R2 measures for some common limited dependent variable models, Journal of Economic Surveys, 10, 241-259, Available from: https://doi.org/10.1111/j.1467-6419.1996.tb00013.x
  33. Windmeijer FAG (1995). Goodness-of-fit measures in binary choice models, Econometric Reviews, 14, 101-116, Available from: https://doi.org/10.1080/07474939508800306
  34. Yoo M and Kim D (2020). Statistical tests for biosimilarity based on relative distance between follow-on biologics for ordinal endpoints, Communications for Statistical Applications and Methods, 22, 1-14, Available from: https://doi.org/10.29220/CSAM.2020.27.1.001
  35. Zhang D (2017). A coefficient of determination for generalized linear models, The American Statisticians, 71, 310-316, Available from: https://doi.org/10.1080/00031305.2016.1256839