A study on log-density ratio in logistic regression model for binary data

  • Received : 2010.10.29
  • Accepted : 2011.01.03
  • Published : 2011.01.31

Abstract

We present methods for studying the log-density ratio, which allow us to select which predictors are needed, and how they should be included in the logistic regression model. Under multivariate normal distributional assumptions, we investigate the form of the log-density ratio as a function of many predictors. The linear, quadratic and crossproduct terms are required in general. If two covariance matrices are equal, then the crossproduct and quadratic terms are not needed. If the variables are uncorrelated, we do not need the crossproduct terms, but we still need the linear and quadratic terms.

Keywords

References

  1. Anderson, J. A. (1984). Regression and ordered categorical variables (with discussion). Journal of Royal Statistical Society, B, 46, 1-30.
  2. Clark, R. G., Henderson, H. V., Hoggard, G. K. Ellison, R. S., and Young, B. J. (1987). The ability of biochemical and haematological tests to predict recovery in periparturient recumbent cows. New Zealand Veterinary Journal, 35, 126-133. https://doi.org/10.1080/00480169.1987.35410
  3. Cook, R. D. and Weisberg, S. (1999). Applied regression including computing and Graphics, John Wiley & Sons, New York.
  4. Cox, D. R. (1970). Analysis of binary data, Chapman Hall, London.
  5. Kay, R. and Little, S. (1987), Transformations of the explanatory variables in the logistic regression model for binary data. Biometrika, 74, 495-501. https://doi.org/10.1093/biomet/74.3.495
  6. Kahng, M. (2005). Exploring interaction in generalized linear models. Journal of Korean Data & Information Science Society, 16, 13-18.
  7. Kahng, M. and Kim, M. (2004). A score test for detection of outliers in generalized linear models. Journal of Korean Data & Information Science Society, 15, 129-139.
  8. Kahng, M. and Kim, B. and Hong, J. (2010). Graphical regression and model assessment in logistic model. Journal of Korean Data & Information Science Society, 21, 21-32.
  9. McCullagh, P. (1980). Regression models for ordinal data (with discussion). Journal of Royal Statistical Society, B, 42, 109-142.
  10. McCullagh, P. and Nelder, J. A. (1989). Generalized linear models, 2nd Ed., Chapman Hall, London.
  11. Nelder, J. A. and Wedderburn, R. W. M. (1972). Generalized linear models. Journal of Royal Statistical Society, A, 135, 370-384. https://doi.org/10.2307/2344614
  12. Scrucca, L. (2003). Graphics for studying logistics regression models, Statistical Methods and Applications, 11, 371-394.
  13. Scrucca, L. and Weisberg, S. (2004), A simulation study to investigate the behavior of the log-density ratio under normality, Communication in Statistics Simulation and Computation, 33, 159-178. https://doi.org/10.1081/SAC-120028439
  14. Seo, M. and Kim, J. (2006). Estimation of odds ratio in proportional odds model. Journal of Korean Data & Information Science Society, 17, 1067-1076.
  15. Velilla, S. (1993). A note on the multivariate Box-Cox transformations to normality. Statistics and Probability Letters, 17, 315-322. https://doi.org/10.1016/0167-7152(93)90209-2