DOI QR코드

DOI QR Code

Independence test of a continuous random variable and a discrete random variable

  • Yang, Jinyoung (Department of Statistics, Ewha Womans University) ;
  • Kim, Mijeong (Department of Statistics, Ewha Womans University)
  • Received : 2019.10.26
  • Accepted : 2020.02.20
  • Published : 2020.05.31

Abstract

In many cases, we are interested in identifying independence between variables. For continuous random variables, correlation coefficients are often used to describe the relationship between variables; however, correlation does not imply independence. For finite discrete random variables, we can use the Pearson chi-square test to find independency. For the mixed type of continuous and discrete random variables, we do not have a general type of independent test. In this study, we develop a independence test of a continuous random variable and a discrete random variable without assuming a specific distribution using kernel density estimation. We provide some statistical criteria to test independence under some special settings and apply the proposed independence test to Pima Indian diabetes data. Through simulations, we calculate false positive rates and true positive rates to compare the proposed test and Kolmogorov-Smirnov test.

Keywords

References

  1. Baba K, Shibata R, and Sibuya M (2004). Partial correlation and conditional correlation as measures of conditional independence, Australian & New Zealand Journal of Statistics, 46, 657-664. https://doi.org/10.1111/j.1467-842X.2004.00360.x
  2. Chakravarti IM, Laha RG, and Roy J (1967). Handbook of Methods of Applied Statistics (Vol. I), John Wiley & Sons, New York.
  3. Colombo D and Maathuis MH (2014). Order-independent constraint-based causal structure learning, The Journal of Machine Learning Research, 15, 3741-3782.
  4. Kalisch M, Hauser A, Maechler M, et al. (2019). Package 'pcalg'.
  5. Leisch F, Dimitriadou E, Leisch MF, et al. (2009). Package 'mlbench'.
  6. Neapolitan RE (2004). Learning Bayesian Networks, Pearson Prentice Hall, Upper Saddle River, NJ.
  7. Neyman J and Pearson ES (1933). On the problem of the most efficient tests of statistical hypotheses, Philosophical Transactions of the Royal Society of London. Series A, Containing Papers of a Mathematical or Physical Character, 231, 289-337.
  8. Pearl J, Glymour M, and Jewell NP (2016). Causal Inference in Statistics: A Primer, John Wiley & Sons, Chichester.
  9. Pearson K (1900). On the criterion that a given system of deviations from the probable in the case of a correlated system of variables is such that it can be reasonably supposed to have arisen from random sampling, The London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science, 50, 157-175. https://doi.org/10.1080/14786440009463897
  10. Russell SJ and Norvig P (2003). Artificial Intelligence: A Modern Approach (2nd ed), Prentice Hall, Upper Saddle River, N.J., 111-114.
  11. Scutari M, Scutari MM, and MMPC HP (2019). Package 'bnlearn'.
  12. Scutari M and Denis JB (2014). Bayesian Networks: with Examples in R, Chapman and Hall/CRC, Boca Raton.
  13. Silverman BW (1986). Density Estimation, Chapman and Hall, London.
  14. Spirtes P, Glymour CN, Scheines R, and Heckerman D (2000). Causation, Prediction, and Search (2nd ed), MIT press, Cambridge, Mass.