DOI QR코드

DOI QR Code

혼합자료에서 독립성검정에 의한 연관성 측정

A unified measure of association for complex data obtained from independence tests

  • 이승천 (한신대학교 정보통계학과) ;
  • 허문열 (성균관대학교 통계학과)
  • 투고 : 2021.05.11
  • 심사 : 2021.05.11
  • 발행 : 2021.08.31

초록

두 확률변수의 연관성을 측정하는 측도는 많이 있으나, 이러한 측도는 같은 유형인 변수들 간의 관계를 측정하기 위한 것으로 여러 가지 유형의 변수들이 혼재되어 있는 혼합자료에서 사용하기는 곤란하다. 본 논문에서는 두 확률변수의 독립성 검정을 통해 구한 p-값으로 혼합자료에서 사용될 수 있는 새로운 연관성 측도를 구하였으며, 이렇게 구하여 진 연관성 측도가 혼합자료에서 변수들 간의 연관성을 비교하는데 유용하게 사용될 수 있음을 보였다.

Although there exist numerous measures of association, most of them are lacking in generality in that they do not intend to measure the association between heterogeneous type of random variables. On the other hand, many statistical analyzes dealing with complex data sets require a very sophisticate measure of association. In this note, the p-value of independence tests is utilized to obtain a measure of association. The proposed measure of association have some consistency in measuring association between various types of random variables.

키워드

참고문헌

  1. 박성현, 허문열 (1983). <전산통계>, 박영사.
  2. Bell CB (1962). Mutual information and maximal correlation as measures of dependence, Annal of Mathematical Statistics, 33, 587-595. https://doi.org/10.1214/aoms/1177704583
  3. Bishop YMM, Fienberg SE, and Holland PW (1975). Discrete Multivariate Analysis-Theory and practive, The MIT press, Cambridge, Massachusetts.
  4. Breiman L, Friedman JH, Olshen RA, and Stone CJ (1984). Classification and Regression Trees, Wadsworth, Belmont, CA.
  5. Cramer H (1946). Mathematical Methods of Statistics, Princeton, New Jersey: Princeton University press.
  6. Cohen J (1960). A coefficient of agreement for nominal scales, Educ. Psychol. Meas., 20, 37-46. https://doi.org/10.1177/001316446002000104
  7. Eubank RL, Lariccia VN, and Rosenstein RB (1987). Test statistics derived as components of Pearson's Phsquared distance measure, Journal of the American Statistical Association, 82, 816-825. https://doi.org/10.1080/01621459.1987.10478503
  8. Goodman LA and Kruskal WH (1954). Measure of association for cross classifications. Journal of the American Statistical Association, 49, 732. https://doi.org/10.2307/2281536
  9. Kendall MG (1938). A new measure of rank correlation, Biometrika, 30, 81-93. https://doi.org/10.1093/biomet/30.1-2.81
  10. Quinlan JR (1988). C4.5: Programs for Machine Learning, Morgan Kaufmann Publishers, San Mateo, California.
  11. Randle R and Wolfe D (1979). Introduction to the Theory of Nonparametric Statistics, Wiley, New York.
  12. Renyi A (1959). On measures of dependence, ' Acta Mathematica Academiae Scientiarum Hungarica, 10, 441-451. https://doi.org/10.1007/BF02024507
  13. Shannon CE (1948). A mathematical theory of communication. Bell System Tech. Journal, 27, 379-423 and 623-656. https://doi.org/10.1002/j.1538-7305.1948.tb01338.x
  14. Silvey SD (1964). On a measure of association, Annals of Mathematical Statistics, 35, 1157-1166. https://doi.org/10.1214/aoms/1177703273
  15. Yule GU (1912). On the methods of measuring association between two attributes (with discussion), Journal of Royal Statistical Society, 75, 579-642. https://doi.org/10.2307/2340126