DOI QR코드

DOI QR Code

Statistical micro matching using a multinomial logistic regression model for categorical data

  • 투고 : 2019.07.08
  • 심사 : 2019.08.24
  • 발행 : 2019.09.30

초록

Statistical matching is a method of combining multiple sources of data that are extracted or surveyed from the same population. It can be used in situation when variables of interest are not jointly observed. It is a low-cost way to expect high-effects in terms of being able to create synthetic data using existing sources. In this paper, we propose the several statistical micro matching methods using a multinomial logistic regression model when all variables of interest are categorical or categorized ones, which is common in sample survey. Under conditional independence assumption (CIA), a mixed statistical matching method, which is useful when auxiliary information is not available, is proposed. We also propose a statistical matching method with auxiliary information that reduces the bias of the conventional matching methods suggested under CIA. Through a simulation study, proposed micro matching methods and conventional ones are compared. Simulation study shows that suggested matching methods outperform the existing ones especially when CIA does not hold.

키워드

참고문헌

  1. Budd EC (1971). The creation of a microdata file for estimating the size distribution of income, The Review of Income and Wealth, 17, 317-333. https://doi.org/10.1111/j.1475-4991.1971.tb00785.x
  2. D'Orazio M, Di Zio M, and Scanu M. (2006). Statistical Matching: Theory and Practice, JohnWiley & Sons, Chichester.
  3. D'Orazio M (2017). Statistical matching and imputation of survey data with statmatch (Technical Paper). Available from: https://cran.r-project.org/web/packages/StatMatch/vignettes/StatisticalMatching with StatMatch.pdf
  4. Maddala GS (1983). Limited-dependent and Qualitative Variables in Econometrics, Cambridge University Press, Cambridge.
  5. Okner BA (1972). Constructing a new data base from existing microdata sets: the 1966 merge file, Annals of Economic and Social Measurement, 1, 325-342.
  6. Paass G (1986). Statistical matching: evaluation of existing procedures and improvements be using additional information. In Microanalytic Simulation Models to Support Social and Financial Policy, Elsevier Science, Amsterdam.
  7. Renssen RH (1998). Use of statistical matching techniques in calibration estimation, Survey Methodology, 24, 171-183.
  8. Rodgers WL (1984). An evaluation of statistical matching, Journal of Business and Economic Statistics, 2, 91-102. https://doi.org/10.2307/1391358
  9. Rubin DB (1986). Statistical matching using file concatenation with adjusted weights and multiple imputations, Journal of Business and Economic Statistics, 4, 87-94. https://doi.org/10.2307/1391390
  10. Sims CA (1972). Comment on Okner (1972), Annals of Economic and Social Measurement, 1, 343-345.
  11. Singh AC (1988). Log-linear imputation, Methodology Branch Working Paper, SSMD, 88-029E, Statistics Canada; also published in Proceedings of the Fifth Annual Research Conference, U.S. Bureau of the Census, 118-132.
  12. Singh AC, Mantel H, Kinack M, and Rowe G (1993). Statistical matching: use of auxiliary information as an alternative to the conditional independence assumption, Survey Methodology, 19, 59-79.