Abstract
We consider the problem of testing cell probabilities in sparse multinomial data. Aerts et al. (2000) presented T=${{\Sigma}_{i=1}}^{k}{[{p_i}^{*}-E{(p_{i}}^{*})]^2$ as a test statistic with the local least square polynomial estimator ${{p}_{i}}^{*}$, and derived its asymptotic distribution. The local least square estimator may produce negative estimates for cell probabilities. The local maximum likelihood polynomial estimator ${{\hat{p}}_{i}}$, however, guarantees positive estimates for cell probabilities and has the same asymptotic performance as the local least square estimator (Baek and Park, 2003). When there are cell probabilities with relatively much different sizes, the same contribution of the difference between the estimator and the hypothetical probability at each cell in their test statistic would not be proper to measure the total goodness-of-fit. We consider a Pearson type of goodness-of-fit test statistic, $T_1={{\Sigma}_{i=1}}^{k}{[{p_i}^{*}-E{(p_{i}}^{*})]^2/p_{i}$ instead, and show it follows an asymptotic normal distribution. Also we investigate the asymptotic normality of $T_2={{\Sigma}_{i=1}}^{k}{[{p_i}^{*}-E{(p_{i}}^{*})]^2/p_{i}$ where the minimum expected cell frequency is very small.