Statistical Inference in Non-Identifiable and Singular Statistical Models

  • Published : 2001.06.01

Abstract

When a statistical model has a hierarchical structure such as multilayer perceptrons in neural networks or Gaussian mixture density representation, the model includes distribution with unidentifiable parameters when the structure becomes redundant. Since the exact structure is unknown, we need to carry out statistical estimation or learning of parameters in such a model. From the geometrical point of view, distributions specified by unidentifiable parameters become a singular point in the parameter space. The problem has been remarked in many statistical models, and strange behaviors of the likelihood ratio statistics, when the null hypothesis is at a singular point, have been analyzed so far. The present paper studies asymptotic behaviors of the maximum likelihood estimator and the Bayesian predictive estimator, by using a simple cone model, and show that they are completely different from regular statistical models where the Cramer-Rao paradigm holds. At singularities, the Fisher information metric degenerates, implying that the cramer-Rao paradigm does no more hold, and that he classical model selection theory such as AIC and MDL cannot be applied. This paper is a first step to establish a new theory for analyzing the accuracy of estimation or learning at around singularities.

Keywords

References

  1. Neural Computation v.10 Natural gradient works efficiently in learning Amari, S.
  2. Neural Computation v.5 Statistical theory of learning curves under entropic loss criterion Amari, S.;Murata, N.
  3. Methods of Information Geometry Amari, S.;Nagaoka, H.
  4. IEICE Transactions on Fundamentals of Electronics, Communications and Computer System v.E84-A Differential and algebraic geometry of multilayer perceptrons Amari, S.;Ozeki, T.
  5. Neural Computation v.12 Adaptive method of realizing natural gradient learning for multilayer perceptrons Amari, S.;Park, H.;Fukumizu, F.
  6. Probability and Statistics v.1 Testing in locally conic models, and application to mixture models Dacunha-Castelle, D.;Gassiat, E.
  7. Memo at Post-Conference of the Bernoulli-RIKEN BSI 2000 Symposium on Neural Networks and Learning Statistical analysis of unidentifiable models and its application to multilayer neural networks Fukumizu, K.
  8. Research Memorandum v.780 Likelihood Ratio of Unidentifiable Models and Multilayer Neural Networks Fukumizu, K.
  9. Proceeding of International Joint Conference of Neural Networks On the problem in model selection of neural network regression in overrealizable scenario Hagiwara, K.;Kuno, K.;Usui, S.
  10. Proceedings of Berkeley Conference in Honor of J. Neyman and J. Kiefer v.2 A failure of likelihood asymptotics for normal mixtures Hartigan, J.A.
  11. Neural Networks v.13 Adaptive natural gradient learning algorithms for various stochastic models Park, H.;Amari, S.;Fukumizu, F.
  12. Neural Computation v.13 Algebraic analysis for non-identifiable learning machines Watanabe, S.
  13. The Trans. of IEICE A v.J84-A Training and generalization errors of learning machines with algebraic singularities Watanabe, S.