• Title/Summary/Keyword: sufficient statistics

Search Result 299, Processing Time 0.025 seconds

On hierarchical clustering in sufficient dimension reduction

  • Yoo, Chaeyeon;Yoo, Younju;Um, Hye Yeon;Yoo, Jae Keun
    • Communications for Statistical Applications and Methods
    • /
    • v.27 no.4
    • /
    • pp.431-443
    • /
    • 2020
  • The K-means clustering algorithm has had successful application in sufficient dimension reduction. Unfortunately, the algorithm does have reproducibility and nestness, which will be discussed in this paper. These are clear deficits for the K-means clustering algorithm; however, the hierarchical clustering algorithm has both reproducibility and nestness, but intensive comparison between K-means and hierarchical clustering algorithm has not yet been done in a sufficient dimension reduction context. In this paper, we rigorously study the two clustering algorithms for two popular sufficient dimension reduction methodology of inverse mean and clustering mean methods throughout intensive numerical studies. Simulation studies and two real data examples confirm that the use of hierarchical clustering algorithm has a potential advantage over the K-means algorithm.

Method-Free Permutation Predictor Hypothesis Tests in Sufficient Dimension Reduction

  • Lee, Kyungjin;Oh, Suji;Yoo, Jae Keun
    • Communications for Statistical Applications and Methods
    • /
    • v.20 no.4
    • /
    • pp.291-300
    • /
    • 2013
  • In this paper, we propose method-free permutation predictor hypothesis tests in the context of sufficient dimension reduction. Different from an existing method-free bootstrap approach, predictor hypotheses are evaluated based on p-values; therefore, usual statistical practitioners should have a potential preference. Numerical studies validate the developed theories, and real data application is provided.

Intensive numerical studies of optimal sufficient dimension reduction with singularity

  • Yoo, Jae Keun;Gwak, Da-Hae;Kim, Min-Sun
    • Communications for Statistical Applications and Methods
    • /
    • v.24 no.3
    • /
    • pp.303-315
    • /
    • 2017
  • Yoo (2015, Statistics and Probability Letters, 99, 109-113) derives theoretical results in an optimal sufficient dimension reduction with singular inner-product matrix. The results are promising, but Yoo (2015) only presents one simulation study. So, an evaluation of its practical usefulness is necessary based on numerical studies. This paper studies the asymptotic behaviors of Yoo (2015) through various simulation models and presents a real data example that focuses on ordinary least squares. Intensive numerical studies show that the $x^2$ test by Yoo (2015) outperforms the existing optimal sufficient dimension reduction method. The basis estimation by the former can be theoretically sub-optimal; however, there are no notable differences from that by the latter. This investigation confirms the practical usefulness of Yoo (2015).

Integrated Partial Sufficient Dimension Reduction with Heavily Unbalanced Categorical Predictors

  • Yoo, Jae-Keun
    • The Korean Journal of Applied Statistics
    • /
    • v.23 no.5
    • /
    • pp.977-985
    • /
    • 2010
  • In this paper, we propose an approach to conduct partial sufficient dimension reduction with heavily unbalanced categorical predictors. For this, we consider integrated categorical predictors and investigate certain conditions that the integrated categorical predictor is fully informative to partial sufficient dimension reduction. For illustration, the proposed approach is implemented on optimal partial sliced inverse regression in simulation and data analysis.

Identification of Multiple Outlying Cells in Multi-way Tables

  • Lee, Jong Cheol;Hong, Chong Sun
    • Communications for Statistical Applications and Methods
    • /
    • v.7 no.3
    • /
    • pp.687-698
    • /
    • 2000
  • An identification method is proposed in order to detect more than one outlying cells in multi-way contingency tables. The iterative proportional fitting method is applied to get expected values of several suspected outlying cells. Since the proposed method uses minimal sufficient statistics under quasi log-linear models, expected counts of outlying cells could be estimated under any hierarchical log-linear models. This method is an extension of the backwards-stepping method of Simonoff(1988) and requires les iteration to identify outlying cells.

  • PDF

Unsupervised Speaker Adaptation Based on Sufficient HMM Statistics (SUFFICIENT HMM 통계치에 기반한 UNSUPERVISED 화자 적응)

  • Ko Bong-Ok;Kim Chong-Kyo
    • Proceedings of the KSPS conference
    • /
    • 2003.05a
    • /
    • pp.127-130
    • /
    • 2003
  • This paper describes an efficient method for unsupervised speaker adaptation. This method is based on selecting a subset of speakers who are acoustically close to a test speaker, and calculating adapted model parameters according to the previously stored sufficient HMM statistics of the selected speakers' data. In this method, only a few unsupervised test speaker's data are required for the adaptation. Also, by using the sufficient HMM statistics of the selected speakers' data, a quick adaptation can be done. Compared with a pre-clustering method, the proposed method can obtain a more optimal speaker cluster because the clustering result is determined according to test speaker's data on-line. Experiment results show that the proposed method attains better improvement than MLLR from the speaker independent model. Moreover the proposed method utilizes only one unsupervised sentence utterance, while MLLR usually utilizes more than ten supervised sentence utterances.

  • PDF

A NOTE ON SOME HIGHER ORDER CUMULANTS IN k PARAMETER NATURAL EXPONENTIAL FAMILY

  • KIM, HYUN CHUL
    • Journal of the Korean Society for Industrial and Applied Mathematics
    • /
    • v.3 no.2
    • /
    • pp.157-160
    • /
    • 1999
  • We show the cumulants of a minimal sufficient statistics in k parameter natural exponential family by parameter function and partial parameter function. We nd the cumulants have some merits of central moments and general cumulants both. The first three cumulants are the central moments themselves and the fourth cumulant has the form related with kurtosis.

  • PDF

Sufficient conditions for the oracle property in penalized linear regression (선형 회귀모형에서 벌점 추정량의 신의 성질에 대한 충분조건)

  • Kwon, Sunghoon;Moon, Hyeseong;Chang, Jaeho;Lee, Sangin
    • The Korean Journal of Applied Statistics
    • /
    • v.34 no.2
    • /
    • pp.279-293
    • /
    • 2021
  • In this paper, we introduce how to construct sufficient conditions for the oracle property in penalized linear regression model. We give formal definitions of the oracle estimator, penalized estimator, oracle penalized estimator, and the oracle property of the oracle estimator. Based on the definitions, we present a unified way of constructing optimality conditions for the oracle property and sufficient conditions for the optimality conditions that covers most of the existing penalties. In addition, we present an illustrative example and results from the numerical study.

Fused inverse regression with multi-dimensional responses

  • Cho, Youyoung;Han, Hyoseon;Yoo, Jae Keun
    • Communications for Statistical Applications and Methods
    • /
    • v.28 no.3
    • /
    • pp.267-279
    • /
    • 2021
  • A regression with multi-dimensional responses is quite common nowadays in the so-called big data era. In such regression, to relieve the curse of dimension due to high-dimension of responses, the dimension reduction of predictors is essential in analysis. Sufficient dimension reduction provides effective tools for the reduction, but there are few sufficient dimension reduction methodologies for multivariate regression. To fill this gap, we newly propose two fused slice-based inverse regression methods. The proposed approaches are robust to the numbers of clusters or slices and improve the estimation results over existing methods by fusing many kernel matrices. Numerical studies are presented and are compared with existing methods. Real data analysis confirms practical usefulness of the proposed methods.