• Title/Summary/Keyword: multivariate normality

Search Result 50, Processing Time 0.023 seconds

Combining cluster analysis and neural networks for the classification problem

  • Kim, Kyungsup;Han, Ingoo
    • Proceedings of the Korean Operations and Management Science Society Conference
    • /
    • 1996.10a
    • /
    • pp.31-34
    • /
    • 1996
  • The extensive researches have compared the performance of neural networks(NN) with those of various statistical techniques for the classification problem. The empirical results of these comparative studies have indicated that the neural networks often outperform the traditional statistical techniques. Moreover, there are some efforts that try to combine various classification methods, especially multivariate discriminant analysis with neural networks. While these efforts improve the performance, there exists a problem violating robust assumptions of multivariate discriminant analysis that are multivariate normality of the independent variables and equality of variance-covariance matrices in each of the groups. On the contrary, cluster analysis alleviates this assumption like neural networks. We propose a new approach to classification problems by combining the cluster analysis with neural networks. The resulting predictions of the composite model are more accurate than each individual technique.

  • PDF

Partially linear multivariate regression in the presence of measurement error

  • Yalaz, Secil;Tez, Mujgan
    • Communications for Statistical Applications and Methods
    • /
    • v.27 no.5
    • /
    • pp.511-521
    • /
    • 2020
  • In this paper, a partially linear multivariate model with error in the explanatory variable of the nonparametric part, and an m dimensional response variable is considered. Using the uniform consistency results found for the estimator of the nonparametric part, we derive an estimator of the parametric part. The dependence of the convergence rates on the errors distributions is examined and demonstrated that proposed estimator is asymptotically normal. In main results, both ordinary and super smooth error distributions are considered. Moreover, the derived estimators are applied to the economic behaviors of consumers. Our method handles contaminated data is founded more effectively than the semiparametric method ignores measurement errors.

Quantile confidence region using highest density

  • Hong, Chong Sun;Yoo, Myung Soo
    • Communications for Statistical Applications and Methods
    • /
    • v.26 no.1
    • /
    • pp.35-46
    • /
    • 2019
  • Multivariate Confidence Region (MCR) cannot be used to obtain the confidence region of the mean vector of multivariate data when the normality assumption is not satisfied; however, the Quantile Confidence Region (QCR) could be used with a Multivariate Quantile Vector in these cases. The coverage rate of the QCR is better than MCR; however, it has a disadvantage because the QCR has a wide shape when the probability density function follows a bimodal form. In this study, we propose a Quantile Confidence Region using the Highest density (QCRHD) method with the Highest Density Region (HDR). The coverage rate of QCRHD was superior to MCR, but is found to be similar to QCR. The QCRHD is constructed as one region similar to QCR when the distance of the mean vector is close. When the distance of the mean vector is far, the QCR has one wide region, but the QCRHD has two smaller regions. Based on these features, it is found that the QCRHD can overcome the disadvantages of the QCR, which may have a wide shape.

Note on Estimating the Eigen System of Σ1-1Σ2

  • Kim, Myung-Geun
    • Communications for Statistical Applications and Methods
    • /
    • v.10 no.2
    • /
    • pp.603-606
    • /
    • 2003
  • The maximum likelihood estimators of the eigenvalues and eigenvectors of $$\Sigma$$_1$^{-1}$$\Sigma$$_2$are shown to be the eigenvalues and eigenvectors of $S$_1$^{1}$S$_2$ under multivariate normality and are explicitly derived. The nature of the eigenvalues and eigenvectors of $$\Sigma$$_1$^{-1}$$\Sigma$$_2$ or their estimators will be uncovered.

Bayes Prediction Density in Linear Models

  • Kim, S.H.
    • Communications for Statistical Applications and Methods
    • /
    • v.8 no.3
    • /
    • pp.797-803
    • /
    • 2001
  • This paper obtained Bayes prediction density for the spatial linear model with non-informative prior. It showed the results that predictive inferences is completely unaffected by departures from the normality assumption in the direction of the elliptical family and the structure of prediction density is unchanged by more than one additional future observations.

  • PDF

New Dispersion Function in the Rank Regression

  • Choi, Young-Hun
    • Communications for Statistical Applications and Methods
    • /
    • v.9 no.1
    • /
    • pp.101-113
    • /
    • 2002
  • In this paper we introduce a new score generating (unction for the rank regression in the linear regression model. The score function compares the $\gamma$'th and s\`th power of the tail probabilities of the underlying probability distribution. We show that the rank estimate asymptotically converges to a multivariate normal. further we derive the asymptotic Pitman relative efficiencies and the most efficient values of $\gamma$ and s under the symmetric distribution such as uniform, normal, cauchy and double exponential distributions and the asymmetric distribution such as exponential and lognormal distributions respectively.

Statistical Methods for Multivariate Missing Data in Health Survey Research (보건조사연구에서 다변량결측치가 내포된 자료를 효율적으로 분석하기 위한 통계학적 방법)

  • Kim, Dong-Kee;Park, Eun-Cheol;Sohn, Myong-Sei;Kim, Han-Joong;Park, Hyung-Uk;Ahn, Chae-Hyung;Lim, Jong-Gun;Song, Ki-Jun
    • Journal of Preventive Medicine and Public Health
    • /
    • v.31 no.4 s.63
    • /
    • pp.875-884
    • /
    • 1998
  • Missing observations are common in medical research and health survey research. Several statistical methods to handle the missing data problem have been proposed. The EM algorithm (Expectation-Maximization algorithm) is one of the ways of efficiently handling the missing data problem based on sufficient statistics. In this paper, we developed statistical models and methods for survey data with multivariate missing observations. Especially, we adopted the EM algorithm to handle the multivariate missing observations. We assume that the multivariate observations follow a multivariate normal distribution, where the mean vector and the covariance matrix are primarily of interest. We applied the proposed statistical method to analyze data from a health survey. The data set we used came from a physician survey on Resource-Based Relative Value Scale(RBRVS). In addition to the EM algorithm, we applied the complete case analysis, which uses only completely observed cases, and the available case analysis, which utilizes all available information. The residual and normal probability plots were evaluated to access the assumption of normality. We found that the residual sum of squares from the EM algorithm was smaller than those of the complete-case and the available-case analyses.

  • PDF

Drought Analysis of Nakdong River Basin Based on Multivariate Stochastic Models (다변량 추계학적 모형을 이용한 낙동강 유역의 가뭄해석에 관한 연구)

  • Heo, Jun-Haeng;Kim, Gyeong-Deok;Jo, Won-Cheol
    • Journal of Korea Water Resources Association
    • /
    • v.30 no.2
    • /
    • pp.155-163
    • /
    • 1997
  • In this study, drought analysis of annual flows of Jindong, Hyunpoong, and Waekwan stations located at Nakdong River Basins was performed based on multivariate stochastic models. The stochastic models used were multivariate autoregressive model (MAR) and multivariate contemporaneous (MCAR) model. MCAR(1) and MAR(1) models were selected to be a appropriate models for these stations based on skewness test of normality, test of uncorrelated residuals, and correlograms of the residual series of each model. The statistics generated by MCAR(1) model and MAR(1) model resembled very closely those computed from historical series. The drought characteristics such as run len호, run sum, and run intensity were fairly well reproduced for the various lengths of generated annual flows based on the MCAR(1) and MAR(1) models. Thus, these drought characteristics may give the important informations in planning mid or long term water supplying systems.

  • PDF

Canonical Correlation: Permutation Tests and Regression

  • Yoo, Jae-Keun;Kim, Hee-Youn;Um, Hye-Yeon
    • Communications for Statistical Applications and Methods
    • /
    • v.19 no.3
    • /
    • pp.471-478
    • /
    • 2012
  • In this paper, we present a permutation test to select the number of pairs of canonical variates in canonical correlation analysis. The existing chi-squared test is known to be limited to normality in use. We compare the existing test with the proposed permutation test and study their asymptotic behaviors through numerical studies. In addition, we connect canonical correlation analysis to regression and we we show that certain inferences in regression can be done through canonical correlation analysis. A regression analysis of real data through canonical correlation analysis is illustrated.

An Alternative Parametric Estimation of Sample Selection Model: An Application to Car Ownership and Car Expense (비정규분포를 이용한 표본선택 모형 추정: 자동차 보유와 유지비용에 관한 실증분석)

  • Choi, Phil-Sun;Min, In-Sik
    • Communications for Statistical Applications and Methods
    • /
    • v.19 no.3
    • /
    • pp.345-358
    • /
    • 2012
  • In a parametric sample selection model, the distribution assumption is critical to obtain consistent estimates. Conventionally, the normality assumption has been adopted for both error terms in selection and main equations of the model. The normality assumption, however, may excessively restrict the true underlying distribution of the model. This study introduces the $S_U$-normal distribution into the error distribution of a sample selection model. The $S_U$-normal distribution can accommodate a wide range of skewness and kurtosis compared to the normal distribution. It also includes the normal distribution as a limiting distribution. Moreover, the $S_U$-normal distribution can be easily extended to multivariate dimensions. We provide the log-likelihood function and expected value formula based on a bivariate $S_U$-normal distribution in a sample selection model. The results of simulations indicate the $S_U$-normal model outperforms the normal model for the consistency of estimators. As an empirical application, we provide the sample selection model for car ownership and a car expense relationship.