• Title/Summary/Keyword: Multivariate Data

Search Result 1,968, Processing Time 0.023 seconds

A Test for Multivariate Normality Focused on Elliptical Symmetry Using Mahalanobis Distances

  • Park, Cheol-Yong
    • Journal of the Korean Data and Information Science Society
    • /
    • v.17 no.4
    • /
    • pp.1191-1200
    • /
    • 2006
  • A chi-squared test of multivariate normality is suggested which is mainly focused on detecting deviations from elliptical symmetry. This test uses Mahalanobis distances of observations to have some power for deviations from multivariate normality. We derive the limiting distribution of the test statistic by a conditional limit theorem. A simulation study is conducted to study the accuracy of the limiting distribution in finite samples. Finally, we compare the power of our method with those of other popular tests of multivariate normality under two non-normal distributions.

  • PDF

Multivariate Test based on the Multiple Testing Approach

  • Hong, Seung-Man;Park, Hyo-Il
    • The Korean Journal of Applied Statistics
    • /
    • v.25 no.5
    • /
    • pp.821-827
    • /
    • 2012
  • In this study, we propose a new nonparametric test procedure for the multivariate data. In order to accommodate the generalized alternatives for the multivariate case, we construct test statistics via-values with some useful combining functions. Then we illustrate our procedure with an example and compare efficiency among the combining functions through a simulation study. Finally we discuss some interesting features related with the new nonparametric test as concluding remarks.

Control Charts for Means and Variances under Multivariate Normal Process

  • Chang, Duk-Joon;Kwon, Yong-Man
    • Journal of the Korean Data and Information Science Society
    • /
    • v.10 no.1
    • /
    • pp.223-232
    • /
    • 1999
  • Multivariate quality control charts with combine-accumulate approach and accumulate-combine apprach for monitoring both means and variances under multivariate normal process are investigated. Numerical performances of the charts show that multivariate EWMA chart with accumulate-combine approach can be recommended for all kinds of shift in means and variances.

  • PDF

A Simple Chi-squared Test of Multivariate Normality Based on the Spherical Data

  • Park, Cheolyong
    • Communications for Statistical Applications and Methods
    • /
    • v.8 no.1
    • /
    • pp.117-126
    • /
    • 2001
  • We provide a simple chi-squared test of multivariate normality based on rectangular cells on the spherical data. This test is simple since it is a direct extension of the univariate chi-squared test to multivariate case and the expected cell counts are easily computed. We derive the limiting distribution of the chi-squared statistic via the conditional limit theorems. We study the accuracy in finite samples of the limiting distribution and then compare the poser of our test with those of other popular tests in an application to a real data.

  • PDF

Prediction of arrhythmia using multivariate time series data (다변량 시계열 자료를 이용한 부정맥 예측)

  • Lee, Minhai;Noh, Hohsuk
    • The Korean Journal of Applied Statistics
    • /
    • v.32 no.5
    • /
    • pp.671-681
    • /
    • 2019
  • Studies on predicting arrhythmia using machine learning have been actively conducted with increasing number of arrhythmia patients. Existing studies have predicted arrhythmia based on multivariate data of feature variables extracted from RR interval data at a specific time point. In this study, we consider that the pattern of the heart state changes with time can be important information for the arrhythmia prediction. Therefore, we investigate the usefulness of predicting the arrhythmia with multivariate time series data obtained by extracting and accumulating the multivariate vectors of the feature variables at various time points. When considering 1-nearest neighbor classification method and its ensemble for comparison, it is confirmed that the multivariate time series data based method can have better classification performance than the multivariate data based method if we select an appropriate time series distance function.

Multivariate CUSUM control charts for monitoring the covariance matrix

  • Choi, Hwa Young;Cho, Gyo-Young
    • Journal of the Korean Data and Information Science Society
    • /
    • v.27 no.2
    • /
    • pp.539-548
    • /
    • 2016
  • This paper is a study on the multivariate CUSUM control charts using three different control statistics for monitoring covariance matrix. We get control limits and ARLs of the proposed multivariate CUSUM control charts using three different control statistics by using computer simulations. The performances of these proposed multivariate CUSUM control charts have been investigated by comparing ARLs. The purpose of control charts is to detect assignable causes of variation so that these causes can be found and eliminated from process, variability will be reduced and the process will be improved. We show that the charts based on three different control statistics are very effective in detecting shifts, especially shifts in covariances when the variables are highly correlated. When variables are highly correlated, our overall recommendation is to use the multivariate CUSUM control charts using trace for detecting changes in covariance matrix.

On inference of multivariate means under ranked set sampling

  • Rochani, Haresh;Linder, Daniel F.;Samawi, Hani;Panchal, Viral
    • Communications for Statistical Applications and Methods
    • /
    • v.25 no.1
    • /
    • pp.1-13
    • /
    • 2018
  • In many studies, a researcher attempts to describe a population where units are measured for multiple outcomes, or responses. In this paper, we present an efficient procedure based on ranked set sampling to estimate and perform hypothesis testing on a multivariate mean. The method is based on ranking on an auxiliary covariate, which is assumed to be correlated with the multivariate response, in order to improve the efficiency of the estimation. We showed that the proposed estimators developed under this sampling scheme are unbiased, have smaller variance in the multivariate sense, and are asymptotically Gaussian. We also demonstrated that the efficiency of multivariate regression estimator can be improved by using Ranked set sampling. A bootstrap routine is developed in the statistical software R to perform inference when the sample size is small. We use a simulation study to investigate the performance of the method under known conditions and apply the method to the biomarker data collected in China Health and Nutrition Survey (CHNS 2009) data.

Statistical Methods for Multivariate Missing Data in Health Survey Research (보건조사연구에서 다변량결측치가 내포된 자료를 효율적으로 분석하기 위한 통계학적 방법)

  • Kim, Dong-Kee;Park, Eun-Cheol;Sohn, Myong-Sei;Kim, Han-Joong;Park, Hyung-Uk;Ahn, Chae-Hyung;Lim, Jong-Gun;Song, Ki-Jun
    • Journal of Preventive Medicine and Public Health
    • /
    • v.31 no.4 s.63
    • /
    • pp.875-884
    • /
    • 1998
  • Missing observations are common in medical research and health survey research. Several statistical methods to handle the missing data problem have been proposed. The EM algorithm (Expectation-Maximization algorithm) is one of the ways of efficiently handling the missing data problem based on sufficient statistics. In this paper, we developed statistical models and methods for survey data with multivariate missing observations. Especially, we adopted the EM algorithm to handle the multivariate missing observations. We assume that the multivariate observations follow a multivariate normal distribution, where the mean vector and the covariance matrix are primarily of interest. We applied the proposed statistical method to analyze data from a health survey. The data set we used came from a physician survey on Resource-Based Relative Value Scale(RBRVS). In addition to the EM algorithm, we applied the complete case analysis, which uses only completely observed cases, and the available case analysis, which utilizes all available information. The residual and normal probability plots were evaluated to access the assumption of normality. We found that the residual sum of squares from the EM algorithm was smaller than those of the complete-case and the available-case analyses.

  • PDF

MULTIVARIATE JOINT NORMAL LIKELIHOOD DISTANCE

  • Kim, Myung-Geun
    • Journal of applied mathematics & informatics
    • /
    • v.27 no.5_6
    • /
    • pp.1429-1433
    • /
    • 2009
  • The likelihood distance for the joint distribution of two multivariate normal distributions with common covariance matrix is explicitly derived. It is useful for identifying outliers which do not follow the joint multivariate normal distribution with common covariance matrix. The likelihood distance derived here is a good ground for the use of a generalized Wilks statistic in influence analysis of two multivariate normal data.

  • PDF

Detection of Hotspots on Multivariate Spatial Data

  • Moon, Sung-Ho
    • Journal of the Korean Data and Information Science Society
    • /
    • v.17 no.4
    • /
    • pp.1181-1190
    • /
    • 2006
  • Statistical analyses for spatial data are important features for various types of fields. Spatial data are taken at specific locations or within specific regions and their relative positions are recorded. Lattice data are synoptic observation covering an entire spatial region, like cancer rates corresponding to each county in a state. Until now, the echelon analysis has been applied only to univariate spatial data. As a result, it is impossible to detect the hotspots on the multivariate spatial data In this paper, we expand the spatial data to time series structure. And then we analyze them on the time space and detect the hotspots. Echelon dendrogram has been made by piling up each multivariate spatial data to bring time spatial data. We perform the structural analysis of temporal spatial data.

  • PDF