• 제목/요약/키워드: multivariate data

검색결과 1,967건 처리시간 0.024초

A Test for Multivariate Normality Focused on Elliptical Symmetry Using Mahalanobis Distances

  • Park, Cheol-Yong
    • Journal of the Korean Data and Information Science Society
    • /
    • 제17권4호
    • /
    • pp.1191-1200
    • /
    • 2006
  • A chi-squared test of multivariate normality is suggested which is mainly focused on detecting deviations from elliptical symmetry. This test uses Mahalanobis distances of observations to have some power for deviations from multivariate normality. We derive the limiting distribution of the test statistic by a conditional limit theorem. A simulation study is conducted to study the accuracy of the limiting distribution in finite samples. Finally, we compare the power of our method with those of other popular tests of multivariate normality under two non-normal distributions.

  • PDF

Multivariate Test based on the Multiple Testing Approach

  • Hong, Seung-Man;Park, Hyo-Il
    • 응용통계연구
    • /
    • 제25권5호
    • /
    • pp.821-827
    • /
    • 2012
  • In this study, we propose a new nonparametric test procedure for the multivariate data. In order to accommodate the generalized alternatives for the multivariate case, we construct test statistics via-values with some useful combining functions. Then we illustrate our procedure with an example and compare efficiency among the combining functions through a simulation study. Finally we discuss some interesting features related with the new nonparametric test as concluding remarks.

Control Charts for Means and Variances under Multivariate Normal Process

  • Chang, Duk-Joon;Kwon, Yong-Man
    • Journal of the Korean Data and Information Science Society
    • /
    • 제10권1호
    • /
    • pp.223-232
    • /
    • 1999
  • Multivariate quality control charts with combine-accumulate approach and accumulate-combine apprach for monitoring both means and variances under multivariate normal process are investigated. Numerical performances of the charts show that multivariate EWMA chart with accumulate-combine approach can be recommended for all kinds of shift in means and variances.

  • PDF

A Simple Chi-squared Test of Multivariate Normality Based on the Spherical Data

  • Park, Cheolyong
    • Communications for Statistical Applications and Methods
    • /
    • 제8권1호
    • /
    • pp.117-126
    • /
    • 2001
  • We provide a simple chi-squared test of multivariate normality based on rectangular cells on the spherical data. This test is simple since it is a direct extension of the univariate chi-squared test to multivariate case and the expected cell counts are easily computed. We derive the limiting distribution of the chi-squared statistic via the conditional limit theorems. We study the accuracy in finite samples of the limiting distribution and then compare the poser of our test with those of other popular tests in an application to a real data.

  • PDF

다변량 시계열 자료를 이용한 부정맥 예측 (Prediction of arrhythmia using multivariate time series data)

  • 이민혜;노호석
    • 응용통계연구
    • /
    • 제32권5호
    • /
    • pp.671-681
    • /
    • 2019
  • 최근에 부정맥 환자가 증가하면서 머신러닝을 이용한 부정맥을 예측하는 연구가 활발하게 진행되고 있다. 기존의 많은 연구들은 특정한 시점의 RR 간격 데이터에서 추출한 특징변수 다변량 데이터에 기반하여 부정맥을 예측하였다. 본 연구에서는 심장 상태가 시간에 따라 변해가는 패턴도 부정맥 예측에 중요한 정보가 될 수 있다고 생각하여 일정한 시간 간격을 두고 특징변수의 다변량 벡터를 추출하여 쌓음으써 얻어지는 다변량 시계열 데이터로 부정맥을 예측하는 것의 유용성에 대해 살펴보았다. 1-Nearest Neighbor 방법과 그것을 앙상블(ensemble)한 learner를 중심으로 비교했을 경우 시계열의 특징을 고려한 적절한 시계열 거리함수를 선택하여 시계열 정보를 활용한 다변량 시계열 데이터 기반 방법의 분류 성능이 더 좋게 나오는 것을 확인하였다.

Multivariate CUSUM control charts for monitoring the covariance matrix

  • Choi, Hwa Young;Cho, Gyo-Young
    • Journal of the Korean Data and Information Science Society
    • /
    • 제27권2호
    • /
    • pp.539-548
    • /
    • 2016
  • This paper is a study on the multivariate CUSUM control charts using three different control statistics for monitoring covariance matrix. We get control limits and ARLs of the proposed multivariate CUSUM control charts using three different control statistics by using computer simulations. The performances of these proposed multivariate CUSUM control charts have been investigated by comparing ARLs. The purpose of control charts is to detect assignable causes of variation so that these causes can be found and eliminated from process, variability will be reduced and the process will be improved. We show that the charts based on three different control statistics are very effective in detecting shifts, especially shifts in covariances when the variables are highly correlated. When variables are highly correlated, our overall recommendation is to use the multivariate CUSUM control charts using trace for detecting changes in covariance matrix.

On inference of multivariate means under ranked set sampling

  • Rochani, Haresh;Linder, Daniel F.;Samawi, Hani;Panchal, Viral
    • Communications for Statistical Applications and Methods
    • /
    • 제25권1호
    • /
    • pp.1-13
    • /
    • 2018
  • In many studies, a researcher attempts to describe a population where units are measured for multiple outcomes, or responses. In this paper, we present an efficient procedure based on ranked set sampling to estimate and perform hypothesis testing on a multivariate mean. The method is based on ranking on an auxiliary covariate, which is assumed to be correlated with the multivariate response, in order to improve the efficiency of the estimation. We showed that the proposed estimators developed under this sampling scheme are unbiased, have smaller variance in the multivariate sense, and are asymptotically Gaussian. We also demonstrated that the efficiency of multivariate regression estimator can be improved by using Ranked set sampling. A bootstrap routine is developed in the statistical software R to perform inference when the sample size is small. We use a simulation study to investigate the performance of the method under known conditions and apply the method to the biomarker data collected in China Health and Nutrition Survey (CHNS 2009) data.

보건조사연구에서 다변량결측치가 내포된 자료를 효율적으로 분석하기 위한 통계학적 방법 (Statistical Methods for Multivariate Missing Data in Health Survey Research)

  • 김동기;박은철;손명세;김한중;박형욱;안재형;임종건;송기준
    • Journal of Preventive Medicine and Public Health
    • /
    • 제31권4호
    • /
    • pp.875-884
    • /
    • 1998
  • Missing observations are common in medical research and health survey research. Several statistical methods to handle the missing data problem have been proposed. The EM algorithm (Expectation-Maximization algorithm) is one of the ways of efficiently handling the missing data problem based on sufficient statistics. In this paper, we developed statistical models and methods for survey data with multivariate missing observations. Especially, we adopted the EM algorithm to handle the multivariate missing observations. We assume that the multivariate observations follow a multivariate normal distribution, where the mean vector and the covariance matrix are primarily of interest. We applied the proposed statistical method to analyze data from a health survey. The data set we used came from a physician survey on Resource-Based Relative Value Scale(RBRVS). In addition to the EM algorithm, we applied the complete case analysis, which uses only completely observed cases, and the available case analysis, which utilizes all available information. The residual and normal probability plots were evaluated to access the assumption of normality. We found that the residual sum of squares from the EM algorithm was smaller than those of the complete-case and the available-case analyses.

  • PDF

MULTIVARIATE JOINT NORMAL LIKELIHOOD DISTANCE

  • Kim, Myung-Geun
    • Journal of applied mathematics & informatics
    • /
    • 제27권5_6호
    • /
    • pp.1429-1433
    • /
    • 2009
  • The likelihood distance for the joint distribution of two multivariate normal distributions with common covariance matrix is explicitly derived. It is useful for identifying outliers which do not follow the joint multivariate normal distribution with common covariance matrix. The likelihood distance derived here is a good ground for the use of a generalized Wilks statistic in influence analysis of two multivariate normal data.

  • PDF

Detection of Hotspots on Multivariate Spatial Data

  • Moon, Sung-Ho
    • Journal of the Korean Data and Information Science Society
    • /
    • 제17권4호
    • /
    • pp.1181-1190
    • /
    • 2006
  • Statistical analyses for spatial data are important features for various types of fields. Spatial data are taken at specific locations or within specific regions and their relative positions are recorded. Lattice data are synoptic observation covering an entire spatial region, like cancer rates corresponding to each county in a state. Until now, the echelon analysis has been applied only to univariate spatial data. As a result, it is impossible to detect the hotspots on the multivariate spatial data In this paper, we expand the spatial data to time series structure. And then we analyze them on the time space and detect the hotspots. Echelon dendrogram has been made by piling up each multivariate spatial data to bring time spatial data. We perform the structural analysis of temporal spatial data.

  • PDF