• Title/Summary/Keyword: Data normality

Search Result 324, Processing Time 0.021 seconds

A Simple Chi-squared Test of Multivariate Normality Based on the Spherical Data

  • Park, Cheolyong
    • Communications for Statistical Applications and Methods
    • /
    • v.8 no.1
    • /
    • pp.117-126
    • /
    • 2001
  • We provide a simple chi-squared test of multivariate normality based on rectangular cells on the spherical data. This test is simple since it is a direct extension of the univariate chi-squared test to multivariate case and the expected cell counts are easily computed. We derive the limiting distribution of the chi-squared statistic via the conditional limit theorems. We study the accuracy in finite samples of the limiting distribution and then compare the poser of our test with those of other popular tests in an application to a real data.

  • PDF

A modified test for multivariate normality using second-power skewness and kurtosis

  • Namhyun Kim
    • Communications for Statistical Applications and Methods
    • /
    • v.30 no.4
    • /
    • pp.423-435
    • /
    • 2023
  • The Jarque and Bera (1980) statistic is one of the well known statistics to test univariate normality. It is based on the sample skewness and kurtosis which are the sample standardized third and fourth moments. Desgagné and de Micheaux (2018) proposed an alternative form of the Jarque-Bera statistic based on the sample second power skewness and kurtosis. In this paper, we generalize the statistic to a multivariate version by considering some data driven directions. They are directions given by the normalized standardized scaled residuals. The statistic is a modified multivariate version of Kim (2021), where the statistic is generalized using an empirical standardization of the scaled residuals of data. A simulation study reveals that the proposed statistic shows better power when the dimension of data is big.

A Non-parametric Analysis of the Tam-Jin River : Data Homogeneity between Monitoring Stations (탐진강 수질측정 지점 간 동질성 검정을 위한 비모수적 자료 분석)

  • Kim, Mi-Ah;Lee, Su-Woong;Lee, Jae-Kwan;Lee, Jung-Sub
    • Journal of Korean Society on Water Environment
    • /
    • v.21 no.6
    • /
    • pp.651-658
    • /
    • 2005
  • The Non-parametric Analysis is powerful in data test especially for the non- normality water quality data. The data at three monitoring stations of the Tam-Jin River were evaluated for their normality using Skewness, Q-Q plot and Shapiro-Willks tests. Various constituent of water quality data including temperature, pH, DO, SS, BOD, COD, TN and TP in the period of January 1994 to December 2004 were used as dataset. Shapiro-Willks normality test was carried out for a test 5% significance level. Most water quality data except DO at monitoring stations 1 and 2 showed that data does not normally distributed. It is indicating that non-parametric method must be used for a water quality data. Therefore, a homogeneity was conducted by Mann-Whitney U test (p<0.05). Two stations were paired in three pairs of such stations. Differences between stations 1, 2 and stations 1, 3 for pH, BOD, COD, TN and TP were meaningful, but Tam-Jin 2 and 3 stations did not meaningful. In addition, a narrow gap of the water quality ranges is not a difference. Categories in which all three pairs of stations (1 and 2, 2 and 3, 1 and 3) in the Tam-Jin River showed difference in water quality were analyzed on TN and TP. The results of in this research suggest a right analysis in the homogeneity test of water quality data and a reasonable management of pollutant sources.

Testing Whether Failure Rate Changes its Trend Using Censored Data

  • Jeong, Hai-Sung;Na, Myung-Hwan;Kim, Jae-Joo
    • International Journal of Reliability and Applications
    • /
    • v.1 no.2
    • /
    • pp.115-121
    • /
    • 2000
  • The trend change in aging properties, such as failure rate and mean residual life, of a life distribution is important to engineers and reliability analysts. In this paper we develop a test statistic for testing whether or not the failure rate changes its trend using censored data. The asymptotic normality of the test statistics is established. We discuss the efficiency values of loss due to censoring.

  • PDF

The System for Checking Multivariate Normality and Outliers

  • 강명래;최용석
    • Proceedings of the Korean Statistical Society Conference
    • /
    • 2000.11a
    • /
    • pp.253-255
    • /
    • 2000
  • 다변량분석 기법을 사용하기 위해서는 자료가 정규성(normality)가정을 만족해야한다. 본 연구에서는 GUI(graphic user interface)환경 하에서 일변량(univariate)과 다변량자료(multivariate data)의 정규성검정, 이상치(outliers)제거 및 변수변환(variable transformation)을 지원하는 시스템을 구축하여 사용자들이 보다 편리하게 사용할 수 있음을 소개 하고자 한다.

  • PDF

A Study on the Inference Model of In-use Vehicles Emission Distribution according to the Vehicle Mileage (주행거리별 운행차 배출가스 분포 추정 모델에 관한 연구)

  • 김현우
    • Transactions of the Korean Society of Automotive Engineers
    • /
    • v.10 no.4
    • /
    • pp.85-92
    • /
    • 2002
  • To investigate the safety of the in-use vehicles emission against the tail-pipe emission regulation, in-use vehicles emission trend according to vehicle mileage should be known. But it is impossible to collect all vehicles emission data In order to know that. Therefore, it is necessary to establish a statistically meaningful inference method that can be used generally to estimate in-use vehicles emissions distribution according to the vehicle mileage with relatively less in-use vehicles emission data. To do this, a linear regression model that solved the problems of data normality and common variance of error was studied. As a way that can secure the data normality, In(emission) instead of emission itself was used as a sampled data. And a reciprocal of mileage was suggested as a factor to secure common variance of error. As an example, 36 data of FTP-75 test were handled in this study. As a result, using average value and standard deviation at each mileage which were inferred from a linear regression model, probability density distribution and cumulative distribution of emissions according to the vehicle mileage were obtained and it was possible to predict the deterioration factor through full useful life mileage and also possible to decide whether those in-use vehicles will meet the tail-pipe emission regulations or not.

A Study on Split Variable Selection Using Transformation of Variables in Decision Trees

  • Chung, Sung-S.;Lee, Ki-H.;Lee, Seung-S.
    • Journal of the Korean Data and Information Science Society
    • /
    • v.16 no.2
    • /
    • pp.195-205
    • /
    • 2005
  • In decision tree analysis, C4.5 and CART algorithm have some problems of computational complexity and bias on variable selection. But QUEST algorithm solves these problems by dividing the step of variable selection and split point selection. When input variables are continuous, QUEST algorithm uses ANOVA F-test under the assumption of normality and homogeneity of variances. In this paper, we investigate the influence of violation of normality assumption and effect of the transformation of variables in the QUEST algorithm. In the simulation study, we obtained the empirical powers of variable selection and the empirical bias of variable selection after transformation of variables having various type of underlying distributions.

  • PDF

Power Analysis for Normality Plots (정규성 그래프의 검정력 비교)

  • Lee, Jae-Young;Rhee, Seong-Won
    • Journal of the Korean Data and Information Science Society
    • /
    • v.10 no.2
    • /
    • pp.429-436
    • /
    • 1999
  • We suggest test statistics for normality using Q-Q plot and P-P plot and obtain empirical quantities of these statistics. Also the power comparison with Shapiro-Wilk's W is conducted by Monte Carlo study.

  • PDF

A note on the test for the covariance matrix under normality

  • Park, Hyo-Il
    • Communications for Statistical Applications and Methods
    • /
    • v.25 no.1
    • /
    • pp.71-78
    • /
    • 2018
  • In this study, we consider the likelihood ratio test for the covariance matrix of the multivariate normal data. For this, we propose a method for obtaining null distributions of the likelihood ratio statistics by the Monte-Carlo approach when it is difficult to derive the exact null distributions theoretically. Then we compare the performance and precision of distributions obtained by the asymptotic normality and the Monte-Carlo method for the likelihood ratio test through a simulation study. Finally we discuss some interesting features related to the likelihood ratio test for the covariance matrix and the Monte-Carlo method for obtaining null distributions for the likelihood ratio statistics.

Asymptotic Distribution of the LM Test Statistic for the Nested Error Component Regression Model

  • Jung, Byoung-Cheol;Myoungshic Jhun;Song, Seuck-Heun
    • Journal of the Korean Statistical Society
    • /
    • v.28 no.4
    • /
    • pp.489-501
    • /
    • 1999
  • In this paper, we consider the panel data regression model in which the disturbances have nested error component. We derive a Lagrange Multiplier(LM) test which is jointly testing for the presence of random individual effects and nested effects under the normality assumption of the disturbances. This test extends the earlier work of Breusch and Pagan(1980) and Baltagi and Li(1991). Further, it is shown that this LM test has the same asymptotic distribution without normality assumption of the disturbances.

  • PDF