• Title/Summary/Keyword: violation of normality

Search Result 5, Processing Time 0.016 seconds

A comparison of tests for homoscedasticity using simulation and empirical data

  • Anastasios Katsileros;Nikolaos Antonetsis;Paschalis Mouzaidis;Eleni Tani;Penelope J. Bebeli;Alex Karagrigoriou
    • Communications for Statistical Applications and Methods
    • /
    • v.31 no.1
    • /
    • pp.1-35
    • /
    • 2024
  • The assumption of homoscedasticity is one of the most crucial assumptions for many parametric tests used in the biological sciences. The aim of this paper is to compare the empirical probability of type I error and the power of ten parametric and two non-parametric tests for homoscedasticity with simulations under different types of distributions, number of groups, number of samples per group, variance ratio and significance levels, as well as through empirical data from an agricultural experiment. According to the findings of the simulation study, when there is no violation of the assumption of normality and the groups have equal variances and equal number of samples, the Bhandary-Dai, Cochran's C, Hartley's Fmax, Levene (trimmed mean) and Bartlett tests are considered robust. The Levene (absolute and square deviations) tests show a high probability of type I error in a small number of samples, which increases as the number of groups rises. When data groups display a nonnormal distribution, researchers should utilize the Levene (trimmed mean), O'Brien and Brown-Forsythe tests. On the other hand, if the assumption of normality is not violated but diagnostic plots indicate unequal variances between groups, researchers are advised to use the Bartlett, Z-variance, Bhandary-Dai and Levene (trimmed mean) tests. Assessing the tests being considered, the test that stands out as the most well-rounded choice is the Levene's test (trimmed mean), which provides satisfactory type I error control and relatively high power. According to the findings of the study and for the scenarios considered, the two non-parametric tests are not recommended. In conclusion, it is suggested to initially check for normality and consider the number of samples per group before choosing the most appropriate test for homoscedasticity.

A Study on Split Variable Selection Using Transformation of Variables in Decision Trees

  • Chung, Sung-S.;Lee, Ki-H.;Lee, Seung-S.
    • Journal of the Korean Data and Information Science Society
    • /
    • v.16 no.2
    • /
    • pp.195-205
    • /
    • 2005
  • In decision tree analysis, C4.5 and CART algorithm have some problems of computational complexity and bias on variable selection. But QUEST algorithm solves these problems by dividing the step of variable selection and split point selection. When input variables are continuous, QUEST algorithm uses ANOVA F-test under the assumption of normality and homogeneity of variances. In this paper, we investigate the influence of violation of normality assumption and effect of the transformation of variables in the QUEST algorithm. In the simulation study, we obtained the empirical powers of variable selection and the empirical bias of variable selection after transformation of variables having various type of underlying distributions.

  • PDF

Robust Bayesian analysis for autoregressive models

  • Ryu, Hyunnam;Kim, Dal Ho
    • Journal of the Korean Data and Information Science Society
    • /
    • v.26 no.2
    • /
    • pp.487-493
    • /
    • 2015
  • Time series data sometimes show violation of normal assumptions. For cases where the assumption of normality is untenable, more exible models can be adopted to accommodate heavy tails. The exponential power distribution (EPD) is considered as possible candidate for errors of time series model that may show violation of normal assumption. Besides, the use of exible models for errors like EPD might be able to conduct the robust analysis. In this paper, we especially consider EPD as the exible distribution for errors of autoregressive models. Also, we represent this distribution as scale mixture of uniform and this form enables efficient Bayesian estimation via Markov chain Monte Carlo (MCMC) methods.

Relationship between rural watershed characteristics and stream water quality (농촌유역특성과 하천수질과의 관계)

  • 홍성구;권순국
    • Magazine of the Korean Society of Agricultural Engineers
    • /
    • v.43 no.3
    • /
    • pp.56-65
    • /
    • 2001
  • In interpreting stream water quality data, scientific or statistical mehtods should be employed. Classical parametric statistical methods may not be adopted in analyzing water quality data, due to the violation of normality. In this study, nonparametric statistical methods, such as Kruskal-Wallis test and Mann-Whitney test, were used in comparing water quality data from several monitoring stations. Water quality data used are those collected Bokha watershed, located in Ichon-city, Kyonggi province. Based on the test results, domestic sewage is the major pollution source. A couple of sub-watersheds with a large number of livestock do not show significant differences in water quality parameters. It should be noted that comparison of mean values of water quality parameters is difficult to relate water quality with watershed characteristics. The results also indicate that livestock farming does not significantly affect the water quality.

  • PDF

Micro-Study on Stock Splits and Measuring Information Content Using Intervention Method (주식분할 미시분석과 정보효과 측정)

  • Kim, Yang-Yul
    • The Korean Journal of Financial Management
    • /
    • v.7 no.1
    • /
    • pp.1-20
    • /
    • 1990
  • In most of studies on market efficiency, the stability of risk measures and the normality of residuals unexplained by the pricing model are presumed. This paper re-examines stock splits, taking the possible violation of two assumptions into accounts. The results does not change the previous studies. But, the size of excess returns during the 2-week period before announcements decreases by 43%. The results also support that betas change around announcements and the serial autocorrelation of residuals is caused by events. Based on the results, the existing excess returns are most likely explained as a compensation to old shareholders for unwanted risk increases in their portfolio, or by uses of incorrect betas in testing models. In addition, the model suggested in the paper provides a measure for the speed of adjustment of the market to the new information arrival and the intensity of information contents.

  • PDF