• Title/Summary/Keyword: two-sample t-test

Search Result 588, Processing Time 0.033 seconds

Effective Sample Sizes for the Test of Mean Differences Based on Homogeneity Test

  • Heo, Sunyeong
    • Journal of Integrative Natural Science
    • /
    • v.12 no.3
    • /
    • pp.91-99
    • /
    • 2019
  • Many researchers in various study fields use the two sample t-test to confirm their treatment effects. The two sample t-test is generally used for small samples, and assumes that two independent random samples are selected from normal populations, and the population variances are unknown. Researchers often conduct F-test, the test of equality of variances, before testing the treatment effects, and the test statistic or confidence interval for the two sample t-test has two formats according to whether the variances are equal or not. Researchers using the two sample t-test often want to know how large sample sizes they need to get reliable test results. This research gives some guidelines for sample sizes to them through simulation works. The simulation had run for normal populations with the different ratios of two variances for different sample sizes (${\leq}30$). The simulation results are as follows. First, if one has no idea equality of variances but he/she can assume the difference is moderate, it is safe to use sample size at least 20 in terms of the nominal level of significance. Second, the power of F-test for the equality of variances is very low when the sample sizes are small (<30) even though the ratio of two variances is equal to 2. Third, the sample sizes at least 10 for the two sample t-test are recommendable in terms of the nominal level of significance and the error limit.

Comparison of the Power of Bootstrap Two-Sample Test and Wilcoxon Rank Sum Test for Positively Skewed Population

  • Heo, Sunyeong
    • Journal of Integrative Natural Science
    • /
    • v.15 no.1
    • /
    • pp.9-18
    • /
    • 2022
  • This research examines the power of bootstrap two-sample test, and compares it with the powers of two-sample t-test and Wilcoxon rank sum test, through simulation. For simulation work, a positively skewed and heavy tailed distribution was selected as a population distribution, the chi-square distributions with three degrees of freedom, χ23. For two independent samples, the fist sample was selected from χ23. The second sample was selected independently from the same χ23 as the first sample, and calculated d+ax for each sampled value x, a randomly selected value from χ23. The d in d+ax has from 0 to 5 by 0.5 interval, and the a has from 1.0 to 1.5 by 0.1 interval. The powers of three methods were evaluated for the sample sizes 10,20,30,40,50. The null hypothesis was the two population medians being equal for Bootstrap two-sample test and Wilcoxon rank sum test, and the two population means being equal for the two-sample t-test. The powers were obtained using r program language; wilcox.test() in r base package for Wilcoxon rank sum test, t.test() in r base package for the two-sample t-test, boot.two.bca() in r wBoot pacakge for the bootstrap two-sample test. Simulation results show that the power of Wilcoxon rank sum test is the best for all 330 (n,a,d) combinations and the power of two-sample t-test comes next, and the power of bootstrap two-sample comes last. As the results, it can be recommended to use the classic inference methods if there are widely accepted and used methods, in terms of time, costs, sometimes power.

A Study on Teaching Method of Two-Sample Test for Population Mean Difference (두 모집단 모평균 비교의 지도에 관한 연구)

  • Kim Yong-Tae;Lee Jang-Taek
    • The Mathematical Education
    • /
    • v.45 no.2 s.113
    • /
    • pp.145-154
    • /
    • 2006
  • The main purpose of this study is to investigate the effect of departures from normality and equal variance on the two-sample test when the variances are unknown. We have found that type I error brought about a little bit change which is ignorable in relation to kurtosis. But the change of type I error was mainly based on the skewness of the parent population. In introductory statistics classes where data analysis includes techniques for detecting skewness of two populations, we recommend the two-sample t-test when maximal skewness of two populations is smalter than the value 4 when the variances seem equal. Furthermore, our simulations reveal that the two-sample t-test appears somewhat more robust than that of z-test if the assumption of equal variance is satisfied. In the case of unequal variance, the two-sample t-test appears somewhat more robust provided the t-statistic using Satterthwaite's approximate degrees of freedom.

  • PDF

Effect of Positively Skewed Distribution on the Two sample t-test: Based on Chi-square Distribution

  • Heo, Sunyeong
    • Journal of Integrative Natural Science
    • /
    • v.14 no.3
    • /
    • pp.123-129
    • /
    • 2021
  • This research examines the effect of positively skewed population distribution on the two sample t-test through simulation. For simulation work, two independent samples were selected from the same chi-square distributions with 3, 5, 10, 15, 20, 30 degrees of freedom and sample sizes 3, 5, 10, 15, 20, 30, respectively. Chi-square distribution is largely skewed to the right at small degrees of freedom and getting symmetric as the degrees of freedom increase. Simulation results show that the sampled populations are distributed positively skewed like chi-square distribution with small degrees of freedom, the F-test for the equality of variances shows poor performances even at the relatively large degrees of freedom and sample sizes like 30 for both, and so it is recommended to avoid using F-test. When two population variances are equal, the skewness of population distribution does not affect on the t-test in terms of the confidence level. However even though for the highly positively skewed distribution and small sample sizes like three or five the t-test achieved the nominal confidence level, the error limits are very large at small sample size. Therefore, if the sampled population is expected to be highly skewed to the right, it will be recommended to use relatively large sample size, at least 20.

Asymptotic Relative Efficiency of t-test Following Transformations

  • Yeo, In-Kwon
    • Journal of the Korean Statistical Society
    • /
    • v.26 no.4
    • /
    • pp.467-476
    • /
    • 1997
  • The two-sample t-test is not expected to be optimal when the two samples are not drawn from normal populations. According to Box and Cox (1964), the transformation is estimated to enhance the normality of the tranformed data. We investigate the asymptotic relative efficiency of the ordinary t-test versus t-test applied transformation introduced by Yeo and Johnson (1997) under Pitman local alternatives. The theoretical and simulation studies show that two-sample t-test using transformed date gives higher power than ordinary t-test for location-shift models.

  • PDF

Two-sample Tests for Edge Detection in Noisy Images (잡음영상에서 에지검출을 위한 이표본 검정법)

  • 임동훈;박은희
    • The Korean Journal of Applied Statistics
    • /
    • v.14 no.1
    • /
    • pp.149-160
    • /
    • 2001
  • In this paper we employ two-sample location tests such as Wilcoxon test and T test for detecting edges in noisy images. For this, we compute a test statistic on pixel gray levels obtained using an edge-height parameter and compare it with a threshold determined by a significance level. Experimental results applied to sample images are given and performances of these tests in terms of the objective measure are compared.

  • PDF

Note on the Equality of Variances in Two Sample t-Test (두 집단 평균 차이 검정에서 분산의 동질성에 관한 소고)

  • Kim, Sang-Cheol;Lim, Jo-Han
    • Communications for Statistical Applications and Methods
    • /
    • v.17 no.1
    • /
    • pp.79-88
    • /
    • 2010
  • Introductory statistic class proposes two tests for the equality of two population means according to the homogeneity of their variances. However, in practice, the variances are also unknown and practitioners often test their homogeneity before they do two sample t-test. This is also true in many popular statistical packages such as SAS and SPSS. In this paper, we study the type I error of this two stage procedure and propose a procedure to control it at a given significance level.

Improved Statistical Testing of Two-class Microarrays with a Robust Statistical Approach

  • Oh, Hee-Seok;Jang, Dong-Ik;Oh, Seung-Yoon;Kim, Hee-Bal
    • Interdisciplinary Bio Central
    • /
    • v.2 no.2
    • /
    • pp.4.1-4.6
    • /
    • 2010
  • The most common type of microarray experiment has a simple design using microarray data obtained from two different groups or conditions. A typical method to identify differentially expressed genes (DEGs) between two conditions is the conventional Student's t-test. The t-test is based on the simple estimation of the population variance for a gene using the sample variance of its expression levels. Although empirical Bayes approach improves on the t-statistic by not giving a high rank to genes only because they have a small sample variance, the basic assumption for this is same as the ordinary t-test which is the equality of variances across experimental groups. The t-test and empirical Bayes approach suffer from low statistical power because of the assumption of normal and unimodal distributions for the microarray data analysis. We propose a method to address these problems that is robust to outliers or skewed data, while maintaining the advantages of the classical t-test or modified t-statistics. The resulting data transformation to fit the normality assumption increases the statistical power for identifying DEGs using these statistics.

Window Configurations Comparison Based on Statistical Edge Detection in Images (영상에서 윈도우 배치에 따른 통계적 에지검출 비교)

  • Lim, Dong-Hoon
    • Communications for Statistical Applications and Methods
    • /
    • v.16 no.4
    • /
    • pp.615-625
    • /
    • 2009
  • In this paper we describe Wilcoxon test and T-test that are well-known in two-sample location problem for detecting edges under different window configurations. The choice of window configurations is an important factor in determining the performance and the expense of edge detectors. Our edge detectors are based on testing the mean values of local neighborhoods obtained under the edge model using an edge-height parameter. We compare three window configurations based on statistical tests in terms of qualitative measures with the edge maps and objective, quantitative measures as well as CPU time for detecting edge.

Sample size comparison for two independent populations (독립인 두 모집단 설계에서의 표본수 비교)

  • Ko, Hae-Won;Kim, Dong-Jae
    • Journal of the Korean Data and Information Science Society
    • /
    • v.21 no.6
    • /
    • pp.1243-1251
    • /
    • 2010
  • For clinical trials, it is common to compare the placebo and new drug. The method of calculating a sample size for two independent populations are the t-test that is used for parametric methods, and the Wilcoxon rank-sum test that is used in the non-parametric methods. In this paper, we propose a method that is using Kim's (1994) statistic power based on the linear placement statistic, which was proposed by Orban and Wolfe (1982). We also compare the sample size for the proposed method with that for using Wang et al. (2003)'s sample size formula which is based on Wilcoxon rank-sum test, and with that of t-test for parametric methods.