The significance of t-tests

Ph.D. Topics : Statistics

No, I’m not talking about statistical significance here; I’m talking about practical significance.

The first statistical significance test an intro stats student learns is usually a t-test to test differences between group means. If she goes on to use statistics, she may never use a t-test for such a purpose again. Why not? Because few real-world data analysis projects involve just one dichotomous independent variable and one normally distributed dependent variable. It almost seems like t-tests aren’t that important.

But they are, because they:

  • provide small-sample estimators; they don’t rely on asymptotic properties like many other statistical methods
  • illustrate null hypothesis testing in a simple manner
  • present the basics of frequentist statistics in the barest form possible
  • allow you to test the significance of regression coefficients, something much more common than two-group comparisons in day-to-day data analysis

The t-test is based on the Student’s t distribution, named after the pseudonym used by William Sealy Gosset when he published its description in 1908. Gosset was a chemist at Guinness Brewery in Dublin, and was not allowed to publish work under his own name. Gosset used statistical methods to identify the best strains of barley for use in brewing ale, and revolutionized the small-sample analysis of means.

The t distribution is used in estimating means when the population standard deviation is unknown, which is almost always the case in actual data analysis practice. It’s bell-shaped, like the normal distribution, but is flatter, representing greater uncertainty around the mean compared to the normal. As sample sizes (and thus degrees of freedom) increase, the t-distribution looks more and more like the normal distribution.

In an intro stats class, you’d learn about two main uses of the t-distribution: the independent samples t-test and the correlated samples t-test. The independent samples t-test is used to analyze a single-factor, between-groups design. The correlated samples t-test is used for analyzing a single-factor, repeated measures (a.k.a. within subjects) design.

Independent samples t-test

The independent samples t-test tests the difference between group means when two groups are unrelated to each other. This would be classified as a single-factor, between-groups design. Use of this test involves calculating a pooled variance for the two groups that weights each group’s variance according to its sample size. This assumes the variances are the same.

If you cannot assume the variances are the same, you can use Welch’s t-test, which defines the standard error of the mean difference as a linear, unweighted combination of the two group variances rather than using a pooled variance and uses Satterthwaite’s approximation to calculate an adjusted value for the degrees of freedom for the test.

How do you know if the variances are so different that you should use Satterthwaite’s approximate degrees of freedom and the Welch t-test? One rule of thumb is that if the sample variance of one group is more than twice the other, you should not assume equal variances (Frankfort-Machmias & Leon-Guerrero, 2009). There are also a variety of tests that you can use to check whether variances across two or more groups can reasonably be assumed to have equal variances (an F test, Levene’s Test, Cochran’s g, Bartlett’s test).

Correlated samples t-test

The correlated samples t-test tests the difference between means when two groups are related to each other, for example if you are analyzing pre/post-test data or data from matched pairs such as siblings or married couples. It’s actually easier to compute than the independent samples t-test, because all you do is calculate difference scores for each pair, compute an estimated standard error from the difference scores, and calculate the test statistic as the mean difference score divided by the standard error. The degrees of freedom for this test is the number of pairs minus one.


The t tests assume that scores are normally distributed, with homogeneous variances, and are independent of each other (except, of course, for correlations within subjects in the repeated measures design). The most important assumption is independence. If scores are correlated because of spatial, temporal, organizational, or other relationships between subjects, the t-test will not give you an accurate read on mean differences.


Frankfort-Nachmias, C., & Leon-Guerrero, A. (2009). Social Statistics for a Diverse Society. 5th Ed. Thousand Oaks, Calif: Pine Forge Press.


Comments are closed.