The independent-samples t-test is used to determine if a difference exists between the means of two independent groups on a continuous dependent variable. More specifically, it will let you determine whether the difference between these two groups is statistically significant. This test is also known by a number of different names, including the independent t-test, independent-measures t-test, between-subjects t-test, unpaired t-test, and Student’s t-test.
For example, you could use the independent-samples t-test to determine whether (mean) salaries, measured in US dollars, differed between males and females (i.e., your dependent variable would be “salary” and your independent variable would be “gender”, which has two groups: “males” and “females”). You could also use an independent-samples t-test to determine whether (mean) reaction time, measured in milliseconds, differed in under 21-year-olds versus those 21 years old and over (i.e., your dependent variable would be “reaction time” and your independent variable would be “age group”, split into two groups: “under 21-year-olds” and “21 years old and over”).
In order to run an independent-samples t-test, there are six assumptions that need to be considered. The first three assumptions relate to your choice of study design and the measurements you chose to make, whilst the second three assumptions relate to the characteristics of the data that you actually collected. These assumptions are:
- Assumption #1: You have one dependent variable that is measured at the continuous level. Examples of continuous variables include revision time (measured in hours), intelligence (measured using IQ score), exam performance (measured from 0 to 100), weight (measured in kg), and so forth.
- Assumption #2: You have one independent variable that consists of two categorical, independent groups (i.e., a dichotomous variable). Example independent variables that meet this criterion include gender (two groups: “males” or “females”), employment status (two groups: “employed” or “unemployed”), transport type (two groups: “bus” or “car”), smoker (two groups: “yes” or “no”), trial (two groups: “intervention” or “control”), and so forth. If you are unfamiliar with any of the above terms, you might wish to read our Types of Variables guide.
Note: The two groups of the independent variable are also referred to as “categories” or “levels”, but the term “levels” is usually reserved for groups that have an order (e.g., fitness level, with two levels: “low” and “high”).
- Assumption #3: You should have independence of observations, which means that there is no relationship between the observations in each group of the independent variable or between the groups themselves. Indeed, an important distinction is made in statistics when comparing values from either different individuals or from the same individuals. Independent groups (in an independent-samples t-test) are groups where there is no relationship between the participants in either of the groups. Most often, this occurs simply by having different participants in each group. For example, if you split a group of individuals into two groups based on their gender (i.e., a male group and a female group), no one in the female group can be in the male group and vice versa. As another example, you might randomly assign participants to either a control trial or an intervention trial. Again, no participant can be in both the control group and the intervention group. This will be true of any two independent groups you form (i.e., a participant cannot be a member of both groups). In actual fact, the ‘no relationship’ part extends a little further and requires that participants in both groups are considered unrelated, not just different people; for example, participants might be considered related if they are husband and wife, or twins. Furthermore, participants in Group A cannot influence any of the participants in Group B, and vice versa. If you are using the same participants in each group, or they are otherwise related, a paired-samples t-test is a more appropriate test. It is also fairly common to hear this type of study design, with two independent groups, referred to as “between-subjects” because you are concerned with the differences in the dependent variable between different subjects. An example of where related observations might be a problem is if all the participants in your study (or the participants within each group) were assessed together, such that a participant’s performance affects another participant’s performance (e.g., participants encourage each other to lose more weight in a ‘weight loss intervention’ when assessed as a group compared to being assessed individually; or athletic participants being asked to complete ‘100m sprint tests’ together rather than individually, with the added competition amongst participants resulting in faster times, etc.). Independence of observations is largely a study design issue rather than something you can test for, but it is an important assumption of the independent-samples t-test. If your study fails this assumption, you will need to use another statistical test instead of the independent-samples t-test (you can use our Statistical Test Selector to find the appropriate statistical test).
- Assumption #4: There should be no significant outliers in the two groups of your independent variable in terms of the dependent variable. For both groups of the independent variable, if there are any scores that are unusual for that group, in that their value is extremely small or large compared to the other scores, these scores are called outliers (e.g., 8 participants in a group scored between 60-75 out of 100 in a difficult maths test, but one participant scored 98 out of 100). Outliers can have a large negative effect on your results because they can exert a large influence (i.e., change) on the mean and standard deviation for that group, which can affect the statistical test results. Outliers are more important to consider when you have smaller sample sizes, as the effect of the outlier will be greater. Therefore, in this example, you need to investigate whether engagement has no outliers for each group of gender (i.e., you are testing whether the engagement score is outlier free for both the “Male” and “Female” groups).
- Assumption #5: Your dependent variable should be approximately normally distributed for each group of the independent variable. The assumption of normality is necessary for statistical significance testing using an independent-samples t-test. However, the independent-samples t-test is considered “robust” to violations of normality. This means that some violation of this assumption can be tolerated and the test will still provide valid results. Therefore, you will often hear of this test only requiring approximately normal data. Furthermore, as sample size increases, the distribution can be very non-normal and, thanks to the Central Limit Theorem, the independent-samples t-test can still provide valid results. Also, it should be noted that if the distributions are all skewed in a similar manner (e.g., all moderately negatively skewed), this is not as troublesome as when compared to the situation where you have groups that have differently-shaped distributions (e.g., the distribution of Group A is moderately ‘positively’ skewed, whilst the distribution of Group B is moderately ‘negatively’ skewed). Therefore, in this example, you need to investigate whether engagement is normally distributed.
Note: Technically, it is the residuals that need to be normally distributed. However, for an independent-samples t-test the distribution of the scores (observations) in each group will be the same as the distribution of the residuals in each group.
There are many different methods available to test this assumption. We show you one of the most common methods: the Shapiro-Wilk test for normality. Whilst it is most common to run only one type of normality test for a given analysis and to rely solely on that result, as you become more familiar with statistics you might start to evaluate normality based on the result of more than one method.
- Assumption #6: You have homogeneity of variances (i.e., the variance is equal in each group of your independent variable)The assumption of homogeneity of variances states that the population variance for each group of your independent variable is the same. If the sample size in each group is similar, violation of this assumption is not often too serious. However, if sample sizes are quite different, the independent-samples t-test is sensitive to the violation of this assumption. Either way, SPSS Statistics uses Levene’s test of equality of variances and two differently-calculated independent-samples t-tests, which will give you a valid result irrespective of whether you met or violated this assumption (i.e., SPSS Statistics provides an independent-samples t-test that is calculated normally (with pooled variances) and another for when the assumption is violated that uses separate variances (i.e., non-pooled variances) and the Welch-Satterthwaite correction to the degrees of freedom).
After running the independent-samples t-test procedure in the previous section, SPSS Statistics will have generated a number of tables that contain all the information you need to report the results of the independent-samples t-test.
- Homogeneity of variances was met: If your data has met the assumption of homogeneity of variances, you simply need to interpret the ‘standard’ independent-samples t-test output in SPSS Statistics. We can show you (a) how to accurately interpret the SPSS Statistics output for the independent-samples t-test, including the mean difference, standard error of the mean difference, 95% confidence intervals, t-value, degrees of freedom and p-value; (b) how to determine whether there was a statistically significant mean difference in the two groups of your independent variable in terms of your dependent variable; (c) if you can reject, or fail to reject, the null hypothesis; and (d) how you can bring all of this together into a single paragraph that explains your results.
- Homogeneity of variances was violated: If your data has violated the assumption of homogeneity of variances, you can still continue with your analysis. However, you will have to interpret the results from a modified t-test that SPSS Statistics produces in its output. This modified t-test, often referred to as the unequal variance t-test, separate variances t-test or Welch t-test after its creator (Welch, 1947), can accommodate unequal variances and still deliver a valid test result. We can show you: (a) how to accurately interpret the SPSS Statistics output for the separate variances (non-pooled variances) with Welch-Satterthwaite correction to the degrees of freedom t-test, explaining the mean difference, standard error of the mean difference, 95% confidence intervals, t-value, degrees of freedom and p-value; (b) how to determine whether there was a statistically significant mean difference in the two groups of your independent variable in terms of your dependent variable; (c) if you can reject, or fail to reject, the null hypothesis; and (d) how you can bring all of this together into a single paragraph that explains your results.