5 LAB VI: Inference for numerical data (2 samples)
When we have finished this Lab, we should be able to:
5.1 Two-sample t-test (Student’s t-test)
Two sample t-test (Student’s t-test) can be used if we have two independent (unrelated) groups (e.g., males-females, treatment-non treatment) and one quantitative variable of interest.
5.1.1 Opening the file
Open the dataset named depression
from the file tab in the menu:
The dataset depression
includes 76 patients and has two variables. The treatment
variable and the HDRS
variable (Figure 5.1). Double-click on the variable name HDRS
and change the measure type from nominal to continuous
.
5.1.2 Research question
In an experiment designed to test the effectiveness of paroxetine for treating bipolar depression, the participants were randomly assigned into two groups (intervention Vs placebo).
The researchers used the Hamilton Depression Rating Scale (HDRS) to measure the depression state of the participants and wanted to find out if the HDRS score is different in paroxetine group as compared to placebo group at the end of the experiment. The significance level α was set to 0.05.
Note A score of 0–7 in HDRS is generally accepted to be within the normal range, while a score of 20 or higher indicates at least moderate severity.
5.1.3 Hypothesis Testsing for the Student’s t-test
5.1.4 Assumptions
A. Explore the descriptive characteristics of distribution for each group and check for normality
The distributions can be explored visually with appropriate plots. Additionally, summary statistics and significance tests to check for normality (e.g., Shapiro-Wilk test) and for equality of variances (e.g., Levene’s test) can be used.
On the Jamovi top menu navigate to
as shown below in Figure 5.2.
The Descriptives
dialogue box opens. Drag the variable HDRS
into the Variables
box and split it by the treatment
variable, as shown below (Figure 5.3):
We can now select the relevant descriptive statistics such as Percantiles
, Skewness
, Kurtosis
and the Shapiro-Wilk
test from the Statistics
section:
Once we have selected our descriptive statistics, a table will appear in the output window on our right-hand side, as shown below:
The means are close to medians (20.3 vs 21 and 21.5 vs 21). The skewness is approximately zero (symmetric distribution) and the (excess) kurtosis is close to zero (mesokurtic distribution) indicating normal distributions for both groups.
Additionally, the Shapiro-Wilk tests of normality suggest that the data for the HDRS
in both groups, paroxetine and placebo, are normally distributed (p=0.67 >0.05 and p=0.61 >0.05, respectively). (NOTE: If the
- If p − value < 0.05, reject the null hypothesis,
. - If p − value ≥ 0.05, do not reject the null hypothesis,
.
Then we can check the Density
from Histograms
in the Plot
section, as shown below (Figure 5.7):
A graph is generated in the output window on our right-hand side, as shown below:
The above figure shows that the data are close to symmetry and the assumption of a normal distribution is reasonable.
B. Homogeneity of variance
The second assumption that should be satisfied is the homogeneity of variance. We observe in the summary table of Figure 5.5 that the two standard deviations (3.65 vs 3.41) are similar (see also below the Levene’s test for equality of variances in Figure 5.11).
5.1.5 Run the Student’s t-test
5.2 Paired samples t-test
The paired samples design can effectively reduce the effect of non-treatment factors and improve the efficiency of the experiment. A paired samples t-test is used to estimate whether the means of two related measurements are significantly different from one another.
Open the dataset named weight
from the file tab in the menu:
The dataset weight
contains the birth and discharge weight of 25 newborns (Figure 5.15). Double-click on the name of the variables birth_weight
and discharge_weight
to change the measure type from nominal to continuous
.
5.2.1 Research question
We might ask if the mean difference of the weight in birth and in discharge equals to zero or not. If the differences between the pairs of measurements are normally distributed, a paired t-test is the most appropriate statistical test.
5.2.2 Hypothesis Testsing for the paired samples t-test
5.2.3 Assumptions
Explore the characteristics of the distribution of differences,
First, we have to calculate the differences
The distributions of the differences,
On the Jamovi top menu navigate to
as shown below in Figure 5.17.
The Descriptives
dialogue box opens. Drag the variable d
into the Variables
box, as shown below (Figure 5.18):
We can now select the relevant descriptive statistics such as Percantiles
, Skewness
, Kurtosis
and the Shapiro-Wilk
test from the Statistics
section:
Once we have selected our descriptive statistics, a table will appear in the output window on our right-hand side, as shown below:
The mean is close to median (39.6 vs 40). Moreover, both skewness and (excess) kurtosis are approximately zero indicating a symmetric and mesokurtic distribution of the weight differences.
Then we can check the Density
from Histograms
in the Plot
section, as shown below (Figure 6.6):
A graph is generated in the output window on our right-hand side, as shown below:
The above figure shows that the data are close to symmetry and the assumption of a normal distribution is reasonable.
Additionally, the Shapiro-Wilk test of normality suggests that the data for the differences,