Calculate the arithmetic means (Mc, Mv) of the two groups by dividing the sum of values by the number of observations in each.
For example, if the metric of interest is average revenue per user, the total number of observations in the control group is the number of users enrolled in the control, and the sum of values is the summation of the revenues from each user in the group.
Calculate the difference (Delta) of the two means by subtracting the mean of the control group from the mean of the variant group
Following the same notation, the difference in means is
Delta = Mv - Mc.
For example, if
Mv = 10.50 and
Mc = 8.00, then Delta is
10.50 - 8.00 = 2.50.
Calculate the pooled standard deviation (SDp) using the equation
SDp = SQRT(((Nv - 1) * SDv2 + (Nc - 1) * SDc2) / (Nc + Nv - 2))
For example, if the sample sizes Nc and Nv are 1000 and 1010, and the standard deviations were estimated to be 5.5 and 6 respectively, then
SDp = SQRT( ( (1000 - 1) * 5.52+ (1010 - 1) * 62) / (1000 + 1010 - 2))
SDp = SQRT( (999 * 30.25 + 1009 * 36) / 2008 )
SDp = SQRT( (30219.75 + 36324) / 2008 )
SDp = SQRT(33.14)
SDp = 5.756
Calculate the Z score (Z) corresponding to the confidence level as the inverse probability density function of the confidence level.
If the confidence level is expressed as a percentage, convert it to a proportion first by dividing it by 100. For example, a 95% confidence level would become 0.95. Then use a tool like Microsoft Excel, R, the GIGA online z-score calculator to calculate the inverse probability function (a.k.a. quantile function):
- Use the
NORM.S.INV()function in Excel.
- Use the
qnorm()function in R.
- Use the GIGA online z-score calculator to calculate Z from Probability.
To make sure your calculation is correct, you can check it using this reference: for a confidence level of 0.95, the Z score would be 1.644854.
Calculate the standard error of the difference in means (SE) using the formula
SE = Z * SD * SQRT( 1 / Nv + 1 / Nc )
The standard error of the difference in means is the standard deviation divided by the square root of the sum of the number of observations in each group which is then multiplied by the Z score obtained earlier. Continuing with the previous example, where
SD = 5.756,
Z = 1.644854,
Nv = 1000 and
Nc = 1010:
SE = 1.644854 * 5.756 * SQRT( 1 / 1000 + 1 / 1010 ) SE = 1.644854 * 5.756 * 0.0446 SE = 1.644854 * 0.2567 SE = 0.42
Subtract the standard error (SE) from the absolute difference (Delta) to get the lower confidence interval bound. The upper bound is infinity.
The result is a one-sided interval with a lower bound as calculated above and an upper bound of plus infinity. Values outside the interval can be rejected with confidence equal to or greater than the chosen confidence level. For example, if a 95% confidence interval spans from 0.01 to plus infinity, we can say that any difference less than 0.01 can be rejected with a confidence of 95% or greater.
To calculate the opposite one-sided interval, simply add the standard error (SE) to the absolute difference (Delta) instead.