## Calculating sample size for a 2 independent sample t-test

Scientists often plan for studies by calculating how many subjects or units need to be tested in order to find an effect. That is, they plan for a study using statistical power according to principles of hypothesis testing. Sample size calculations are usually required in ethics applications and grant proposals to justify the study.

We previously learned how to calculate sample size for a 2 independent t-test in R. If you do most of your work in Python, you could instead use the `statsmodels`

package to perform the same calculation. `statsmodels`

is a Python module that provides functionality for conducting many statistical tests and analyses. It has been tested against R and other statistical packages, and implements R-style formulas with `pandas`

dataframes or `numpy`

functions to fit models.

Calculating sample size for a 2 independent sample t-test in Python requires specifying similar parameters to performing the calculation in R, but there are some differences. Here’s how to do it in `statsmodels`

(output shown using `>>>`

prompt, and documentation available here):

```
from statsmodels.stats.power import tt_ind_solve_power
mean_diff, sd_diff = 0.5, 0.5
std_effect_size = mean_diff / sd_diff
n = tt_ind_solve_power(effect_size=std_effect_size, alpha=0.05, power=0.8, ratio=1, alternative='two-sided')
print('Number in *each* group: {:.5f}'.format(n))
>>> Number in *each* group: 16.71472
```

The `tt_ind_solve_power()`

function requires the following parameters to calculate sample size:

`effect_size`

: The standardised effect size ie. difference between the two means divided by the standard deviation; this value has to be positive. (This is different to R’s`delta`

parameter, which requires the mean difference only.)`alpha`

: Significance level or probability of Type I error (false positives), usually set at 0.05.`power`

: Power of the test, or 1 – probability of Type II error (false negatives), usually set at 0.8.`ratio`

: Ratio of sample size in sample 2 relative to sample 1, default set at 1. (This function can be used to calculate power for unevenly-sized samples.)`alternative`

: Power the test to detect two-sided effects (eg. the effect could be an increase or a reduction in outcome, not forced to be only an increase in outcome.)

In the code above, we specified the difference between two means and the standard deviation of the difference as 0.5 each, producing a standardised effect size of 1. This means we are calculating sample size (or powering the study) to detect quite a big effect! Performing the sample size calculation in Python obtains the same answer, to 4 decimal places, as the output from R.

It is easy to see that changes in the standardised mean difference we want to detect will change the sample size. For example, for a given mean difference of 0.5, sample size increases as standard deviation of the difference increases:

```
for sd in [0.4, 0.5, 0.6]:
n = tt_ind_solve_power(effect_size=mean_diff/sd, alpha=0.05, power=0.8, ratio=1, alternative='two-sided')
print('Number in *each* group when SD is {:<4.1f}: {:.2f}'.format(sd, n))
>>> Number in *each* group when SD is 0.4 : 11.09
>>> Number in *each* group when SD is 0.5 : 16.71
>>> Number in *each* group when SD is 0.6 : 23.60
```

### Summary

We used Python’s `statsmodels`

module to calculate sample size for a 2 independent sample t-test. Sample size is sensitive to the size and variability of the difference between groups, and tolerance to Type I and II errors.