R: Calculating sample size for a 2 independent sample t-test

In hypothesis testing, the number of samples needed to find an effect depends on the:

  • size of the effect
  • variability of the effect
  • tolerance to reporting a false positive (Type I error) and
  • probability of not reporting a false negative (1 – probability of Type II error)

The following code uses the stats package in R to calculate sample size for a t-test of the difference of a continuous variable between two independent groups (e.g. no. of cells that respond to a drug):

# Comment next line if stats already installed
# Load the stats library

power.t.test(n=NULL, delta=0.5, sd=0.5, sig.level=0.05, power=0.8, 
             type="two.sample", alternative="two.sided")

The power.t.test() function requires one of the parameters n, delta, sd, sig.level or power to be passed as NULL so that this parameter can be calculated. Here, we calculate the sample size required to detect a between-group difference of 50% when standard deviation of the difference is also 50%, tolerating false positives 5% of the time (sig.level=0.05) with the probability of not committing Type II error as 80% (power=0.8). The sample size calculation is constructed to find a difference between two independent groups (type="two.sample") for a two sided test (alternative="two.sided"). That is, the test considers the hypothesis that group 1 values could be either greater or smaller than group 2 values, and not only greater or only smaller. The following output is produced:

     Two-sample t test power calculation 

              n = 16.71477
          delta = 0.5
             sd = 0.5
      sig.level = 0.05
          power = 0.8
    alternative = two.sided

NOTE: n is number in *each* group

In experimental research, scientists don’t often know how big an effect might be or how variable it is, so sample size calculations are often based on the ratio of the effect size to its variability. It works out that when the ratio of delta:sd = 1, the minimum number of samples needed for each of two independent groups is 17 (with rounding up). Scientists usually test a few more samples up to 20 (in case some produce poor-quality data), so if you have been in research long enough to wonder where the magic group size 20 comes from, it comes from the delta:sd ratio.

Try testing the R code with different specifications: set different parameters to NULL and see what values are calculated for different settings. If sample size was known, we could use the code above to calculate power simply by specifying n with sample size and passing power as NULL. If we want to calculate sample size for a paired t-test, specify type='paired' instead: this calculates the number of pairs of tests needed to find an effect where sd is standard deviation of differences within pairs.


We learned how to calculate sample size for a 2-sample t-test using the power.t.test() function in R. The principles of sample size calculations can be applied to sample size calculations of other types of outcomes (e.g. proportions, count data, etc.)

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s