Monthly Archives: August 2017

R: How to reshape data from wide to long format, and back again

Many studies take repeated observations on subjects. For example, clinical trials record outcomes from subjects before and after treatments, and laboratory studies might record physiological outcomes from the same subjects over time. In a dataframe, when observations from each subject are written on one row and repeated observations are stored as different column variables, we say the data are in

Read more

The likelihood ratio test: relevance and application

Suppose you conduct a study to compare an outcome between two independent groups of people, but you realised later that the groups were unexpectedly different at baseline. This difference might affect how you interpret the findings. For example, you measured muscle stiffness in people with stroke and in healthy people. At the end of the study, you realised that on

Read more

Add jitter to your figures using Python and R

Scientific figures are at their most informative when they include the individual data used to calculate summary statistics such as means and standard deviations. Why is showing data important? As previously pointed out here and here, figures with means, standard deviations, standard errors, etc. can be misleading and conceal the nature of the underlying data. As highlighted in our previous

Read more

Calculating sample size for a paired t-test

Suppose you are planning to conduct a repeated-measures study, where outcomes are measured from the same subject at more than one point in time and the average within-subject effect is calculated using a paired t-test or linear regression. How might you calculate how many subjects need to be tested in order to find an effect? Similar to calculating sample size

Read more