Does it matter that data are Normally distributed?

Hypothesis testing vs. Estimation Hypothesis tests require that populations are Normally distributed in order for the tests to be reliable. When samples are drawn from Normally distributed populations, the distributions of F or t statistics can be calculated for any given sample size, and the F or t statistic for a specific experiment can be obtained from the distribution. This

Read more

Reproducibility: The Transparency and Openness Promotion Guidelines

In a previous post, we profiled the EQUATOR network and reporting guidelines. These guidelines stress transparency in reporting study methods, and most are relevant to study designs in clinical research, such as randomised controlled trials, epidemiological studies and systematic reviews. Taking a different vein, the Transparency and Openness Promotion (TOP) Guidelines were developed to enhance transparency in reporting of study

Read more

Multiple linear regression in R

In a previous blog, we applied simple linear regression to an interesting problem: how well does a measure of wine density account for alcohol content. This was considered simple linear regression because we had one outcome variable (alcohol content) and one predictor variable (wine density). We can extend this approach to have more than one predictor. Specifically, we can use

Read more

Statistics you are interested in: simple linear regression – part 1

We introduced simple linear regression in a previous series and learned how to perform it in R (1, 2). What is the theory behind simple linear regression? How is it used to understand relationships between variables? What is another way to perform it in Python? The hsb2.csv dataset (available here) contains demographic and academic test scores data from 200 students.

Read more
« Older Entries