Reflections on p-values and confidence intervals

When we run a statistical test, we almost always obtain a p-value. Many statistical tests will also generate a confidence interval. Unfortunately, many scientists report the p-value and ignore the confidence interval. As pointed by Rothman (2016) and the American Statistical Association, relying on p-values forces a false dichotomy between results that are significant and those that are non-significant. This

Read more

Manipulating data with Pandas – Part 1

Pandas (i.e. panel data) is a Python library designed to manipulate data in tables and time series. Pandas uses many Numpy library functions to manipulate data stored in dataframes, analogous to a spreadsheet or table. Let’s look at some basic Pandas functions to manipulate data, and plot the data using the Seaborn plotting package. To begin, import libraries and simulate

Read more

Fixing statistics

Statistics are not broken. The problem lies with how scientists use, interpret and report them. At the end of 2017, Nature asked five influential statisticians to recommend a key change that would improve science. Adjust for human cognition The first change was proposed by Jeff Leek of the Johns Hopkins School of Public Health, and it is based on the

Read more
« Older Entries