Manipulating data with Pandas – Part 1

Pandas (i.e. panel data) is a Python library designed to manipulate data in tables and time series. Pandas uses many Numpy library functions to manipulate data stored in dataframes, analogous to a spreadsheet or table. Let’s look at some basic Pandas functions to manipulate data, and plot the data using the Seaborn plotting package. To begin, import libraries and simulate

Read more

Fixing statistics

Statistics are not broken. The problem lies with how scientists use, interpret and report them. At the end of 2017, Nature asked five influential statisticians to recommend a key change that would improve science. Adjust for human cognition The first change was proposed by Jeff Leek of the Johns Hopkins School of Public Health, and it is based on the

Read more

Break, Continue and Pass statements using for loops in Python

In programming, for loops are used to automate repetitive tasks, such as analysing similar datasets from different subjects. Sometimes, a dataset might be slightly different to another such that the data can be used but needs to be analysed differently in the loop. For example, data might have been accidentally sampled at a higher rate, or a nested trial needs

Read more
« Older Entries Recent Entries »