## Manipulating data with Pandas – Part 3 Previously, we read in CSV data and summarised it. Here, we will learn how to locate data using labels and tabulate categorical data. The CSV file data_pandas.csv (available here) contains simulated data of age, height and score from 20 subjects. Read in and store the CSV data in the dataframe df, using the header names. Also create and assign subjects

## Reflections on p-values and confidence intervals When we run a statistical test, we almost always obtain a p-value. Many statistical tests will also generate a confidence interval. Unfortunately, many scientists report the p-value and ignore the confidence interval. As pointed by Rothman (2016) and the American Statistical Association, relying on p-values forces a false dichotomy between results that are significant and those that are non-significant. This

## Manipulating data with Pandas – Part 2 Previously, we used Pandas to read and write data to CSV files, reshape data from wide to long format, and used the Seaborn package to plot paired data. Here, we will summarise some data and write the results to a CSV file. The CSV file data_pandas.csv (available here) contains simulated data of age, height and score from 20 subjects. To

## Manipulating data with Pandas – Part 1 Pandas (i.e. panel data) is a Python library designed to manipulate data in tables and time series. Pandas uses many Numpy library functions to manipulate data stored in dataframes, analogous to a spreadsheet or table. Let’s look at some basic Pandas functions to manipulate data, and plot the data using the Seaborn plotting package. To begin, import libraries and simulate 