Handling outlying or skewed data with robust regression
In a previous blog, we applied simple linear regression to an interesting problem: how well does a measure of wine density account for alcohol content. This was considered simple linear regression because we had one outcome variable (alcohol content) and one predictor variable (wine density). We can extend this approach to have more than one predictor. Specifically, we can use
Read more
In statistics, we often want to fit a statistical model to be able to make broader generalizations. An important type of statistical model is linear regression, where we predict the linear relationship between an outcome variable and a predictor variable. In this post we will learn how to perform a simple linear regression in R. See our previous post for
Read more
In the first and second post of this series, we learned how to graph our data using histograms and Q-Q plots to see whether it is normally distributed, and quantify the shape of the distribution by considering skew and kurtosis. In this, the final post in this series, we will learn to use the Shapiro-Wilk test to determine whether data
Read more
In our previous post, we learned how to inspect whether or data were normally distributed or not using plots. It is always important to visualise our data. However, inspecting such plots is open for interpretation and, possibly, abuse. We will now learn how to analyse our data and generate numerical values that describe how our data are distributed. Quantifying the
Read more
Many statistical tests assume that the sampling distribution is normally distributed. This does not mean that the data we collected for our experiment is normally distributed, but rather that the distribution of mean values from many samples of the same size will be normally distributed. Unfortunately, we do no have access to the sampling distribution. However, based on the central
Read more