Why Type I errors are worse than Type II errors

Most introductory statistics courses include a section explaining Type I (false positive) and Type II (false negative) errors in hypothesis testing. If you have been through such courses, you would have learned that the tolerance for Type I error is set by the significance level (alpha =0.05; the usual default) while Type II error is controlled by statistical power which depends on sample size (beta =0.20; power=1-beta). A study designed with these values of alpha and beta means that the study has 80% power to detect a given effect, tolerating a 5% probability of detecting a false positive effect.
Sure, there are plenty of difficulties using statistical significance tests to interpret findings. Later on, I hope to do a short series on common misconceptions of significance tests. But, laying all that aside, why are false positive errors called “Type I” errors, and false negative errors called “Type II” errors?
The statisticians who developed significance testing, Jerzy Neyman and Egon Pearson, wrote that it is important to reduce the chance of rejecting a true hypothesis to be as low as desired, and devise a test to reject the hypothesis when it is likely to be false. They described these errors (from the Wikipedia page) as:
(I) we reject H0 [i.e., the hypothesis to be tested] when it is true,
(II) we fail to reject H0 when some alternative hypothesis HA or H1 is true.
Neyman and Pearson named these as Type I and Type II errors, with the emphasis that of the two, Type I errors are worse because they cause us to conclude that a finding exists when in fact it does not. That is, it is worse to conclude that we found an effect that does not exist, than miss an effect that does exist.
This distinction between the worse and better errors can be explained using an analogy from criminal justice. I recently read an article about the tragic death of a man who had been wrongfully jailed for a murder he did not commit. The article describes how he was sentenced to 20 years in jail. The combined efforts of a journalist, a politician and a team of high-profile, pro bono lawyers finally got him exonerated after he served 12 years in jail. The High Court quashed his conviction and declared it a miscarriage of justice. The Australian criminal justice system states that a person is presumed innocent unless proven guilty beyond reasonable doubt. This stance seems to take the view that the suffering of the innocent is worse than overlooking the guilty. A little like Agatha Christie in her crime fiction novel “Ordeal by Innocence”. By analogy, Neyman and Pearson would say that rejecting H0 when it is true, is worse than not rejecting it when H1 is true.
See if this helps you remember the difference between Type I and Type II errors.
Reference
Neyman, J.; Pearson, E.S. (1967) [1933]. “The testing of statistical hypotheses in relation to probabilities a priori”. Joint Statistical Papers. Cambridge University Press. pp. 186–202.
Thank you for the concise and helpful article! I specifically found, “That is, it is worse to conclude that we found an effect that does not exist, than miss an effect that does exist.” to be very helpful.
Though, I think you may have mistakenly emphasized the wrong alternative in the following: “This stance seems to take the view that the suffering of the innocent is worse than punishment of the guilty.” Perhaps what you meant to say was: “This stance seems to take the view that the suffering of the innocent is worse than release of the guilty.”
LikeLike
Good pick up, thanks! I’ve changed it to something similar. Cheers
LikeLike
FINALLY I GOT TO UNDERSTAND THIS ONCE AND FOR ALL. FOREVER GRATEFUL 🙂 🙂
LikeLike