Tag Archives: statistics

Research concepts: From sample to population

In doing research, we apply the scientific method to answer questions. For example, does cigarette smoking cause lung cancer? What are the mechanisms of weakness after stroke? Why do cells become cancerous? What properties are specific to the poison of South American tree frogs? We want to understand all the individuals being studied (i.e. people, cells, frogs, etc.) but it

Read more

Research concepts: Overview

An important part of conducting sound science involves interpreting data correctly. Unfortunately, we don’t do that very well. For example, we are fooled by regression to the mean, we report findings when there are none, and we are overconfident about statistical power and significance. As scientists and lay persons, we want to be certain about research findings. But statistics only

Read more

Statistics you are interested in: simple linear regression – part 3

In the first and second posts of this series, we performed simple linear regression of a continuous outcome on a single continuous predictor, but we also learned it is possible to include binary or categorical predictors in such regression models. How is this be done? The hsb2.csv dataset we have been using also contains the variable female where male participants

Read more

Statistics you are interested in: simple linear regression – part 2

In the previous post, we performed simple linear regression of science scores on reading scores from 200 students using ordinary least squares (OLS) estimation. This was done using Python’s Statsmodels package. What does the OLS output show and how should it be interpreted? Here is the figure of the individual subject data and the line of best fit, as well

Read more

Does it matter that data are Normally distributed?

Hypothesis testing vs. Estimation Hypothesis tests require that populations are Normally distributed in order for the tests to be reliable. When samples are drawn from Normally distributed populations, the distributions of F or t statistics can be calculated for any given sample size, and the F or t statistic for a specific experiment can be obtained from the distribution. This

Read more

Statistics you are interested in: simple linear regression – part 1

We introduced simple linear regression in a previous series and learned how to perform it in R (1, 2). What is the theory behind simple linear regression? How is it used to understand relationships between variables? What is another way to perform it in Python? The hsb2.csv dataset (available here) contains demographic and academic test scores data from 200 students.

Read more
« Older Entries Recent Entries »