Add jitter to your figures using Python and R

Scientific figures are at their most informative when they include the individual data used to calculate summary statistics such as means and standard deviations. Why is showing data important? As previously pointed out here and here, figures with means, standard deviations, standard errors, etc. can be misleading and conceal the nature of the underlying data. As highlighted in our previous posts, scientists are encouraged to plot the data used to compute the summary statistics in figures (e.g., Drummond & Vowler, 2011).

Using jitter to help readers see your data

One problem with plotting individual data points is that they can overlap and make it difficult to see all of the data. This can easily be solved by adding some jitter to the individual points that have the same or similar values. Jitter is simply the addition of a small amount of horizontal (or vertical) variability to the data in order to ensure all data points are visible.

The following figure has three subplots that all include individual data points. Because the first subplot does not include jitter, it is difficult to tell whether some data points overlap. The next two subplots show two ways to add jitter in Python with the Seaborn statistical plotting package. The code used to generate this figure is available here.

 

scatter

 

Other examples of jitter

I have written a small Python module to generate plots for paired data and their difference. The module contains a function called _jitter() that adds jitter to the data to be plotted (I wrote this before I knew about Seaborn!).

Creating pretty, informative plots is one of the hallmarks of ggplot2, a plotting system for the R statistical programming language. Jitter can easily be added to plotted data to make nice plots like this one and this one.

Summary

Try adding individual data points and jitter to your next figures, your readers will be grateful. And the good news is that the people behind Python’s Seaborn and R’s ggplot2 have done the hard work for us.

References

Drummond GB, Vowler SL (2011). Show the data, don’t conceal them. J Physiol 589:1861-3.

 

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s