Python virtual environments for scientists with conda part 4

In our previous post we learned how to verify what Python virtual environments were installed on our machine and what Python packages they contained. We also learned how to delete unwanted environments.

In this post we are going to learn how to share our virtual environment with others. This is incredibly useful in this day and age of research reproducibility. By creating a file that details our virtual environment – the version of Python, the packages, the versions of these packages – others will be able to recreate the virtual environment on their machine. With an environment file, they will be able to run our code on our data, using the same Python set-up that we used to generate the results from our publication.

Environment files

As mentioned in our first post, venv is the default Python package to create and manage virtual environments, and it uses simple text files for its environment files.

However, because of its ease of use and popularity amongst scientists, this series of posts has focuses on conda to manage virtual environments. conda uses the YAML file format for its environment files. YAML, which stands for “YAML Ain’t Markup Language”, is a human-readable data-serialization language that is commonly used for configuration files.

Generating an environment file

Once we have our Python virtual environment set-up with the correct version of Python and required packages, we can create our environment file. After activating our virtual environment, we can run the following command in a terminal window (Mac, Linux) or the Anaconda prompt (Windows).

(base) /home/martin$ conda activate sci_sound
(sci_sound) /home/martin$ conda env export > environment.yml

Simple as that! We now have an environment.yml file that we can include in top level of our project repository.

For those of you that are curious, the environment.yml file that we just created looks like this:

name: sci_sound
channels:
- conda-forge
- defaults
dependencies:
- bzip2=1.0.6=h14c3975_1002
- ca-certificates=2019.3.9=hecc5488_0
- certifi=2019.3.9=py37_0
- libblas=3.8.0=8_openblas
- libcblas=3.8.0=8_openblas
- libffi=3.2.1=he1b5a44_1006
- libgcc-ng=8.2.0=hdf63c60_1
- libgfortran-ng=7.3.0=hdf63c60_0
- liblapack=3.8.0=8_openblas
- libstdcxx-ng=8.2.0=hdf63c60_1
- ncurses=6.1=hf484d3e_1002
- numpy=1.16.3=py37he5ce36f_0
- openblas=0.3.6=h6e990d7_1
- openssl=1.1.1b=h14c3975_1
- pip=19.1=py37_0
- python=3.7.3=h5b0a415_0
- readline=7.0=hf8c457e_1001
- setuptools=41.0.1=py37_0
- sqlite=3.26.0=h67949de_1001
- tk=8.6.9=h84994c4_1001
- wheel=0.33.1=py37_0
- xz=5.2.4=h14c3975_1001
- zlib=1.2.11=h14c3975_1004
- pip:
- pygame==1.9.6
prefix: /home/martin/anaconda3/envs/sci_sound

Creating a virtual environment from a YAML environment file

It is easy to create a Python virtual environment from a YAML environment file.

(base) /home/martin$ conda env create -f environment.yml

Review of conda commands

Functionality Command
new virtual environment conda create –name python=
conda create –name sci_sound python=3.7 numpy pandas
add package conda install
conda install numpy pandas
add package from channel conda install -c
conda install -c cogsci pygame
add package from PyPI pip install
pip install pygame
activate env conda activate
deactivate env conda deactivate
deactivate env conda deactivate
list environments conda info –envs
list env packages conda list –name
remove environment conda remove –name –all
create environment file conda env export > environment.yml
create env from file conda install -f environment.yml

Summary

In this series we learned how to create new virtual environments and add packages, add packages available from the wider Anaconda and Python community via channels and pip, list our available environments and their packages and remove environments. In this latest post we learned how to create environment files, which allow others to create the same virtual environment on their own computer.

It is now increasingly encouraged (or required) to include the data and code when publishing a scientific paper. The reproducibility of your analysis pipeline can be enhanced by including a conda environment file as this allows others to create the same Python set up on their own machines.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s