Python virtual environments for scientists with conda


I have been coding with Python for several years and I only recently took the time to learn about Python environments. Now that I understand them, I can see why they are so useful, especially in this day and age of large team collaborations, research reproducibility, and sharing data and code when research results are published.

The purpose of this post is to give you some background on Python virtual environments and how to incorporate them into your projects.

What is a virtual environment?

When coding with a large piece of commercial software such as Matlab, you usually deal with a single copy of the purchased software plus any additional toolboxes you or your institute have purchased. Because they are bundled together, Matlab and its various toolboxes work seamlessly with each another.

Python is an Open-Source programming language, therefore things are a little different.

The core Python program continues to evolve (currently at version 3.7), but so do the various packages written and contributed by the Python community. Some packages have been developed and tested with certain versions of Python. Similarly, some packages depend on specific versions of other packages.

These inter-dependencies can become complicated. Even worse, two of our projects may require different versions of the same package.

Python virtual environments to the rescue!

Python virtual environments are isolated installations of Python that you can activate and deactivate. You can have dozens of environments on your computer, often one for every major project you are working on, and if needed, each can use different versions of Python. Moreover, these virtual environments can use different packages and different versions of these packages.

In this way, each virtual environment is a self-contained programming environment that contains the required packages, and the correct version of these packages.

venv vs conda

Python ships with venv, a tool to create virtual environments. While venv is simple to use, the majority of (data) scientists who use Python tend to use conda to create their virtual environments.

conda is part of the Anaconda Python distribution. Anaconda provides a simple way for its users to quickly download 1,500+ Python/R data science packages, as well as manage Python libraries, dependencies, and environments (via conda).

Because of its ease of use and popularity amongst scientists, the rest of this post will focus on conda.

Anaconda. If you are new to Python, I highly recommend using Anaconda. The base download includes Python and over 200 key data science Python packages. It also includes a large library of Python packages that can be installed with conda, a simple coding environment (Idle), and more advanced coding environment that resembles somewhat the Matlab interface (Spyder). Also, the folks at Anaconda recently announced a collaboration with the JetBrains team and their Python independent development environment (IDE) called PyCharm. I highly recommend PyCharm to new users; it makes navigating your code and learning about Python and proper Python style very accessible.

To start off, you likely want the full version of Anaconda. However, after you gain more experience, you may prefer to use Miniconda. It is a slimmed down version of Anaconda that includes the core Python program andcondapackage manager. Once Miniconda is installed, you can use thecondacommand to install any other packages and create environments, etc. This allows you to install only the packages you need.

Creating our first virtual environment with conda

While it is possible in Windows to use the Anaconda Navigator to create new virtual environments, it is more common to use the Anaconda prompt. On Mac and Linux, condacan be used from a terminal.

(base) /home/martin $ conda create --name sci_sound
Collecting package metadata: done
Solving environment: done

## Package Plan ##

  environment location: /home/martin/anaconda3/envs/sci_sound

Proceed ([y]/n)? y

Preparing transaction: done
Verifying transaction: done
Executing transaction: done
#
# To activate this environment, use
#
#     $ conda activate sci_sound
#
# To deactivate an active environment, use
#
#     $ conda deactivate

Notice how the terminal prompt includes (base). This is somethingcondadoes to inform us we are currently in our base Python environment.

The above command created a new virtual environment called sci_sound. Because we did not specify anything other than the name of the environment,condacreated the environment without actually including Python or any packages.

Creating our first useful virtual environment with conda

A slightly more useful version of the above command is to specify the version of Python we want to use in our virtual environment:

~$ conda create --name sci_sound python=3.7
Collecting package metadata: done
Solving environment: done

## Package Plan ##

  environment location: /home/martin/anaconda3/envs/sci_sound

  added / updated specs:
    - python=3.7


The following NEW packages will be INSTALLED:

  bzip2              conda-forge/linux-64::bzip2-1.0.6-h14c3975_1002
  ca-certificates    conda-forge/linux-64::ca-certificates-2019.3.9-hecc5488_0
  certifi            conda-forge/linux-64::certifi-2019.3.9-py37_0
  libffi             conda-forge/linux-64::libffi-3.2.1-he1b5a44_1006
  libgcc-ng          pkgs/main/linux-64::libgcc-ng-8.2.0-hdf63c60_1
  libstdcxx-ng       pkgs/main/linux-64::libstdcxx-ng-8.2.0-hdf63c60_1
  ncurses            conda-forge/linux-64::ncurses-6.1-hf484d3e_1002
  openssl            conda-forge/linux-64::openssl-1.1.1b-h14c3975_1
  pip                conda-forge/linux-64::pip-19.1-py37_0
  python             conda-forge/linux-64::python-3.7.3-h5b0a415_0
  readline           conda-forge/linux-64::readline-7.0-hf8c457e_1001
  setuptools         conda-forge/linux-64::setuptools-41.0.1-py37_0
  sqlite             conda-forge/linux-64::sqlite-3.26.0-h67949de_1001
  tk                 conda-forge/linux-64::tk-8.6.9-h84994c4_1001
  wheel              conda-forge/linux-64::wheel-0.33.1-py37_0
  xz                 conda-forge/linux-64::xz-5.2.4-h14c3975_1001
  zlib               conda-forge/linux-64::zlib-1.2.11-h14c3975_1004


Proceed ([y]/n)? y

Preparing transaction: done
Verifying transaction: done
Executing transaction: done
#
# To activate this environment, use
#
#     $ conda activate sci_sound
#
# To deactivate an active environment, use
#
#     $ conda deactivate

From the prompt given, we can see that the most basic Python installation requires several packages to be installed.

Now, let’s activate our new Python environment by following the instructions given to us bycondaafter it create our new environment. Specifically, we will use conda activate sci_sound to activate our environment and conda deactivate to deactivate it (and return to our base environment).

(base) /home/martin $ conda activate sci_sound
(sci_sound) /home/martin $

Notice how the terminal prompt indicates our sci_sound environment is now active.

Including Python packages in our conda environments

When we create an new virtual environment, we may already have a good idea of what packages we will need. We can pass the required package names toconda, and it will take care of the rest.

Let’s run an example where we want to create a Python 3.7 environment that includes numpy:

(base) /home/martin $ conda create --name sci_sound python=3.7 numpy
Collecting package metadata: done
Solving environment: done

## Package Plan ##

  environment location: /home/martin/anaconda3/envs/sci_sound

  added / updated specs:
    - numpy
    - python=3.7

[...]

Alternatively, we can create our virtual environment and then add packages:

(base) /home/martin $ conda create --name sci_sound python=3.7
[...]
(base) /home/martin $ conda install --name sci_sound numpy

Creating custom virtual environments with conda

There are times where we need to work with a specific version of Python as well as specific versions of packages. This can easily be achieved by specifying the version number:

(base) /home/martin $ conda create --name sci_sound python=3.4 scipy=0.15.0 astroid babel

As we can see, the above command created a new environment for Python 3.4 with a specific version of scipy as well as the most recent version of the astroid and babel packages. Because different packages can depend on specific versions of other packages, it is recommend that we install all our packages in one call rather that one at a time.

Summary

In this post we learned about Python virtual environments and how to use conda to create them. This is a large topic that is worth delving deeper into. So in our next post we will learn more about install Python packages that are not directly available through conda.

 

 

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s