Python virtual environments for scientists with conda

I have been coding with Python for several years and I only recently took the time to learn about Python environments. Now that I understand them, I can see why they are so useful, especially in this day and age of large team collaborations, research reproducibility, and sharing data and code when research results are published.
The purpose of this post is to give you some background on Python virtual environments and how to incorporate them into your projects.
What is a virtual environment?
When coding with a large piece of commercial software such as Matlab, you usually deal with a single copy of the purchased software plus any additional toolboxes you or your institute have purchased. Because they are bundled together, Matlab and its various toolboxes work seamlessly with each another.
Python is an Open-Source programming language, therefore things are a little different.
The core Python program continues to evolve (currently at version 3.7), but so do the various packages written and contributed by the Python community. Some packages have been developed and tested with certain versions of Python. Similarly, some packages depend on specific versions of other packages.
These inter-dependencies can become complicated. Even worse, two of our projects may require different versions of the same package.
Python virtual environments to the rescue!
Python virtual environments are isolated installations of Python that you can activate and deactivate. You can have dozens of environments on your computer, often one for every major project you are working on, and if needed, each can use different versions of Python. Moreover, these virtual environments can use different packages and different versions of these packages.
In this way, each virtual environment is a self-contained programming environment that contains the required packages, and the correct version of these packages.
venv vs conda
Python ships with venv, a tool to create virtual environments. While venv
is simple to use, the majority of (data) scientists who use Python tend to use conda
to create their virtual environments.
conda
is part of the Anaconda Python distribution. Anaconda provides a simple way for its users to quickly download 1,500+ Python/R data science packages, as well as manage Python libraries, dependencies, and environments (via conda
).
Because of its ease of use and popularity amongst scientists, the rest of this post will focus on conda
.
Anaconda. If you are new to Python, I highly recommend using Anaconda. The base download includes Python and over 200 key data science Python packages. It also includes a large library of Python packages that can be installed with conda
, a simple coding environment (Idle), and more advanced coding environment that resembles somewhat the Matlab interface (Spyder). Also, the folks at Anaconda recently announced a collaboration with the JetBrains team and their Python independent development environment (IDE) called PyCharm. I highly recommend PyCharm to new users; it makes navigating your code and learning about Python and proper Python style very accessible.
To start off, you likely want the full version of Anaconda. However, after you gain more experience, you may prefer to use Miniconda. It is a slimmed down version of Anaconda that includes the core Python program andconda
package manager. Once Miniconda is installed, you can use theconda
command to install any other packages and create environments, etc. This allows you to install only the packages you need.
Creating our first virtual environment with conda
While it is possible in Windows to use the Anaconda Navigator to create new virtual environments, it is more common to use the Anaconda prompt. On Mac and Linux, conda
can be used from a terminal.
(base) /home/martin $ conda create --name sci_sound
Collecting package metadata: done
Solving environment: done
## Package Plan ##
environment location: /home/martin/anaconda3/envs/sci_sound
Proceed ([y]/n)? y
Preparing transaction: done
Verifying transaction: done
Executing transaction: done
#
# To activate this environment, use
#
# $ conda activate sci_sound
#
# To deactivate an active environment, use
#
# $ conda deactivate
Notice how the terminal prompt includes (base)
. This is somethingconda
does to inform us we are currently in our base Python environment.
The above command created a new virtual environment called sci_sound
. Because we did not specify anything other than the name of the environment,conda
created the environment without actually including Python or any packages.
Creating our first useful virtual environment with conda
A slightly more useful version of the above command is to specify the version of Python we want to use in our virtual environment:
~$ conda create --name sci_sound python=3.7
Collecting package metadata: done
Solving environment: done
## Package Plan ##
environment location: /home/martin/anaconda3/envs/sci_sound
added / updated specs:
- python=3.7
The following NEW packages will be INSTALLED:
bzip2 conda-forge/linux-64::bzip2-1.0.6-h14c3975_1002
ca-certificates conda-forge/linux-64::ca-certificates-2019.3.9-hecc5488_0
certifi conda-forge/linux-64::certifi-2019.3.9-py37_0
libffi conda-forge/linux-64::libffi-3.2.1-he1b5a44_1006
libgcc-ng pkgs/main/linux-64::libgcc-ng-8.2.0-hdf63c60_1
libstdcxx-ng pkgs/main/linux-64::libstdcxx-ng-8.2.0-hdf63c60_1
ncurses conda-forge/linux-64::ncurses-6.1-hf484d3e_1002
openssl conda-forge/linux-64::openssl-1.1.1b-h14c3975_1
pip conda-forge/linux-64::pip-19.1-py37_0
python conda-forge/linux-64::python-3.7.3-h5b0a415_0
readline conda-forge/linux-64::readline-7.0-hf8c457e_1001
setuptools conda-forge/linux-64::setuptools-41.0.1-py37_0
sqlite conda-forge/linux-64::sqlite-3.26.0-h67949de_1001
tk conda-forge/linux-64::tk-8.6.9-h84994c4_1001
wheel conda-forge/linux-64::wheel-0.33.1-py37_0
xz conda-forge/linux-64::xz-5.2.4-h14c3975_1001
zlib conda-forge/linux-64::zlib-1.2.11-h14c3975_1004
Proceed ([y]/n)? y
Preparing transaction: done
Verifying transaction: done
Executing transaction: done
#
# To activate this environment, use
#
# $ conda activate sci_sound
#
# To deactivate an active environment, use
#
# $ conda deactivate
From the prompt given, we can see that the most basic Python installation requires several packages to be installed.
Now, let’s activate our new Python environment by following the instructions given to us byconda
after it create our new environment. Specifically, we will use conda activate sci_sound
to activate our environment and conda deactivate
to deactivate it (and return to our base environment).
(base) /home/martin $ conda activate sci_sound
(sci_sound) /home/martin $
Notice how the terminal prompt indicates our sci_sound
environment is now active.
Including Python packages in our conda environments
When we create an new virtual environment, we may already have a good idea of what packages we will need. We can pass the required package names toconda
, and it will take care of the rest.
Let’s run an example where we want to create a Python 3.7 environment that includes numpy:
(base) /home/martin $ conda create --name sci_sound python=3.7 numpy
Collecting package metadata: done
Solving environment: done
## Package Plan ##
environment location: /home/martin/anaconda3/envs/sci_sound
added / updated specs:
- numpy
- python=3.7
[...]
Alternatively, we can create our virtual environment and then add packages:
(base) /home/martin $ conda create --name sci_sound python=3.7
[...]
(base) /home/martin $ conda install --name sci_sound numpy
Creating custom virtual environments with conda
There are times where we need to work with a specific version of Python as well as specific versions of packages. This can easily be achieved by specifying the version number:
(base) /home/martin $ conda create --name sci_sound python=3.4 scipy=0.15.0 astroid babel
As we can see, the above command created a new environment for Python 3.4 with a specific version of scipy as well as the most recent version of the astroid and babel packages. Because different packages can depend on specific versions of other packages, it is recommend that we install all our packages in one call rather that one at a time.
Summary
In this post we learned about Python virtual environments and how to use conda
to create them. This is a large topic that is worth delving deeper into. So in our next post we will learn more about install Python packages that are not directly available through conda
.