Python virtual environments for scientists with conda
I have been coding with Python for several years and I only recently took the time to learn about Python environments. Now that I understand them, I can see why they are so useful, especially in this day and age of large team collaborations, research reproducibility, and sharing data and code when research results are published.
The purpose of this post is to give you some background on Python virtual environments and how to incorporate them into your projects.
What is a virtual environment?
When coding with a large piece of commercial software such as Matlab, you usually deal with a single copy of the purchased software plus any additional toolboxes you or your institute have purchased. Because they are bundled together, Matlab and its various toolboxes work seamlessly with each another.
Python is an Open-Source programming language, therefore things are a little different.
The core Python program continues to evolve (currently at version 3.7), but so do the various packages written and contributed by the Python community. Some packages have been developed and tested with certain versions of Python. Similarly, some packages depend on specific versions of other packages.
These inter-dependencies can become complicated. Even worse, two of our projects may require different versions of the same package.
Python virtual environments to the rescue!
Python virtual environments are isolated installations of Python that you can activate and deactivate. You can have dozens of environments on your computer, often one for every major project you are working on, and if needed, each can use different versions of Python. Moreover, these virtual environments can use different packages and different versions of these packages.
In this way, each virtual environment is a self-contained programming environment that contains the required packages, and the correct version of these packages.
venv vs conda
Python ships with venv, a tool to create virtual environments. While
venv is simple to use, the majority of (data) scientists who use Python tend to use
conda to create their virtual environments.
conda is part of the Anaconda Python distribution. Anaconda provides a simple way for its users to quickly download 1,500+ Python/R data science packages, as well as manage Python libraries, dependencies, and environments (via
Because of its ease of use and popularity amongst scientists, the rest of this post will focus on
Anaconda. If you are new to Python, I highly recommend using Anaconda. The base download includes Python and over 200 key data science Python packages. It also includes a large library of Python packages that can be installed with
conda, a simple coding environment (Idle), and more advanced coding environment that resembles somewhat the Matlab interface (Spyder). Also, the folks at Anaconda recently announced a collaboration with the JetBrains team and their Python independent development environment (IDE) called PyCharm. I highly recommend PyCharm to new users; it makes navigating your code and learning about Python and proper Python style very accessible.
To start off, you likely want the full version of Anaconda. However, after you gain more experience, you may prefer to use Miniconda. It is a slimmed down version of Anaconda that includes the core Python program and
condapackage manager. Once Miniconda is installed, you can use the
condacommand to install any other packages and create environments, etc. This allows you to install only the packages you need.
Creating our first virtual environment with conda
While it is possible in Windows to use the Anaconda Navigator to create new virtual environments, it is more common to use the Anaconda prompt. On Mac and Linux,
condacan be used from a terminal.
(base) /home/martin $ conda create --name sci_sound Collecting package metadata: done Solving environment: done ## Package Plan ## environment location: /home/martin/anaconda3/envs/sci_sound Proceed ([y]/n)? y Preparing transaction: done Verifying transaction: done Executing transaction: done # # To activate this environment, use # # $ conda activate sci_sound # # To deactivate an active environment, use # # $ conda deactivate
Notice how the terminal prompt includes
(base). This is something
condadoes to inform us we are currently in our base Python environment.
The above command created a new virtual environment called
sci_sound. Because we did not specify anything other than the name of the environment,
condacreated the environment without actually including Python or any packages.
Creating our first useful virtual environment with conda
A slightly more useful version of the above command is to specify the version of Python we want to use in our virtual environment:
~$ conda create --name sci_sound python=3.7 Collecting package metadata: done Solving environment: done ## Package Plan ## environment location: /home/martin/anaconda3/envs/sci_sound added / updated specs: - python=3.7 The following NEW packages will be INSTALLED: bzip2 conda-forge/linux-64::bzip2-1.0.6-h14c3975_1002 ca-certificates conda-forge/linux-64::ca-certificates-2019.3.9-hecc5488_0 certifi conda-forge/linux-64::certifi-2019.3.9-py37_0 libffi conda-forge/linux-64::libffi-3.2.1-he1b5a44_1006 libgcc-ng pkgs/main/linux-64::libgcc-ng-8.2.0-hdf63c60_1 libstdcxx-ng pkgs/main/linux-64::libstdcxx-ng-8.2.0-hdf63c60_1 ncurses conda-forge/linux-64::ncurses-6.1-hf484d3e_1002 openssl conda-forge/linux-64::openssl-1.1.1b-h14c3975_1 pip conda-forge/linux-64::pip-19.1-py37_0 python conda-forge/linux-64::python-3.7.3-h5b0a415_0 readline conda-forge/linux-64::readline-7.0-hf8c457e_1001 setuptools conda-forge/linux-64::setuptools-41.0.1-py37_0 sqlite conda-forge/linux-64::sqlite-3.26.0-h67949de_1001 tk conda-forge/linux-64::tk-8.6.9-h84994c4_1001 wheel conda-forge/linux-64::wheel-0.33.1-py37_0 xz conda-forge/linux-64::xz-5.2.4-h14c3975_1001 zlib conda-forge/linux-64::zlib-1.2.11-h14c3975_1004 Proceed ([y]/n)? y Preparing transaction: done Verifying transaction: done Executing transaction: done # # To activate this environment, use # # $ conda activate sci_sound # # To deactivate an active environment, use # # $ conda deactivate
From the prompt given, we can see that the most basic Python installation requires several packages to be installed.
Now, let’s activate our new Python environment by following the instructions given to us by
condaafter it create our new environment. Specifically, we will use
conda activate sci_sound to activate our environment and
conda deactivate to deactivate it (and return to our base environment).
(base) /home/martin $ conda activate sci_sound (sci_sound) /home/martin $
Notice how the terminal prompt indicates our
sci_sound environment is now active.
Including Python packages in our conda environments
When we create an new virtual environment, we may already have a good idea of what packages we will need. We can pass the required package names to
conda, and it will take care of the rest.
Let’s run an example where we want to create a Python 3.7 environment that includes numpy:
(base) /home/martin $ conda create --name sci_sound python=3.7 numpy Collecting package metadata: done Solving environment: done ## Package Plan ## environment location: /home/martin/anaconda3/envs/sci_sound added / updated specs: - numpy - python=3.7 [...]
Alternatively, we can create our virtual environment and then add packages:
(base) /home/martin $ conda create --name sci_sound python=3.7 [...] (base) /home/martin $ conda install --name sci_sound numpy
Creating custom virtual environments with conda
There are times where we need to work with a specific version of Python as well as specific versions of packages. This can easily be achieved by specifying the version number:
(base) /home/martin $ conda create --name sci_sound python=3.4 scipy=0.15.0 astroid babel
As we can see, the above command created a new environment for Python 3.4 with a specific version of scipy as well as the most recent version of the astroid and babel packages. Because different packages can depend on specific versions of other packages, it is recommend that we install all our packages in one call rather that one at a time.
In this post we learned about Python virtual environments and how to use
conda to create them. This is a large topic that is worth delving deeper into. So in our next post we will learn more about install Python packages that are not directly available through