3 min read

Export anaconda environment with conda export

Export anaconda environment with conda export
TL;DR Confused with conda env export not working as expected? Jump to the solution!

Managing projects with virtual environments is one of the core practices of modern Python development. There are several tools for managing virtual environments, such as venv, pipenv, poetry and conda. Some of these tools also provide dependency and project management capabilities, and there is no universally acknowledged favorite. The important thing to understand is that the exact tool you use is much less important than a clear and consistent development process that you and your team use.

In this blog post we will discuss one specific problem with managing conda environment specifications. I would like to keep this post focused, so we will not dive into reasons for choosing conda over other tools, neither will we discuss the full Python development pipeline.

Why version environment specifications

Saving and versioning your environment spec is always a good idea, even when you work alone on a pet project. A specification stored along with your code will help you recreate your environment on a different machine or when you take a break from you project and get back to it half a year later.

Having an environment spec is even more important when you work with a team: all of your teammates should be able to spin up a new environment quickly and reliably.

As mentioned in this great blog post, we need to accomplish two distinct and sometimes conflicting objectives:

  1. Reproducibility. You should be able to reproduce your environment with as much precision as possible. It's really important that you deploy your code to the same environment that you develop and test it in. You can achieve great reproducibility with conda-lock tool, but such environment specs are really hard to read, reason about and maintain.
  2. Upgradability. You should be able to manage your environment specification by hand in a text editor. Ideally, you would start your project with an empty environment and gradually install packages with conda install or even pip install. After some initial fiddling you settle on an environment that works, and you would like to save its configuration for your team to use. You would like everyone to have the exact same environment, so you use conda-lock to generate locked specifications for each platform. But to generate a lockfile, you first need a less strict version of a specification — one that contains only those packages and their versions that you really care about, and not all the transitive dependencies. This file you can edit by hand and be sure that the packaging system will do its best to resolve all the dependencies. Right now you would either author this file completely by hand, or use conda env export and then edit the output by hand, since the resulting specifications are not portable, unfortunately.

Automating environment specifications with conda export

Since it's quite burdensome to manually track all the packages you install, I decided to find a way to do this automatically. I quickly found that conda env export doesn't help much, since:

  1. It exports all the transient dependencies too. This is problematic, since on different platforms transient dependencies might differ.
  2. It adds a fixed version component, which prevents cross-platform portability and generally defeats the idea of upgradability that we discussed previously.
  3. It includes environment prefix, which should also be deleted for specification to be portable.

As you can see, conda env export doesn't help much with our task, and one might argue that it's more effective to write the specification by hand.

To tackle this problem I've created conda-export — a tool that generates portable environment specifications, which contain only top-level packages that you installed over the lifetime of your environment.

The solution

You can install conda-export into your root environment (so that your other environments are not cluttered by tools not relevant to the projects):

conda install conda-export -n base

And then you can export you portable environment specification with a command like this:

conda export -n [environment name] -f environment.yml

And that's it! You will find that conda export tries to minimize total number of packages in a specification by taking into account only those packages that you explicitly installed. It includes specific versions only if you specified them, and also handles pip packages separately and correctly.

Suggested workflow

So what would be the complete workflow for having both reproducible and upgradable environment specifications? I would suggest the following:

  1. Start with an empty environment by using conda create.
  2. Install packages as you normally would with conda install or pip install. You can even use specific versions, like conda install numpy=1.26.
  3. When you are satisfied with your environment, use conda export to create a portable and upgradable environment.yml file.
  4. Use conda lock to generate conda-lock.yml with all dependencies fully locked.
  5. Use conda-lock.yml to deploy your code to production, build containers and spin up new environments on development machines.
  6. When it's time to upgrade packages, just regenerate conda-lock.yml from environment.yml possibly updating versions for only those packages that need to have fixed versions.