Conda and Mamba

Conda and Mamba#

Tip

We recommend using Mamba (https://mamba.readthedocs.io/en/latest/). It is a drop in replacement for Conda and is typically faster.

Conda (https://docs.conda.io/projects/conda/en/stable/) and Mamba (https://mamba.readthedocs.io/en/latest/) are package management systems.

Loading Conda and Mamba#

System Installation#

We have a base install of Mamba and Conda from Miniforge available as a module that can be accessed using, for example,

module load miniforge/24.11.2-1

The module will automatically load the shell and environment settings needed to use conda and mamba commands.

Note

It is recommended to always include the version number when loading modules. Most modules will default to the most recent version if not specified, which may cause unexpected behaviors when new versions are installed.

User Installation#

You can also install your own versions of Conda in your $HOME directory or a project directory. We recommend:

Micromamba: https://mamba.readthedocs.io/en/latest/user_guide/micromamba.html
Miniforge: conda-forge/miniforge

The first time you run Conda/Mamba, you will need to initialize it. for example, this creates some shell functions in your profile to make it easier to call Mamba.

mamba init $(echo $SHELL | cut -c 6-)

After doing this, you may need to log out and log back in to see the effects. In most cases, you can source your shell profile to avoid having to log out. For most users, this is source ~/.bashrc.

We additionally recommend that you disable Conda/Mamba’s auto-activate base functionality. By default, Conda/Mamba will load a base environment, which can cause issues with system dependencies. In particular, applications on https://hpc.m3.smu.edu often behave in unexpected ways because it tries to use a Conda package instead of the correct system package. The next two commands tell Conda to prefer to save packages and environments in your $HOME directory (you can specify other locations you have access to, but performance is generally better in $HOME).

conda config --set auto_activate_base false
conda config --prepend envs_dirs $HOME/.conda/envs
conda config --prepend pkgs_dirs $HOME/.conda/pkgs

Creating Virtual Environments from the Command Line#

For simple environments with a small number of packages, you can create an environment named mamba_env (or any name of your choosing) in your $HOME directory with

mamba create -n mamba_env python=3.9 package1 package2 package3

The -n tells Mamba what to name the environment. Here, we request Python version 3.9 and the packages package1 package2 package3 which are the packages you’d like to install (e.g. numpy, tensorflow, pandas, etc.). In general, it is a good idea install all the packages at the same time because Conda will do a better job of resolving dependencies.

Creating Virtual Environments From environment.yml File#

For environments that contain more than a few packages, we suggest creating a environment.yml file (note, you can name the file anything you’d like, but it is common practice to call it environment.yml.)

The basic structure of the environment.yml is:

name: mamba_env
channels:
  - conda-forge
dependencies:
  - python>=3.9
  - package1
  - package2
  - package3
  - pip
  - pip:
    - pip_package1
    - -r requirements.txt

The name field is what the created environment will be called (it can be anything you like, but we again use the name mamba_env for the example).

The next section is channels, which are the repositories where Mamba will look for the requested packages. Mamba prioritizes the channels from the top down, so in this case Mamba will prefer the package in conda-forge (typically the packages in the conda-forge are more up to date.)

The next section is dependencies and this is where you should list all of the packages you would like to install. If you have packages that need to be installed with pip, you should include pip in the dependencies as above and you can list the specific packages like the above as pip_package1, etc. and/or you can have all the pip packages in a requirements.txt file.

Once you have made the environment.yml file, you can create the environment with:

mamba env create -f environment.yml -n mamba_env

Examples#

The following are examples of how you might start to build your environment for a few different usage cases. These are the source files we use on the web portal, if you choose to build a custom environment from the form.

It is likely that these will not meet the needs of any specific use case, but you can add or remove packages as needed for your particular needs.

Example of an environment for data science, datascience.yml.
Example of an environment for bioinformatics, bioinformatics.yml.
Example of an environment for machine learning with SKLearn, sklearn.yml.
Example of an environment for R, r.yml

Using the HPC Portal#

If you are running interactive sessions through the portal using JupyterLab, you need to have JupyterLab installed in your environment. If it is not, the portal will not allow that environment to be used.

Your Mamba environment should appear in the drop down list of Python Environments. If it is greyed out, that means that you need to install JupyterLab in the environment.

Using a base#

module load miniforge/24.11.2-1
mamba activate mamba_env

to the Custom environment settings field on the portal. It should look like:

Interactively from the terminal#

If you are running programs interactively from the terminal (e.g. using srun) just activate the virtual environment with

module load miniforge/24.11.2-1
mamba activate mamba_env

in the terminal before running any commands.

Using SBatch scripts#

If you are running programs using SBatch scripts, you should include the activation command in your script:

module load miniforge/24.11.2-1
mamba activate mamba_env

Tips and reproducibility#

In general, you should not update packages inside a Conda/Mamba environment. Instead, you should make a new environment with the versions of the packages need and verify this works before removing any old environments that are not longer needed. This is especially true if you used pip to install anything.
It is a good idea to include version numbers of the packages you want (if you know them). For example, in the above, we requested Python version 3.9 or newer. Being more specific can help speed up how long it takes to set up the environment because it will reduce the number of package version Mamba will consider.
It is best to install all of the packages when you create the environment, if possible. Mamba will do a better job of resolving dependencies.