Use Mamba as a faster drop-in replacement for conda

What is mamba

Mamba is a project originally developed by the Quantstack team. They went in and solved some of the annoyances with the conda package manager - specifically the problem of how long it takes to solve an environment specification.

How do you get mamba

Mamba is available on conda-forge and PyPI. Follow the instructions on the mamba repo to install it.

Alias mamba to conda

If you have muscle memory and want to make the switch from conda to mamba as easy as possible, you can use a shell alias inside your sourced .aliases file:

alias conda="mamba"

See the page Create shell command aliases for your commonly used commands for more information on shell aliases.

Install Anaconda on your machine

What is anaconda

Anaconda is a way to get a Python installed on your system.

One of the neat but oftentimes confusing things about Python is that you can have multiple Python executables living around on your system. Anaconda makes it easy for you to:

  1. Obtain Python
  2. Manage different Python versions into isolated environments using a consistent interface
  3. Install packages into these environments

Why use anaconda (or one of its variants)?

Why is this a good thing? Primarily because you might have individual projects that need different version of Python and different versions of packages that are built for Python. Also, default Python installations, such as the ones shipped with older versions of macOS, tend to be versions behind the latest, which is to the detriment of your projects. Some built-in apps in an operating system may depend on that old version of Python (such as iPhoto), which means if you mess up the installation, you might break those built-in apps. Hence, you will want a tool that lets you easily create isolated Python environments.

The Anaconda Python distribution fulfills the following key needs:

  1. You'll be able to create isolated environments on a per-project basis. (see: Follow the rule of one-to-one in managing your projects)
  2. You'll be able to install packages into those isolated environments, and evolve them over time. (see: Create one conda environment per project)

Installing Anaconda on your local machine thus helps you get easy access to Python, Jupyter (see: Use Jupyter as an experimentation playground), and other tools for modelling and analysis.

How to get anaconda?

To install the Miniforge variant of Anaconda, which will be lighter-weight than the full Anaconda distribution, using the following command:

cd ~
wget "https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-$(uname)-$(uname -m).sh" -O anaconda.sh

This will send you to your home directory, and then download the Miniforge bash script installer from Anaconda's download page as anaconda.sh.

Now, install Anaconda:

bash anaconda.sh -b -p $HOME/anaconda/

This will install the Anaconda distribution of Python onto your system inside your home directory. You can now install packages at will, without needing sudo privileges!

Next steps

Level-up your conda skills

Variants of Anaconda

If you're a conda user, you may have heard of the Anaconda distribution of Python. In this set of notes, however, I've also referenced the Miniforge distribution of Python. What's the difference here? How do you pick which one to use? To answer those questions, we must first understand what is a distribution of Python.

Python distributions

Python can get distributed to users in many ways. You can download it directly from the official Python Software Foundation's (PSF) website. Or you can install it onto your system using the official Anaconda installer, through Homebrew, or through your official Linux package manager. Each way of installing Python can be thought of as a distribution of Python. Each distribution of Python differs ever so slightly. Official Python from the PSF comes with just the standard library. Anaconda, however, ships with the standard library and many other packages that are relevant for data science.

What is common across all Python distributions, however, is that it will ship with a Python executable that, at the end of installation, should be discoverable on your PATH environment variable.

Most commonly, there will be a Python package installer that ships with the distribution as well. This can be pip, the official tool for installing Python packages, or it could be conda, which was developed by the company Anaconda.

As such, the anatomy of a distribution is essentially nothing more than:

  • A Python interpreter that can be discovered on your PATH,
  • A Python package manager, and
  • Any other default Python packages that the distributor thinks you might want

With that aside, let's look at three distributions of Python that are relevant to this set of notes.

Anaconda Python

The Anaconda distribution of Python is the official distribution from Anaconda. It ships with a modern version of Python, both pip and conda package managers, and a whole slew of default data science packages (pandas, numpy, scikit-learn, scipy, matplotlib, for example). With the Anaconda distribution, conda is configured such that packages are installed from the anaconda repository of packages, hosted by Anaconda itself. Its default installation location is ~/anaconda or ~/anaconda3.

Miniconda Python

The Miniconda Python distribution also comes from Anaconda. It looks like Anaconda except it ships with fewer packages in the base environment. You wouldn't, for example, find pandas installed for you. This was mostly intended to keep the base environment small for use within Docker containers.

Its default installation location is ~/miniconda or ~/miniconda3.

Miniforge Python

This distribution of Python comes from the open-source developer team behind conda-forge. Miniforge looks like Miniconda, but instead of configuring conda to pull packages from the anaconda repository, conda packages are instead pulled from the conda-forge repository of packages by default. This has the advantage of being able to pull more bleeding-edge versions of packages that you may use. Additionally, Miniforge Python ships with mamba as well. (See: Use Mamba as a faster drop-in replacement for conda)

Which to use?

Depends on your persona! If you're an indie hacker type, I would strongly recommend the Miniforge Python as it is lightweight and fast to get set up with and fully open source. On the other hand, if you're more inclined to want enterprise support, vetting of packages, and wish to support a company that backs so much of the Python open source world, then I would recommend reaching out to Anaconda and talking with their sales reps.

Bootstrap a scratch conda environment

A scratch environment is your playground

In a pinch, you might want to muck around on your system with some quick-and-dirty experiment. Having a suite of packages inside a scratch environment can be handy. Your scratch environment can be your base environment if you'd like, but I would strongly recommend creating a separate scratch environment instead.

How to bootstrap a scratch environment

I would recommend that you bootstrap a scratch conda environment with some basic data science packages.

mamba activate base
mamba install -c conda-forge \
    scipy numpy pandas matplotlib \
	numpy jupyter jupyterlab \
	scikit-learn ipython ipykernel \
	ipywidgets mamba

(Replace mamba with conda if you don't have mamba installed on your system.)

Doing so gives you an environment where you can quickly prototype new things without necessarily going through the overhead of creating an entirely new project (and with it, a full conda environment).

Installing mamba can be helpful if you want a faster drop-in replacement for conda. (see: Use Mamba as a faster drop-in replacement for conda for more information.)

Create shell command aliases for your commonly used commands

Why create shell aliases

Shell aliases can save you keystrokes, which save time. That time saved is compound interest over long time horizons!

How do I create aliases?

Shell aliases are easy to create. In your shell initializer script, use the following syntax, using ls being aliased to exa with configuration flags at the end as an example:

alias ls="exa --long"

Now, typing ls at the shell will instead execute exa! (To know what is exa, see Install a suite of really cool utilities on your machine using homebrew.)

Where do I store these aliases?

In order for these shell aliases to take effect each time you open up your shell, you should ensure that they get sourced in your shell initialization script (see: Take full control of your shell environment variables for more information). You have one of two options:

  1. These aliases can be declared in your .zshrc or .bashrc (or analogous) file, or
  2. They can be declared in ~/.aliases, which you source inside your shell initialization script file (i.e. .zshrc/.bashrc/etc.)

I recommend the second option as doing so means you'll be putting into practice the philosophy of having clear categories of things in one place.

What are some aliases that could be useful?

In my dotfiles repository, I have a .shell_aliases directory which contains a full suite of aliases that I have installed.

Other external links that showcase shell aliases that could serve as inspiration for your personal collection include:

And finally, to top it off, Twitter user @ctrlshifti suggests aliasing please to sudo for a pleasant experience at the terminal:

alias please="sudo"

# Now you type:
# please apt-get update
# please apt-get upgrade
# etc...