Use pixi for maximally ergonomic and reproducible environments

The first edition walked through conda-centric examples because that was the default mental model for many teams at the time. That is history, not something I want you to cargo-cult today. I now recommend one Pixi configuration per project, which can define multiple environments inside that single project. The rest of this chapter is Pixi-first: commands, lockfiles, and how to compose environments.

Pixi is the multitool I use for that job: one place to define software environments reproducibly and ergonomically. For the hybrid data scientist and tool-builder persona, I told the longer personal story in a blog post about moving my workflows toward Pixi; here I keep the operational spine in cheat-sheet form. Read top to bottom for everyday commands, then lockfiles, then how to compose multiple environments from shared pieces.

A practical onboarding path (clone to first command)

If you are joining an existing Pixi-managed project, this is the shortest reliable path from clone to a working command:

git clone https://github.com/ericmjl/data-science-bootstrap-notes.git
cd data-science-bootstrap-notes
pixi install
pixi run python -V
pixi run build

That sequence does three things in order: install the solved environment, verify you are using project-managed Python, then run a real task from the project.

If pixi.lock changed after you pulled latest commits, re-run pixi install so your local solve matches the pinned graph again.

Choose one config layout on purpose

For Python and data-science projects, I recommend keeping Pixi configuration in pyproject.toml under [tool.pixi]. That keeps package metadata and environment management in one file.

Pixi also supports a standalone pixi.toml. That format is valid, but in this book the default recommendation is:

Use pyproject.toml + [tool.pixi.*] for Python projects.
Commit pixi.lock for reproducibility.

If you want a broader "which config file does what" reference, see configuration files guide.

Pixi command cheat sheet

Install or update Pixi

curl -fsSL https://pixi.sh/install.sh | bash # install pixi
echo 'eval "$(pixi completion --shell zsh)"' >> ~/.zshrc # enable auto-completion
echo 'eval "$(pixi completion --shell bash)"' >> ~/.bashrc # enable auto-completion
pixi self-update # self-update

The commands should be self-explanatory.

Initialize a project

Using pyds-cli:

pyds project init # scaffolds project with cookiecutter and pixi

When starting a new project, I use pyds-cli to scaffold the layout. Under the hood, pyds project init uses cookiecutter, then brings Pixi in to install the environment.

Using Pixi directly:

pixi init --format pyproject # quick initialization for prototyping

If you only need a scratch environment for prototyping, pixi init alone is enough to get moving.

Add a dependency

pixi add "package name"
pixi add --pypi "pypi package name"
pixi add -f "feature name"

Use pixi add to record dependencies in the environment specification. Pull from PyPI with --pypi. Target a single feature (for example, a CUDA-specific stack) with -f and the feature name.

In pyproject.toml, the mental model is:

[tool.pixi.dependencies] for conda-forge packages.
[tool.pixi.pypi-dependencies] for packages you want sourced from PyPI.

In this book's own pyproject.toml, you can see both in action: docs tooling from conda-forge and the local editable package under PyPI dependencies.

Shell activation (conda-like workflow)

pixi shell
pixi shell -e "environment-name"

When you need a shell whose python / IPython matches the project, run pixi shell. Pass -e env-name to pick a non-default environment.

If you also use automatic environment loading by directory, see Install and configure direnv for environment management.

Run tasks and programs

pixi run task-name # as specified in your pyproject.toml
pixi run -e docs quarto preview # run quarto preview in the docs environment
pixi run python # run Python in the project's default environment

Pixi can replace scattered Makefiles with tasks declared in pyproject.toml, so you get named aliases and one-off commands (for example quarto preview) inside the environment they belong to.

Standardize with `AGENTS.md` and `pixi run`

The same pixi run habit shows up when you add coding agents: they often invoke python or other tools directly and skip the project environment. Encode a local norm in AGENTS.md: run project commands through pixi run unless you have a deliberate exception, so agents reuse the same solves and task names you rely on locally.

For repo-specific command conventions, see AGENTS.md.

Long-term reproducibility through lock files

Automatic lockfiles are one of Pixi's strongest features. They address a failure mode that kept surfacing in historical conda-style setups: the environment you solved today was not guaranteed to be the environment you would solve a year later unless you layered on extra discipline.

Historical friction: conda-era lockfiles

In those workflows, lockfiles were not first-class by default. Teams either remembered to run conda env export, adopted add-ons such as conda-lock, or drifted quietly as solver metadata changed. Without an always-on lockfile story, environments crept, which fed the classic “works on my machine” failure mode.

The Pixi approach

Pixi generates and updates pixi.lock when you change the environment specification. That gives you:

Automatic reproducibility: dependency changes show up as lockfile updates
Fewer manual rituals: no separate “remember to lock” step for the common path
Durable environments: the same solve can be replayed later
Team alignment: everyone resolves to the same pinned graph

For data science work that sits idle for months or spans collaborators on different machines, that automation is part of what makes results replayable.

In practice, the lockfile discipline is simple: commit pixi.lock, and re-run pixi install when lockfile changes land from upstream.

Composable multi-environment projects

Beyond locking, Pixi's biggest ergonomic win is composition: instead of juggling unrelated envs per purpose, you define reusable features and combine them into named environments.

Features as building blocks

In pyproject.toml, define features as reusable bundles:

[tool.pixi.feature.tests.dependencies]
pytest = "*"
pytest-cov = "*"
hypothesis = "*"

[tool.pixi.feature.docs.dependencies]
mkdocs = "*"
mkdocs-material = "*"
mknotebooks = "*"

[tool.pixi.feature.notebook.dependencies]
ipykernel = "*"
ipython = "*"
jupyter = "*"
pixi-kernel = "*"

[tool.pixi.feature.devtools.dependencies]
pre-commit = "*"

Composing environments

Combine features into environments that match how you work:

[tool.pixi.environments]
default = { features = ["tests", "devtools", "notebook", "setup"] }
docs = { features = ["docs"] }
tests = { features = ["tests", "setup"] }
cuda = { features = ["tests", "devtools", "notebook", "setup", "cuda"] }

Why this shape helps

Purpose stays visible: each feature names the dependencies for one job (tests, docs, notebooks).
Lean CI paths: CI can target -e tests without dragging doc or notebook stacks along.
Hardware-shaped slices: CUDA or other accelerators become explicit feature toggles instead of one-off hacks.
Reproducibility stays structural: environments are spelled out as explicit feature lists.

A benchmark-driven use case: divergent model stacks

Multi-environment composition is especially useful when you benchmark multiple ML models that do not share the same software stack. In practice, this often means different Python versions, different deep learning libraries, or mutually incompatible dependency pins.

Instead of trying to force everything into one giant environment, define one environment per benchmark stack and run the same benchmark task per environment. You keep stacks isolated while still running everything from one project directory.

[tool.pixi.feature.model_a.dependencies]
python = "3.10.*"
torch = ">=2.4,<2.5"

[tool.pixi.feature.model_b.dependencies]
python = "3.12.*"
jax = ">=0.5,<0.6"
equinox = ">=0.11,<0.12"

[tool.pixi.environments]
model-a = { features = ["model_a"], solve-group = "model-a" }
model-b = { features = ["model_b"], solve-group = "model-b" }

[tool.pixi.tasks]
benchmark = "python scripts/benchmark.py"

Then run:

pixi run -e model-a benchmark
pixi run -e model-b benchmark

Use separate solve groups when stacks genuinely diverge; this keeps one environment's constraints from accidentally shaping the other environment's solve.

Example workflows

The commands below match the environment names in the preceding pyproject.toml snippets: lean paths where you need them, full stack where you do not.

Run tests with only the test stack:

pixi run -e tests pytest

Build docs with only documentation tooling:

pixi run -e docs mkdocs build

Develop with the default “everything I need” composition:

pixi shell # uses default environment

Open a CUDA-capable shell when that feature exists:

pixi shell -e cuda

For many projects, including this one, the default environment plus a few tasks gets you very far. Multi-environment composition is a powerful extension pattern, not something you have to adopt on day one.