Use pixi for maximally ergonomic and reproducible environments
The first edition walked through conda-centric examples because that was the default mental model for many teams at the time. That is history, not something I want you to cargo-cult today. I now recommend one Pixi configuration per project, which can define multiple environments inside that single project. The rest of this chapter is Pixi-first: commands, lockfiles, and how to compose environments.
Pixi is the multitool I use for that job: one place to define software environments reproducibly and ergonomically. For the hybrid data scientist and tool-builder persona, I told the longer personal story in a blog post about moving my workflows toward Pixi; here I keep the operational spine in cheat-sheet form. Read top to bottom for everyday commands, then lockfiles, then how to compose multiple environments from shared pieces.
A practical onboarding path (clone to first command)
If you are joining an existing Pixi-managed project, this is the shortest reliable path from clone to a working command:
git clone https://github.com/ericmjl/data-science-bootstrap-notes.git
cd data-science-bootstrap-notes
pixi install
pixi run python -V
pixi run build
That sequence does three things in order: install the solved environment, verify you are using project-managed Python, then run a real task from the project.
If pixi.lock changed after you pulled latest commits, re-run pixi install so
your local solve matches the pinned graph again.
Choose one config layout on purpose
For Python and data-science projects, I recommend keeping Pixi configuration in
pyproject.toml under [tool.pixi]. That keeps package metadata and environment
management in one file.
Pixi also supports a standalone pixi.toml. That format is valid, but in this
book the default recommendation is:
- Use
pyproject.toml+[tool.pixi.*]for Python projects. - Commit
pixi.lockfor reproducibility.
If you want a broader "which config file does what" reference, see configuration files guide.
Pixi command cheat sheet
Install or update Pixi
curl -fsSL https://pixi.sh/install.sh | bash # install pixi
echo 'eval "$(pixi completion --shell zsh)"' >> ~/.zshrc # enable auto-completion
echo 'eval "$(pixi completion --shell bash)"' >> ~/.bashrc # enable auto-completion
pixi self-update # self-update
The commands should be self-explanatory.
Initialize a project
Using pyds-cli:
pyds project init # scaffolds project with cookiecutter and pixi
When starting a new project, I use pyds-cli to scaffold the layout. Under the
hood, pyds project init uses cookiecutter, then brings Pixi in to install
the environment.
Using Pixi directly:
pixi init --format pyproject # quick initialization for prototyping
If you only need a scratch environment for prototyping, pixi init alone is
enough to get moving.
Add a dependency
pixi add "package name"
pixi add --pypi "pypi package name"
pixi add -f "feature name"
Use pixi add to record dependencies in the environment specification. Pull from
PyPI with --pypi. Target a single feature (for example, a CUDA-specific stack)
with -f and the feature name.
In pyproject.toml, the mental model is:
[tool.pixi.dependencies]for conda-forge packages.[tool.pixi.pypi-dependencies]for packages you want sourced from PyPI.
In this book's own pyproject.toml, you can see both in action: docs tooling from
conda-forge and the local editable package under PyPI dependencies.
Shell activation (conda-like workflow)
pixi shell
pixi shell -e "environment-name"
When you need a shell whose python / IPython matches the project, run
pixi shell. Pass -e env-name to pick a non-default environment.
If you also use automatic environment loading by directory, see
Install and configure direnv for environment management.
Run tasks and programs
pixi run task-name # as specified in your pyproject.toml
pixi run -e docs quarto preview # run quarto preview in the docs environment
pixi run python # run Python in the project's default environment
Pixi can replace scattered Makefiles with tasks declared in pyproject.toml, so
you get named aliases and one-off commands (for example quarto preview) inside
the environment they belong to.
Standardize with AGENTS.md and pixi run
The same pixi run habit shows up when you add coding agents: they often invoke
python or other tools directly and skip the project environment. Encode a local
norm in AGENTS.md: run project commands through pixi run
unless you have a deliberate exception, so agents reuse the same solves and task
names you rely on locally.
For repo-specific command conventions, see AGENTS.md.
Long-term reproducibility through lock files
Automatic lockfiles are one of Pixi's strongest features. They address a failure mode that kept surfacing in historical conda-style setups: the environment you solved today was not guaranteed to be the environment you would solve a year later unless you layered on extra discipline.
Historical friction: conda-era lockfiles
In those workflows, lockfiles were not first-class by default. Teams either
remembered to run conda env export, adopted add-ons such as conda-lock, or
drifted quietly as solver metadata changed. Without an always-on lockfile story,
environments crept, which fed the classic “works on my machine” failure mode.
The Pixi approach
Pixi generates and updates pixi.lock when you change the environment
specification. That gives you:
- Automatic reproducibility: dependency changes show up as lockfile updates
- Fewer manual rituals: no separate “remember to lock” step for the common path
- Durable environments: the same solve can be replayed later
- Team alignment: everyone resolves to the same pinned graph
For data science work that sits idle for months or spans collaborators on different machines, that automation is part of what makes results replayable.
In practice, the lockfile discipline is simple: commit pixi.lock, and
re-run pixi install when lockfile changes land from upstream.
Composable multi-environment projects
Beyond locking, Pixi's biggest ergonomic win is composition: instead of juggling unrelated envs per purpose, you define reusable features and combine them into named environments.
Features as building blocks
In pyproject.toml, define features as reusable bundles:
[tool.pixi.feature.tests.dependencies]
pytest = "*"
pytest-cov = "*"
hypothesis = "*"
[tool.pixi.feature.docs.dependencies]
mkdocs = "*"
mkdocs-material = "*"
mknotebooks = "*"
[tool.pixi.feature.notebook.dependencies]
ipykernel = "*"
ipython = "*"
jupyter = "*"
pixi-kernel = "*"
[tool.pixi.feature.devtools.dependencies]
pre-commit = "*"
Composing environments
Combine features into environments that match how you work:
[tool.pixi.environments]
default = { features = ["tests", "devtools", "notebook", "setup"] }
docs = { features = ["docs"] }
tests = { features = ["tests", "setup"] }
cuda = { features = ["tests", "devtools", "notebook", "setup", "cuda"] }
Why this shape helps
- Purpose stays visible: each feature names the dependencies for one job (tests, docs, notebooks).
- Lean CI paths: CI can target
-e testswithout dragging doc or notebook stacks along. - Hardware-shaped slices: CUDA or other accelerators become explicit feature toggles instead of one-off hacks.
- Reproducibility stays structural: environments are spelled out as explicit feature lists.
A benchmark-driven use case: divergent model stacks
Multi-environment composition is especially useful when you benchmark multiple ML models that do not share the same software stack. In practice, this often means different Python versions, different deep learning libraries, or mutually incompatible dependency pins.
Instead of trying to force everything into one giant environment, define one environment per benchmark stack and run the same benchmark task per environment. You keep stacks isolated while still running everything from one project directory.
[tool.pixi.feature.model_a.dependencies]
python = "3.10.*"
torch = ">=2.4,<2.5"
[tool.pixi.feature.model_b.dependencies]
python = "3.12.*"
jax = ">=0.5,<0.6"
equinox = ">=0.11,<0.12"
[tool.pixi.environments]
model-a = { features = ["model_a"], solve-group = "model-a" }
model-b = { features = ["model_b"], solve-group = "model-b" }
[tool.pixi.tasks]
benchmark = "python scripts/benchmark.py"
Then run:
pixi run -e model-a benchmark
pixi run -e model-b benchmark
Use separate solve groups when stacks genuinely diverge; this keeps one environment's constraints from accidentally shaping the other environment's solve.
Example workflows
The commands below match the environment names in the preceding pyproject.toml
snippets: lean paths where you need them, full stack where you do not.
Run tests with only the test stack:
pixi run -e tests pytest
Build docs with only documentation tooling:
pixi run -e docs mkdocs build
Develop with the default “everything I need” composition:
pixi shell # uses default environment
Open a CUDA-capable shell when that feature exists:
pixi shell -e cuda
For many projects, including this one, the default environment plus a few tasks gets you very far. Multi-environment composition is a powerful extension pattern, not something you have to adopt on day one.