Skip to content

Configure your shell

As a data scientist, odds are that you are going to be working in a terminal shell. It will absolutely pay off to invest time in making sure your shell is configured for your maximal productivity.

Install Starship

The default or vanilla prompt that ships with most machines is highly uninformative. Yours may look like any one of the following:

%
#
$

This really doesn't tell you much! And yet, the potential of all of that blank space on your terminal screen is immense! At a glance, it's super helpful to know things like:

  • Where am I in the file tree?
  • What is my current git working branch?
  • Which environment is currently activated?

In principle, these are all displayable at your shell prompt, allowing you to have this information is at your fingertips. Starship gives you an easy way to make this available and is installable by a one-line shell command:

curl -sS https://starship.rs/install.sh | sh

Once Starship is installed, you'll get a prompt that looks something like mine:

data-science-bootstrap-notes on second-edition-rewrite [$!?] is πŸ“¦ v1.0.0 via 🐍 v3.12.4
❯

At a glance, it tells me that:

  • the Python on my PATH is version 3.12.4 (🐍 v3.12.4),
  • that this book is currently at version 1.0 (πŸ“¦ v1.0.0),
  • and that I'm on the branch second-edition-rewrite
  • with uncommitted changes ([$!?])
  • on this book's cloned repository (data-science-bootstrap-notes).

Configure environment variables

If you're not sure what environment variables are, I have an essay on them that you can reference. Mastering environment variables is crucial for data scientists!

Your shell environment, whether it is zsh or bash or fish or something else, is supremely important. It determines the runtime environment, which in turn determines which Python you're using, whether you have proxies set correctly, and more. Rather than leave this to chance, I would recommend instead gaining full control over your environment variables.

The simplest way is to set them explicitly in your shell initialization script. For bash shells, it's either .bashrc or .bash_profile. For the Z shell, it'll be the .zshrc file. In there, step by step, set the environment variables that you need system-wide.

For example, explicitly set your PATH environment variable with explainers that tell future you why you ordered the PATH in a certain way.

# Start with an explicit minimal PATH
export PATH=/bin:/usr/bin:/usr/local/bin

# Add in my custom binaries that I want available across projects
export PATH=$HOME/bin:$PATH

# Add in anaconda installation path
export PATH=$HOME/.pixi/bin:$PATH

# Add more stuff below to your heart's content...

If you want your shell initialization script to be cleaner, you can refactor it out into a second bash script called env_vars.sh, which lives either inside your home directory or your [dotfiles repository]1. Then, source the env_vars.sh script from the shell initialization script:

source ~/env_vars.sh

There may be a chance that other programs, such as the pixi installer, will give you an option to modify your shell initializer script. If so, be sure to keep this in the back of your mind. You can always echo the final state of environment variables to help you debug:

env

And the most important one to look out for is the PATH variable:

data-science-bootstrap-notes on ξ‚  second-edition-rewrite [$!] is πŸ“¦ v1.0.0 via 🐍 v3.12.4 on ☁️  ericmajinglong@gmail.com
❯ echo $PATH | tr ':' '\n'
/opt/homebrew/bin
/Users/ericmjl/bin

/Applications/quarto/bin
/opt/homebrew/bin
/opt/homebrew/sbin
/usr/local/bin
/System/Cryptexes/App/usr/bin
/usr/bin
/bin
/usr/sbin
/sbin

This will give you a list of all the directories in your PATH variable, in order of priority.

Global vs. Project-Specific Environment Variables

Your shell environment variables are considered "global" environment variables that are set across every terminal session that you're logged into. On the other hand, there are "project-specific" environment variables which are set on a per-project basis. These should be set within a project-specific .env file that lives within a code repository.

Create shell aliases

Shell aliases can save you keystrokes, which in turn saves you time time. That time saved is compound interest over long time horizons!

Shell aliases are easy to create. In your shell initializer script, use the following syntax, using ls being aliased to exa with configuration flags at the end as an example:

alias ls="exa --long"

Now, typing ls at the shell will instead execute exa! (exa is one of the system-level software that I recommend installing.)

In order for these shell aliases to take effect each time you open up your shell, you should ensure that they get sourced in your shell initialization script such as ~/.bashrc or ~/.zshrc. You have one of two options:

  1. These aliases can be declared in your .zshrc or .bashrc (or analogous) file, or
  2. They can be declared in ~/.aliases, which you source inside your shell initialization script file (i.e. .zshrc/.bashrc/etc.)

The latter is done using:

# put this line in your ~/.bashrc
source /path/to/.aliases

And the contents of your .aliases file is exactly what I showed above.

Of the two options above, I recommend the second as doing so means you'll be putting into practice the philosophy of having clear categories of things in one place.

In my dotfiles repository, I have a .shell_aliases directory which contains a full suite of aliases that I have installed. Some of my most commonly used ones are for git commands, such as:

# either your ~/.bashrc or ~/.aliases file that gets sourced in ~/.bashrc or ~/.zshrc
alias gc="git commit"
alias ga="git add"
alias gs="git status"
alias gk="git checkout"
alias gm="git merge"
alias gpl="git pull"
alias gps="git push"
alias gacp="git add . && git commit && git push"

Other external links that showcase shell aliases that could serve as inspiration for your personal collection include:

And finally, to top it off, Twitter user @ctrlshifti suggests aliasing please to sudo for a pleasant experience at the terminal:

alias please="sudo"

# Now you type:
# please apt-get update
# please apt-get upgrade
# etc...

Troubleshooting

Starship prompt not showing

If you still see % or $ after installation:

  1. Check your shell configuration:

    echo $SHELL
    
    If it shows /bin/zsh, make sure you added the Starship init to ~/.zshrc

  2. Verify the init line is correct:

    grep -n "starship" ~/.zshrc
    
    Should show: eval "$(starship init zsh)"

  3. Restart your terminal completely (don't just source the file)

  4. Check if Starship is in your PATH:

    which starship
    
    Should show something like /usr/local/bin/starship or /home/user/.local/bin/starship

Environment variables not working

If your environment variables aren't being set:

  1. Check if your shell config file is being sourced:

    echo $0
    
    Should show your shell (e.g., -zsh or -bash)

  2. Verify your config file exists:

    ls -la ~/.zshrc  # or ~/.bashrc
    

  3. Test sourcing manually:

    source ~/.zshrc  # or ~/.bashrc
    

PATH issues

If commands aren't found:

  1. Check your current PATH:

    echo $PATH | tr ':' '\n'
    

  2. Verify the directory exists:

    ls -la $HOME/.pixi/bin
    

  3. Add to PATH manually to test:

    export PATH=$HOME/.pixi/bin:$PATH
    

Still not working?

Try the manual installation method from the Starship documentation.

Quick Reference

Essential Commands

# Check your shell
echo $SHELL

# Check your PATH
echo $PATH | tr ':' '\n'

# Reload shell configuration
source ~/.zshrc  # or ~/.bashrc

# Check if a command exists
which command_name

# View all environment variables
env | grep VARIABLE_NAME

Common Aliases to Add

# Git shortcuts
alias gc="git commit"
alias ga="git add"
alias gs="git status"
alias gk="git checkout"

# Navigation shortcuts
alias ..="cd .."
alias ...="cd ../.."
alias ll="ls -la"

# System shortcuts
alias please="sudo"
alias ports="lsof -i -P -n | grep LISTEN"

  1. dotfiles are named as such because they begin with the . character. Examples of these are .bashrc, .zshrc, and other shell configuration files. By convention, they serve the role of a configuration file that configures the behaviour of a program.