Create shell command aliases for your commonly used commands
Shell aliases can save you keystrokes, which save time. That time saved is compound interest over long time horizons!
Shell aliases are easy to create. In your shell initializer script, use the following syntax, using ls
being aliased to exa
with configuration flags at the end as an example:
alias ls="exa --long"
Now, typing ls
at the shell will instead execute exa
! (To know what is exa
, see Install a suite of really cool utilities on your machine using homebrew.)
In order for these shell aliases to take effect each time you open up your shell, you should ensure that they get sourced in your shell initialization script (see: Take full control of your shell environment variables for more information). You have one of two options:
.zshrc
or .bashrc
(or analogous) file, or~/.aliases
, which you source inside your shell initialization script file (i.e. .zshrc
/.bashrc
/etc.)I recommend the second option as doing so means you'll be putting into practice the philosophy of having clear categories of things in one place.
In my dotfiles repository, I have a .shell_aliases
directory which contains a full suite of aliases that I have installed.
Other external links that showcase shell aliases that could serve as inspiration for your personal collection include:
And finally, to top it off, Twitter user @ctrlshifti suggests aliasing please to sudo for a pleasant experience at the terminal:
alias please="sudo"
# Now you type:
# please apt-get update
# please apt-get upgrade
# etc...
Leverage dotfiles to get your machine configured quickly
Your dotfiles control the baseline of your computing environment. Creating a dotfiles repository lets you version control it, make a backup of it on a hosted version control site (like Github or Bitbucket) and quickly deploy it to a new system.
It's really up to you, but you want to make sure that you capture all of the .some_file_extension
files stored in your home directory that are also important for your shell runtime environment.
For example, you might want to include your .zshrc
or your .bashrc
files, i.e. the shell initialization scripts.
You might also want to refactor out some pieces from the .zshrc
and put them into separate files that get sourced inside those files. For example, I have two, one for the PATH
environment variable named .path
(see: Take full control of your shell environment variables) and one for aliases named .aliases
(see: Create shell command aliases for your commonly used commands). You can source these files in the .zshrc
file, so I have everything defined in .path
and .aliases
available to me.
You can also create an install.sh
script that, when executed at the shell, symlinks all the files from the dotfiles directory into the home directory or copies them. (I usually opt to symlink because I can apply updates more easily.) The install.sh
script can be as simple as:
cp .zshrc $HOME/.zshrc
cp .path $HOME/.path
cp .aliases $HOME/.aliases
Everything outlined above forms the basis of your bootstrap for a new computer, which I alluded to in Automate the bootstrapping of your new computer.
If you want to see a few examples of dotfiles in action, check out the following repositories and pages:
From the official "dotfiles" GitHub pages:
My own dotfiles: ericmjl/dotfiles which are inspired by mathiasbynens/dotfiles
Take full control of your shell environment variables
If you're not sure what environment variables are, I have an essay on them that you can reference. Mastering environment variables is crucial for data scientists!
Your shell environment, whether it is zsh or bash or fish or something else, is supremely important. It determines the runtime environment, which in turn determines which Python you're using, whether you have proxies set correctly, and more. Rather than leave this to chance, I would recommend instead gaining full control over your environment variables.
The simplest way is to set them explicitly in your shell initialization script. For bash shells, it's either .bashrc
or .bash_profile
. For the Z shell, it'll be the .zshrc
file. In there, step by step, set the environment variables that you need system-wide.
For example, explicitly set your PATH
environment variable with explainers that tell future you why you ordered the PATH in a certain way.
# Start with an explicit minimal PATH
export PATH=/bin:/usr/bin:/usr/local/bin
# Add in my custom binaries that I want available across projects
export PATH=$HOME/bin:$PATH
# Add in anaconda installation path
export PATH=$HOME/anaconda/bin:$PATH
# Add more stuff below...
If you want your shell initialization script to be cleaner, you can refactor it out into a second bash script called env_vars.sh
, which lives either inside your home directory or your dotfiles repository (see: Leverage dotfiles to get your machine configured quickly). Then, source the env_vars.sh
script from the shell initialization script:
source ~/env_vars.sh
There may be a chance that other things, like the Anaconda installer, will give you an option to modify your shell initializer script. If so, be sure to keep this in the back of your mind. At the end, of your shell initializer script, you can echo the final state of environment variables to help you debug.
Environment variables that need to be set on a per-project basis are handled slightly differently. See Create runtime environment variable configuration files for each of your projects.
Use Mamba as a faster drop-in replacement for conda
Mamba is a project originally developed by the Quantstack team. They went in and solved some of the annoyances with the conda package manager - specifically the problem of how long it takes to solve an environment specification.
Mamba is available on conda-forge and PyPI. Follow the instructions on the mamba repo to install it.
If you have muscle memory and want to make the switch from conda
to mamba
as easy as possible, you can use a shell alias inside your sourced .aliases
file:
alias conda="mamba"
See the page Create shell command aliases for your commonly used commands for more information on shell aliases.
Use bash tricks to help save keystrokes and time
There are some bash tricks that can be incredibly helpful. Here's a collection of those that I have encountered.
||
for fallback commandsAn example:
source my_env/bin/activate || conda activate my_env || source activate my_env
Where did this come up? In my continuous integration pipelines, I try to maintain the same syntax between pipelines (e.g. GitHub Actions and Azure Pipelines.) However, as of 2020, Azure Pipelines doesn't play well with conda activate
, and requires that I use source activate
. As such, in order to use the same bash scripts that need to activate an environment, I used the bash ||
syntax to create a fallback command for the conda activate
command. If the conda activate
command fails, the source activate
command will be executed.
The commands are executed in order from left to right. One thing neat is that there will be an exit code of 0
, which by bash historical convention signifies "success", as soon as one of the commands succeeds. If all three fail, there will be a non-zero exit code, which, depending on your system, should terminate further execution.
Configure your machine
After getting access to your development machine, you'll want to configure it and take full control over how it works. Backing the following steps are a core set of ideas:
Head over to the following pages to see how you can get things going.
Install a suite of really cool utilities on your machine using homebrew
Install gcc if you want to have the GNU C compiler available on your Mac at the same time as the clang C compiler.
C compilers come in handy for some numerical computing packages, which multiple data science languages (Julia, Python, R) depend on.
If you've ever been disconnected from SSH because of a flaky internet connection, mosh can be your saviour. Check out the tool's homepage.
This is a tool for multiplexing your shell sessions -- uber handy if you want to persist a shell session on a remote server even after you disconnect. If you're of the type who has a habit of creating new shell sessions for every project, then tmux
might be able to help you get things under control. Check out the tool's homepage.
The tree
command line tool allows you to see the file tree at the terminal. If you pair it with exa
, you will have an upgraded file tree experience. See its homepage.
exa
is next-level ls
(which is used to list files in a directory). According to the website, "A modern replacement for ls
". See the homepage. If you alias ls
to exa
, it's next-level convenience! (see Create shell command aliases for your commonly used commands)
ripgrep provides a command line tool rg
, which recursively scans down the file tree from the current directory for files that contain text that you want to search. Its Github repo should reveal all of its secrets.
This gives you a tool for viewing differences between files, aka "diffs". Check out its Github repo for more information. You can also configure git
to use diff-so-fancy to render diffs at the terminal. (see: Install and configure git on your machine)
bat
is next-level cat
, which is a utility for viewing text files in the terminal. Check out the Github repository for what you get. You can alias cat
to bat
, and in that way, not need to deviate from muscle memory to use bat
.
fd
gives you a faster replacement for the shell tool find
, which you can use to find files by name. Check out the Github repository to learn more about it.
On recommendation from my colleague Arkadij Kummer, grab fzf
to have extremely fast fuzzy text search on the filesystem. Check out the project's GitHub repository!
Use croc
as a tool to move data from one machine to another easily in a secure fashion. (I have used this in lieu of commercial utilities that cost tens of thousands of dollars in license fees.) Check out the project's GitHub repository!
Now that you've read about these utilities' reason for existence, go ahead and install them!
brew install \
git gcc tmux wget mobile-shell \
diff-so-fancy ripgrep bat fd fzf croc