What is differential computing
I think at its core, we can think of differential computing as "computing where derivatives (from calculus) are a first-class citizen". This is analogous to probabilistic computing, in which probabilistic constructs (such as probability distributions and MCMC samplers) are given "first class" status in a language. By "first class" status, I mean that the relevant computing constructs are a well-developed category of constructs in the language, with clearly defined interfaces.
In differential computing, being able to evaluate gradients of a math function are at the heart of the language. (How the gradients are used is a matter of application.) The objective of a differential computing system is to write a program that gets a computer to automatically evaluate gradients of a math function that we have written in that language. This is where automatic differentiation (AD) systems come into play. They take a program, perhaps written as a Python function, and automatically transform the program into, perhaps, another Python function, which can be used to evaluate the derivative of the original function.
All of this falls under the paradigm of AD systems, as I mentioned earlier. Symbolic differentiation can be considered one subclass of AD systems; this is a point PyMC3 developer Brandon Willard would make. Packages such as autograd
and now JAX (also by the autograd
authors) are another subclass of AD systems, which leverage the chain rule and a recorder tape of math operations called in the program to automatically construct the gradient function.
Notes on differential computing
This is an overview page of my notes on differential computing and JAX.
Contents:
Stuff I've built/done with JAX: