written by Eric J. Ma on 2020-12-18 | tags: math graph theory category theory programming

Just as I began vacation, my brain started going into overdrive. (This is a quirk of my head.) One of the epiphanies I arrived at involved graphs, programming languages, and categories.

Multiple dispatch is a programming construct. It essentially allows us to write functions that:

- Share the same name, but
- Exhibit different behaviour based on the
*types*of the arguments passed in.

This idea is baked into the Julia programming language
and is part of what makes Julia so composable.
Python programmers don't have this luxury,
partially because multiple dispatch isn't part of the language design
and partially because of namespaces.
That said, Matt Rocklin and a few others
maintain the `multipledispatch`

Python package,
which implements multiple dispatch in a pretty sane fashion.
To borrow the example from the package,
we might design a function called `add`

,
but need it to operate differently
depending on whether we put in integers or Python objects:

@dispatch(int, int) def add(a, b): """Adds two numbers together.""" return a + b @dispatch(object, object) def add(a, b): """Performs a concatenation between two strings.""" return f"{a}{b}"

Is this a good idea? I used to wonder when I'd know enough to answer the question. I think I do now. But first, we have to take a detour to graphs.

Graphs are a data structure that allows us to model entities and their relationships. Entities are nodes, while relationships are edges. I usually teach a tutorial at PyCon, SciPy and ODSC titled "Network Analysis Made Simple", in which we use NetworkX as a tool to learn applied graph theory. It's been running since 2014, and I've learned a lot, made a lot of friends, and had new opportunities spring up through teaching it.

As a data model, graphs are handy for reasoning
because for our logic to make sense,
we must be extremely clear on our definitions.
What exactly constitutes a node?
What exactly constitutes a relationship?
What connections are allowed to exist between nodes
(or, surprise! *categories* of nodes)?
These answers must be precise.

This then brings us to categories.

I think it was from Eric Schles that I learned that there was this entire field of mathematics called "category theory". (This was likely ODSC 2017, while both of us were chatting in the speaker's lounge.) Also, I started reading the blog of Tai Danae-Bradley, who trained as a category theorist at NYU and even has a book on it. From one of her excellent blog posts, I learned the core of category theory is about: "...collections of objects that can relate to each other via morphisms..."

Sounds familiar... it sounds like a graph!

If you stepped back one moment and looked carefully at the `add`

example,
you'll notice that we have a microcosm of category theory
being brought to life in a programming language.

There's a category of objects called *objects*,
and there's a (sub-)category of objects called *integers*.
There are relations between integers called *add*,
and a similar relation between objects, also called *add*.
Both have a relationship named *add*,
but each *add* differs because the "integers" case
is a particular case that needs to be handled differently from the "objects" case.

You could construct a graph to represent it all, one that looks like this:

(int, int) --add--> int (obj, obj) --add--> str

If we think about it carefully,
because integers are a sub-category of objects,
the second case `(obj, obj)`

covers the child cross-type cases.

So what gives, what's the epiphany?

*Multiple dispatch works because linguistically,
we might need to reuse the same category of relations,
as indicated by their shared name,
for different object categories,
for no other reason than doing so
is the semantically correct thing to do for our problem.*
And doing so allows us to leverage composition
to write cleaner and interoperable programs.

```
@article{
ericmjl-2020-graphs-categories,
author = {Eric J. Ma},
title = {Graphs, language, and categories},
year = {2020},
month = {12},
day = {18},
howpublished = {\url{https://ericmjl.github.io}},
journal = {Eric J. Ma's Blog},
url = {https://ericmjl.github.io/blog/2020/12/18/graphs-language-and-categories},
}
```

*I send out a newsletter with tips and tools
for data scientists. Come check it out at
Substack.*

*If you would like to sponsor the coffee that goes into making my posts,
please consider
GitHub Sponsors!*

*Finally, I do free 30-minute GenAI strategy calls for teams
that are looking to leverage GenAI for maximum impact. Consider
booking a call on Calendly
if you're interested!*