written by Eric J. Ma on 2021-01-24 | tags: programming coding
I finally grokked an old thing!
The thing I've been trying to understand properly is "how and when to do single/multiple dispatching". It's an idea that is heavily baked into the Julia programming language, and one for which I rarely see the pattern being used in Python. But after reading Michael Chow's blog post on single dispatching, I finally realized how dispatching gets used. Let me explain with an example.
Let's say you have a function that can process one column or multiple columns of the a dataframe. As a first pass, you might write it as follows:
import pandas as pd from typing import Union, List, Tuple def my_func(df: pd.DataFrame, column_names: Union[List, Tuple, str]): if isinstance(column_names, (list, tuple)): for column in column_names: # something happens. if isinstance(column_names, str): # something happens. return df
Here are the problems I see:
my_func
is. It definitely goes against the Zen of Python's recommendation that "flat is better than nested".Here's the same function written instead as two functions that share the same name, leveraging the multipledispatch package by Matt Rocklin:
from multipledispatch import dispatch import pandas as pd @dispatch(pd.DataFrame, str) def my_func(df, column_names): # do str-specific behaviour return df @dispatch(pd.DataFrame, (List, Tuple)) def my_func(df, column_names): for column in column_names: # do stuff return df
Notice how now each function is:
dispatch
call.The only downside I can think of to the "dispatch" programming pattern is that one must know the concept of "dispatching" before one can understand the @dispatch
syntax. The one who is unfamiliar with this concept might also be thrown off by declaring the same function twice. That said, I think the benefits here are immense though!
@article{
ericmjl-2021-dispatch-types,
author = {Eric J. Ma},
title = {Dispatch rather than check types},
year = {2021},
month = {01},
day = {24},
howpublished = {\url{https://ericmjl.github.io}},
journal = {Eric J. Ma's Blog},
url = {https://ericmjl.github.io/blog/2021/1/24/dispatch-rather-than-check-types},
}
I send out a newsletter with tips and tools for data scientists. Come check it out at Substack.
If you would like to sponsor the coffee that goes into making my posts, please consider GitHub Sponsors!
Finally, I do free 30-minute GenAI strategy calls for teams that are looking to leverage GenAI for maximum impact. Consider booking a call on Calendly if you're interested!