2020 10-October
Content to feature:
Recently at work, I've been building some bespoke machine learning models (autoregressive hidden Markov models and graph neural networks) for scientific problems that we encounter. In building those bespoke models, because we aren't using standard reference libraries, we have to build the model code from scratch. Since it's software, it needs tests, and Jeremy Jordan has a great blog post on how to effectively test ML systems. Definitely worth a read in my opinion.
In his Medium article, Gonzalo Ferreiro Volpi shares some fundamentals software skills for data scientists. For those of you who want to invest in levelling up your code-writing skills to reap multiplicative dividends in time saved, frustrations avoided and happiness, come check it out.
In her blog post, Shreya Shankar has some extremely valuable insights into the practice of making ML useful in the real world, which I absolutely agree with. One, in particular, being the quote:
Outside of ML classes and research, I learned that often the most reliable way to get performance improvements is to find another piece of data which gives insight into a completely new aspect of the problem, rather than to add a tweak to the loss. Whenever model performance is bad, we (scientists and practitioners) shouldn’t only resort to investigating model architecture and parameters. We should also be thinking about “culprits” of bad performance in the data.
With that little teaser, I hope this gives you enough impetus to read it. :)
This article is one that is topical and relevant. I also appreciated the illustrations put in there. Also, it's a blog post that highlights a really powerful model -- where powerful doesn't mean millions of parameters, but rather conceptually simple, easy to communicate, broadly applicable, and intensely relevant for the times. Aatish Bhatia has done a tremendously wonderful job here with this explanation. It's a technical masterpiece.
Finally, some more humour from the ever on fire Kareem Carr :).
I have a deep learning joke but it has a lot of layers to it. https://t.co/puRs6lqCUY
— 🔥Kareem Carr🔥 (@kareem_carr) July 24, 2020
Data Science Programming Newsletter MOC
With the Data Science Programming newsletter, I'm trying to share ideas on how to make