Skip to content

Software Skills

Because our day-to-day involves writing code, I am convinced that we data scientists need to be equipped with basic software engineering skills. Being equipped with these skills will help us write code that is, in the long-run, easy to recap, remember, reference, review, and rewrite.

In this collection of short essays, I will highlight the basic software skills that, if we master, will increase our efficiency and effectiveness in the long-run.

Common Objections

If you have heard these suggestions before, then you might have also heard some of the common objections to learning these software practices. I wish to address them here in bulk, so I do not have to address them in-depth in the individual essays.

I have not enough time

This objection is one I am sympathetic to, as I operate under time constraints myself.

This is the nature of code: written once, used many times. Hence, the best response that I can give is that time taken cutting corners now yields multiples of others' (including your future self's) time wasted navigating an undocumented, spaghetti-code codebase, that is not well-structured either. Cutting out these software practices now makes things much more difficult to maintain and improve code when it goes into production.

My code is only going to be written and read by myself

At some point, though, there is a high probability that you will end up writing code that someone else has to read and use. That someone else is usually your future self but also teammates that may need to cover for you when you're out. (This applies especially if you're on a data science team.) The time invested in making the code read well now, even on code that does not have to be read by others, will reduce the learning curve pain when you eventually do have to write code for others. You might as well invest the time now while there's less formal scrutiny to practice your software skills. When the stakes are higher, being ready can only be helpful.

I don't know how to get started, there are so many places to begin

Pick any one skill, say, refactoring, and work on it first. You can always add on more skills into your toolkit as you go along.