🌟 How to foster an open source culture in your data science team

ericmjl.github.io/foster-open-source-culture

💁🏻‍♂️ About Me

Sr. Principal Data Scientist, DSAI (Research), Moderna

  • Moderna: Here to make medicines.
  • Research: Biology and chemistry nerd
  • DSAI: Computer nerd
  • Data Scientist: Data nerd
  • Sr. Principal: Person who has team lead responsibilities.

💁🏻‍♂️ About Me (Take 2)

Open Source Contributor, Developer, and Educator

I move through projects as my needs and interests evolve.

❓ Today’s question

Q: How do we foster a culture of open source within our data science team?

  1. Import open source ways of working into the team.
  2. Articulate value to multiple stakeholders in a way that is aligned with their interests.

💡 Idea 1: Import OSS culture

🧐 Open source was formative for my thinking

  • SciPy 2015, Matplotlib Sprints.
  • Task: MEP12 examples gallery

1️⃣ Me, the newbie, was given simple tasks…

from pylab import *

x = arange(0, 3, 1000)
y = 0.32 * x + 35

plot(x, y)
import matplotlib.pyplot as plt
import numpy as np

x = np.arange(0, 3, 1000)
y = 0.32 * x + 35

plt.plot(x, y)

The training steps, and repetition, helped me grow my skillset.

2️⃣ …88 PRs later…

3️⃣ …I knew that software skills mattered in data science…

4️⃣ …and imported ways of working to my work and home team…

Everything is viewable, clone-able, hackable, and contributable, just like the OSS world.

We resist doing a treadmill of one-off projects.

We build high power tools for ourselves, laboratory scientists, and computational scientists, just like the OSS world.

We invest here to clarify our thinking about our domain problem.

We cut future frustrations untangling messy code and fragile configuration.

We develop software like the best of the OSS world.

We invest time here to ensure the correctness of our work.

We spend much less time in the future being burned by changes that introduced subtle bugs.

We develop software like the best of the OSS world.

OSS is under-funded and under-staffed, so there is much investment in automation and documentation.

🤖 Automation scales labour. 📖 Documentation scales our brain.

Mirroring sprints, we run quarterly docathons that provide focused time for us to write high impact docs with high ROI.

5️⃣ …creating cultural compatibility with OSS!

I wanted to instill a compatible culture of empowerment and self-directed agency within the Moderna DSAI teams:

  • Collaboration: stewarded and shared ownership of code.
  • Contribution: co-creation of new features.
  • Community: increased investment in ecosystem.

💡 Idea 2: Articulate the value of OSS to your organization

🔄 CBOSS vs. CDOSS

Articulated by Travis Oliphant, these are two different flavours of open source software.

  • CBOSS: Company-Backed Open Source Software
  • CDOSS: Community-Driven Open Source Software

🌏 Diverse 🌍 CBOSS 🌎 motivations

Consulting around open source sustains and grows demand for open source software, which sustains consulting business.

Community engagement gives a customer path to profitable services.

OSS releases are primarily a marketing tool to attract new talent. OSS release has no direct impact on core products & services.

Company-Backed Open Source Software must fit strategically within organizational motivations

What about goodwill?

  • Giving back is my personal motivation!
  • Can be a deep source of motivation for the individual.
  • However, goodwill rarely is the slam-dunk winning argument.
  • We need to study stakeholder motivations.

🎯 Interested parties have motivations

Party Interest
Data Science Building a professional portfolio.
Legal Protecting company IP.
Management Minimizing operational disruption.
Executive Enhancing reputation & value creation.

You may need >1 of these parties to have buy-in before you can release open source software.

🗣️ Ways to articulate

Goal: Build professional portfolio.

  • Long-term investment in company’s reputation as a technologically competent company.
  • Professional development opportunity for individual contributors.

Goal: Protect company IP.

  • Show evidence that open source release is not novel IP, using existing literature.
  • Articulate criteria used to evaluate whether contributions from within the company are open source-able or not.

Goal: Minimize operational disruption.

  • Articulate set schedule for activities related to stewardship of OSS software, such as reviewing PRs.
  • Articulate prospective leverage in internal work from external contributions.

Goal: Enhance reputation and value creation

  • Identify peer group companies that are engaged in similar activities.
  • Articulate prospective leverage in internal work from external contributions.

👉 You can foster an open source culture for your org!

  1. Internal: Import open source ways of working into the team.
  2. External: Articulate value to multiple stakeholders in a way that is aligned with their interests.

⭐️ Thank you!