Eric J Ma's Website

SciPy 2015: Computational Statistics II Tutorial

written by Eric J. Ma on 2015-07-07 | tags: scipy conferences python


The final tutorial that I sat in today was the intermediate computational statistics tutorial. This was led by Chris Fonnesbeck, prof at Vanderbilt University, fellow Vancouverite, also one of the maintainers of the PyMC3 package.

Tutorial Content

In this tutorial, Chris covered:

  1. Data cleaning/preparation - using pandas.
  2. Density estimation - using the numpy and scipy packages; mechanics: method of moments and maximum likelihood estimators.
  3. Fitting regression models.

Tutorial Pace

The initial part of the tutorial was heavily pandas oriented. I think it was useful for the fairly large fraction of the class that was not well-versed with pandas. In my own case, however, I skipped forward to the second notebook in order to explore a bit. The time spent on pandas was about 1 hr 45 minutes; we only got to the second topic at 2:45 pm.

The latter parts were quite useful. I think the mechanics of thinking through statistical modelling problems isn’t commonly emphasized in stats classes. As such, just like I had mentioned in my review of the first tutorial, the mechanics on "how to do stuff" proved to be really helpful.

Overall Thoughts

This was the one that I was particularly anticipating, as I was hoping to learn the mechanics of doing Bayesian statistical analysis in PyMC3. However, the tutorial content was not that, possibly because this material was already covered last year and recorded (for YouTube posterity). Instead, I was pleasantly surprised by the content covered here instead. Definitely was an expansion of my thinking.

Two full days of learning has been quite an intellectual adventure! Many thanks to all of the tutorial leaders for their preparation and hard work; count me as one more person who’s learned lots!


I send out a newsletter with tips and tools for data scientists. Come check it out at Substack.

If you would like to sponsor the coffee that goes into making my posts, please consider GitHub Sponsors!

Finally, I do free 30-minute GenAI strategy calls for organizations who are seeking guidance on how to best leverage this technology. Consider booking a call on Calendly if you're interested!