written by Eric J. Ma on 2018-08-06 | tags: causal inference bayesian data science
Yesterday evening, I had an empty block of time during which I finally did a worked example of finding whether two nodes are "d-separated" in a causal graph. It was pretty instructive to implement the algorithm. It also reminded me yet... (read more)
(71 words, approximately 1 minute reading time)written by Eric J. Ma on 2018-08-01 | tags: nxviz visualization data science software open source
A new version of nxviz is released!
In this update, I have added a declarative interface for visualizing geographically-constrained graphs. Here, nodes in a graph have their placement constrained by longitude and latitude.
An example... (read more)
(117 words, approximately 1 minute reading time)written by Eric J. Ma on 2018-07-27 | tags: open source pyjanitor data science
A new release of pyjanitor
is out!
Two new features that I have added in include:
written by Eric J. Ma on 2018-07-26 | tags: scipy conferences python
It's been about two weeks since SciPy 2018 ended, and I've finally found some breathing room to write about it.
SciPy 2018 is the 4th year I've made it to the conference, my first one being SciPy 2015 (not 2014, as I had originally... (read more)
(576 words, approximately 3 minutes reading time)written by Eric J. Ma on 2018-07-16 | tags: bayesian statistics data science
Over the past year, having learned about Bayesian inference methods, I finally see how estimation, group comparison, and model checking build upon each other into this really elegant framework for data analysis.
written by Eric J. Ma on 2018-07-14 | tags: statistics visualization data science
I detail why ECDFs are superior to histograms as a way of visualizing distributions. In short, they provide richer information than histograms do. Come learn about them!
Read on... (611 words, approximately 4 minutes reading time)written by Eric J. Ma on 2018-06-17 | tags: git version control code snippets
I learned a new thing this weekend: we apparently can apply a patch onto a branch/fork using git apply [patchfile]
.
There's a few things to unpack here. First off, what's a patchfile
?
The long story cut short... (read more)
(729 words, approximately 4 minutes reading time)written by Eric J. Ma on 2018-06-05 | tags: data science machine learning deep learning causal inference graph theory probability
It took reading Judea Pearl's "The Book of Why", and Jonas Peters' mini-course on causality, for me to finally figure out why I had this lingering dissatisfaction with modern machine learning. It's because modern machine learning (deep... (read more)
(662 words, approximately 4 minutes reading time)written by Eric J. Ma on 2018-05-26 | tags: causal inference
Finally, I have finished Judea Pearl's latest work "The Book of Why"! Having read it, I have come to appreciate... (read more)
(208 words, approximately 2 minutes reading time)written by Eric J. Ma on 2018-05-06 | tags: machine learning data science deep learning automl
For any problem that we think is machine learnable, having a sane baseline is really important. It is even more important to establish them early.
Today at ODSC, I had a chance to meet both Andreas Mueller and Randy Olson. Andreas leads