Eric J Ma's Website

pyjanitor 0.3 released!

written by Eric J. Ma on 2018-07-27 | tags: open source pyjanitor data science


A new release of pyjanitor is out!

Two new features that I have added in include:

  1. Concatenating column names into a single column, such that each item is separated by a delimiter.
  2. Deconcatenating a column into multiple columns, separating on the basis of a delimiter.

Both of these tasks come up frequently in data preparation.

For example, concatenating a few columns together oftentimes lets us create an unique index based sample properties.

On the other hand, deconcatenating columns into multiple columns can be useful when our index is used to store metadata. (This really shouldn't be happening, but... sometimes that's just how the world works right now...)

Here's an example of how it works:

To install pyjanitor, grab it from PyPI:

$ pip install pyjanitor

The conda-forge build will be coming soon!


I send out a newsletter with tips and tools for data scientists. Come check it out at Substack.

If you would like to sponsor the coffee that goes into making my posts, please consider GitHub Sponsors!

Finally, I do free 30-minute GenAI strategy calls for organizations who are seeking guidance on how to best leverage this technology. Consider booking a call on Calendly if you're interested!