written by Eric J. Ma on 2023-06-20 | tags: git commit messages gpt-4 llamabot cli tool openai api conventional commits collaboration project management debugging code review versioning machine learning outlines library coding bot
As a data scientist, I work with Git.
If you're anything like the lazy version of me, your commit messages look something like this:
commit 4ea50cae5529a1199aff1a5970c556ae11838681 Author: Eric Ma <eric***@gmail.com> Date: Sat May 20 22:16:17 2023 -0400 Add headshot to bio. commit 0f7358b0bc813648a5cabbcf33c33bb04a01f6ed Author: Eric Ma <eric***@gmail.com> Date: Sun May 14 08:29:57 2023 -0400 Update blog post with sam oranyeli's comment. commit afec8c50da401859b9759c5490ab95cd74de7c39 Author: Eric Ma <eric***@gmail.com> Date: Sun May 14 08:26:21 2023 -0400 Add blog post on crafting PR summaries using GPT4. commit fb4e5517fc51044f1fa4adb9716e27b32e1c734f Author: Eric Ma <eric***@gmail.com> Date: Tue May 2 08:51:34 2023 -0400 Update with SEMBR.
(This is my website's commit log.)
These aren't particularly informative. They don't tell me sufficient detail on what changes were made from commit to commit. Imagine, now, we had a commit message that looked like this:
commit ... Author: Eric Ma <eric***@gmail.com> Date: Tue May 2 08:51:34 2023 -0400 stuff done
That's even worse... and it's a routine thing amongst data scientists!
To get around this problem, I wrote a CLI tool within llamabot
that crafts commit messages according to the Conventional Commits specification. To use it, I simply stage the changes I'd like to commit together and then run:
llamabot git commit
Underneath the hood, with an OpenAI API key, GPT-4-32k will write a commit message for me that looks like:
commit 62ee2b25e8f39f211a074a45c6caf8aabae13742 (HEAD -> main, origin/main) Author: Eric Ma <ericmajinglong@gmail.com> Date: Tue Jun 27 08:46:57 2023 -0400 feat(blog): add outlines library exploration and coding bot example This commit adds a new blog post that explores the `outlines` library and demonstrates building a coding bot that ghostwrites code, docstrings, and tests using composable prompts. The post highlights the benefits of using `outlines` for prompt management and showcases how clean prompt design can result in compact and efficient Python programs. Additionally, the post discusses the ease of mapping the demonstrated code to a user interface and compares the use of `outlines` with alternative approaches such as f-strings for prompt templating.
Isn't that amazing?!
The benefits of using meaningful commit messages are manifold. Firstly, it improves collaboration within development teams. When team members review the commit log, they can quickly understand the nature and purpose of each change. This clarity fosters efficient code reviews and reduces the time spent deciphering the intentions behind commits.
Secondly, informative commit messages facilitate better project management. By examining the commit log, project managers and stakeholders gain insights into the progress of the project, the implemented features, and the bugs fixed. This knowledge allows for effective tracking of changes and helps in making informed decisions about future development directions.
Moreover, detailed commit messages greatly aid in debugging and issue resolution. When encountering a bug or regression, developers can trace back the changes through the commit history and identify the specific commit that introduced the problem. This capability saves time and effort by pinpointing the exact source of an issue, enabling faster resolution and reducing downtime. Even better if we're versioning machine learning models!
If you're interested in trying out llamabot
, you can install it by running pip install llamabot
at the terminal!
@article{
ericmjl-2023-automatically-messages,
author = {Eric J. Ma},
title = {Automatically write awesome commit messages},
year = {2023},
month = {06},
day = {20},
howpublished = {\url{https://ericmjl.github.io}},
journal = {Eric J. Ma's Blog},
url = {https://ericmjl.github.io/blog/2023/6/20/automatically-write-awesome-commit-messages},
}
I send out a newsletter with tips and tools for data scientists. Come check it out at Substack.
If you would like to sponsor the coffee that goes into making my posts, please consider GitHub Sponsors!
Finally, I do free 30-minute GenAI strategy calls for teams that are looking to leverage GenAI for maximum impact. Consider booking a call on Calendly if you're interested!