On the Meedan engineering team we are trying to clean up the way we write Git commits in order to help our engineers communicate more clearly with one another and to make our jobs that much easier in the future when we’re looking back at old code. One strategy we are attempting is to write longform Git commits.
Meedan is hiring for our engineering team! Join us and enjoy the benefits of verbose git commits. Positions are open for a Backend Engineer (Remote) and a Machine Learning Engineer.
What is a longform Git commit?
When most engineers are taught how to commit a change to Git, we learn to do something like:
This certainly works, but writing a commit message this way is very limiting. The command line is not a great place to write long human-readable text, so this encourages short commit messages that are not descriptive. Plus, it’s difficult to include special punctuation this way. For example, ! and newlines need to be escaped inside the message string.
A good rule of thumb: git commit -m should not be your default! You should be doing git commit or git commit -a. Even better is git commit -v -a, which will pull up your editor of choice and let you easily write a commit. The -v means “verbose” and will add a diff of all your changes. Let’s change some code and try:
You probably already know that -a means “add”, which basically runs git add . before running the commit. But not a lot of people know about -v. This switch provides a diff inside the text file you are going to edit in whatever git’s default text editor is set to. It will look something like this:
Note the diff on the end, telling us that we corrected the math in our multiply function.
I find including the diff helpful for writing the commit because I can simply refer to the changes I’ve made right there. It also means that I can use autocomplete to complete variable names because any code I wrote is down there. The diff won’t be included in the commit, just all the stuff above the comment lines.
So, what do we type here now that we are in this document? I like to think of commits as emails, with a subject, a body, and a footer.
The subject portion is the first line of the text, before a line break. This briefly describes the changes we made in this commit. In our case we might want it to be something like “Fix our multiply function”. This should be short, and written in the imperative voice (“command” words like “fix” instead of “fixed” or “fixes”). Most guides out there say to keep it to 50 characters or less. Popular text editors will even include helpful syntax highlighting for your commit message. For example, if you have syntax on set in Vim, it will highlight the first line in red if it goes over 50 characters in length.
The body should explain what changed and why. The “what” describes the techniques and solutions you applied. You might even link to blog posts that helped you get a handle on the problem. I like to write the body of the message in markdown, since it’s pretty readable on its own and can easily be cut and pasted into a pull request form on Github or Gitlab. The “why” doesn’t mean from a project management standpoint – so no need to say “I’m doing this because our client asked for it”. What I mean by “why” here is relevant contextual information. For example, why did you implement your change the way you did versus the other possible ways you could have done it?
And the footer is any extra information, metadata-like stuff. This is usually where I put the issue number that the commit is related to.
Ultimately we might write something like the following:
For a real example of this in action, you can look at this commit on Check.
How does this actually help developers?
Okay, so we have made a bunch of commits in the format I’ve described above. How does this make day-to-day development easier? It all comes down to organizing information about the software you’re developing, and centralizing developer-facing data in Git instead of somewhere else.
Short subject lines are beneficial in at least two places: git log output and the “commits” page on your Git host of choice.
When it comes to logs, I like to use the --oneline command line switch to see an abbreviated version of recent changes. This gives us just the short subject lines. Check out these actual git log outputs for commits that do not use the above formatting recommendations:
Now compare it to a version of the same log that I’ve edited to reflect the above formatting suggestions:
You’ll notice that I removed otherwise-critical context information from several commits. For example, “propose folders as destination for rejecting suggested and make default folder on the top of list” has become “Change logic for folder organization”. That’s based on the assumption that the details of what changed would move from the subject of the commit down to the body. The subject really just needs to let someone scanning commits understand that the code in this commit touches the folder organization systems, and then the reader can git show [hash] if they need to know more. I also removed Jira ticket numbers and Github pull request numbers. Those can also go in the body text.
Ultimately, good subject lines make the timeline of development legible.
In contrast to subject lines, body text is not about brevity. It’s about including as much context as makes sense for a commit, ideally in plaintext language. I think of the body text of Git commits like a textual database of knowledge about the code.
For example, I can use the --grep switch for git log to search this database (along with -i to make the search case-insensitive). Let’s say I am working on a bug related to a modal dialog, and I have the sneaking suspicion that I might have introduced it as a regression in the last six months. Here I am searching the code for all commits I have made that are related to the concept of a modal:
Not only does this pull up my previous modal work, but looking at these results I can quickly understand the context of what I was doing at the time and why I made some of the decisions that I did. When I do git show [hash] to view the code changes, I’ll have a much better understanding of why the diff is the way it is!
Another advantage of verbose body text is that writing pull requests becomes extremely easy. Have you ever made a bunch of commits and then written a pull request the next day? Sometimes it’s hard to remember the context of the work you were doing, and that makes it hard to write a good pull request. Using longform commits, I will often just copy and paste the body text of my commits into a pull request composition form and it’s already ~80% written for me. Since I write my body text in Markdown it’s even nicely pre-formatted!
If you use Github, a neat little trick is that if your branch contains a single commit, Github will automatically populate the subject line as the title of the pull request and the body text as the body of the pull request. Remember the commit that I linked earlier? The associated pull request was automatically created using this method. The only thing I added was the screenshot for extra context.
Good commit messages turn our Git repositories into a database of knowledge about our code and why it’s implemented the way it is. This is distinct from something like Jira, which is a database for product management decisions. We could put technical implementation details in Jira, but in my opinion Git is the place where it makes sense for all this stuff to live. Our product managers don’t need to know why we implemented something the way we did. They need to know that we implemented it, and when it happened, and when it’s going live. Jira and similar project management tools are great for tracking that information. But information like you see in the body text of commits that I recommend above is basically noise as far as anyone outside of engineering is concerned, so let’s keep that stuff to a place where only engineers are going to see it!
- Online conversations are heavily influenced by news coverage, like the 2022 Supreme Court decision on abortion. The relationship is less clear between big breaking news and specific increases in online misinformation.
- The tweets analyzed were a random sample qualitatively coded as “misinformation” or “not misinformation” by two qualitative coders trained in public health and internet studies.
- This method used Twitter’s historical search API
- The peak was a significant outlier compared to days before it using Grubbs' test for outliers for Chemical Abortion (p<0.2 for the decision; p<0.003 for the leak) and Herbal Abortion (p<0.001 for the decision and leak).
- All our searches were case insensitive and could match substrings; so, “revers” matches “reverse”, “reversal”, etc.
Darius Kazemi is a senior software engineer at Meedan. He is a researcher, former Mozilla Open Web Fellow, and internet artist under the moniker Tiny Subversions. His work focuses on re-decentralizing the internet and empowering communities to set their own norms online.