On the Meedan engineering team we are trying to clean up the way we write Git commits in order to help our engineers communicate more clearly with one another and to make our jobs that much easier in the future when we’re looking back at old code. One strategy we are attempting is to write longform Git commits.

Meedan is hiring for our engineering team! Join us and enjoy the benefits of verbose git commits. Positions are open for a Backend Engineer (Remote) and a Machine Learning Engineer.

What is a longform Git commit?

When most engineers are taught how to commit a change to Git, we learn to do something like:

git commit -m "This is my commit message"

This certainly works, but writing a commit message this way is very limiting. The command line is not a great place to write long human-readable text, so this encourages short commit messages that are not descriptive. Plus, it’s difficult to include special punctuation this way. For example, ! and newlines need to be escaped inside the message string.

A good rule of thumb: git commit -m should not be your default! You should be doing git commit or git commit -a. Even better is git commit -v -a, which will pull up your editor of choice and let you easily write a commit. The -v means “verbose” and will add a diff of all your changes. Let’s change some code and try:

$ git commit -av

You probably already know that -a means “add”, which basically runs git add . before running the commit. But not a lot of people know about -v. This switch provides a diff inside the text file you are going to edit in whatever git’s default text editor is set to. It will look something like this:

# Please enter the commit message for your changes. Lines starting
# with '#' will be ignored, and an empty message aborts the commit.
#
# On branch working
# Changes to be committed:
#       modified:   foo.js
#
# ------------------------ >8 ------------------------
# Do not modify or remove the line above.
# Everything below it will be ignored.
diff --git foo.js foo.js
index d575e27..6afb236 100644
--- foo.js
+++ foo.js
@@ -6,6 +6,6 @@ 
 function multiply(a, b) {
-  return a / b;
+  return a * b;
 }
 

Note the diff on the end, telling us that we corrected the math in our multiply function.

I find including the diff helpful for writing the commit because I can simply refer to the changes I’ve made right there. It also means that I can use autocomplete to complete variable names because any code I wrote is down there. The diff won’t be included in the commit, just all the stuff above the comment lines.

So, what do we type here now that we are in this document? I like to think of commits as emails, with a subject, a body, and a footer.

The subject portion is the first line of the text, before a line break. This briefly describes the changes we made in this commit. In our case we might want it to be something like “Fix our multiply function”. This should be short, and written in the imperative voice (“command” words like “fix” instead of “fixed” or “fixes”). Most guides out there say to keep it to 50 characters or less. Popular text editors will even include helpful syntax highlighting for your commit message. For example, if you have syntax on set in Vim, it will highlight the first line in red if it goes over 50 characters in length.

The body should explain what changed and why. The “what” describes the techniques and solutions you applied. You might even link to blog posts that helped you get a handle on the problem. I like to write the body of the message in markdown, since it’s pretty readable on its own and can easily be cut and pasted into a pull request form on Github or Gitlab. The “why” doesn’t mean from a project management standpoint – so no need to say “I’m doing this because our client asked for it”. What I mean by “why” here is relevant contextual information. For example, why did you implement your change the way you did versus the other possible ways you could have done it?

And the footer is any extra information, metadata-like stuff. This is usually where I put the issue number that the commit is related to.

Ultimately we might write something like the following:

Fix our multiply function

Swapped `/` for `*`. Apparently multiplication and division are two different
things! We want to do
[multiplication](https://en.wikipedia.org/wiki/Multiplication) because
otherwise everything breaks. I did some research and found out that `*`
means "multiply" in most popular programming languages.

In the future we could also consider dividing by the inverse of an operand,
in case we move to a programming langauge that doesn't support multiplication.

Fixes #1234.

# Please enter the commit message for your changes. Lines starting
# with '#' will be ignored, and an empty message aborts the commit.
#
# On branch working
# Changes to be committed:
#       modified:   foo.js
#
# ------------------------ >8 ------------------------
# Do not modify or remove the line above.
# Everything below it will be ignored.
diff --git foo.js foo.js
index d575e27..6afb236 100644
--- foo.js
+++ foo.js
@@ -6,6 +6,6 @@ 
 function multiply(a, b) {
-  return a / b;
+  return a * b;
 }
 

For a real example of this in action, you can look at this commit on Check.

How does this actually help developers?

Okay, so we have made a bunch of commits in the format I’ve described above. How does this make day-to-day development easier? It all comes down to organizing information about the software you’re developing, and centralizing developer-facing data in Git instead of somewhere else.

Short subject lines are beneficial in at least two places: git log output and the “commits” page on your Git host of choice.

When it comes to logs, I like to use the --oneline command line switch to see an abbreviated version of recent changes. This gives us just the short subject lines. Check out these actual git log outputs for commits that do not use the above formatting recommendations:

$ git log --oneline --no-decorate

3340a1653 CHECK-1254 Updating check-ui version
a6e92ddc7 CHECK-1252: remove none options from folder filter (#992)
b596e5e03 CHECK-1249: allow user to add maxNumber then validate the inputs (#991)
56308243d Feature/1071 time annotation (#990)
227de4ecf Upgrade @material-ui/icons to v5 (#989)
6437fb41f Ticket CHECK-1239: Filter item log only by event types
0f72a9ee1 Ticket CHECK-1099: Auto-publish (or not) Fetch imports
a2ee9b639 CHECK:824: add clear for date range filter and disable search in annotations field
e40871c51 Feature/check 824 filter date range number range for annotation (#986)
0bb05a56b CHECK-109: Localization fix: Trim spaces. ¯\_(ツ)_/¯
4e60cc109 CHECK-109: Updating l10n
c67e74427 CHECK-1142: propose folders as destination for rejecting suggested and make default folder on the top of list (#987)
d84b757e2 Feature/check 1150 add default folder per workspace (#983)
6e91692aa Ticket CHECK-1195: Gray-out the previous and next links if there are not search results to be displayed

Now compare it to a version of the same log that I’ve edited to reflect the above formatting suggestions:

$ git log --oneline --no-decorate

3340a1653 Update check-ui version
a6e92ddc7 Remove "none" options from folder filter
b596e5e03 Allow user to add maxNumber then validate the inputs
56308243d Add advanced time annotation feature
227de4ecf Upgrade @material-ui/icons to v5
6437fb41f Filter item log only by event types
0f72a9ee1 Auto-publish (or not) Fetch imports
a2ee9b639 Add clear for date range filter, disable search in annotations field
e40871c51 Filter date range number range for annotation
0bb05a56b Fix localization: trim spaces ¯\_(ツ)_/¯
4e60cc109 Update l10n
c67e74427 Change logic for folder organization
d84b757e2 Add default folder per workspace
6e91692aa Update previous/next link behavior in search

You’ll notice that I removed otherwise-critical context information from several commits. For example, “propose folders as destination for rejecting suggested and make default folder on the top of list” has become “Change logic for folder organization”. That’s based on the assumption that the details of what changed would move from the subject of the commit down to the body. The subject really just needs to let someone scanning commits understand that the code in this commit touches the folder organization systems, and then the reader can git show [hash] if they need to know more. I also removed Jira ticket numbers and Github pull request numbers. Those can also go in the body text.

Ultimately, good subject lines make the timeline of development legible.

In contrast to subject lines, body text is not about brevity. It’s about including as much context as makes sense for a commit, ideally in plaintext language. I think of the body text of Git commits like a textual database of knowledge about the code.

For example, I can use the --grep switch for git log to search this database (along with -i to make the search case-insensitive). Let’s say I am working on a bug related to a modal dialog, and I have the sneaking suspicion that I might have introduced it as a regression in the last six months. Here I am searching the code for all commits I have made that are related to the concept of a modal:

$ git log --author="Darius" --grep="modal" -i

commit 5166bb2388c6058351aaa4b05bc9729a015e9914
Author: Darius Kazemi 
Date:   Thu Mar 17 13:10:02 2022 -0700

    Add option to disable title hyperlinks to MediaItem

    Calling this `modalOnly` because moreso than disabling the title it means
 that clicking anywhere on the card pops up the modal and only the modal.

    Fixes CHECK-1573

commit 8d0d9e50f4563b48447fc9ef23590eb150b1e538
Author: Darius Kazemi 
Date:   Mon Nov 15 15:26:53 2021 -0800

    Add modal prompt when navigating away during edit

    This commit attaches a `useEffect` to the task editor that loads/unloads
 a browser confirmation EventListener depending on if the annotation widget
 is in the editing state.

    Fixes CHECK-1093

Not only does this pull up my previous modal work, but looking at these results I can quickly understand the context of what I was doing at the time and why I made some of the decisions that I did. When I do git show [hash] to view the code changes, I’ll have a much better understanding of why the diff is the way it is!

Another advantage of verbose body text is that writing pull requests becomes extremely easy. Have you ever made a bunch of commits and then written a pull request the next day? Sometimes it’s hard to remember the context of the work you were doing, and that makes it hard to write a good pull request. Using longform commits, I will often just copy and paste the body text of my commits into a pull request composition form and it’s already ~80% written for me. Since I write my body text in Markdown it’s even nicely pre-formatted!

If you use Github, a neat little trick is that if your branch contains a single commit, Github will automatically populate the subject line as the title of the pull request and the body text as the body of the pull request. Remember the commit that I linked earlier? The associated pull request was automatically created using this method. The only thing I added was the screenshot for extra context.

Conclusion

Good commit messages turn our Git repositories into a database of knowledge about our code and why it’s implemented the way it is. This is distinct from something like Jira, which is a database for product management decisions. We could put technical implementation details in Jira, but in my opinion Git is the place where it makes sense for all this stuff to live. Our product managers don’t need to know why we implemented something the way we did. They need to know that we implemented it, and when it happened, and when it’s going live. Jira and similar project management tools are great for tracking that information. But information like you see in the body text of commits that I recommend above is basically noise as far as anyone outside of engineering is concerned, so let’s keep that stuff to a place where only engineers are going to see it!

Tags
No items found.
Footnotes
  1. Online conversations are heavily influenced by news coverage, like the 2022 Supreme Court decision on abortion. The relationship is less clear between big breaking news and specific increases in online misinformation.
  2. The tweets analyzed were a random sample qualitatively coded as “misinformation” or “not misinformation” by two qualitative coders trained in public health and internet studies.
  3. This method used Twitter’s historical search API
  4. The peak was a significant outlier compared to days before it using Grubbs' test for outliers for Chemical Abortion (p<0.2 for the decision; p<0.003 for the leak) and Herbal Abortion (p<0.001 for the decision and leak).
  5. All our searches were case insensitive and could match substrings; so, “revers” matches “reverse”, “reversal”, etc.
References
Authors
Words by

Darius Kazemi is a senior software engineer at Meedan. He is a researcher, former Mozilla Open Web Fellow, and internet artist under the moniker Tiny Subversions. His work focuses on re-decentralizing the internet and empowering communities to set their own norms online.

Darius Kazemi
Words by
Organization
Published on
April 29, 2022
May 25, 2022