I use Git[1] on the daily and in most contexts[2] I care a lot about the commit messages being meaningful.
I also have a small handful of repositories where commits are automated (automated data feeds; daily timesheets; my lab journal, etc).
In the case of these repos, I want to track the incoming changes, but do not want the effort of composing a message. There's no benefit to me in manually composing a history entry, either because I'm not present, or because a descriptive change wouldn't add to the record.
(Sometimes that's not true; when I backfilled hundreds of dates into old journal entries, I did write a commit message for the change. But not for when it's happening from cron
.)
For a long time I used a git commit -m "journal update on ${HOSTNAME} @ ${DATE}"
to automate; it gave me visibility on the origin and recency of the changes, and I could check the history for more information if necessary.
But I want more. I want to be able to look at a commit message and see something more like:
Changed: 2024/03/09.md, Presentations/DS2024/Recipes.md
- 2024/03/09.md:
- Modified section "Tasks"
- Added section "Git commit message"
- Presentations/DS2024/Recipes.md
- Added section "Config validation"
That seems to match what Martin Fowler has referred to as Semantic Diff (like Martin, I ask that if you know a better name, please tell me).
I don't have that part yet. I expect it wouldn't be terrible to implement; in Markdown you'd be looking for the parent "element" (ie, heading), and maybe reporting back on the parent of that if you wanted it.
I see also that are some LLM-based tools which will generate a commit message for you based on staged changes, which is perhaps nice for some but I'll give a hard no on feeding my daily work notes to an LLM for the provider to train against.
But to improve on the lack of information in "journal update on ${HOSTNAME} @ ${DATE}" I decided to at least capture a high-level view. Here's what I came up with.
A Git prepare-message-hook
with the following content:
#!/usr/bin/env bash
COMMIT_MSG_FILE=$1
COMMIT_SOURCE=$2
SHA1=$3
ADDED=$(git status --porcelain | grep '^A' | wc -l)
DELETED=$(git status --porcelain | grep '^D' | wc -l)
MODIFIED=$(git status --porcelain | grep '^M' | wc -l)
RENAMED=$(git status --porcelain | grep '^R' | wc -l)
OTHER=$(git status --porcelain | grep -Ev '^[ADMR]' | wc -l)
HOSTNAME=$(hostname)
SUBJECT="Files changed on $HOSTNAME: "
ORIG_SUBJECT=$SUBJECT
function push_subject() {
if [ "$SUBJECT" != "$ORIG_SUBJECT" ]; then
echo "$SUBJECT, $1"
else
echo "$SUBJECT $1"
fi
}
if [ "$ADDED" != "0" ]; then
SUBJECT=$( push_subject "$ADDED added" )
fi
if [ "$DELETED" != "0" ]; then
SUBJECT=$( push_subject "$DELETED removed" )
fi
if [ "$MODIFIED" != "0" ]; then
SUBJECT=$( push_subject "$MODIFIED modified" )
fi
if [ "$RENAMED" != "0" ]; then
SUBJECT=$( push_subject "$RENAMED renamed" )
fi
if [ "$OTHER" != "0" ]; then
SUBJECT=$( push_subject "$OTHER other" )
fi
echo "$SUBJECT" > "$COMMIT_MSG_FILE"
echo >> "$COMMIT_MSG_FILE"
git diff --cached --stat >> "$COMMIT_MSG_FILE"
Will generate commit messages like this:
commit 7f9cb0ff06afaa5db8733869ca5552c86aa7fc99 (HEAD -> main, origin/main, origin/HEAD)
Author: Chris Burgess <chris@giantrobot.co.nz>
Date: Sat Mar 9 07:16:04 2024 +1300
Files changed on thip: 44 added, 1 removed, 23 modified, 1 renamed
2021/03/23/Some Notes.md | 374 +++++++++++++++++++++
2023/09/04.md | 2 +-
2024/02/04.md | 260 +++++++++++++-
2024/02/05.md | 121 +++++++
...
Or this:
commit 36a61c68c8b0031ad556c6be15ffe8399c2903a9 (HEAD -> main, origin/main, origin/HEAD)
Author: Chris Burgess <chris@giantrobot.co.nz>
Date: Sat Mar 9 07:26:37 2024 +1300
Files changed on thip: 1 added
2024/03/09.md | 82 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 82 insertions(+)
It's not perfect. It doesn't handle git commit -m <msg>
, for one thing, and I hope to explore further a diff that is capable of precis-ing a set of changes to Markdown files. But it'll improve the history I see when working with those repos where I'm not writing commit messages for each change.
I'll report back on whatever annoys me about this first quick implementation đ