From: Max Gautier <mg@max.gautier.name>
To: git@vger.kernel.org
Cc: Denton Liu <liu.denton@gmail.com>
Subject: Detecting squash-merged branches (and question about git-diff-tree)
Date: Tue, 3 Dec 2024 14:55:44 +0100 [thread overview]
Message-ID: <Z08N4AlQKiNi-IOI@framework> (raw)
Hi,
I tend to work on project which do a lot of "squash-merging" e.g, merge
branches by having a robot squash the branch in a new commit on top of
the main branch.
This makes it a bit hard to remove my branches when they are
"squash-merged" (in contrast to using `git branch --merged`)
I started a little script to detect such branches; initially I used git
cherry, but this only detect the case where the branch has 1 commit,
which is not enough.
Sharing below if anyone is interested and/or want to give some feedback
(warning, this is probably full of bash-ism/gnu-ism).
Which leads me to my actual question:
I wanted to use diff-tree in --stdin mode (instead of calling it
repeatedly in a loop), feeding it my target branch and
the list of relevant commits, but apparently --merge-base and --stdin
are mutually exclusive. What's the reason for that ?
I suppose it's related to the 3 possible line forms diff-tree accepts in
--stdin mode, but I didn't find a spelled out explanation in the
original thread implementing --merge-base [1].
Is there another alternative for computing the patches ids of branches
in that way ? A '%(mergebase)' token for git for-each-ref would also
work, but there is no such thing either that I know of.
(Of course, the script as such runs ~reasonably~ well, but it does spend
95% of it's time waiting for subprocess, which bugs me a little^^)
Thanks for reading me !
[1]: https://lore.kernel.org/git/cover.1599332861.git.liu.denton@gmail.com/
---
#!/bin/bash
# $1 : target ref (in which we search for squashed branches)
# (default: upstream/HEAD)
# ${@:2} (all scripts args after the first one): git for-each-ref
# patterns for refs candidates for squash-merge detection
# (default: refs/{remotes/origin,heads}/}
declare -A commit_by_patch_ids
oldest_merge_base=${1-upstream/HEAD}
ref_patterns=${@:2}
ref_patterns=${ref_patterns:-refs/remotes/origin/ refs/heads/}
for ref in $(git for-each-ref ${ref_patterns} \
--format='%(objectname)' \
--no-merged=${1-upstream/HEAD} )
do
patch_id=( $(git diff-tree -p --merge-base ${1-upstream/HEAD} $ref \
| git patch-id --stable) )
commit_by_patch_ids[$patch_id[0]]=$ref
# Caveat:
# It's possible for different commit to have the save patch-id
# (for instance on a feature branch do: git checkout feature;git branch
# old;git rebase main -> old and feature would probably have the same
# patch-id, if I understand this correctly)
# proper treatment of this would need to use array of commits by
# patch-id, but bash does not support multidimensional arrays.
# Check oldest commit we will need to go back to when checking if a
# patch-id exist in the source branch.
# This assumes that branches are not squash-merged before their fork
# point. This avoids going back all the way to the first commit,
# which can be prohibitively expensive on repository with a long
# history (e.g, linux kernel tree takes 13 minutes on a recent machine
# for git log -p | git patch-id)
oldest_merge_base=$(git merge-base $oldest_merge_base $ref)
done
declare -a squashed
# Extract commits whose patch-id exist in the target branch.
#
for patch_id in $(git log -p ${oldest_merge_base}..${1-upstream/HEAD} \
| git patch-id --stable | cut -d ' ' -f 1)
do
if [[ -n "${commit_by_patch_ids[$patch_id]+exists}" ]];then
squashed+=("--points-at=${commit_by_patch_ids[$patch_id]}")
fi
done
printf "%s\n" "$(git for-each-ref $ref_patterns \
--format='%(refname:short)' \
${squashed[@]})"
--
Max Gautier
next reply other threads:[~2024-12-03 14:03 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-12-03 13:55 Max Gautier [this message]
2024-12-03 15:30 ` Detecting squash-merged branches (and question about git-diff-tree) Kristoffer Haugsbakk
2024-12-03 15:51 ` Max Gautier
2024-12-03 17:42 ` Kristoffer Haugsbakk
2024-12-03 16:18 ` Max Gautier
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Z08N4AlQKiNi-IOI@framework \
--to=mg@max.gautier.name \
--cc=git@vger.kernel.org \
--cc=liu.denton@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox