public inbox for git@vger.kernel.org
 help / color / mirror / Atom feed
From: Max Gautier <mg@max.gautier.name>
To: git@vger.kernel.org
Cc: Denton Liu <liu.denton@gmail.com>
Subject: Detecting squash-merged branches (and question about git-diff-tree)
Date: Tue, 3 Dec 2024 14:55:44 +0100	[thread overview]
Message-ID: <Z08N4AlQKiNi-IOI@framework> (raw)

Hi,

I tend to work on project which do a lot of "squash-merging" e.g, merge
branches by having a robot squash the branch in a new commit on top of
the main branch.

This makes it a bit hard to remove my branches when they are
"squash-merged" (in contrast to using `git branch --merged`)


I started a little script to detect such branches; initially I used git
cherry, but this only detect the case where the branch has 1 commit,
which is not enough.

Sharing below if anyone is interested and/or want to give some feedback
(warning, this is probably full of bash-ism/gnu-ism).

Which leads me to my actual question:
I wanted to use diff-tree in --stdin mode (instead of calling it
repeatedly in a loop), feeding it my target branch and
the list of relevant commits, but apparently --merge-base and --stdin
are mutually exclusive. What's the reason for that ?
I suppose it's related to the 3 possible line forms diff-tree accepts in
--stdin mode, but I didn't find a spelled out explanation in the
original thread implementing --merge-base [1].

Is there another alternative for computing the patches ids of branches
in that way ? A '%(mergebase)' token for git for-each-ref would also
work, but there is no such thing either that I know of. 

(Of course, the script as such runs ~reasonably~ well, but it does spend
95% of it's time waiting for subprocess, which bugs me a little^^)

Thanks for reading me !

[1]: https://lore.kernel.org/git/cover.1599332861.git.liu.denton@gmail.com/

---

#!/bin/bash
# $1 : target ref (in which we search for squashed branches)
# (default: upstream/HEAD)
# ${@:2} (all scripts args after the first one): git for-each-ref
# patterns for refs candidates for squash-merge detection
# (default: refs/{remotes/origin,heads}/}

declare -A commit_by_patch_ids
oldest_merge_base=${1-upstream/HEAD}
ref_patterns=${@:2}
ref_patterns=${ref_patterns:-refs/remotes/origin/ refs/heads/}

for ref in $(git for-each-ref ${ref_patterns} \
                --format='%(objectname)' \
                --no-merged=${1-upstream/HEAD} )
do
  patch_id=( $(git diff-tree -p --merge-base ${1-upstream/HEAD} $ref \
              | git patch-id --stable) )
  commit_by_patch_ids[$patch_id[0]]=$ref
  # Caveat:
  # It's possible for different commit to have the save patch-id
  # (for instance on a feature branch do: git checkout feature;git branch
  # old;git rebase main -> old and feature would probably have the same
  # patch-id, if I understand this correctly)
  # proper treatment of this would need to use array of commits by
  # patch-id, but bash does not support multidimensional arrays.

  # Check oldest commit we will need to go back to when checking if a
  # patch-id exist in the source branch.
  # This assumes that branches are not squash-merged before their fork
  # point.  This avoids going back all the way to the first commit,
  # which can be prohibitively expensive on repository with a long
  # history (e.g, linux kernel tree takes 13 minutes on a recent machine
  # for git log -p | git patch-id)
  oldest_merge_base=$(git merge-base $oldest_merge_base $ref)
done

declare -a squashed
# Extract commits whose patch-id exist in the target branch.
#
for patch_id in $(git log -p ${oldest_merge_base}..${1-upstream/HEAD} \
                 | git patch-id --stable | cut -d ' ' -f 1)
do
    if [[ -n "${commit_by_patch_ids[$patch_id]+exists}" ]];then
        squashed+=("--points-at=${commit_by_patch_ids[$patch_id]}")
    fi
done

printf "%s\n" "$(git for-each-ref $ref_patterns \
                    --format='%(refname:short)' \
                    ${squashed[@]})"

-- 
Max Gautier

             reply	other threads:[~2024-12-03 14:03 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-12-03 13:55 Max Gautier [this message]
2024-12-03 15:30 ` Detecting squash-merged branches (and question about git-diff-tree) Kristoffer Haugsbakk
2024-12-03 15:51   ` Max Gautier
2024-12-03 17:42     ` Kristoffer Haugsbakk
2024-12-03 16:18 ` Max Gautier

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z08N4AlQKiNi-IOI@framework \
    --to=mg@max.gautier.name \
    --cc=git@vger.kernel.org \
    --cc=liu.denton@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox