All of lore.kernel.org
 help / color / mirror / Atom feed
From: Max Gautier <mg@max.gautier.name>
To: git@vger.kernel.org
Cc: Denton Liu <liu.denton@gmail.com>
Subject: Detecting squash-merged branches (and question about git-diff-tree)
Date: Tue, 3 Dec 2024 14:55:44 +0100	[thread overview]
Message-ID: <Z08N4AlQKiNi-IOI@framework> (raw)

Hi,

I tend to work on project which do a lot of "squash-merging" e.g, merge
branches by having a robot squash the branch in a new commit on top of
the main branch.

This makes it a bit hard to remove my branches when they are
"squash-merged" (in contrast to using `git branch --merged`)


I started a little script to detect such branches; initially I used git
cherry, but this only detect the case where the branch has 1 commit,
which is not enough.

Sharing below if anyone is interested and/or want to give some feedback
(warning, this is probably full of bash-ism/gnu-ism).

Which leads me to my actual question:
I wanted to use diff-tree in --stdin mode (instead of calling it
repeatedly in a loop), feeding it my target branch and
the list of relevant commits, but apparently --merge-base and --stdin
are mutually exclusive. What's the reason for that ?
I suppose it's related to the 3 possible line forms diff-tree accepts in
--stdin mode, but I didn't find a spelled out explanation in the
original thread implementing --merge-base [1].

Is there another alternative for computing the patches ids of branches
in that way ? A '%(mergebase)' token for git for-each-ref would also
work, but there is no such thing either that I know of. 

(Of course, the script as such runs ~reasonably~ well, but it does spend
95% of it's time waiting for subprocess, which bugs me a little^^)

Thanks for reading me !

[1]: https://lore.kernel.org/git/cover.1599332861.git.liu.denton@gmail.com/

---

#!/bin/bash
# $1 : target ref (in which we search for squashed branches)
# (default: upstream/HEAD)
# ${@:2} (all scripts args after the first one): git for-each-ref
# patterns for refs candidates for squash-merge detection
# (default: refs/{remotes/origin,heads}/}

declare -A commit_by_patch_ids
oldest_merge_base=${1-upstream/HEAD}
ref_patterns=${@:2}
ref_patterns=${ref_patterns:-refs/remotes/origin/ refs/heads/}

for ref in $(git for-each-ref ${ref_patterns} \
                --format='%(objectname)' \
                --no-merged=${1-upstream/HEAD} )
do
  patch_id=( $(git diff-tree -p --merge-base ${1-upstream/HEAD} $ref \
              | git patch-id --stable) )
  commit_by_patch_ids[$patch_id[0]]=$ref
  # Caveat:
  # It's possible for different commit to have the save patch-id
  # (for instance on a feature branch do: git checkout feature;git branch
  # old;git rebase main -> old and feature would probably have the same
  # patch-id, if I understand this correctly)
  # proper treatment of this would need to use array of commits by
  # patch-id, but bash does not support multidimensional arrays.

  # Check oldest commit we will need to go back to when checking if a
  # patch-id exist in the source branch.
  # This assumes that branches are not squash-merged before their fork
  # point.  This avoids going back all the way to the first commit,
  # which can be prohibitively expensive on repository with a long
  # history (e.g, linux kernel tree takes 13 minutes on a recent machine
  # for git log -p | git patch-id)
  oldest_merge_base=$(git merge-base $oldest_merge_base $ref)
done

declare -a squashed
# Extract commits whose patch-id exist in the target branch.
#
for patch_id in $(git log -p ${oldest_merge_base}..${1-upstream/HEAD} \
                 | git patch-id --stable | cut -d ' ' -f 1)
do
    if [[ -n "${commit_by_patch_ids[$patch_id]+exists}" ]];then
        squashed+=("--points-at=${commit_by_patch_ids[$patch_id]}")
    fi
done

printf "%s\n" "$(git for-each-ref $ref_patterns \
                    --format='%(refname:short)' \
                    ${squashed[@]})"

-- 
Max Gautier

             reply	other threads:[~2024-12-03 14:03 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-12-03 13:55 Max Gautier [this message]
2024-12-03 15:30 ` Detecting squash-merged branches (and question about git-diff-tree) Kristoffer Haugsbakk
2024-12-03 15:51   ` Max Gautier
2024-12-03 17:42     ` Kristoffer Haugsbakk
2024-12-03 16:18 ` Max Gautier

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z08N4AlQKiNi-IOI@framework \
    --to=mg@max.gautier.name \
    --cc=git@vger.kernel.org \
    --cc=liu.denton@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.