git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Charles Bailey <charles@hashpling.org>
To: git@vger.kernel.org
Subject: [PATCH] Fix filter-branch to eliminate duplicate mapped parents
Date: Mon, 30 Jun 2014 22:20:27 +0100	[thread overview]
Message-ID: <1404163227-30962-1-git-send-email-charles@hashpling.org> (raw)

From: Charles Bailey <cbailey32@bloomberg.net>

When multiple parents of a merge commit get mapped to the same commit,
filter-branch used to pass all instances of the parent commit to the
parent and commit filters and to "git commit-tree" or
"git_commit_non_empty_tree".

This can often happen when extracting a small project from a large
repository; merges can join history with no commits on any branch which
affect the paths being retained. Once the intermediate commits have been
filtered out, all the immediate parents of the merge commit can end up
being mapped to the same commit - either the original merge-base or an
ancestor of it.

"git commit-tree" would display an error but write the commit with the
normalized parents in any case. "git_commit_non_empty_tree" would fail
to notice that the commit being made was in fact a non-merge commit and
would retain it even if a further pass with --prune-empty would discard
the commit as empty.

This change ensure that duplicate parents are pruned before the parent
filter and ensures that --prune-empty is idempotent, removing all
empty non-merge commits in a singe pass.

Signed-off-by: Charles Bailey <cbailey32@bloomberg.net>
---

I worked on this after discovering that --prune-empty often left some
apparently empty commits that I was wasn't expecting it to leave and
that running filter-branch --prune-empty in a loop would often do many
passes where it was still pruning empty former merge commits.

The test is a simple example of such a case. A non-ff merge of a commit
that only changes a file that is to be pruned gets squashed into an
empty non-merge commit that should be pruned.

 git-filter-branch.sh     |  8 +++++++-
 t/t7003-filter-branch.sh | 11 +++++++++++
 2 files changed, 18 insertions(+), 1 deletion(-)

diff --git a/git-filter-branch.sh b/git-filter-branch.sh
index 86d6994..c5b82a8 100755
--- a/git-filter-branch.sh
+++ b/git-filter-branch.sh
@@ -332,7 +332,13 @@ while read commit parents; do
 	parentstr=
 	for parent in $parents; do
 		for reparent in $(map "$parent"); do
-			parentstr="$parentstr -p $reparent"
+			case "$parentstr" in
+				*" -p $reparent"*)
+					;;
+				*)
+					parentstr="$parentstr -p $reparent"
+					;;
+			esac
 		done
 	done
 	if [ "$filter_parent" ]; then
diff --git a/t/t7003-filter-branch.sh b/t/t7003-filter-branch.sh
index 9496736..3741f51 100755
--- a/t/t7003-filter-branch.sh
+++ b/t/t7003-filter-branch.sh
@@ -308,6 +308,17 @@ test_expect_success 'Prune empty commits' '
 	test_cmp expect actual
 '
 
+test_expect_success 'Prune empty collapsed merges' '
+	test_config merge.ff false &&
+	git rev-list HEAD > expect &&
+	test_commit to_remove_2 &&
+	git reset --hard HEAD^ &&
+	test_merge non-ff to_remove_2 &&
+	git filter-branch -f --index-filter "git update-index --remove to_remove_2.t" --prune-empty HEAD &&
+	git rev-list HEAD > actual &&
+	test_cmp expect actual
+'
+
 test_expect_success '--remap-to-ancestor with filename filters' '
 	git checkout master &&
 	git reset --hard A &&
-- 
1.9.0

             reply	other threads:[~2014-06-30 21:28 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-06-30 21:20 Charles Bailey [this message]
2014-07-01 12:11 ` [PATCH] Fix filter-branch to eliminate duplicate mapped parents Charles Bailey
2014-07-01 15:03 ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1404163227-30962-1-git-send-email-charles@hashpling.org \
    --to=charles@hashpling.org \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).