From: Jeff King <peff@peff.net>
To: Junio C Hamano <gitster@pobox.com>
Cc: git@vger.kernel.org, John Fultz <jfultz@wolfram.com>
Subject: [PATCH] filter-branch: resolve $commit^{tree} in no-index case
Date: Tue, 19 Jan 2016 16:51:00 -0500 [thread overview]
Message-ID: <20160119215100.GB28656@sigill.intra.peff.net> (raw)
In-Reply-To: <xmqq37tt9r9g.fsf@gitster.mtv.corp.google.com>
On Tue, Jan 19, 2016 at 01:46:35PM -0800, Junio C Hamano wrote:
> > It _is_ slower, though, because it introduces an extra rev-parse. When
> > we could in fact be getting rid of one. Give me a moment to complete a
> > few timing tests and post the results.
>
> Good point.
>
> We should do that rev-parse in the helper function. That rev-parse
> is there only because the skip-empty code wants to know the exact
> object name when comparing. There is no reason for this code to do
> it for the helper--the helper, if (and only if) it is called, can
> do the rev-parse itself, and we can still omit the overhead when
> we are not skipping empty ones.
Here's the patch I came up with. It takes the conservative choice (see
the argument below), and shows the performance impact. I'll work up the
non-conservative one on top, which I think can do even better than the
original.
-- >8 --
Subject: filter-branch: resolve $commit^{tree} in no-index case
Commit 348d4f2 (filter-branch: skip index read/write when
possible, 2015-11-06) taught filter-branch to optimize out
the final "git write-tree" when we know we haven't touched
the tree with any of our filters. It does by simply putting
the literal text "$commit^{tree}" into the "$tree" variable,
avoiding a useless rev-parse call.
However, when we pass this to git_commit_non_empty_tree(),
it gets confused; it resolves "$commit^{tree}" itself, and
compares our string to the 40-hex sha1, which obviously
doesn't match. As a result, "--prune-empty" (or any custom
filter using git_commit_non_empty_tree) will fail to drop
an empty commit (when filter-branch is used without a tree
or index filter).
Let's resolve $tree to the 40-hex ourselves, so that
git_commit_non_empty_tree can work. Unfortunately, this is a
bit slower due to the extra process overhead:
$ cd t/perf && ./run 348d4f2 HEAD p7000-filter-branch.sh
[...]
Test 348d4f2 HEAD
--------------------------------------------------------------
7000.2: noop filter 3.76(0.24+0.26) 4.54(0.28+0.24) +20.7%
However, the value of $tree here is technically
user-visible. The user can provide arbitrary shell code at
this stage, which could itself have a similar assumption to
what is in git_commit_non_empty_tree. So the conservative
choice to fix this regression is to take the 20% hit and
give the pre-348d4f2 behavior. We still end up much faster
than before the optimization:
$ cd t/perf && ./run 348d4f2^ HEAD p7000-filter-branch.sh
[...]
Test 348d4f2^ HEAD
--------------------------------------------------------------
7000.2: noop filter 9.51(4.32+0.40) 4.51(0.28+0.23) -52.6%
Signed-off-by: Jeff King <peff@peff.net>
---
git-filter-branch.sh | 2 +-
t/t7003-filter-branch.sh | 8 ++++++++
2 files changed, 9 insertions(+), 1 deletion(-)
diff --git a/git-filter-branch.sh b/git-filter-branch.sh
index d61f9ba..5e094ce 100755
--- a/git-filter-branch.sh
+++ b/git-filter-branch.sh
@@ -404,7 +404,7 @@ while read commit parents; do
then
tree=$(git write-tree)
else
- tree="$commit^{tree}"
+ tree=$(git rev-parse "$commit^{tree}")
fi
workdir=$workdir @SHELL_PATH@ -c "$filter_commit" "git commit-tree" \
"$tree" $parentstr < ../message > ../map/$commit ||
diff --git a/t/t7003-filter-branch.sh b/t/t7003-filter-branch.sh
index 377c648..97c23c2 100755
--- a/t/t7003-filter-branch.sh
+++ b/t/t7003-filter-branch.sh
@@ -333,6 +333,14 @@ test_expect_success 'prune empty collapsed merges' '
test_cmp expect actual
'
+test_expect_success 'prune empty works even without index/tree filters' '
+ git rev-list HEAD >expect &&
+ git commit --allow-empty -m empty &&
+ git filter-branch -f --prune-empty HEAD &&
+ git rev-list HEAD >actual &&
+ test_cmp expect actual
+'
+
test_expect_success '--remap-to-ancestor with filename filters' '
git checkout master &&
git reset --hard A &&
--
2.7.0.248.g5eafd77
next prev parent reply other threads:[~2016-01-19 21:51 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-01-19 20:48 git filter-branch not removing commits when it should in 2.7.0 John Fultz
2016-01-19 21:14 ` Junio C Hamano
2016-01-19 21:35 ` Junio C Hamano
2016-01-19 21:37 ` Jeff King
2016-01-19 21:46 ` Junio C Hamano
2016-01-19 21:51 ` Jeff King [this message]
2016-01-19 21:59 ` [PATCH] filter-branch: resolve $commit^{tree} in no-index case Jeff King
2016-01-19 22:07 ` Jeff King
2016-01-19 22:23 ` Junio C Hamano
2016-01-19 22:28 ` Jeff King
2016-01-19 22:48 ` Jeff King
2016-01-20 1:22 ` Jonathan Nieder
2016-01-20 1:34 ` Jeff King
2016-01-20 1:51 ` Junio C Hamano
2016-01-20 2:00 ` Jeff King
2016-01-20 2:43 ` Junio C Hamano
2016-01-20 3:23 ` Junio C Hamano
2016-01-20 4:14 ` Jeff King
2016-01-20 0:47 ` Jonathan Nieder
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160119215100.GB28656@sigill.intra.peff.net \
--to=peff@peff.net \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=jfultz@wolfram.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).