* Serious performance regression in diff between 1.6.6 and 1.7.0
@ 2010-06-10 0:10 Brian Downing
2010-06-10 17:08 ` Brian Downing
0 siblings, 1 reply; 6+ messages in thread
From: Brian Downing @ 2010-06-10 0:10 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
I've stumbled across a pretty serious performance regression in "git
diff" for a large (187456 files checked out on HEAD) repository of mine.
I bisected it and the first slow commit is:
commit 730f72840cc50c523fe4cdd796ea2d2fc4571a28
Author: Junio C Hamano <gitster@pobox.com>
Date: Sun Sep 20 00:03:39 2009 -0700
unpack-trees.c: look ahead in the index
This makes the traversal of index be in sync with the tree traversal.
When unpack_callback() is fed a set of tree entries from trees, it
inspects the name of the entry and checks if the an index entry with
the same name could be hiding behind the current index entry, and
...
It is more than 10x worse for my case here:
before 730f72840cc50c523fe4cdd796ea2d2fc4571a28:
:; /usr/bin/time ~/src/git/git-diff HEAD >/dev/null
0.30user 0.50system 0:00.81elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+30119minor)pagefaults 0swaps
730f72840cc50c523fe4cdd796ea2d2fc4571a28:
:; /usr/bin/time ~/src/git/git-diff HEAD >/dev/null
5.58user 0.40system 0:05.98elapsed 100%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+30117minor)pagefaults 0swaps
Unfortunately this pushes everyday use past my pain threshold for this
repository. (As a sidenote, it's amazing how much Git has adjusted my
pain threshold since the days of CVS!)
Is there any chance of algorithmic improvements to this fix to gain some
of this speed back, or am I stuck with this when using newer Git?
-bcd
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Serious performance regression in diff between 1.6.6 and 1.7.0
2010-06-10 0:10 Serious performance regression in diff between 1.6.6 and 1.7.0 Brian Downing
@ 2010-06-10 17:08 ` Brian Downing
2010-06-10 18:14 ` Brian Downing
0 siblings, 1 reply; 6+ messages in thread
From: Brian Downing @ 2010-06-10 17:08 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
On Wed, Jun 09, 2010 at 07:10:05PM -0500, Brian Downing wrote:
> I've stumbled across a pretty serious performance regression in "git
> diff" for a large (187456 files checked out on HEAD) repository of mine.
[...]
> It is more than 10x worse for my case here:
>
> before 730f72840cc50c523fe4cdd796ea2d2fc4571a28:
> 0.30user 0.50system 0:00.81elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
>
> 730f72840cc50c523fe4cdd796ea2d2fc4571a28:
> 5.58user 0.40system 0:05.98elapsed 100%CPU (0avgtext+0avgdata 0maxresident)k
Here's a little more profiling information (from the bad case). I ran this a
couple of times to see what oprofile said. Sorry about the lack of non-Git
symbols:
CPU: Core 2, speed 2668 MHz (estimated)
Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit mask of 0x00 (Unhalted core cycles) count 100000
samples % image name app name symbol name
499255 62.8497 git-diff git-diff df_name_compare
110561 13.9182 git-diff git-diff do_compare_entry
54508 6.8619 no-vmlinux git-diff (no symbols)
33741 4.2476 no-vmlinux no-vmlinux (no symbols)
23724 2.9865 git-diff git-diff ce_in_traverse_path
11767 1.4813 libcrypto.so.0.9.8 git-diff (no symbols)
10511 1.3232 git-diff git-diff find_cache_pos
I also ran this through callgrind to see how often the above were called:
Calls Symbol
----------- -------------------
197,958 unpack_callback
208,460 find_cache_pos
37,308,336 ce_in_traverse_path
156,950,469 do_compare_entry
156,950,469 df_name_compare
Hope this helps.
-bcd
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Serious performance regression in diff between 1.6.6 and 1.7.0
2010-06-10 17:08 ` Brian Downing
@ 2010-06-10 18:14 ` Brian Downing
2010-06-10 18:42 ` Brian Downing
2010-06-11 2:59 ` [PATCH] unpack-trees: Make index lookahead less pessimal Brian Downing
0 siblings, 2 replies; 6+ messages in thread
From: Brian Downing @ 2010-06-10 18:14 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
On Thu, Jun 10, 2010 at 12:08:04PM -0500, Brian Downing wrote:
> I also ran this through callgrind to see how often the above were called:
(187,456 files)
> Calls Symbol
> ----------- -------------------
> 197,958 unpack_callback
> 208,460 find_cache_pos
> 37,308,336 ce_in_traverse_path
> 156,950,469 do_compare_entry
> 156,950,469 df_name_compare
Here is an identical run (git-diff HEAD) from the Linux kernel tree
(33,307 files):
Calls Symbol
----------- -------------------
35,332 unpack_callback
37,357 find_cache_pos
4,979,473 ce_in_traverse_path
6,828,181 do_compare_entry
6,828,181 df_name_compare
That makes it look sort of exponential (perhaps around files^1.5),
though from what I can understand of the find_cache_pos code in
unpack-trees it would depend on the exact shape of the repository. It
does seem to linear-search over whole directory trees of the index
repeatedly, though, which would support the exponential theory.
Unfortunately I don't really understand what the code is trying to do.
Is it not the case that trees and the index are always stored sorted in
the same order? The examples given in the commit messages that
introduced this fix would imply not, but I'm not sure how that could
come about.
-bcd
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Serious performance regression in diff between 1.6.6 and 1.7.0
2010-06-10 18:14 ` Brian Downing
@ 2010-06-10 18:42 ` Brian Downing
2010-06-11 2:59 ` [PATCH] unpack-trees: Make index lookahead less pessimal Brian Downing
1 sibling, 0 replies; 6+ messages in thread
From: Brian Downing @ 2010-06-10 18:42 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
On Thu, Jun 10, 2010 at 01:14:21PM -0500, Brian Downing wrote:
> That makes it look sort of exponential (perhaps around files^1.5),
> though from what I can understand of the find_cache_pos code in
> unpack-trees it would depend on the exact shape of the repository. It
> does seem to linear-search over whole directory trees of the index
> repeatedly, though, which would support the exponential theory.
(I meant polynomial with an exponent > 1 here, not exponential [in the
2^n sense]; sorry!)
-bcd
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH] unpack-trees: Make index lookahead less pessimal
2010-06-10 18:14 ` Brian Downing
2010-06-10 18:42 ` Brian Downing
@ 2010-06-11 2:59 ` Brian Downing
2010-06-11 4:26 ` Junio C Hamano
1 sibling, 1 reply; 6+ messages in thread
From: Brian Downing @ 2010-06-11 2:59 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
When traversing trees with an index, the current index pointer
(o->cache_base) occasionally has to be temporarily advanced forwards to
match the traversal order of the tree, which is not the same as the sort
order of the index. The existing algorithm that did this (introduced in
730f72840cc50c523fe4cdd796ea2d2fc4571a28) would get "stuck" when the
cache_base was popped and then repeatedly check the same index entries
over and over. This represents a serious performance regression for
large repositories compared to the old "broken" traversal order.
This commit makes a simple change to mitigate this. Whenever
find_cache_pos sees that the current pos is also the cache_base, and it
has already been unpacked, it advances the cache_base as well as the
current pos. This prevents the above "sticking" behavior without
dramatically changing the algorithm.
In addition, this commit moves the unpacked check above the
ce_in_traverse_path() check. The simple bitmask check is cheaper, and
in the case described above will be firing quite a bit to advance the
cache_base after a tree pop.
This yields considerable performance improvements for large trees.
The following are the number of function calls for "git diff HEAD" on
the Linux kernel tree, with 33,307 files:
Symbol Calls Before Calls After
------------------- ------------ -----------
unpack_callback 35,332 35,332
find_cache_pos 37,357 37,357
ce_in_traverse_path 4,979,473 37,357
do_compare_entry 6,828,181 251,925
df_name_compare 6,828,181 251,925
And on a repository of 187,456 files:
Symbol Calls Before Calls After
------------------- ------------ -----------
unpack_callback 197,958 197,958
find_cache_pos 208,460 208,460
ce_in_traverse_path 37,308,336 208,460
do_compare_entry 156,950,469 2,690,626
df_name_compare 156,950,469 2,690,626
On the latter repository, user time for "git diff HEAD" was reduced from
5.58 to 0.42 seconds. This is compared to 0.30 seconds before the
traversal order fix was implemented.
Signed-off-by: Brian Downing <bdowning@lavos.net>
---
So, yeah. I admit frobbing cache_bottom from inside find_cache_pos
feels a little ugly, but it does minimize changes to the algorithm,
and the performance results speak for themselves.
As far as I can tell, there's no way this modified algorithm can
return different results from the original. (Obviously all the
working tests still pass.) Both reordered cases simply called
continue, so the order wouldn't matter unless ce_in_traverse_path
has side effects, and the "continue" for the unpacked test is just
a bit more permanent now. Still, I'd appreciate it if somebody who
knows what they are doing with this code could confirm this.
I'd like to see this on maint if possible, since I have repo that's
pretty unpleasant to use with the current release.
-bcd
unpack-trees.c | 12 ++++++++++--
1 files changed, 10 insertions(+), 2 deletions(-)
diff --git a/unpack-trees.c b/unpack-trees.c
index c29a9e0..94b8ecd 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -520,9 +520,17 @@ static int find_cache_pos(struct traverse_info *info,
const char *ce_name, *ce_slash;
int cmp, ce_len;
- if (!ce_in_traverse_path(ce, info))
+ if (ce->ce_flags & CE_UNPACKED) {
+ /*
+ * cache_bottom entry is already unpacked, so
+ * we can never match it; don't check it
+ * again.
+ */
+ if (pos == o->cache_bottom)
+ ++o->cache_bottom;
continue;
- if (ce->ce_flags & CE_UNPACKED)
+ }
+ if (!ce_in_traverse_path(ce, info))
continue;
ce_name = ce->name + pfxlen;
ce_slash = strchr(ce_name, '/');
--
1.7.1
On Thu, Jun 10, 2010 at 01:14:21PM -0500, Brian Downing wrote:
> On Thu, Jun 10, 2010 at 12:08:04PM -0500, Brian Downing wrote:
> > I also ran this through callgrind to see how often the above were called:
>
> (187,456 files)
>
> > Calls Symbol
> > ----------- -------------------
> > 197,958 unpack_callback
> > 208,460 find_cache_pos
> > 37,308,336 ce_in_traverse_path
> > 156,950,469 do_compare_entry
> > 156,950,469 df_name_compare
>
> Here is an identical run (git-diff HEAD) from the Linux kernel tree
> (33,307 files):
>
> Calls Symbol
> ----------- -------------------
> 35,332 unpack_callback
> 37,357 find_cache_pos
> 4,979,473 ce_in_traverse_path
> 6,828,181 do_compare_entry
> 6,828,181 df_name_compare
>
> That makes it look sort of exponential (perhaps around files^1.5),
> though from what I can understand of the find_cache_pos code in
> unpack-trees it would depend on the exact shape of the repository. It
> does seem to linear-search over whole directory trees of the index
> repeatedly, though, which would support the exponential theory.
>
> Unfortunately I don't really understand what the code is trying to do.
> Is it not the case that trees and the index are always stored sorted in
> the same order? The examples given in the commit messages that
> introduced this fix would imply not, but I'm not sure how that could
> come about.
>
> -bcd
>
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH] unpack-trees: Make index lookahead less pessimal
2010-06-11 2:59 ` [PATCH] unpack-trees: Make index lookahead less pessimal Brian Downing
@ 2010-06-11 4:26 ` Junio C Hamano
0 siblings, 0 replies; 6+ messages in thread
From: Junio C Hamano @ 2010-06-11 4:26 UTC (permalink / raw)
To: Brian Downing; +Cc: git
I _knew_ I coded that one pessimally and the code could be optimized not
to scan the same thing over and over again.
I haven't looked at the affected codepath for quite some time, but I'll
try to find time to review the patch when able.
Thanks.
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2010-06-11 4:26 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-06-10 0:10 Serious performance regression in diff between 1.6.6 and 1.7.0 Brian Downing
2010-06-10 17:08 ` Brian Downing
2010-06-10 18:14 ` Brian Downing
2010-06-10 18:42 ` Brian Downing
2010-06-11 2:59 ` [PATCH] unpack-trees: Make index lookahead less pessimal Brian Downing
2010-06-11 4:26 ` Junio C Hamano
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).