From: Davide Libenzi <davidel@xmailserver.org>
To: Linus Torvalds <torvalds@osdl.org>
Cc: Jim Meyering <jim@meyering.net>, Git Mailing List <git@vger.kernel.org>
Subject: Re: git-diff-tree inordinately (O(M*N)) slow on files with many changes
Date: Mon, 16 Oct 2006 11:18:22 -0700 (PDT) [thread overview]
Message-ID: <Pine.LNX.4.64.0610161109430.7697@alien.or.mcafeemobile.com> (raw)
In-Reply-To: <Pine.LNX.4.64.0610161038200.3962@g5.osdl.org>
On Mon, 16 Oct 2006, Linus Torvalds wrote:
> On Mon, 16 Oct 2006, Jim Meyering wrote:
> >
> > That helps a little.
> > Now, instead of taking 63s, my test takes ~30s.
> > (32 for XDL_MAX_EQLIMIT = 16, 30 for XDL_MAX_EQLIMIT = 8)
>
> Btw, what architecture is this on?
>
> I'm testing those two files, and I get much more reasonable numbers with
> both ppc32 and x86. Both 32-bit:
>
> [torvalds@macmini test-perf]$ time git show | wc -l
> 25221
>
> real 0m1.437s
> user 0m1.436s
> sys 0m0.012s
>
> ie it generated the diff in less than a second and a half. Not wonderful,
> but certainly not your 63s either.
>
> HOWEVER. On x86-64, it takes forever (still not 63 seconds, but it takes
> 17 seconds on my 2GHz merom machine).
>
> So I think there's something seriously broken with hashing on 64-bit.
>
> And I think I know what it is.
>
> Try this patch. And make sure to do a "make clean" first, since I think
> the dependencies on xdiff may be broken.
>
> Davide: there's two things wrong with your old XDL_HASHLONG():
>
> - the GR_PRIME was just 32-bit, so it wouldn't shift low bits up far
> enough on a 64-bit architecture, so then shifting things down caused
> pretty much everything to be very small.
>
> - The whole idea of shifting up by multiplying and then shifting down to
> get the high bits is _broken_. Even on 32-bit architectures. Think
> about what happens when "hashbits" is 16 on a 32-bit architecture: the
> multiply moves the low bits _up_, but it doesn't move the high bits
> _down_. And with hashbits being a large fraction of the whole word, you
> need to shift things down, not up.
>
> So just making GR_PRIME be a bigger value on a 64-bit architecture would
> not have fixed it. The whole hash was simply broken. Do it the sane and
> obvious way instead: always pick the low bits, but mix in upper bits there
> too..
Yeah, using an appropriate golden ratio prime for 64 bits fixes it. I
think it's the best/minimal fix (use 0x9e37fffffffc0001UL, like the
kernel does).
I'm also looking into optimizing the multi-match discard loop, that
actually loses the classifier informations collected in the context
prepare phase.
- Davide
next prev parent reply other threads:[~2006-10-16 18:18 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-10-16 14:12 git-diff-tree inordinately (O(M*N)) slow on files with many changes Jim Meyering
2006-10-16 15:47 ` Linus Torvalds
2006-10-16 16:12 ` Linus Torvalds
2006-10-16 16:33 ` Jim Meyering
2006-10-16 16:42 ` Davide Libenzi
2006-10-16 16:50 ` Jim Meyering
2006-10-16 16:54 ` Davide Libenzi
2006-10-16 16:57 ` Jim Meyering
2006-10-16 17:02 ` Davide Libenzi
2006-10-16 17:56 ` Linus Torvalds
2006-10-16 18:03 ` Linus Torvalds
2006-10-16 18:41 ` Davide Libenzi
2006-10-16 18:18 ` Davide Libenzi [this message]
2006-10-16 18:51 ` Linus Torvalds
2006-10-16 19:44 ` Davide Libenzi
2006-10-16 20:29 ` Jakub Narebski
2006-10-16 22:53 ` Junio C Hamano
2006-10-16 23:24 ` Linus Torvalds
2006-10-16 23:52 ` Davide Libenzi
2006-10-16 18:24 ` Jim Meyering
2006-10-16 18:30 ` Davide Libenzi
2006-10-16 18:43 ` Jim Meyering
2006-10-16 16:54 ` Linus Torvalds
2006-10-16 16:36 ` Davide Libenzi
2006-10-16 16:57 ` Linus Torvalds
2006-10-16 16:24 ` Davide Libenzi
2006-10-16 16:54 ` Jakub Narebski
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Pine.LNX.4.64.0610161109430.7697@alien.or.mcafeemobile.com \
--to=davidel@xmailserver.org \
--cc=git@vger.kernel.org \
--cc=jim@meyering.net \
--cc=torvalds@osdl.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).