git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* git-diff-tree inordinately (O(M*N)) slow on files with many changes
@ 2006-10-16 14:12 Jim Meyering
  2006-10-16 15:47 ` Linus Torvalds
  0 siblings, 1 reply; 27+ messages in thread
From: Jim Meyering @ 2006-10-16 14:12 UTC (permalink / raw)
  To: git

[using very latest code, built an hour ago:
 git version 1.4.3.rc3.gb32db-dirty ]

I found that git-diff-tree is *very* slow when processing files with
many changes.  The offending example involves comparing the configure
file from coreutils-6.3 with that from the latest coreutils development
sources.  Both are over 50k lines long, and the diff -u output is almost
50k-lines long, divided into ~700 hunks.

  http://meyering.net/code/git-perf/configure.gz
  http://meyering.net/code/git-perf/configure-curr.gz

Comparing them with "diff -u" takes about 0.3s.
Putting them in a git repo (uncompressed, and with the same name,
of course) and comparing with git-diff-tree takes over a minute.
That's 200 times slower.

I traced the problem to xdiff/xprepare.c's xdl_cleanup_records function.
Each of its two doubly-nested loops ends up running the inner-loop
code more than 2 *billion* times.

That seems to be due to the two typos fixed by this patch:
With this patch, my "git-diff-tree --no-commit-id -r -p 2c2172"
command completes in just 2 seconds.

diff --git a/xdiff/xprepare.c b/xdiff/xprepare.c
index 1be7b31..e5438a9 100644
--- a/xdiff/xprepare.c
+++ b/xdiff/xprepare.c
@@ -381,7 +381,7 @@ static int xdl_cleanup_records(xdfile_t
 		hav = (*recs)->ha;
 		rhi = (long) XDL_HASHLONG(hav, xdf2->hbits);
 		for (nm = 0, rec = xdf2->rhash[rhi]; rec; rec = rec->next)
-			if (rec->ha == hav && ++nm == mlim)
+			if (rec->ha == hav || ++nm == mlim)
 				break;
 		dis1[i] = (nm == 0) ? 0: (nm >= mlim) ? 2: 1;
 	}
@@ -392,7 +392,7 @@ static int xdl_cleanup_records(xdfile_t
 		hav = (*recs)->ha;
 		rhi = (long) XDL_HASHLONG(hav, xdf1->hbits);
 		for (nm = 0, rec = xdf1->rhash[rhi]; rec; rec = rec->next)
-			if (rec->ha == hav && ++nm == mlim)
+			if (rec->ha == hav || ++nm == mlim)
 				break;
 		dis2[i] = (nm == 0) ? 0: (nm >= mlim) ? 2: 1;
 	}

However, that change causes the t800*.sh (14,16,18) annotate/blame
tests to fail, so take it with a grain of salt.

I.e., running one of them manually gave this:

    $ sh t8001-annotate.sh --immediate --verbose
    * expecting success: check_count A 2 B 1 B1 2 B2 1 "A U Thor" 1
    Author A (expected 2, attributed 2) good
    Author B1 (expected 2, attributed 1) bad
    Author A U Thor (expected 1, attributed 2) bad
    Author B2 (expected 1, attributed 1) good
    Author B (expected 1, attributed 1) good
    * FAIL 14: Two lines blamed on A, one on B, two on B1, one on B2, one on A U Thor
            check_count A 2 B 1 B1 2 B2 1 "A U Thor" 1
    [Exit 1]

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: git-diff-tree inordinately (O(M*N)) slow on files with many changes
  2006-10-16 14:12 git-diff-tree inordinately (O(M*N)) slow on files with many changes Jim Meyering
@ 2006-10-16 15:47 ` Linus Torvalds
  2006-10-16 16:12   ` Linus Torvalds
  2006-10-16 16:24   ` Davide Libenzi
  0 siblings, 2 replies; 27+ messages in thread
From: Linus Torvalds @ 2006-10-16 15:47 UTC (permalink / raw)
  To: Jim Meyering, Davide Libenzi; +Cc: Git Mailing List


Davide? I'm quoting the whole report, because I suspect you don't follow 
the git lists, and this is all original libxdiff code.

Jim: the annotation failure _may_ just be due to a "valid" diff change 
(there is not always a unique correct answer for a diff, and so two 
different diff algorithms can validly give two different answers, which 
will also mean that git-annotate/blame would give different explanations). 

But it could certainly also be that you just broke the diffs entirely, so 
I would like to wait for Davide to comment on your diff before Junio 
should apply it. 

Others may be intimately familiar with the diff algorithms too, of course. 
It just scares me personally ;)

		Linus

On Mon, 16 Oct 2006, Jim Meyering wrote:
>
> [using very latest code, built an hour ago:
>  git version 1.4.3.rc3.gb32db-dirty ]
> 
> I found that git-diff-tree is *very* slow when processing files with
> many changes.  The offending example involves comparing the configure
> file from coreutils-6.3 with that from the latest coreutils development
> sources.  Both are over 50k lines long, and the diff -u output is almost
> 50k-lines long, divided into ~700 hunks.
> 
>   http://meyering.net/code/git-perf/configure.gz
>   http://meyering.net/code/git-perf/configure-curr.gz
> 
> Comparing them with "diff -u" takes about 0.3s.
> Putting them in a git repo (uncompressed, and with the same name,
> of course) and comparing with git-diff-tree takes over a minute.
> That's 200 times slower.
> 
> I traced the problem to xdiff/xprepare.c's xdl_cleanup_records function.
> Each of its two doubly-nested loops ends up running the inner-loop
> code more than 2 *billion* times.
> 
> That seems to be due to the two typos fixed by this patch:
> With this patch, my "git-diff-tree --no-commit-id -r -p 2c2172"
> command completes in just 2 seconds.
> 
> diff --git a/xdiff/xprepare.c b/xdiff/xprepare.c
> index 1be7b31..e5438a9 100644
> --- a/xdiff/xprepare.c
> +++ b/xdiff/xprepare.c
> @@ -381,7 +381,7 @@ static int xdl_cleanup_records(xdfile_t
>  		hav = (*recs)->ha;
>  		rhi = (long) XDL_HASHLONG(hav, xdf2->hbits);
>  		for (nm = 0, rec = xdf2->rhash[rhi]; rec; rec = rec->next)
> -			if (rec->ha == hav && ++nm == mlim)
> +			if (rec->ha == hav || ++nm == mlim)
>  				break;
>  		dis1[i] = (nm == 0) ? 0: (nm >= mlim) ? 2: 1;
>  	}
> @@ -392,7 +392,7 @@ static int xdl_cleanup_records(xdfile_t
>  		hav = (*recs)->ha;
>  		rhi = (long) XDL_HASHLONG(hav, xdf1->hbits);
>  		for (nm = 0, rec = xdf1->rhash[rhi]; rec; rec = rec->next)
> -			if (rec->ha == hav && ++nm == mlim)
> +			if (rec->ha == hav || ++nm == mlim)
>  				break;
>  		dis2[i] = (nm == 0) ? 0: (nm >= mlim) ? 2: 1;
>  	}
> 
> However, that change causes the t800*.sh (14,16,18) annotate/blame
> tests to fail, so take it with a grain of salt.
> 
> I.e., running one of them manually gave this:
> 
>     $ sh t8001-annotate.sh --immediate --verbose
>     * expecting success: check_count A 2 B 1 B1 2 B2 1 "A U Thor" 1
>     Author A (expected 2, attributed 2) good
>     Author B1 (expected 2, attributed 1) bad
>     Author A U Thor (expected 1, attributed 2) bad
>     Author B2 (expected 1, attributed 1) good
>     Author B (expected 1, attributed 1) good
>     * FAIL 14: Two lines blamed on A, one on B, two on B1, one on B2, one on A U Thor
>             check_count A 2 B 1 B1 2 B2 1 "A U Thor" 1
>     [Exit 1]
> -
> To unsubscribe from this list: send the line "unsubscribe git" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: git-diff-tree inordinately (O(M*N)) slow on files with many changes
  2006-10-16 15:47 ` Linus Torvalds
@ 2006-10-16 16:12   ` Linus Torvalds
  2006-10-16 16:33     ` Jim Meyering
  2006-10-16 16:36     ` Davide Libenzi
  2006-10-16 16:24   ` Davide Libenzi
  1 sibling, 2 replies; 27+ messages in thread
From: Linus Torvalds @ 2006-10-16 16:12 UTC (permalink / raw)
  To: Jim Meyering, Davide Libenzi; +Cc: Git Mailing List



On Mon, 16 Oct 2006, Linus Torvalds wrote:
> 
> But it could certainly also be that you just broke the diffs entirely, so 
> I would like to wait for Davide to comment on your diff before Junio 
> should apply it. 

I think you broke it. 

If the "&& vs ||" makes a difference (and it clearly does), that implies 
that you have lots of different hash values on the same hash chain, and 
you end up considering those _different_ hash values to be all equivalent 
for the counting, even though they obviously aren't.

I think the real problem is that with big input, the hash tables are too 
small, making the hash chains too long - even though the values on the 
chains are different (ie we're not hashing different records with the same 
hash value over and over again - if that was true, the "&& vs ||" change 
wouldn't make any difference).

So I think xdiff has chosen too small a hash. Can you try what happens if 
you change xdl_hashbits() (in xdiff/xutil.c) instead? Try making it return 
a bigger value (for example, by initializing "bits" to 2 instead of 0), 
and see if that makes a difference.

But again, I'm not actually all _that_ familiar with the libxdiff 
algorithms, _especially_ the line-based ones (I can follow the regular 
binary delta code, but the line-based one just makes my head hurt). So 
take anything I say with a pinch of salt.

		Linus

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: git-diff-tree inordinately (O(M*N)) slow on files with many changes
  2006-10-16 15:47 ` Linus Torvalds
  2006-10-16 16:12   ` Linus Torvalds
@ 2006-10-16 16:24   ` Davide Libenzi
  2006-10-16 16:54     ` Jakub Narebski
  1 sibling, 1 reply; 27+ messages in thread
From: Davide Libenzi @ 2006-10-16 16:24 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Jim Meyering, Git Mailing List

On Mon, 16 Oct 2006, Linus Torvalds wrote:

> 
> Davide? I'm quoting the whole report, because I suspect you don't follow 
> the git lists, and this is all original libxdiff code.
> 
> Jim: the annotation failure _may_ just be due to a "valid" diff change 
> (there is not always a unique correct answer for a diff, and so two 
> different diff algorithms can validly give two different answers, which 
> will also mean that git-annotate/blame would give different explanations). 
> 
> But it could certainly also be that you just broke the diffs entirely, so 
> I would like to wait for Davide to comment on your diff before Junio 
> should apply it. 
> 
> Others may be intimately familiar with the diff algorithms too, of course. 
> It just scares me personally ;)

The test is fine as is. Only really bad hash collisions can show O(M*N). 
Can I have the two sample files to test?



- Davide

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: git-diff-tree inordinately (O(M*N)) slow on files with many changes
  2006-10-16 16:12   ` Linus Torvalds
@ 2006-10-16 16:33     ` Jim Meyering
  2006-10-16 16:42       ` Davide Libenzi
  2006-10-16 16:54       ` Linus Torvalds
  2006-10-16 16:36     ` Davide Libenzi
  1 sibling, 2 replies; 27+ messages in thread
From: Jim Meyering @ 2006-10-16 16:33 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Davide Libenzi, Git Mailing List

Linus Torvalds <torvalds@osdl.org> wrote:
> On Mon, 16 Oct 2006, Linus Torvalds wrote:
...
> So I think xdiff has chosen too small a hash. Can you try what happens if
> you change xdl_hashbits() (in xdiff/xutil.c) instead? Try making it return
> a bigger value (for example, by initializing "bits" to 2 instead of 0),
> and see if that makes a difference.

It makes no difference.

Bear in mind that there are a *lot* of duplicate lines in the files
being compared: filtering each through "sort -u" removes 40-50k lines.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: git-diff-tree inordinately (O(M*N)) slow on files with many changes
  2006-10-16 16:12   ` Linus Torvalds
  2006-10-16 16:33     ` Jim Meyering
@ 2006-10-16 16:36     ` Davide Libenzi
  2006-10-16 16:57       ` Linus Torvalds
  1 sibling, 1 reply; 27+ messages in thread
From: Davide Libenzi @ 2006-10-16 16:36 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Jim Meyering, Git Mailing List

On Mon, 16 Oct 2006, Linus Torvalds wrote:

> On Mon, 16 Oct 2006, Linus Torvalds wrote:
> > 
> > But it could certainly also be that you just broke the diffs entirely, so 
> > I would like to wait for Davide to comment on your diff before Junio 
> > should apply it. 
> 
> I think you broke it. 
> 
> If the "&& vs ||" makes a difference (and it clearly does), that implies 
> that you have lots of different hash values on the same hash chain, and 
> you end up considering those _different_ hash values to be all equivalent 
> for the counting, even though they obviously aren't.
> 
> I think the real problem is that with big input, the hash tables are too 
> small, making the hash chains too long - even though the values on the 
> chains are different (ie we're not hashing different records with the same 
> hash value over and over again - if that was true, the "&& vs ||" change 
> wouldn't make any difference).
> 
> So I think xdiff has chosen too small a hash. Can you try what happens if 
> you change xdl_hashbits() (in xdiff/xutil.c) instead? Try making it return 
> a bigger value (for example, by initializing "bits" to 2 instead of 0), 
> and see if that makes a difference.

I think the xdl_hashbits() picks up the hash table size "almost" 
correctly. I think we're looking at some bad hash *collisions* (not 
records with same hash value, that'd be stopped by the mlim check). 
Send me the files and I'll take a look ...




> But again, I'm not actually all _that_ familiar with the libxdiff 
> algorithms, _especially_ the line-based ones (I can follow the regular 
> binary delta code, but the line-based one just makes my head hurt). So 
> take anything I say with a pinch of salt.

That's my revenge on myself having to follow your code in the kernel  :D




- Davide

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: git-diff-tree inordinately (O(M*N)) slow on files with many changes
  2006-10-16 16:33     ` Jim Meyering
@ 2006-10-16 16:42       ` Davide Libenzi
  2006-10-16 16:50         ` Jim Meyering
  2006-10-16 16:54       ` Linus Torvalds
  1 sibling, 1 reply; 27+ messages in thread
From: Davide Libenzi @ 2006-10-16 16:42 UTC (permalink / raw)
  To: Jim Meyering; +Cc: Linus Torvalds, Git Mailing List

On Mon, 16 Oct 2006, Jim Meyering wrote:

> Linus Torvalds <torvalds@osdl.org> wrote:
> > On Mon, 16 Oct 2006, Linus Torvalds wrote:
> ...
> > So I think xdiff has chosen too small a hash. Can you try what happens if
> > you change xdl_hashbits() (in xdiff/xutil.c) instead? Try making it return
> > a bigger value (for example, by initializing "bits" to 2 instead of 0),
> > and see if that makes a difference.
> 
> It makes no difference.
> 
> Bear in mind that there are a *lot* of duplicate lines in the files
> being compared: filtering each through "sort -u" removes 40-50k lines.

Ok, try to bring down XDL_MAX_EQLIMIT to something like 8 or 16 ...



- Davide

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: git-diff-tree inordinately (O(M*N)) slow on files with many changes
  2006-10-16 16:42       ` Davide Libenzi
@ 2006-10-16 16:50         ` Jim Meyering
  2006-10-16 16:54           ` Davide Libenzi
  2006-10-16 17:56           ` Linus Torvalds
  0 siblings, 2 replies; 27+ messages in thread
From: Jim Meyering @ 2006-10-16 16:50 UTC (permalink / raw)
  To: Davide Libenzi; +Cc: Linus Torvalds, Git Mailing List

Davide Libenzi <davidel@xmailserver.org> wrote:
> On Mon, 16 Oct 2006, Jim Meyering wrote:
>
>> Linus Torvalds <torvalds@osdl.org> wrote:
>> > On Mon, 16 Oct 2006, Linus Torvalds wrote:
>> ...
>> > So I think xdiff has chosen too small a hash. Can you try what happens if
>> > you change xdl_hashbits() (in xdiff/xutil.c) instead? Try making it return
>> > a bigger value (for example, by initializing "bits" to 2 instead of 0),
>> > and see if that makes a difference.
>>
>> It makes no difference.
>>
>> Bear in mind that there are a *lot* of duplicate lines in the files
>> being compared: filtering each through "sort -u" removes 40-50k lines.
>
> Ok, try to bring down XDL_MAX_EQLIMIT to something like 8 or 16 ...

That helps a little.
Now, instead of taking 63s, my test takes ~30s.
(32 for XDL_MAX_EQLIMIT = 16, 30 for XDL_MAX_EQLIMIT = 8)

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: git-diff-tree inordinately (O(M*N)) slow on files with many changes
  2006-10-16 16:24   ` Davide Libenzi
@ 2006-10-16 16:54     ` Jakub Narebski
  0 siblings, 0 replies; 27+ messages in thread
From: Jakub Narebski @ 2006-10-16 16:54 UTC (permalink / raw)
  To: git

<opublikowany i wysłany>

Davide Libenzi wrote:

> The test is fine as is. Only really bad hash collisions can show O(M*N). 
> Can I have the two sample files to test?

Jim Meyering wrote:

>   http://meyering.net/code/git-perf/configure.gz
>   http://meyering.net/code/git-perf/configure-curr.gz

-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: git-diff-tree inordinately (O(M*N)) slow on files with many changes
  2006-10-16 16:33     ` Jim Meyering
  2006-10-16 16:42       ` Davide Libenzi
@ 2006-10-16 16:54       ` Linus Torvalds
  1 sibling, 0 replies; 27+ messages in thread
From: Linus Torvalds @ 2006-10-16 16:54 UTC (permalink / raw)
  To: Jim Meyering; +Cc: Davide Libenzi, Git Mailing List



On Mon, 16 Oct 2006, Jim Meyering wrote:

> Linus Torvalds <torvalds@osdl.org> wrote:
> > On Mon, 16 Oct 2006, Linus Torvalds wrote:
> ...
> > So I think xdiff has chosen too small a hash. Can you try what happens if
> > you change xdl_hashbits() (in xdiff/xutil.c) instead? Try making it return
> > a bigger value (for example, by initializing "bits" to 2 instead of 0),
> > and see if that makes a difference.
> 
> It makes no difference.
> 
> Bear in mind that there are a *lot* of duplicate lines in the files
> being compared: filtering each through "sort -u" removes 40-50k lines.

It can't be due to duplicate lines. If the lines are truly duplicate, then 
they'd get the same 32-bit hash value, and then the first conditional in 
the expression would always be true, and then it wouldn't _matter_ if it's 
a "&&" or a "||".

See?

So as far as I can tell it has to be some kind of collission on the hash 
queue with _different_ hash values being queued on the same hash queue.

Now, it could be that there's a bad hash algorithm somewhere (eg if 
XDL_HASHLONG() just does horribly badly in distributing the hash values 
onto the hash queues, you'd see this _regardless_ of how many bits you 
have, just because it clumps).

Or there could be something else that I'm just missing..

It would probably be nice to just get a sampling of what the hash-queue 
looks like for the bad case? Maybe it would be obvious that certain 
different hash values then get the same XDL_HASHLONG() thing..

		Linus

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: git-diff-tree inordinately (O(M*N)) slow on files with many changes
  2006-10-16 16:50         ` Jim Meyering
@ 2006-10-16 16:54           ` Davide Libenzi
  2006-10-16 16:57             ` Jim Meyering
  2006-10-16 17:56           ` Linus Torvalds
  1 sibling, 1 reply; 27+ messages in thread
From: Davide Libenzi @ 2006-10-16 16:54 UTC (permalink / raw)
  To: Jim Meyering; +Cc: Linus Torvalds, Git Mailing List

On Mon, 16 Oct 2006, Jim Meyering wrote:

> Davide Libenzi <davidel@xmailserver.org> wrote:
> > On Mon, 16 Oct 2006, Jim Meyering wrote:
> >
> >> Linus Torvalds <torvalds@osdl.org> wrote:
> >> > On Mon, 16 Oct 2006, Linus Torvalds wrote:
> >> ...
> >> > So I think xdiff has chosen too small a hash. Can you try what happens if
> >> > you change xdl_hashbits() (in xdiff/xutil.c) instead? Try making it return
> >> > a bigger value (for example, by initializing "bits" to 2 instead of 0),
> >> > and see if that makes a difference.
> >>
> >> It makes no difference.
> >>
> >> Bear in mind that there are a *lot* of duplicate lines in the files
> >> being compared: filtering each through "sort -u" removes 40-50k lines.
> >
> > Ok, try to bring down XDL_MAX_EQLIMIT to something like 8 or 16 ...
> 
> That helps a little.
> Now, instead of taking 63s, my test takes ~30s.
> (32 for XDL_MAX_EQLIMIT = 16, 30 for XDL_MAX_EQLIMIT = 8)

That's too much still. May I have the offending files?



- Davide

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: git-diff-tree inordinately (O(M*N)) slow on files with many changes
  2006-10-16 16:54           ` Davide Libenzi
@ 2006-10-16 16:57             ` Jim Meyering
  2006-10-16 17:02               ` Davide Libenzi
  0 siblings, 1 reply; 27+ messages in thread
From: Jim Meyering @ 2006-10-16 16:57 UTC (permalink / raw)
  To: Davide Libenzi; +Cc: Linus Torvalds, Git Mailing List

Davide Libenzi <davidel@xmailserver.org> wrote:
>> That helps a little.
>> Now, instead of taking 63s, my test takes ~30s.
>> (32 for XDL_MAX_EQLIMIT = 16, 30 for XDL_MAX_EQLIMIT = 8)
>
> That's too much still. May I have the offending files?

These URLs were in at least one of the messages to which
you've replied.  Would you like me to send them instead?

>   http://meyering.net/code/git-perf/configure.gz
>   http://meyering.net/code/git-perf/configure-curr.gz

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: git-diff-tree inordinately (O(M*N)) slow on files with many changes
  2006-10-16 16:36     ` Davide Libenzi
@ 2006-10-16 16:57       ` Linus Torvalds
  0 siblings, 0 replies; 27+ messages in thread
From: Linus Torvalds @ 2006-10-16 16:57 UTC (permalink / raw)
  To: Davide Libenzi; +Cc: Jim Meyering, Git Mailing List



On Mon, 16 Oct 2006, Davide Libenzi wrote:
>
> I think the xdl_hashbits() picks up the hash table size "almost" 
> correctly. I think we're looking at some bad hash *collisions* (not 
> records with same hash value, that'd be stopped by the mlim check). 
> Send me the files and I'll take a look ...

Davide, they were mentioned in the original report:

	http://meyering.net/code/git-perf/configure.gz
	http://meyering.net/code/git-perf/configure-curr.gz

I'd take a look myself, but I need to run..

		Linus

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: git-diff-tree inordinately (O(M*N)) slow on files with many changes
  2006-10-16 16:57             ` Jim Meyering
@ 2006-10-16 17:02               ` Davide Libenzi
  0 siblings, 0 replies; 27+ messages in thread
From: Davide Libenzi @ 2006-10-16 17:02 UTC (permalink / raw)
  To: Jim Meyering; +Cc: Linus Torvalds, Git Mailing List

On Mon, 16 Oct 2006, Jim Meyering wrote:

> Davide Libenzi <davidel@xmailserver.org> wrote:
> >> That helps a little.
> >> Now, instead of taking 63s, my test takes ~30s.
> >> (32 for XDL_MAX_EQLIMIT = 16, 30 for XDL_MAX_EQLIMIT = 8)
> >
> > That's too much still. May I have the offending files?
> 
> These URLs were in at least one of the messages to which
> you've replied.  Would you like me to send them instead?
> 
> >   http://meyering.net/code/git-perf/configure.gz
> >   http://meyering.net/code/git-perf/configure-curr.gz

Thanks, those are fine.



- Davide

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: git-diff-tree inordinately (O(M*N)) slow on files with many changes
  2006-10-16 16:50         ` Jim Meyering
  2006-10-16 16:54           ` Davide Libenzi
@ 2006-10-16 17:56           ` Linus Torvalds
  2006-10-16 18:03             ` Linus Torvalds
                               ` (2 more replies)
  1 sibling, 3 replies; 27+ messages in thread
From: Linus Torvalds @ 2006-10-16 17:56 UTC (permalink / raw)
  To: Jim Meyering; +Cc: Davide Libenzi, Git Mailing List



On Mon, 16 Oct 2006, Jim Meyering wrote:
> 
> That helps a little.
> Now, instead of taking 63s, my test takes ~30s.
> (32 for XDL_MAX_EQLIMIT = 16, 30 for XDL_MAX_EQLIMIT = 8)

Btw, what architecture is this on?

I'm testing those two files, and I get much more reasonable numbers with 
both ppc32 and x86. Both 32-bit:

	[torvalds@macmini test-perf]$ time git show | wc -l
	25221

	real    0m1.437s
	user    0m1.436s
	sys     0m0.012s

ie it generated the diff in less than a second and a half. Not wonderful, 
but certainly not your 63s either.

HOWEVER. On x86-64, it takes forever (still not 63 seconds, but it takes 
17 seconds on my 2GHz merom machine).

So I think there's something seriously broken with hashing on 64-bit. 

And I think I know what it is.

Try this patch. And make sure to do a "make clean" first, since I think 
the dependencies on xdiff may be broken.

Davide: there's two things wrong with your old XDL_HASHLONG():

 - the GR_PRIME was just 32-bit, so it wouldn't shift low bits up far 
   enough on a 64-bit architecture, so then shifting things down caused 
   pretty much everything to be very small.

 - The whole idea of shifting up by multiplying and then shifting down to 
   get the high bits is _broken_. Even on 32-bit architectures. Think 
   about what happens when "hashbits" is 16 on a 32-bit architecture: the 
   multiply moves the low bits _up_, but it doesn't move the high bits 
   _down_. And with hashbits being a large fraction of the whole word, you 
   need to shift things down, not up.

So just making GR_PRIME be a bigger value on a 64-bit architecture would 
not have fixed it. The whole hash was simply broken. Do it the sane and 
obvious way instead: always pick the low bits, but mix in upper bits there 
too..

This patch brings the time down from 17 seconds to 0.8 seconds for me.

		Linus

---
diff --git a/xdiff/xmacros.h b/xdiff/xmacros.h
index 4c2fde8..bb4830b 100644
--- a/xdiff/xmacros.h
+++ b/xdiff/xmacros.h
@@ -24,14 +24,27 @@ #if !defined(XMACROS_H)
 #define XMACROS_H
 
 
-#define GR_PRIME 0x9e370001UL
+static inline unsigned long xdl_hashlong(unsigned long val, unsigned int bits)
+{
+	unsigned long shift = val >> bits;
+
+	/* Shift in the upper bits too */
+	val += shift;
+
+	/* Do it twice for small values of bits */
+	if (bits < 4*sizeof(unsigned long))
+		val += shift >> bits;
+
+	/* Return the resulting low bits */
+	return val & ((1ul << bits)-1);
+}
 
 
 #define XDL_MIN(a, b) ((a) < (b) ? (a): (b))
 #define XDL_MAX(a, b) ((a) > (b) ? (a): (b))
 #define XDL_ABS(v) ((v) >= 0 ? (v): -(v))
 #define XDL_ISDIGIT(c) ((c) >= '0' && (c) <= '9')
-#define XDL_HASHLONG(v, b) (((unsigned long)(v) * GR_PRIME) >> ((CHAR_BIT * sizeof(unsigned long)) - (b)))
+#define XDL_HASHLONG(v, b) xdl_hashlong(v,b)
 #define XDL_PTRFREE(p) do { if (p) { xdl_free(p); (p) = NULL; } } while (0)
 #define XDL_LE32_PUT(p, v) \
 do { \

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: git-diff-tree inordinately (O(M*N)) slow on files with many changes
  2006-10-16 17:56           ` Linus Torvalds
@ 2006-10-16 18:03             ` Linus Torvalds
  2006-10-16 18:41               ` Davide Libenzi
  2006-10-16 18:18             ` Davide Libenzi
  2006-10-16 18:24             ` Jim Meyering
  2 siblings, 1 reply; 27+ messages in thread
From: Linus Torvalds @ 2006-10-16 18:03 UTC (permalink / raw)
  To: Jim Meyering; +Cc: Davide Libenzi, Git Mailing List



On Mon, 16 Oct 2006, Linus Torvalds wrote:
> 
> So just making GR_PRIME be a bigger value on a 64-bit architecture would 
> not have fixed it.

Side note: in _practice_ I think it would have fixed it. The "not mixing 
in high bits" is not a real problem if the original hash-value has a good 
distribution of bits, which I think we do have. So it's unclear whether we 
even need any mixing in of bits at all, and it's possible that it would be 
fine to just have

	#define XDL_HASHLONG(v,b) ((unsigned long)(v) & ((1ul << (b))-1))

which is simpler than my patch.

I prefer the mixing in of high bits just because it can help if the 
original hash was bad (or had a tendency to have patterns in the low bits, 
which could be the case). But I'm not sure xdiff actually needs it in this 
case.

			Linus

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: git-diff-tree inordinately (O(M*N)) slow on files with many changes
  2006-10-16 17:56           ` Linus Torvalds
  2006-10-16 18:03             ` Linus Torvalds
@ 2006-10-16 18:18             ` Davide Libenzi
  2006-10-16 18:51               ` Linus Torvalds
  2006-10-16 18:24             ` Jim Meyering
  2 siblings, 1 reply; 27+ messages in thread
From: Davide Libenzi @ 2006-10-16 18:18 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Jim Meyering, Git Mailing List

On Mon, 16 Oct 2006, Linus Torvalds wrote:

> On Mon, 16 Oct 2006, Jim Meyering wrote:
> > 
> > That helps a little.
> > Now, instead of taking 63s, my test takes ~30s.
> > (32 for XDL_MAX_EQLIMIT = 16, 30 for XDL_MAX_EQLIMIT = 8)
> 
> Btw, what architecture is this on?
> 
> I'm testing those two files, and I get much more reasonable numbers with 
> both ppc32 and x86. Both 32-bit:
> 
> 	[torvalds@macmini test-perf]$ time git show | wc -l
> 	25221
> 
> 	real    0m1.437s
> 	user    0m1.436s
> 	sys     0m0.012s
> 
> ie it generated the diff in less than a second and a half. Not wonderful, 
> but certainly not your 63s either.
> 
> HOWEVER. On x86-64, it takes forever (still not 63 seconds, but it takes 
> 17 seconds on my 2GHz merom machine).
> 
> So I think there's something seriously broken with hashing on 64-bit. 
> 
> And I think I know what it is.
> 
> Try this patch. And make sure to do a "make clean" first, since I think 
> the dependencies on xdiff may be broken.
> 
> Davide: there's two things wrong with your old XDL_HASHLONG():
> 
>  - the GR_PRIME was just 32-bit, so it wouldn't shift low bits up far 
>    enough on a 64-bit architecture, so then shifting things down caused 
>    pretty much everything to be very small.
> 
>  - The whole idea of shifting up by multiplying and then shifting down to 
>    get the high bits is _broken_. Even on 32-bit architectures. Think 
>    about what happens when "hashbits" is 16 on a 32-bit architecture: the 
>    multiply moves the low bits _up_, but it doesn't move the high bits 
>    _down_. And with hashbits being a large fraction of the whole word, you 
>    need to shift things down, not up.
> 
> So just making GR_PRIME be a bigger value on a 64-bit architecture would 
> not have fixed it. The whole hash was simply broken. Do it the sane and 
> obvious way instead: always pick the low bits, but mix in upper bits there 
> too..

Yeah, using an appropriate golden ratio prime for 64 bits fixes it. I 
think it's the best/minimal fix (use 0x9e37fffffffc0001UL, like the 
kernel does).
I'm also looking into optimizing the multi-match discard loop, that 
actually loses the classifier informations collected in the context 
prepare phase.




- Davide

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: git-diff-tree inordinately (O(M*N)) slow on files with many changes
  2006-10-16 17:56           ` Linus Torvalds
  2006-10-16 18:03             ` Linus Torvalds
  2006-10-16 18:18             ` Davide Libenzi
@ 2006-10-16 18:24             ` Jim Meyering
  2006-10-16 18:30               ` Davide Libenzi
  2 siblings, 1 reply; 27+ messages in thread
From: Jim Meyering @ 2006-10-16 18:24 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Davide Libenzi, Git Mailing List

Linus Torvalds <torvalds@osdl.org> wrote:
> On Mon, 16 Oct 2006, Jim Meyering wrote:
>>
>> That helps a little.
>> Now, instead of taking 63s, my test takes ~30s.
>> (32 for XDL_MAX_EQLIMIT = 16, 30 for XDL_MAX_EQLIMIT = 8)
>
> Btw, what architecture is this on?
>
> I'm testing those two files, and I get much more reasonable numbers with
> both ppc32 and x86. Both 32-bit:
>
> 	[torvalds@macmini test-perf]$ time git show | wc -l
> 	25221
>
> 	real    0m1.437s
> 	user    0m1.436s
> 	sys     0m0.012s
>
> ie it generated the diff in less than a second and a half. Not wonderful,
> but certainly not your 63s either.
>
> HOWEVER. On x86-64, it takes forever (still not 63 seconds, but it takes
> 17 seconds on my 2GHz merom machine).
>
> So I think there's something seriously broken with hashing on 64-bit.

amd_64 @ 2.0GHz

> Try this patch. And make sure to do a "make clean" first, since I think
> the dependencies on xdiff may be broken.

Yep.  Dependencies are definitely broken.
Applied your patch.  No improvement after a plain "make",
but doing "make clean && make" solved the problem.

Now, my diff-tree takes 2s (it's comparing other files, too).
Thank you!

IMHO, my "&& vs. ||" patch is still worth applying.
If not, then the existing code doesn't make sense, and
there can be significant simplification in the affected loops.
With my patch, I get an additional 3x speed-up: diff-tree takes 0.7s

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: git-diff-tree inordinately (O(M*N)) slow on files with many changes
  2006-10-16 18:24             ` Jim Meyering
@ 2006-10-16 18:30               ` Davide Libenzi
  2006-10-16 18:43                 ` Jim Meyering
  0 siblings, 1 reply; 27+ messages in thread
From: Davide Libenzi @ 2006-10-16 18:30 UTC (permalink / raw)
  To: Jim Meyering; +Cc: Linus Torvalds, Git Mailing List

On Mon, 16 Oct 2006, Jim Meyering wrote:

> IMHO, my "&& vs. ||" patch is still worth applying.
> If not, then the existing code doesn't make sense, and
> there can be significant simplification in the affected loops.
> With my patch, I get an additional 3x speed-up: diff-tree takes 0.7s

No, the patch is broken. It will discard *any* line seen at least two 
times, that is an extremely low threashold.



- Davide

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: git-diff-tree inordinately (O(M*N)) slow on files with many changes
  2006-10-16 18:03             ` Linus Torvalds
@ 2006-10-16 18:41               ` Davide Libenzi
  0 siblings, 0 replies; 27+ messages in thread
From: Davide Libenzi @ 2006-10-16 18:41 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Jim Meyering, Git Mailing List

On Mon, 16 Oct 2006, Linus Torvalds wrote:

> On Mon, 16 Oct 2006, Linus Torvalds wrote:
> > 
> > So just making GR_PRIME be a bigger value on a 64-bit architecture would 
> > not have fixed it.
> 
> Side note: in _practice_ I think it would have fixed it. The "not mixing 
> in high bits" is not a real problem if the original hash-value has a good 
> distribution of bits, which I think we do have. So it's unclear whether we 
> even need any mixing in of bits at all, and it's possible that it would be 
> fine to just have
> 
> 	#define XDL_HASHLONG(v,b) ((unsigned long)(v) & ((1ul << (b))-1))
> 
> which is simpler than my patch.
> 
> I prefer the mixing in of high bits just because it can help if the 
> original hash was bad (or had a tendency to have patterns in the low bits, 
> which could be the case). But I'm not sure xdiff actually needs it in this 
> case.

ATM, I added AC_CHECK_SIZEOF(long) inside libxdiff's configure.in, and I 
have (inside xmacros.h):

#if SIZEOF_LONG == 4
#define GR_PRIME 0x9e370001UL
#else
#define GR_PRIME 0x9e37fffffffc0001UL
#endif

I'm also looking into streamlining the discard loop to reuse information 
collected during the context-setup phase ...




- Davide

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: git-diff-tree inordinately (O(M*N)) slow on files with many changes
  2006-10-16 18:30               ` Davide Libenzi
@ 2006-10-16 18:43                 ` Jim Meyering
  0 siblings, 0 replies; 27+ messages in thread
From: Jim Meyering @ 2006-10-16 18:43 UTC (permalink / raw)
  To: Davide Libenzi; +Cc: Linus Torvalds, Git Mailing List

Davide Libenzi <davidel@xmailserver.org> wrote:
...
> No, the patch is broken. It will discard *any* line seen at least two
> times, that is an extremely low threashold.

Oh!  I see it now.  Thanks.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: git-diff-tree inordinately (O(M*N)) slow on files with many changes
  2006-10-16 18:18             ` Davide Libenzi
@ 2006-10-16 18:51               ` Linus Torvalds
  2006-10-16 19:44                 ` Davide Libenzi
  2006-10-16 22:53                 ` Junio C Hamano
  0 siblings, 2 replies; 27+ messages in thread
From: Linus Torvalds @ 2006-10-16 18:51 UTC (permalink / raw)
  To: Junio C Hamano, Davide Libenzi; +Cc: Jim Meyering, Git Mailing List


Junio, I think this is worthy to go in before a 1.4.3 release. Possibly 
even back-ported to earlier trees. Anything that causes an almost two 
orders of magnitude slowdown (even if it's just on 64-bit architectures 
and most people won't necessarily compile git that way) is worth fixing 
pronto.

On Mon, 16 Oct 2006, Davide Libenzi wrote:
> 
> Yeah, using an appropriate golden ratio prime for 64 bits fixes it. I 
> think it's the best/minimal fix (use 0x9e37fffffffc0001UL, like the 
> kernel does).

Ok. But then you need something like the appended to avoid warnings..

(This is the only nice portable way to figure out at compile-time whether 
"unsigned long" is more than 32 bits that I can come up with: everything 
that uses actual C expressions ends up warning about integers not fitting 
etc)

Quite frankly, I prefer my previous patch more, it just avoids that whole 
problem, and two shifts and adds (even with a conditional) are often 
faster than a full 64-bit multiply.

		Linus
---
diff --git a/xdiff/xmacros.h b/xdiff/xmacros.h
index 4c2fde8..38f8f93 100644
--- a/xdiff/xmacros.h
+++ b/xdiff/xmacros.h
@@ -23,8 +23,13 @@
 #if !defined(XMACROS_H)
 #define XMACROS_H
 
+#include <limits.h>
 
+#if LONG_MAX > 2147483647ul
+#define GR_PRIME 0x9e37fffffffc0001UL
+#else
 #define GR_PRIME 0x9e370001UL
+#endif
 
 
 #define XDL_MIN(a, b) ((a) < (b) ? (a): (b))

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: git-diff-tree inordinately (O(M*N)) slow on files with many changes
  2006-10-16 18:51               ` Linus Torvalds
@ 2006-10-16 19:44                 ` Davide Libenzi
  2006-10-16 20:29                   ` Jakub Narebski
  2006-10-16 22:53                 ` Junio C Hamano
  1 sibling, 1 reply; 27+ messages in thread
From: Davide Libenzi @ 2006-10-16 19:44 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Junio C Hamano, Jim Meyering, Git Mailing List

On Mon, 16 Oct 2006, Linus Torvalds wrote:

> 
> Junio, I think this is worthy to go in before a 1.4.3 release. Possibly 
> even back-ported to earlier trees. Anything that causes an almost two 
> orders of magnitude slowdown (even if it's just on 64-bit architectures 
> and most people won't necessarily compile git that way) is worth fixing 
> pronto.

I ended up using this one:

#define XDL_HASHLONG(v, b) ((((unsigned long) (v) >> ((CHAR_BIT * sizeof(unsigned long)) - (b))) + \
                            (unsigned long) (v)) & ((1UL << (b)) - 1))

The GR_PRIME selection does not make me feel good, and the 'static inline' 
is puked-over by certain C compilers. It'd be probably fine to just use a 
simple function, though the above should work just fine.

real    0m0.665s
user    0m0.655s
sys     0m0.010s

(Opteron 252)



- Davide

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: git-diff-tree inordinately (O(M*N)) slow on files with many changes
  2006-10-16 19:44                 ` Davide Libenzi
@ 2006-10-16 20:29                   ` Jakub Narebski
  0 siblings, 0 replies; 27+ messages in thread
From: Jakub Narebski @ 2006-10-16 20:29 UTC (permalink / raw)
  To: git

Davide Libenzi wrote:

> I ended up using this one:
> 
> #define XDL_HASHLONG(v, b) ((((unsigned long) (v) >> ((CHAR_BIT *
sizeof(unsigned long)) - (b))) + \
>                             (unsigned long) (v)) & ((1UL << (b)) - 1))
> 
> The GR_PRIME selection does not make me feel good, and the 'static inline' 
> is puked-over by certain C compilers. It'd be probably fine to just use a 
> simple function, though the above should work just fine.
> 
> real    0m0.665s
> user    0m0.655s
> sys     0m0.010s
> 
> (Opteron 252)

Could you please do and post benchmarks for other solutions?
-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: git-diff-tree inordinately (O(M*N)) slow on files with many changes
  2006-10-16 18:51               ` Linus Torvalds
  2006-10-16 19:44                 ` Davide Libenzi
@ 2006-10-16 22:53                 ` Junio C Hamano
  2006-10-16 23:24                   ` Linus Torvalds
  1 sibling, 1 reply; 27+ messages in thread
From: Junio C Hamano @ 2006-10-16 22:53 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git, Davide Libenzi, Jim Meyering

Linus Torvalds <torvalds@osdl.org> writes:

> Junio, I think this is worthy to go in before a 1.4.3 release. Possibly 
> even back-ported to earlier trees.

Thanks for resolving this quickly; I was (am still somewhat)
down sick today and missed all the excitement.

> Quite frankly, I prefer my previous patch more, it just avoids that whole 
> problem, and two shifts and adds (even with a conditional) are often 
> faster than a full 64-bit multiply.

I agree (although I am not sure about the "do it twice for
small" bit), and I think Davide agrees with you in his reply:

Davide Libenzi <davidel@xmailserver.org> writes:
> On Mon, 16 Oct 2006, Linus Torvalds wrote:
> ...
> I ended up using this one:
>
> #define XDL_HASHLONG(v, b) ((((unsigned long) (v) >> ((CHAR_BIT * sizeof(unsigned long)) - (b))) + \
>                              (unsigned long) (v)) & ((1UL << (b)) - 1))

so I am inclined to apply Davide's version, but I am going to
bed again now...

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: git-diff-tree inordinately (O(M*N)) slow on files with many changes
  2006-10-16 22:53                 ` Junio C Hamano
@ 2006-10-16 23:24                   ` Linus Torvalds
  2006-10-16 23:52                     ` Davide Libenzi
  0 siblings, 1 reply; 27+ messages in thread
From: Linus Torvalds @ 2006-10-16 23:24 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Davide Libenzi, Jim Meyering



On Mon, 16 Oct 2006, Junio C Hamano wrote:
> 
> I agree (although I am not sure about the "do it twice for
> small" bit), and I think Davide agrees with you in his reply:

Sure. Davide's all-macro version is fine. I don't like re-using the same 
value twice even in a ALL-CAPS macro, so I'm used to inline functions, but 
all the uses of XDL_HASHLONG() are fine with multiple uses of the 
arguments.

Somebody should just double-check that all the parentheses ended up being 
right ;)

It might be easier to read if you write it as

	#define BITS_IN_LONG	(CHAR_BIT * sizeof(unsigned long))
	#define XDL_HIGHBITS(v,b) ((v) >> (BITS_IN_LONG - (b)))
	#define XDL_MASKBITS(b) ((1UL << (b)) - 1)
	#define XDL_HASHBITS(v,b) (((v) + XDL_HIGHBITS(v,b)) & XDL_MASKBITS(b))
	#define XDL_HASHLONG(v,b) XDL_HASHBITS( (unsigned long)(v) , b )

just to avoid one huge #define.

That said, it unnecessarily calculates "BITS_IN_LONG - (b)" to shift with, 
because it really shouldn't matter _which_ high bits you use for hashing, 
so you might as well just use the "next" b bits, and have

	#define XDL_ADDBITS(v,b)	((v) + ((v) >> (b)))
	#define XDL_MASKBITS(b)		((1UL << (b)) - 1)
	#define XDL_HASHLONG(v,b)	(XDL_ADDBITS((unsigned long)(v), b) & XDL_MASKBITS(b))

which generates better code at least on x86 (and x86-64), because the 
shift count stays the same for all shifts and can thus be kept in %ecx. 
For example, on x86-64, you get

	movq    %rdi, %rax		# copy 'val'
	movl    $1, %edx		# const 1: start generating (1 << b) - 1
	shrq    %cl, %rax		# val >> b
	salq    %cl, %rdx		# 1 << b
	leaq    (%rdi,%rax), %rax	# val + (val >> b)
	subq    $1, %rdx		# (1 << b) -1
	andq    %rdx, %rax		# final hash

which is short and sweet. And on ppc32 (or ppc64) you get

	li 9,1			# const 1: start generating (1 << b) - 1
	srw 0,3,4		# val >> b
	slw 9,9,4		# 1 << b
	add 0,0,3		# val + (val >> b)
	addi 9,9,-1		# (1 << b) - 1
	and 3,0,9		# final hash

in other words, apart from having two shifts (which you can't really 
avoid, although a multiply can do one of them) it's just a very efficient 
way to mix together (2*b) bits into a (b)-bit hash.

But taking the high bits from the "unsigned long" doesn't add _that_ much 
cost. I just suspect that it's a good way to continue to get different 
answers on 32-bit and 64-bit architectures. 

		Linus

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: git-diff-tree inordinately (O(M*N)) slow on files with many changes
  2006-10-16 23:24                   ` Linus Torvalds
@ 2006-10-16 23:52                     ` Davide Libenzi
  0 siblings, 0 replies; 27+ messages in thread
From: Davide Libenzi @ 2006-10-16 23:52 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Junio C Hamano, git, Jim Meyering

On Mon, 16 Oct 2006, Linus Torvalds wrote:

> That said, it unnecessarily calculates "BITS_IN_LONG - (b)" to shift with, 
> because it really shouldn't matter _which_ high bits you use for hashing, 
> so you might as well just use the "next" b bits, and have
> 
> 	#define XDL_ADDBITS(v,b)	((v) + ((v) >> (b)))
> 	#define XDL_MASKBITS(b)		((1UL << (b)) - 1)
> 	#define XDL_HASHLONG(v,b)	(XDL_ADDBITS((unsigned long)(v), b) & XDL_MASKBITS(b))

Ok, I'm fine with this. And my Opteron agrees too:

real    0m0.283s
user    0m0.267s
sys     0m0.016s



- Davide

^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2006-10-16 23:52 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-10-16 14:12 git-diff-tree inordinately (O(M*N)) slow on files with many changes Jim Meyering
2006-10-16 15:47 ` Linus Torvalds
2006-10-16 16:12   ` Linus Torvalds
2006-10-16 16:33     ` Jim Meyering
2006-10-16 16:42       ` Davide Libenzi
2006-10-16 16:50         ` Jim Meyering
2006-10-16 16:54           ` Davide Libenzi
2006-10-16 16:57             ` Jim Meyering
2006-10-16 17:02               ` Davide Libenzi
2006-10-16 17:56           ` Linus Torvalds
2006-10-16 18:03             ` Linus Torvalds
2006-10-16 18:41               ` Davide Libenzi
2006-10-16 18:18             ` Davide Libenzi
2006-10-16 18:51               ` Linus Torvalds
2006-10-16 19:44                 ` Davide Libenzi
2006-10-16 20:29                   ` Jakub Narebski
2006-10-16 22:53                 ` Junio C Hamano
2006-10-16 23:24                   ` Linus Torvalds
2006-10-16 23:52                     ` Davide Libenzi
2006-10-16 18:24             ` Jim Meyering
2006-10-16 18:30               ` Davide Libenzi
2006-10-16 18:43                 ` Jim Meyering
2006-10-16 16:54       ` Linus Torvalds
2006-10-16 16:36     ` Davide Libenzi
2006-10-16 16:57       ` Linus Torvalds
2006-10-16 16:24   ` Davide Libenzi
2006-10-16 16:54     ` Jakub Narebski

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).