three-way diff performance problem

git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* three-way diff performance problem
@ 2009-07-21 18:10 Linus Torvalds
  2009-07-21 18:16 ` Linus Torvalds
  2009-07-21 19:21 ` Junio C Hamano
  0 siblings, 2 replies; 8+ messages in thread
From: Linus Torvalds @ 2009-07-21 18:10 UTC (permalink / raw)
  To: Git Mailing List, Junio C Hamano

Ok, so I decided to abuse git today, because I actually had a problem 
where git could potentially solve it. Sadly, it's now 55 minutes later, 
and my single git command hasn't completed yet :(

What did I do?

We have a really odd bug-report for the kernel where switching the 
compiler from using '-fwrapv' to using '-fno-strict-overflow' causes a 
non-working kernel.

So the exact same kernel sources work fine with either 'fwrapv' or, 
indeed, with neither option, but with '-fno-strict-overflow' it just hangs 
at boot. 

Now, the problem is that I can't actually see the issue, so I just have 
three kernel binaries, two of which work, and one which does not. I can 
disassemble them, and I did, and the end result are three files with 
roughly 1.14 million lines.

The diffs between them are in the 40- to 90-thousand line area, and it's 
really hard to read diffs of assembly code and try to figure out what the 
problem with the generated code is. Yeah, I'm pretty good at reading 
assembler, but I'm not _that_ good. Just scanning 40 thousand lines takes 
a while, looking at it _closely_ is just incredibly hard and isn't going 
to ever really work out.

But there's a saving grace: I have two working kernels. And the 
differences are often silly things like different register allocation, and 
they are _different_. Many functions end up being the same between one 
working kernel and the failing one, while being different in another 
working kernel.

So what I want is the differences that are different from _both_ working 
kernels, since being the same as _one_ of them means that it's not a bug.

Now, being a git person, what does that say? Right: just check in the 
working kernels as two branches under the same filename, then merge the 
branches, and force the merge result to be the non-working kernel, and do 
a three-way combined context diff! So I did exactly that.

An hour later, it's still thinking.

So I used our nice new 'perf' tool from the kernel on a run that I killed 
after five minuets, and lookie here:

	#
	# (17566764 samples)
	#
	# Overhead           Command  Shared Object              Symbol
	# ........  ................  .........................  ......
	#
	    96.26%               git  /home/torvalds/bin/git     [.] consume_line
	...

it says we're spending pretty much all our time in a single function.

In fact, on the instruction-level profile, it looks like it's all spent on 
four instructions:

    3.51 :        45aec0:       4c 85 72 10             test   %r14,0x10(%rdx)
   86.27 :        45aec4:       48 0f 45 ca             cmovne %rdx,%rcx
         :              if (sline->lost_head) {
         :                      struct lline *last_one = NULL;
         :                      /* We cannot squash it with earlier one */
         :                      for (lline = sline->lost_head;
         :                           lline;
         :                           lline = lline->next)
    6.69 :        45aec8:       48 8b 12                mov    (%rdx),%rdx
         :
         :              /* Check to see if we can squash things */
         :              if (sline->lost_head) {
         :                      struct lline *last_one = NULL;
         :                      /* We cannot squash it with earlier one */
         :                      for (lline = sline->lost_head;
    3.51 :        45aecb:       48 85 d2                test   %rdx,%rdx
    0.01 :        45aece:       75 f0                   jne    45aec0 <consume_line+0xc0>

(ok, five, I included the 0.01% of the branch instruction that finishes 
the loop - the way Nehalem works, it will merge those test+jne 
instructions into one macro-instruction internally, which is why the 
branch that is obviously part of the loop doesn't get a lot of hits)

Just FYI, those four instructions come from inlining 'append_lost()'. It's 
compiled that first for-loop into some very good assembly language, but 
the problem is that this all looks very much quadratic in number of 
lost_lines. The quality of the compiler thus sadly in no way helps make up 
for a really really expensive algorithm.

Any ideas?

For any git people who want to look at the horror that caused this, feel 
free to download the (fairly small) git repository from

	git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/git-merge-test.git

and try to do a

	git show --color fno-strict-overflow

just for fun. Do "gitk --all" to see the simple nature of the repository, 
but if you do, you'll need to kill the "git diff-tree -r -p --textconv .." 
process that gitk will start (and which will take forever).

The code in question is all Junio's, and is over three years old. So it's 
obviously been working fine. And the load is obviously not really 
appropriate, but at the same time I think it's an interesting thing to 
try. An it doesn't seem to be about memory use - It's now been running for 
85 minutes (ok, this email took long to write, and I uploaded the thing 
etc in the meantime etc), and it's holding steady at roughly 175MB RSS.

			Linus

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: three-way diff performance problem
  2009-07-21 18:10 three-way diff performance problem Linus Torvalds
@ 2009-07-21 18:16 ` Linus Torvalds
  2009-07-21 19:21 ` Junio C Hamano
  1 sibling, 0 replies; 8+ messages in thread
From: Linus Torvalds @ 2009-07-21 18:16 UTC (permalink / raw)
  To: Git Mailing List, Junio C Hamano

On Tue, 21 Jul 2009, Linus Torvalds wrote:
> 
> Now, being a git person, what does that say? Right: just check in the 
> working kernels as two branches under the same filename, then merge the 
> branches, and force the merge result to be the non-working kernel, and do 
> a three-way combined context diff! So I did exactly that.

Btw, I did it wrong, which is why it takes too long. Instead of making the 
result be the right dump-file, I made it be a bad one, so the diffs are 
much bigger than they should be. Which then makes the combined-merge diff 
take longer than necessary (with the O(n^2) thing).

Doing it _correctly_ actually made the three-way diff only take 45 
seconds, but sadly still left me with 22 thousand lines.

			Linus

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: three-way diff performance problem
  2009-07-21 18:10 three-way diff performance problem Linus Torvalds
  2009-07-21 18:16 ` Linus Torvalds
@ 2009-07-21 19:21 ` Junio C Hamano
  2009-07-21 19:31   ` Linus Torvalds
  2009-07-21 19:47   ` Junio C Hamano
  1 sibling, 2 replies; 8+ messages in thread
From: Junio C Hamano @ 2009-07-21 19:21 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Git Mailing List, Junio C Hamano

Linus Torvalds <torvalds@linux-foundation.org> writes:

> In fact, on the instruction-level profile, it looks like it's all spent on 
> four instructions:
>
>     3.51 :        45aec0:       4c 85 72 10             test   %r14,0x10(%rdx)
>    86.27 :        45aec4:       48 0f 45 ca             cmovne %rdx,%rcx
>          :              if (sline->lost_head) {
>          :                      struct lline *last_one = NULL;
>          :                      /* We cannot squash it with earlier one */
>          :                      for (lline = sline->lost_head;
>          :                           lline;
>          :                           lline = lline->next)
>     6.69 :        45aec8:       48 8b 12                mov    (%rdx),%rdx
>          :
>          :              /* Check to see if we can squash things */
>          :              if (sline->lost_head) {
>          :                      struct lline *last_one = NULL;
>          :                      /* We cannot squash it with earlier one */
>          :                      for (lline = sline->lost_head;
>     3.51 :        45aecb:       48 85 d2                test   %rdx,%rdx
>     0.01 :        45aece:       75 f0                   jne    45aec0 <consume_line+0xc0>
>
> (ok, five, I included the 0.01% of the branch instruction that finishes 
> the loop - the way Nehalem works, it will merge those test+jne 
> instructions into one macro-instruction internally, which is why the 
> branch that is obviously part of the loop doesn't get a lot of hits)
>
> Just FYI, those four instructions come from inlining 'append_lost()'. It's 
> compiled that first for-loop into some very good assembly language, but 
> the problem is that this all looks very much quadratic in number of 
> lost_lines. The quality of the compiler thus sadly in no way helps make up 
> for a really really expensive algorithm.
>
> Any ideas?

What's that cmovne?

The function append_lost() is the meat of combining.  When you have seen
a hunk like this:

    @@ -l,k +m,n @@
    -lost line
    -another lost line
     common line
    +added line

We queue the lost lines in front of a surviving line (that is sline that
points at "common line").  "lost line" and "another lost line" are stored
in lline (lost line) and they are queued to sline->lost_head.

When another diff that corresponds to this section looks like this:

    @@ -l,k +m1,n1 @@
    -lost line
     common line
    +added line

the function looks at "lost line", tries to find the identical one in the
list of llines.  We find one, and mark the fact that the lline element is
shared among the two parents.

That allows the output routine to say

    @@@ -l,k +m,n +m1,n1 @@@
    --lost line
    - another lost line
      common line
    ++added line

If that cmovne that eats 86.27% of the run time has something to do with
the memcmp() that is repeatedly done, and if the length of lost_head list
is very large, I think a possible optimization is to add a line hash value
in struct lline.

After feeding the first diff (forget the one with +m1,n1 example above),
if the second diff looks like this:

    @@ -l,k +m2,n2 @@
    -another lost line
    -lost line
     common line
    +added line

we match "-another lost line" with the second element in lost_head list,
and when processing the next line "-lost line", we try to avoid matching
it with the first one in the list, by stupidly scanning from the front of
the list.  I think we should add a member to struct sline to remember the
last element in lost_head list for the current parent to skip that scan.
Perhaps that scan is the one that is costing, not memcmp()?

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: three-way diff performance problem
  2009-07-21 19:21 ` Junio C Hamano
@ 2009-07-21 19:31   ` Linus Torvalds
  2009-07-21 19:47   ` Junio C Hamano
  1 sibling, 0 replies; 8+ messages in thread
From: Linus Torvalds @ 2009-07-21 19:31 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Git Mailing List



On Tue, 21 Jul 2009, Junio C Hamano wrote:
> 
> What's that cmovne?

It's "conditional move if not equal".

So the source code is

                /* We cannot squash it with earlier one */
                for (lline = sline->lost_head;
                     lline;
                     lline = lline->next)
                        if (lline->parent_map & this_mask)
                                last_one = lline;

adn the compiler has generated this:

	loop:
		test   %r14,0x10(%rdx)
		cmovne %rdx,%rcx
		mov    (%rdx),%rdx
		test   %rdx,%rdx
		jne loop

from it. In the above, '%r14' contains this_mask, and '%rcx' contains 
'last_line' and '%rdx' contains 'lline'.

So:

 - test   %r14,0x10(%rdx)

   test "this_mask & lline->parent_map"

 - cmovne %rdx,%rcx

   "if the test above was non-zero, then last_one = lline"

 - mov    (%rdx),%rdx

   "lline = lline->next"

 - test   %rdx,%rdx

   "is lline NULL"

 - jne loop

   no, continue.

> The function append_lost() is the meat of combining.  When you have seen
> a hunk like this:
> 
>     @@ -l,k +m,n @@
>     -lost line
>     -another lost line
>      common line
>     +added line
> 
> We queue the lost lines in front of a surviving line (that is sline that
> points at "common line").  "lost line" and "another lost line" are stored
> in lline (lost line) and they are queued to sline->lost_head.

Right. And "sline->lost_head" is going to have over a _million_ entries.

So each time you add one, we'll traverse that loop a million times. For 
each new entry. End result: that loop gets executed on the order of a 
million million times. It doesn't help that the compiler made it be very 
efficient code ;(

			Linus

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: three-way diff performance problem
  2009-07-21 19:21 ` Junio C Hamano
  2009-07-21 19:31   ` Linus Torvalds
@ 2009-07-21 19:47   ` Junio C Hamano
  2009-07-21 20:34     ` Linus Torvalds
  1 sibling, 1 reply; 8+ messages in thread
From: Junio C Hamano @ 2009-07-21 19:47 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Git Mailing List

Junio C Hamano <gitster@pobox.com> writes:

> Linus Torvalds <torvalds@linux-foundation.org> writes:
>
>> In fact, on the instruction-level profile, it looks like it's all spent on 
>> four instructions:
>>
>>     3.51 :        45aec0:       4c 85 72 10             test   %r14,0x10(%rdx)
>>    86.27 :        45aec4:       48 0f 45 ca             cmovne %rdx,%rcx
>>          :              if (sline->lost_head) {
>>          :                      struct lline *last_one = NULL;
>>          :                      /* We cannot squash it with earlier one */
>>          :                      for (lline = sline->lost_head;
>>          :                           lline;
>>          :                           lline = lline->next)
>>     6.69 :        45aec8:       48 8b 12                mov    (%rdx),%rdx
>>          :
>>          :              /* Check to see if we can squash things */
>>          :              if (sline->lost_head) {
>>          :                      struct lline *last_one = NULL;
>>          :                      /* We cannot squash it with earlier one */
>>          :                      for (lline = sline->lost_head;
>>     3.51 :        45aecb:       48 85 d2                test   %rdx,%rdx
>>     0.01 :        45aece:       75 f0                   jne    45aec0 <consume_line+0xc0>
>> ...
> After feeding the first diff (forget the one with +m1,n1 example above),
> if the second diff looks like this:
>
>     @@ -l,k +m2,n2 @@
>     -another lost line
>     -lost line
>      common line
>     +added line
>
> we match "-another lost line" with the second element in lost_head list,
> and when processing the next line "-lost line", we try to avoid matching
> it with the first one in the list, by stupidly scanning from the front of
> the list.  I think we should add a member to struct sline to remember the
> last element in lost_head list for the current parent to skip that scan.
> Perhaps that scan is the one that is costing, not memcmp()?

Here is a patch to do that.  I haven't tested it yet though.

 combine-diff.c |   16 ++++++++--------
 1 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/combine-diff.c b/combine-diff.c
index bbf74fc..8194104 100644
--- a/combine-diff.c
+++ b/combine-diff.c
@@ -80,6 +80,7 @@ struct lline {
 /* Lines surviving in the merge result */
 struct sline {
 	struct lline *lost_head, **lost_tail;
+	struct lline *lost_last_for_current_parent;
 	char *bol;
 	int len;
 	/* bit 0 up to (N-1) are on if the parent has this line (i.e.
@@ -121,18 +122,15 @@ static void append_lost(struct sline *sline, int n, const char *line, int len)
 
 	/* Check to see if we can squash things */
 	if (sline->lost_head) {
-		struct lline *last_one = NULL;
-		/* We cannot squash it with earlier one */
-		for (lline = sline->lost_head;
-		     lline;
-		     lline = lline->next)
-			if (lline->parent_map & this_mask)
-				last_one = lline;
-		lline = last_one ? last_one->next : sline->lost_head;
+		if (sline->lost_last_for_current_parent)
+			lline = sline->lost_last_for_current_parent;
+		else
+			lline = sline->lost_head;
 		while (lline) {
 			if (lline->len == len &&
 			    !memcmp(lline->line, line, len)) {
 				lline->parent_map |= this_mask;
+				sline->lost_last_for_current_parent = lline;
 				return;
 			}
 			lline = lline->next;
@@ -147,6 +145,7 @@ static void append_lost(struct sline *sline, int n, const char *line, int len)
 	lline->line[len] = 0;
 	*sline->lost_tail = lline;
 	sline->lost_tail = &lline->next;
+	sline->lost_last_for_current_parent = lline;
 }
 
 struct combine_diff_state {
@@ -187,6 +186,7 @@ static void consume_line(void *state_, char *line, unsigned long len)
 				xcalloc(state->num_parent,
 					sizeof(unsigned long));
 		state->sline[state->nb-1].p_lno[state->n] = state->ob;
+		state->lost_bucket->lost_last_for_current_parent = NULL;
 		return;
 	}
 	if (!state->lost_bucket)

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: three-way diff performance problem
  2009-07-21 19:47   ` Junio C Hamano
@ 2009-07-21 20:34     ` Linus Torvalds
  2009-07-21 20:46       ` Junio C Hamano
  0 siblings, 1 reply; 8+ messages in thread
From: Linus Torvalds @ 2009-07-21 20:34 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Git Mailing List

On Tue, 21 Jul 2009, Junio C Hamano wrote:
> 
> Here is a patch to do that.  I haven't tested it yet though.

Well, it seems to work.

It turned the 85+ minute thing (which I eventually just killed) into 
something that took 1.4 _seconds_. Of course, I didn't check that the end 
result was identical, since I never had the patience to wait for the 
original case.

Now sadly, my fixed test (which took 45 seconds before) didn't much 
improve (it doesn't have nearly as many removed lines, since I checked in 
the _right_ file that is a much closer version to the parent files).

But what is intriguing is that it it gets different results. So I suspect 
this one actually changed behavior some way:

	[torvalds@nehalem git-merge]$ time git show --color HEAD | wc
	  22671   63302  672234

	real	0m44.024s
	user	0m43.879s
	sys	0m0.148s

	[torvalds@nehalem git-merge]$ time ~/git/git show --color HEAD | wc
	  22596   63122  671076

	real	0m43.553s
	user	0m43.435s
	sys	0m0.128s

and notice how the git version with your change gives different line 
numbers.

Now, this is a diff, and different answers are possibly _both_ valid, so 
who knows. But I suspect that you _intended_ for the patch to be a 
semantic no-op, and it doesn't seem to be.

			Linus

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: three-way diff performance problem
  2009-07-21 20:34     ` Linus Torvalds
@ 2009-07-21 20:46       ` Junio C Hamano
  2009-07-21 21:01         ` Linus Torvalds
  0 siblings, 1 reply; 8+ messages in thread
From: Junio C Hamano @ 2009-07-21 20:46 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Git Mailing List

Linus Torvalds <torvalds@linux-foundation.org> writes:

> On Tue, 21 Jul 2009, Junio C Hamano wrote:
>> 
>> Here is a patch to do that.  I haven't tested it yet though.
>
> Well, it seems to work.
>
> It turned the 85+ minute thing (which I eventually just killed) into 
> something that took 1.4 _seconds_. Of course, I didn't check that the end 
> result was identical, since I never had the patience to wait for the 
> original case.
>
> Now sadly, my fixed test (which took 45 seconds before) didn't much 
> improve (it doesn't have nearly as many removed lines, since I checked in 
> the _right_ file that is a much closer version to the parent files).
>
> But what is intriguing is that it it gets different results. So I suspect 
> this one actually changed behavior some way:
>
> 	[torvalds@nehalem git-merge]$ time git show --color HEAD | wc
> 	  22671   63302  672234
> 	
> 	real	0m44.024s
> 	user	0m43.879s
> 	sys	0m0.148s
>
> 	[torvalds@nehalem git-merge]$ time ~/git/git show --color HEAD | wc
> 	  22596   63122  671076
> 	
> 	real	0m43.553s
> 	user	0m43.435s
> 	sys	0m0.128s
>
> and notice how the git version with your change gives different line 
> numbers.
>
> Now, this is a diff, and different answers are possibly _both_ valid, so 
> who knows. But I suspect that you _intended_ for the patch to be a 
> semantic no-op, and it doesn't seem to be.

I know why.  The conversion was wrong.  The original found the last_one
that was from the parent we are looking at, and when there is such, it
started the scan from the one _after that_.  Otherwise, if lost_head list
had an entry

	@@ -l,k +m,n @@
        -one
         two
         three

and the diff with the parent we are currently looking at duplicates
removal:

	@@ -l,k +m1,n1 @@
	-one
        -one
         two
         three

we will end up losing the second removal, which would be what is happening
with the patch you tried.

I actually was scratching my head wondering why it wasn't happening in the
original code after I sent that faulty patch.

Here is another attempt.

 combine-diff.c |   13 +++++--------
 1 files changed, 5 insertions(+), 8 deletions(-)

diff --git a/combine-diff.c b/combine-diff.c
index bbf74fc..2438490 100644
--- a/combine-diff.c
+++ b/combine-diff.c
@@ -80,6 +80,7 @@ struct lline {
 /* Lines surviving in the merge result */
 struct sline {
 	struct lline *lost_head, **lost_tail;
+	struct lline *next_lost;
 	char *bol;
 	int len;
 	/* bit 0 up to (N-1) are on if the parent has this line (i.e.
@@ -121,18 +122,12 @@ static void append_lost(struct sline *sline, int n, const char *line, int len)
 
 	/* Check to see if we can squash things */
 	if (sline->lost_head) {
-		struct lline *last_one = NULL;
-		/* We cannot squash it with earlier one */
-		for (lline = sline->lost_head;
-		     lline;
-		     lline = lline->next)
-			if (lline->parent_map & this_mask)
-				last_one = lline;
-		lline = last_one ? last_one->next : sline->lost_head;
+		lline = sline->next_lost;
 		while (lline) {
 			if (lline->len == len &&
 			    !memcmp(lline->line, line, len)) {
 				lline->parent_map |= this_mask;
+				sline->next_lost = lline->next;
 				return;
 			}
 			lline = lline->next;
@@ -147,6 +142,7 @@ static void append_lost(struct sline *sline, int n, const char *line, int len)
 	lline->line[len] = 0;
 	*sline->lost_tail = lline;
 	sline->lost_tail = &lline->next;
+	sline->next_lost = NULL;
 }
 
 struct combine_diff_state {
@@ -187,6 +183,7 @@ static void consume_line(void *state_, char *line, unsigned long len)
 				xcalloc(state->num_parent,
 					sizeof(unsigned long));
 		state->sline[state->nb-1].p_lno[state->n] = state->ob;
+		state->lost_bucket->next_lost = state->lost_bucket->lost_head;
 		return;
 	}
 	if (!state->lost_bucket)

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: three-way diff performance problem
  2009-07-21 20:46       ` Junio C Hamano
@ 2009-07-21 21:01         ` Linus Torvalds
  0 siblings, 0 replies; 8+ messages in thread
From: Linus Torvalds @ 2009-07-21 21:01 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Git Mailing List



On Tue, 21 Jul 2009, Junio C Hamano wrote:
> 
> I know why.  The conversion was wrong.  The original found the last_one
> that was from the parent we are looking at, and when there is such, it
> started the scan from the one _after that_.  Otherwise, if lost_head list
> had an entry
> 
> 	@@ -l,k +m,n @@
>         -one
>          two
>          three
> 
> and the diff with the parent we are currently looking at duplicates
> removal:
> 
> 	@@ -l,k +m1,n1 @@
> 	-one
>         -one
>          two
>          three
> 
> we will end up losing the second removal, which would be what is happening
> with the patch you tried.

Ok, that matches what I saw. Doing a 'diff' between the outputs showed 
missing lines from the result with your first patch.

> I actually was scratching my head wondering why it wasn't happening in the
> original code after I sent that faulty patch.
> 
> Here is another attempt.

Yup. This attempt now generates output that matches the original one (in 
44s), while at the same time still handling the previously very expensive 
case quickly (1.4s).

So it looks good now from my testing - I'm not going to say anything else 
about the patch, since I'm not all that familiar with the code itself.

Btw, the 'perf report' on the fixed build now says

    89.67%               git  /home/torvalds/git/git     [.] xdl_prepare_env
     6.42%               git  /home/torvalds/git/git     [.] xdl_recs_cmp
     0.92%               git  [kernel]                   [k] hpet_next_event
     0.52%               git  /home/torvalds/git/git     [.] xdl_prepare_ctx
     0.44%               git  /lib64/libz.so.1.2.3       [.] inflate_fast
     0.29%               git  /home/torvalds/git/git     [.] xdl_hash_record
     0.27%               git  /lib64/libc-2.10.1.so      [.] __GI_memcmp
     0.26%               git  /home/torvalds/git/git     [.] show_patch_diff
     0.14%               git  [kernel]                   [k] _spin_lock
     0.09%               git  [kernel]                   [k] clear_page_c
     0.09%               git  /lib64/libz.so.1.2.3       [.] adler32

so now the expensive part is xdl_prepare_env. And that does actually make 
sense: it's the code that generates all the hash records for the lines to 
be compared, so there doesn't seem to be anything hugely wrong with it. If 
there are a lot of identical lines, I'd expect that to be fairly 
expensive.

			Linus

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2009-07-21 21:01 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-07-21 18:10 three-way diff performance problem Linus Torvalds
2009-07-21 18:16 ` Linus Torvalds
2009-07-21 19:21 ` Junio C Hamano
2009-07-21 19:31   ` Linus Torvalds
2009-07-21 19:47   ` Junio C Hamano
2009-07-21 20:34     ` Linus Torvalds
2009-07-21 20:46       ` Junio C Hamano
2009-07-21 21:01         ` Linus Torvalds

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).