git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/2 for maint] "show --cc" fixes
@ 2009-07-22 21:48 Junio C Hamano
  2009-07-22 21:48 ` [PATCH 1/2] combine-diff.c: fix performance problem when folding common deleted lines Junio C Hamano
  0 siblings, 1 reply; 3+ messages in thread
From: Junio C Hamano @ 2009-07-22 21:48 UTC (permalink / raw)
  To: git

The first one is the same "performance" patch from yesterday, only with
better commit log message.

The second is a bugfix.  "git show --cc" displays merge result incorrectly
when they touch the first line of the affected file (the bug is not about
the commit being wrong; it is about showing the commit incorrectly).  This
was found while writing a new test

Among the 3040 merges up to v1.6.4-rc1 in git.git history, only 643 of
them are interesting merges in "show --cc" sense, and there is only one
commit for which "git show" gives a wrong output due to this bug.

  3ed02de Merge branch 'maint' of git://repo.or.cz/git-gui into maint

The reason git.git history has more "cc-interesting" merges (compared to a
lot larger and more complex kernel history---see below) is an atrifact of
the use of subtree merges.  Most of these are actually not interesting at
all, and the above commit that is shown incorrectly is not interesting
either.  We probably should discourage the use of subtree merges and
switch to use submodules for gitk and git-gui ourselves, but we need to
wait until we can be sure that everybody's installed git is submodule
capable.

Similarly, among the 9487 merges in the Linux kernel history (today's tip
from Linus), only 920 of them are interesting.  There are a handful of
commits for which "git show" gives a wrong output due to this bug.

  f541ae3 Merge branch 'linus' into perfcounters/core-v2
  0ca0f16 Merge branches 'x86/apic', 'x86/asm', 'x86/cleanups', 'x86/debug',...
  9b6a517 Merge branch 'juju' of git://git.kernel.org/pub/scm/linux/kernel/...
  22a3e23 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivial

Take Ingo's 0ca0f16 as an example.  The diff between its first parent and
the commit begins arch/x86/kernel/cpu/common.c like this:

    --- a/arch/x86/kernel/cpu/common.c
    +++ b/arch/x86/kernel/cpu/common.c
    @@ -1,52 +1,52 @@
    -#include <linux/init.h>
    -#include <linux/kernel.h>
    -#include <linux/sched.h>
    -#include <linux/string.h>
     #include <linux/bootmem.h>
    +#include <linux/linkage.h>

When shown in "git show", however, it begins like this (this is an 11-way
octopus):

    --- a/arch/x86/kernel/cpu/common.c
    +++ b/arch/x86/kernel/cpu/common.c
    @@@@@@@@@@@ -1,44 -1,44 -1,44 -1,44 -1,44 -1,44 -1,44 -1,4...
	      #include <linux/bootmem.h>
    --- ------#include <linux/init.h>
    --- ------#include <linux/kernel.h>
    --- ------#include <linux/sched.h>

which is clearly wrong.

In other words, the bug may be rare, but it is real.

An interesting tangent is this graph:

    838 x x
     26 x x x
     18 x x x x
      3 x x x x x
      7 x x x x x x
      5 x x x x x x x
      6 x x x x x x x x
      2 x x x x x x x x x
      1 x x x x x x x x x x
      3 x x x x x x x x x x x
      3 x x x x x x x x x x x x
      1 x x x x x x x x x x x x x
      2 x x x x x x x x x x x x x x
      1 x x x x x x x x x x x x x x x x x x
      1 x x x x x x x x x x x x x x x x x x x x
      1 x x x x x x x x x x x x x x x x x x x x x
      1 x x x x x x x x x x x x x x x x x x x x x x x x
      1 x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x

which is an output from this command in the kernel repository:

    git log --pretty=short |
    sed -ne '/^Merge: /{
	    s/^Merge: //
	    s/[0-9a-f][0-9a-f]*/x/g
	    p
    }' |
    sort |
    uniq -c

Ingo stole the octopus gold star from Len Brown with that 30-way octopus.

Junio C Hamano (2):
  combine-diff.c: fix performance problem when folding common deleted
    lines
  diff --cc: a lost line at the beginning of the file is shown
    incorrectly

 combine-diff.c           |   29 ++++++++--------
 t/t4038-diff-combined.sh |   84 ++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 98 insertions(+), 15 deletions(-)
 create mode 100755 t/t4038-diff-combined.sh

^ permalink raw reply	[flat|nested] 3+ messages in thread

* [PATCH 1/2] combine-diff.c: fix performance problem when folding common deleted lines
  2009-07-22 21:48 [PATCH 0/2 for maint] "show --cc" fixes Junio C Hamano
@ 2009-07-22 21:48 ` Junio C Hamano
  2009-07-22 21:48   ` [PATCH 2/2] diff --cc: a lost line at the beginning of the file is shown incorrectly Junio C Hamano
  0 siblings, 1 reply; 3+ messages in thread
From: Junio C Hamano @ 2009-07-22 21:48 UTC (permalink / raw)
  To: git

For a deleted line in a patch with the parent we are looking at, the
append_lost() function finds the same line among a run of lines that were
deleted from the same location by patches from parents we previously
checked.  This is so that patches with two parents

    @@ -1,4 +1,3 @@    @@ -1,4 +1,3 @@
     one                   one
    -two                  -two
     three                 three
    -quatro               -fyra
    +four                 +four

can be coalesced into this sequence, reusing one line that describes the
removal of "two" for both parents.

   @@@ -1,4 -1,4 +1,3 @@@
     one
   --two
     three
   - quatro
    -frya
   ++four

While reading the second patch (that removes "two" and then "fyra"), after
finding where removal of the "two" matches, we need to find existing
removal of "fyra" (if exists) in the removal list, but the match has to
happen after all the existing matches (in this case "two").  The code used
a naïve O(n^2) algorithm to compute this by scanning the whole removal
list over and over again.

This patch remembers where the next scan should be started in the existing
removal list to avoid this.

Noticed by Linus Torvalds.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 combine-diff.c |   13 +++++--------
 1 files changed, 5 insertions(+), 8 deletions(-)

diff --git a/combine-diff.c b/combine-diff.c
index 60d0367..b82f46c 100644
--- a/combine-diff.c
+++ b/combine-diff.c
@@ -80,6 +80,7 @@ struct lline {
 /* Lines surviving in the merge result */
 struct sline {
 	struct lline *lost_head, **lost_tail;
+	struct lline *next_lost;
 	char *bol;
 	int len;
 	/* bit 0 up to (N-1) are on if the parent has this line (i.e.
@@ -121,18 +122,12 @@ static void append_lost(struct sline *sline, int n, const char *line, int len)
 
 	/* Check to see if we can squash things */
 	if (sline->lost_head) {
-		struct lline *last_one = NULL;
-		/* We cannot squash it with earlier one */
-		for (lline = sline->lost_head;
-		     lline;
-		     lline = lline->next)
-			if (lline->parent_map & this_mask)
-				last_one = lline;
-		lline = last_one ? last_one->next : sline->lost_head;
+		lline = sline->next_lost;
 		while (lline) {
 			if (lline->len == len &&
 			    !memcmp(lline->line, line, len)) {
 				lline->parent_map |= this_mask;
+				sline->next_lost = lline->next;
 				return;
 			}
 			lline = lline->next;
@@ -147,6 +142,7 @@ static void append_lost(struct sline *sline, int n, const char *line, int len)
 	lline->line[len] = 0;
 	*sline->lost_tail = lline;
 	sline->lost_tail = &lline->next;
+	sline->next_lost = NULL;
 }
 
 struct combine_diff_state {
@@ -187,6 +183,7 @@ static void consume_line(void *state_, char *line, unsigned long len)
 				xcalloc(state->num_parent,
 					sizeof(unsigned long));
 		state->sline[state->nb-1].p_lno[state->n] = state->ob;
+		state->lost_bucket->next_lost = state->lost_bucket->lost_head;
 		return;
 	}
 	if (!state->lost_bucket)
-- 
1.6.4.rc1.10.g2a679

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* [PATCH 2/2] diff --cc: a lost line at the beginning of the file is shown incorrectly
  2009-07-22 21:48 ` [PATCH 1/2] combine-diff.c: fix performance problem when folding common deleted lines Junio C Hamano
@ 2009-07-22 21:48   ` Junio C Hamano
  0 siblings, 0 replies; 3+ messages in thread
From: Junio C Hamano @ 2009-07-22 21:48 UTC (permalink / raw)
  To: git

When combine-diff inspected the diff from one parent to the merge result,
it misinterpreted a header in the form @@ -l,k +0,0 @@.

This hunk header means that K lines were removed from the beginning of the
file, so the lost lines must be queued to the sline that represents the
first line of the merge result, but we incremented our pointer incorrectly
and ended up queuing it to the second line, which in turn made the lossage
appear _after_ the first line.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 combine-diff.c           |   16 +++++----
 t/t4038-diff-combined.sh |   84 ++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 93 insertions(+), 7 deletions(-)
 create mode 100755 t/t4038-diff-combined.sh

diff --git a/combine-diff.c b/combine-diff.c
index b82f46c..38f4e2c 100644
--- a/combine-diff.c
+++ b/combine-diff.c
@@ -164,20 +164,22 @@ static void consume_line(void *state_, char *line, unsigned long len)
 				      &state->nb, &state->nn))
 			return;
 		state->lno = state->nb;
-		if (!state->nb)
-			/* @@ -1,2 +0,0 @@ to remove the
-			 * first two lines...
-			 */
-			state->nb = 1;
-		if (state->nn == 0)
+		if (state->nn == 0) {
 			/* @@ -X,Y +N,0 @@ removed Y lines
 			 * that would have come *after* line N
 			 * in the result.  Our lost buckets hang
 			 * to the line after the removed lines,
+			 *
+			 * Note that this is correct even when N == 0,
+			 * in which case the hunk removes the first
+			 * line in the file.
 			 */
 			state->lost_bucket = &state->sline[state->nb];
-		else
+			if (!state->nb)
+				state->nb = 1;
+		} else {
 			state->lost_bucket = &state->sline[state->nb-1];
+		}
 		if (!state->sline[state->nb-1].p_lno)
 			state->sline[state->nb-1].p_lno =
 				xcalloc(state->num_parent,
diff --git a/t/t4038-diff-combined.sh b/t/t4038-diff-combined.sh
new file mode 100755
index 0000000..2cf7e01
--- /dev/null
+++ b/t/t4038-diff-combined.sh
@@ -0,0 +1,84 @@
+#!/bin/sh
+
+test_description='combined diff'
+
+. ./test-lib.sh
+
+setup_helper () {
+	one=$1 branch=$2 side=$3 &&
+
+	git branch $side $branch &&
+	for l in $one two three fyra
+	do
+		echo $l
+	done >file &&
+	git add file &&
+	test_tick &&
+	git commit -m $branch &&
+	git checkout $side &&
+	for l in $one two three quatro
+	do
+		echo $l
+	done >file &&
+	git add file &&
+	test_tick &&
+	git commit -m $side &&
+	test_must_fail git merge $branch &&
+	for l in $one three four
+	do
+		echo $l
+	done >file &&
+	git add file &&
+	test_tick &&
+	git commit -m "merge $branch into $side"
+}
+
+verify_helper () {
+	it=$1 &&
+
+	# Ignore lines that were removed only from the other parent
+	sed -e '
+		1,/^@@@/d
+		/^ -/d
+		s/^\(.\)./\1/
+	' "$it" >"$it.actual.1" &&
+	sed -e '
+		1,/^@@@/d
+		/^- /d
+		s/^.\(.\)/\1/
+	' "$it" >"$it.actual.2" &&
+
+	git diff "$it^" "$it" -- | sed -e '1,/^@@/d' >"$it.expect.1" &&
+	test_cmp "$it.expect.1" "$it.actual.1" &&
+
+	git diff "$it^2" "$it" -- | sed -e '1,/^@@/d' >"$it.expect.2" &&
+	test_cmp "$it.expect.2" "$it.actual.2"
+}
+
+test_expect_success setup '
+	>file &&
+	git add file &&
+	test_tick &&
+	git commit -m initial &&
+
+	git branch withone &&
+	git branch sansone &&
+
+	git checkout withone &&
+	setup_helper one withone sidewithone &&
+
+	git checkout sansone &&
+	setup_helper "" sansone sidesansone
+'
+
+test_expect_success 'check combined output (1)' '
+	git show sidewithone -- >sidewithone &&
+	verify_helper sidewithone
+'
+
+test_expect_failure 'check combined output (2)' '
+	git show sidesansone -- >sidesansone &&
+	verify_helper sidesansone
+'
+
+test_done
-- 
1.6.4.rc1.10.g2a679

^ permalink raw reply related	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2009-07-22 21:48 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-07-22 21:48 [PATCH 0/2 for maint] "show --cc" fixes Junio C Hamano
2009-07-22 21:48 ` [PATCH 1/2] combine-diff.c: fix performance problem when folding common deleted lines Junio C Hamano
2009-07-22 21:48   ` [PATCH 2/2] diff --cc: a lost line at the beginning of the file is shown incorrectly Junio C Hamano

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).