Git development
 help / color / mirror / Atom feed
* Re: [PATCH] diff-raw format update take #2.
From: Thomas Glanzmann @ 2005-05-24  1:44 UTC (permalink / raw)
  To: GIT
In-Reply-To: <20050524013947.ADFEE528F53@taniwha.stupidest.org>

Hello,

* Chris Wedgwood <cw@f00f.org> [050524 03:40]:
> This is an automatically generated response.  You should only receive
> one such response (even if you send mutliple messages).

> I'm current fairly slow with email at times so please be patient.  If
> it's urgent, it's probably best you call my cellphone and leave
> voicemail (if you don't have the number, then chances are I won't
> consider your email urgent anyhow).

> I do check my email, and I do expect to reply to your message given a
> little bit of time.

> Thanks for your patience.

that is a bad joke, isn't it? I need a killfile for eMail or I start
hurting people. Fist talking bullshit and than autorepsonding just
shit to say *something*.

	Thomas

^ permalink raw reply

* Re: [PATCH] diff-raw format update take #2.
From: Thomas Glanzmann @ 2005-05-24  1:39 UTC (permalink / raw)
  To: Chris Wedgwood; +Cc: David Lang, Linus Torvalds, Junio C Hamano, git
In-Reply-To: <211e617258d9d993810f3c88bace255e.IBX@taniwha.stupidest.org>

Hello,
for me the diff/patch works for *spaces*, but screw up on tabs in
filenames. Why? Because the field sep for filenames is *tab*. So patch
believes that everything before the first tab is the filename.

	Thomas

^ permalink raw reply

* Re: [PATCH] diff-raw format update take #2.
From: Chris Wedgwood @ 2005-05-24  1:33 UTC (permalink / raw)
  To: David Lang; +Cc: Linus Torvalds, Junio C Hamano, git
In-Reply-To: <Pine.LNX.4.62.0505231827430.4200@qynat.qvtvafvgr.pbz>

On Mon, May 23, 2005 at 06:29:07PM -0700, David Lang wrote:

> hmm, personally I would have expected it to do shell escapeing of
> the name

diff doesn't do this, so i'm not sure it's useful:

cw@taniwha:/tmp$ echo "we have ones here" > "file one"
cw@taniwha:/tmp$ echo "we have two here" > "file two"
cw@taniwha:/tmp$ diff -u file*
--- file one    2005-05-23 18:32:44.334535901 -0700
+++ file two    2005-05-23 18:32:49.047851694 -0700
@@ -1 +1 @@
-we have ones here
+we have two here

^ permalink raw reply

* Re: [PATCH] diff-raw format update take #2.
From: David Lang @ 2005-05-24  1:29 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Junio C Hamano, Chris Wedgwood, git
In-Reply-To: <Pine.LNX.4.58.0505231758350.2307@ppc970.osdl.org>

On Mon, 23 May 2005, Linus Torvalds wrote:

> On Mon, 23 May 2005, Junio C Hamano wrote:
>>
>> Embedded spaces in path is _always_ safe.
>
> For raw-diff yes, but since you'd normally end up using that name in the
> diff, it won't be safe any more.
>
> Imagine a name like "this is a file", and think about how the diff ends up
> looking:
>
> 	diff --git a/this is a file b/this is a file
>
> and realize that that can't be parsed sanely by anybody who uses the diff.

hmm, personally I would have expected it to do shell escapeing of the name

diff --git a/this\ is\ a \ file b/this\ is\ a\ file

given that diff is trying to record how it was called.

David Lang

-- 
There are two ways of constructing a software design. One way is to make it so simple that there are obviously no deficiencies. And the other way is to make it so complicated that there are no obvious deficiencies.
  -- C.A.R. Hoare

^ permalink raw reply

* [PATCH] Update git-diff-cache documentation.
From: Junio C Hamano @ 2005-05-24  1:20 UTC (permalink / raw)
  To: torvalds; +Cc: git

The recent diff updates gave diff-cache the same ability to
filter paths, which was not properly documented.

Signed-off-by: Junio C Hamano <junkio@cox.net>
---

Documentation/git-diff-cache.txt |   10 ++++++----
diff-cache.c                     |    2 +-
2 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/Documentation/git-diff-cache.txt b/Documentation/git-diff-cache.txt
--- a/Documentation/git-diff-cache.txt
+++ b/Documentation/git-diff-cache.txt
@@ -9,13 +9,15 @@ git-diff-cache - Compares content and mo
 
 SYNOPSIS
 --------
-'git-diff-cache' [-p] [-r] [-z] [-m] [-M] [-R] [-C] [-S<string>] [--cached] <tree-ish>
+'git-diff-cache' [-p] [-r] [-z] [-m] [-M] [-R] [-C] [-S<string>] [--cached] <tree-ish> [<path>...]
 
 DESCRIPTION
 -----------
-Compares the content and mode of the blobs found via a tree object
-with the content of the current cache and, optionally ignoring the
-stat state of the file on disk.
+Compares the content and mode of the blobs found via a tree
+object with the content of the current cache and, optionally
+ignoring the stat state of the file on disk.  When paths are
+specified, compares only those named paths.  Otherwise all
+entries in the cache are compared.
 
 OPTIONS
 -------
diff --git a/diff-cache.c b/diff-cache.c
--- a/diff-cache.c
+++ b/diff-cache.c
@@ -154,7 +154,7 @@ static void mark_merge_entries(void)
 }
 
 static char *diff_cache_usage =
-"git-diff-cache [-p] [-r] [-z] [-m] [-M] [-C] [-R] [-S<string>] [--cached] <tree-ish>";
+"git-diff-cache [-p] [-r] [-z] [-m] [-M] [-C] [-R] [-S<string>] [--cached] <tree-ish> [<path>...]";
 
 int main(int argc, const char **argv)
 {
------------------------------------------------


^ permalink raw reply

* [PATCH] Fix diff-pruning logic which was running prune too early.
From: Junio C Hamano @ 2005-05-24  1:14 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: GIT
In-Reply-To: <7vll651nth.fsf@assigned-by-dhcp.cox.net>

For later stages to reorder patches, pruning logic and rename
detection logic should not decide which delete to discard
(because another entry said it will take over the file as a
rename) until the very end.  Also fix some tests that were
assuming the earlier "last one is rename or keep everything else
is copy" semantics of diff-raw format, which no longer is true.

Signed-off-by: Junio C Hamano <junkio@cox.net>
---

diff-cache.c             |    1 
diff-files.c             |    1 
diff-tree.c              |    2 
diff.c                   |  156 +++++++++++++++++++----------------------------
diff.h                   |    2 
diffcore-rename.c        |   25 -------
diffcore.h               |    2 
t/t4005-diff-rename-2.sh |    7 --
8 files changed, 69 insertions(+), 127 deletions(-)

diff --git a/diff-cache.c b/diff-cache.c
--- a/diff-cache.c
+++ b/diff-cache.c
@@ -229,7 +229,6 @@ int main(int argc, const char **argv)
 	ret = diff_cache(active_cache, active_nr);
 	if (detect_rename)
 		diffcore_rename(detect_rename, diff_score_opt);
-	diffcore_prune();
 	if (pickaxe)
 		diffcore_pickaxe(pickaxe);
 	if (2 <= argc)
diff --git a/diff-files.c b/diff-files.c
--- a/diff-files.c
+++ b/diff-files.c
@@ -115,7 +115,6 @@ int main(int argc, const char **argv)
 	}
 	if (detect_rename)
 		diffcore_rename(detect_rename, diff_score_opt);
-	diffcore_prune();
 	if (pickaxe)
 		diffcore_pickaxe(pickaxe);
 	if (1 < argc)
diff --git a/diff-tree.c b/diff-tree.c
--- a/diff-tree.c
+++ b/diff-tree.c
@@ -266,10 +266,8 @@ static int call_diff_flush(void)
 {
 	if (detect_rename)
 		diffcore_rename(detect_rename, diff_score_opt);
-	diffcore_prune();
 	if (pickaxe)
 		diffcore_pickaxe(pickaxe);
-
 	if (diff_queue_is_empty()) {
 		diff_flush(DIFF_FORMAT_NO_OUTPUT, 0);
 		return 0;
diff --git a/diff.c b/diff.c
--- a/diff.c
+++ b/diff.c
@@ -12,9 +12,6 @@ static const char *diff_opts = "-pu";
 static unsigned char null_sha1[20] = { 0, };
 
 static int reverse_diff;
-static int generate_patch;
-static int line_termination = '\n';
-static int inter_name_termination = '\t';
 
 static const char *external_diff(void)
 {
@@ -502,7 +499,9 @@ struct diff_filepair *diff_queue(struct 
 	return dp;
 }
 
-static void diff_flush_raw(struct diff_filepair *p)
+static void diff_flush_raw(struct diff_filepair *p,
+			   int line_termination,
+			   int inter_name_termination)
 {
 	int two_paths;
 	char status[10];
@@ -566,10 +565,6 @@ static void diff_flush_patch(struct diff
 	const char *name, *other;
 	char msg_[PATH_MAX*2+200], *msg;
 
-	/* diffcore_prune() keeps "stay" entries for diff-raw
-	 * copy/rename detection, but when we are generating
-	 * patches we do not need them.
-	 */
 	if (diff_unmodified_pair(p))
 		return;
 
@@ -585,7 +580,7 @@ static void diff_flush_patch(struct diff
 			"similarity index %d%%\n"
 			"copy from %s\n"
 			"copy to %s\n",
-			(int)(0.5 + p->score * 100/MAX_SCORE),
+			(int)(0.5 + p->score * 100.0/MAX_SCORE),
 			p->one->path, p->two->path);
 		msg = msg_;
 		break;
@@ -594,7 +589,7 @@ static void diff_flush_patch(struct diff
 			"similarity index %d%%\n"
 			"rename old %s\n"
 			"rename new %s\n",
-			(int)(0.5 + p->score * 100/MAX_SCORE),
+			(int)(0.5 + p->score * 100.0/MAX_SCORE),
 			p->one->path, p->two->path);
 		msg = msg_;
 		break;
@@ -630,105 +625,82 @@ int diff_needs_to_stay(struct diff_queue
 	return 0;
 }
 
-static int diff_used_as_source(struct diff_queue_struct *q, int lim,
-			       struct diff_filespec *it)
+int diff_queue_is_empty(void)
 {
-	int i;
-	for (i = 0; i < lim; i++) {
-		struct diff_filepair *p = q->queue[i++];
-		if (!strcmp(p->one->path, it->path))
-			return 1;
-	}
-	return 0;
+	struct diff_queue_struct *q = &diff_queued_diff;
+	return q->nr == 0;
 }
 
-void diffcore_prune(void)
+static void diff_resolve_rename_copy(void)
 {
-	/*
-	 * Although rename/copy detection wants to have "no-change"
-	 * entries fed into them, the downstream do not need to see
-	 * them, unless we had rename/copy for the same path earlier.
-	 * This function removes such entries.
-	 *
-	 * The applications that use rename/copy should:
-	 *
-	 * (1) feed change and "no-change" entries via diff_queue().
-	 * (2) call diffcore_rename, and any other future diffcore_xxx
-	 *     that would benefit by still having "no-change" entries.
-	 * (3) call diffcore_prune
-	 * (4) call other diffcore_xxx that do not need to see
-	 *     "no-change" entries.
-	 * (5) call diff_flush().
-	 */
-	struct diff_queue_struct *q = &diff_queued_diff;
-	struct diff_queue_struct outq;
 	int i;
-
-	outq.queue = NULL;
-	outq.nr = outq.alloc = 0;
-
+	struct diff_queue_struct *q = &diff_queued_diff;
 	for (i = 0; i < q->nr; i++) {
 		struct diff_filepair *p = q->queue[i];
-		if (!diff_unmodified_pair(p) ||
-		    diff_used_as_source(q, i, p->one))
-			diff_q(&outq, p);
-		else
-			free(p);
+		p->status = 0;
+		if (DIFF_PAIR_UNMERGED(p))
+			p->status = 'U';
+		else if (!DIFF_FILE_VALID((p)->one))
+			p->status = 'N';
+		else if (!DIFF_FILE_VALID((p)->two)) {
+			/* maybe earlier one said 'R', meaning
+			 * it will take it, in which case we do
+			 * not need to keep 'D'.
+			 */
+			int j;
+			for (j = 0; j < i; j++) {
+				struct diff_filepair *pp = q->queue[j];
+				if (pp->status == 'R' &&
+				    !strcmp(pp->one->path, p->one->path))
+					break;
+			}
+			if (j < i)
+				continue;
+			p->status = 'D';
+		}
+		else if (strcmp(p->one->path, p->two->path)) {
+			/* This is rename or copy.  Which one is it? */
+			if (diff_needs_to_stay(q, i+1, p->one))
+				p->status = 'C';
+			else
+				p->status = 'R';
+		}
+		else if (memcmp(p->one->sha1, p->two->sha1, 20))
+			p->status = 'M';
+		else {
+			/* we do not need this one */
+			p->status = 0;
+		}
 	}
-	free(q->queue);
-	*q = outq;
-	return;
-}
-
-int diff_queue_is_empty(void)
-{
-	struct diff_queue_struct *q = &diff_queued_diff;
-	return q->nr == 0;
 }
 
 void diff_flush(int diff_output_style, int resolve_rename_copy)
 {
 	struct diff_queue_struct *q = &diff_queued_diff;
 	int i;
+	int line_termination = '\n';
+	int inter_name_termination = '\t';
 
-	generate_patch = 0;
-	switch (diff_output_style) {
-	case DIFF_FORMAT_HUMAN:
-		line_termination = '\n';
-		inter_name_termination = '\t';
-		break;
-	case DIFF_FORMAT_MACHINE:
+	if (diff_output_style == DIFF_FORMAT_MACHINE)
 		line_termination = inter_name_termination = 0;
-		break;
-	case DIFF_FORMAT_PATCH:
-		generate_patch = 1;
-		break;
-	}
+	if (resolve_rename_copy)
+		diff_resolve_rename_copy();
+
 	for (i = 0; i < q->nr; i++) {
 		struct diff_filepair *p = q->queue[i];
-		if (resolve_rename_copy) {
-			if (DIFF_PAIR_UNMERGED(p))
-				p->status = 'U';
-			else if (!DIFF_FILE_VALID((p)->one))
-				p->status = 'N';
-			else if (!DIFF_FILE_VALID((p)->two))
-				p->status = 'D';
-			else if (strcmp(p->one->path, p->two->path)) {
-				/* This is rename or copy.  Which one is it? */
-				if (diff_needs_to_stay(q, i+1, p->one))
-					p->status = 'C';
-				else
-					p->status = 'R';
-			}
-			else
-				p->status = 'M';
-		}
-		if (generate_patch)
+		if (p->status == 0)
+			continue;
+		switch (diff_output_style) {
+		case DIFF_FORMAT_PATCH:
 			diff_flush_patch(p);
-		else
-			diff_flush_raw(p);
+			break;
+		case DIFF_FORMAT_HUMAN:
+		case DIFF_FORMAT_MACHINE:
+			diff_flush_raw(p, line_termination,
+				       inter_name_termination);
+			break;
+		}
 	}
-
 	for (i = 0; i < q->nr; i++) {
 		struct diff_filepair *p = q->queue[i];
 		diff_free_filespec_data(p->one);
@@ -755,9 +727,9 @@ void diff_addremove(int addremove, unsig
 	 * with something like '=' or '*' (I haven't decided
 	 * which but should not make any difference).
 	 * Feeding the same new and old to diff_change() 
-	 * also has the same effect.  diffcore_prune() should
-	 * be used to filter uninteresting ones out before the
-	 * final output happens.
+	 * also has the same effect.
+	 * Before the final output happens, they are pruned after
+	 * merged into rename/copy pairs as appropriate.
 	 */
 	if (reverse_diff)
 		addremove = (addremove == '+' ? '-' :
diff --git a/diff.h b/diff.h
--- a/diff.h
+++ b/diff.h
@@ -39,8 +39,6 @@ extern void diff_setup(int reverse);
 
 extern void diffcore_rename(int rename_copy, int minimum_score);
 
-extern void diffcore_prune(void);
-
 extern void diffcore_pickaxe(const char *needle);
 extern void diffcore_pathspec(const char **pathspec);
 
diff --git a/diffcore-rename.c b/diffcore-rename.c
--- a/diffcore-rename.c
+++ b/diffcore-rename.c
@@ -133,10 +133,7 @@ static void record_rename_pair(struct di
 	 * The downstream diffcore transformers are free to reorder
 	 * the entries as long as they keep file pairs that has the
 	 * same p->one->path in earlier rename_rank to appear before
-	 * later ones.  This ordering is used by the diff_flush()
-	 * logic to tell renames from copies, and also used by the
-	 * diffcore_prune() logic to omit unnecessary
-	 * "no-modification" entries.
+	 * later ones.
 	 *
 	 * To the final output routine, and in the diff-raw format
 	 * output, a rename/copy that is based on a path that has a
@@ -271,14 +268,8 @@ void diffcore_rename(int detect_rename, 
 
 	/* We really want to cull the candidates list early
 	 * with cheap tests in order to avoid doing deltas.
-	 *
-	 * With the current callers, we should not have already
-	 * matched entries at this point, but it is nonetheless
-	 * checked for sanity.
 	 */
 	for (i = 0; i < created.nr; i++) {
-		if (created.s[i]->xfrm_flags & RENAME_DST_MATCHED)
-			continue; /* we have matched exactly already */
 		for (h = 0; h < sizeof(srcs)/sizeof(srcs[0]); h++) {
 			struct diff_rename_pool *p = srcs[h];
 			for (j = 0; j < p->nr; j++) {
@@ -386,25 +377,13 @@ void diffcore_rename(int detect_rename, 
 		}
 		else if (!DIFF_FILE_VALID(p->two)) {
 			/* deleted */
-			if (p->one->xfrm_flags & RENAME_SRC_GONE)
-				; /* rename/copy deleted it already */
-			else
-				diff_queue(q, p->one, p->two);
+			diff_queue(q, p->one, p->two);
 		}
 		else if (strcmp(p->one->path, p->two->path)) {
 			/* rename or copy */
 			struct diff_filepair *dp =
 				diff_queue(q, p->one, p->two);
 			dp->score = p->score;
-
-			/* if we have a later entry that is a rename/copy
-			 * that depends on p->one, then we copy here.
-			 * otherwise we rename it.
-			 */
-			if (!diff_needs_to_stay(&outq, i+1, p->one))
-				/* this is the last one, so mark it as gone.
-				 */
-				p->one->xfrm_flags |= RENAME_SRC_GONE;
 		}
 		else
 			/* otherwise it is a modified (or "stay") entry */
diff --git a/diffcore.h b/diffcore.h
--- a/diffcore.h
+++ b/diffcore.h
@@ -12,8 +12,6 @@
 #define DEFAULT_MINIMUM_SCORE 5000
 
 #define RENAME_DST_MATCHED 01
-#define RENAME_SRC_GONE    02
-#define RENAME_SCORE_SHIFT 8
 
 struct diff_filespec {
 	unsigned char sha1[20];
diff --git a/t/t4005-diff-rename-2.sh b/t/t4005-diff-rename-2.sh
--- a/t/t4005-diff-rename-2.sh
+++ b/t/t4005-diff-rename-2.sh
@@ -147,14 +147,13 @@ test_expect_success \
 ################################################################
 
 # tree has COPYING and rezrov.  work tree has the same COPYING and
-# copy-edited COPYING.1, and unchanged rezrov.  We should see
-# unmodified COPYING in the output, so that downstream diff-helper can
-# notice.  We should not say anything about rezrov.
+# copy-edited COPYING.1, and unchanged rezrov.  We should not say
+# anything about rezrov nor COPYING, since the revised again diff-raw
+# nows how to say Copy.
 
 git-diff-cache -C $tree >current
 cat >expected <<\EOF
 :100644 100644 6ff87c4664981e4397625791c8ea3bbb5f2279a3 0603b3238a076dc6c8022aedc6648fa523a17178 C1234	COPYING	COPYING.1
-:100644 100644 6ff87c4664981e4397625791c8ea3bbb5f2279a3 6ff87c4664981e4397625791c8ea3bbb5f2279a3 M	COPYING
 EOF
 
 test_expect_success \
------------------------------------------------


^ permalink raw reply

* Re: [PATCH] diff-raw format update take #2.
From: Linus Torvalds @ 2005-05-24  1:03 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Chris Wedgwood, git
In-Reply-To: <7v64x91mfb.fsf@assigned-by-dhcp.cox.net>



On Mon, 23 May 2005, Junio C Hamano wrote:
> 
> Embedded spaces in path is _always_ safe.

For raw-diff yes, but since you'd normally end up using that name in the 
diff, it won't be safe any more. 

Imagine a name like "this is a file", and think about how the diff ends up 
looking:

	diff --git a/this is a file b/this is a file

and realize that that can't be parsed sanely by anybody who uses the diff.

And here '-z' doesn't help us, because we're basically screwed by the diff 
format (not our own decision).

So CVS uses "Index: " to help this somewhat, and we can get it right for
renames and copies (because we then output the name in a way that is at
least space and tab-safe, if not newline-safe). But basically, anything
that uses patches as a medium for passing information around should
_really_ avoid using spaces or tabs in filenames, and that's quite
independent of git ;/

		Linus

^ permalink raw reply

* Re: [PATCH] diff-raw format update take #2.
From: Chris Wedgwood @ 2005-05-24  1:05 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Junio C Hamano, git
In-Reply-To: <Pine.LNX.4.58.0505231758350.2307@ppc970.osdl.org>

On Mon, May 23, 2005 at 06:03:09PM -0700, Linus Torvalds wrote:

> For raw-diff yes, but since you'd normally end up using that name in
> the diff, it won't be safe any more.

I'm not really worried about a diff being spat out that can't be
cleanly parsed --- I was more worried about the application doing
BadThings(tm) and crashing or eating some of my repo or worse.

^ permalink raw reply

* Re: [PATCH] diff-raw format update take #2.
From: Linus Torvalds @ 2005-05-24  0:51 UTC (permalink / raw)
  To: Chris Wedgwood; +Cc: Junio C Hamano, git
In-Reply-To: <046ec1d00820537103092ed264f81f65.IBX@taniwha.stupidest.org>



On Mon, 23 May 2005, Chris Wedgwood wrote:
> 
> Sure, I guess I meant to what would happen when not using '-z'?  Will
> something notice this early on barf and tell me to use '-z' or will
> BadThings(tm) just come bite me at some (possibly) later stage?

Well, normally you'd not use the faw format, and the worst that can happen 
is likely that the pathnames in the diff are screwed up.

Side note: files with spaces/tabs in the names have serious problems in
diffs anyway, because parsing the name ends up becoming largely random at
that point. Sad. The problem there is that the format for filenames in
diffs is not very well-specified.

THAT is a much bigger problem than the raw diff format, since that 
actually ends up interfering with interoperability.

The same goes for broken DOS CR-NL text-files, btw. If anybody ever ports
git to the crap that is DOS/Windows (and I assume NT does it too), they'll
have endless problems with interoperating with sane systems.

		Linus

^ permalink raw reply

* Re: [PATCH] diff-raw format update take #2.
From: Junio C Hamano @ 2005-05-24  0:45 UTC (permalink / raw)
  To: Chris Wedgwood; +Cc: Linus Torvalds, git
In-Reply-To: <046ec1d00820537103092ed264f81f65.IBX@taniwha.stupidest.org>

>>>>> "CW" == Chris Wedgwood <cw@f00f.org> writes:

CW> On Mon, May 23, 2005 at 05:25:32PM -0700, Junio C Hamano wrote:
>> Then you would use '-z'.  (10) becomes NUL which your path
>> cannot have inside.  So do (12) and (14).

CW> Sure, I guess I meant to what would happen when not using '-z'?  Will
CW> something notice this early on barf and tell me to use '-z' or will
CW> BadThings(tm) just come bite me at some (possibly) later stage?

Embedded spaces in path is _always_ safe.  And I think with the
current code unless you are using rename detection, your path
with embedded TABs are also OK (but do not depend on it).

If you are using rename detetion, your rename source path is
truncated at the first TAB and your rename destination path has
the remainder of the source path, with an extra TAB, prepended
to it.  Nothing as far as I know would detect and warn that
situation.  If you have an embedded LF, then you are SOL,
period.  Just do not do it.

I _could_ add a code to diff-helper to barf if your path have an
embedded TAB in it, but I am not sure if that is worth it.  Also
I _could_ add a code to diff-raw output routine to barf if your
path have these problematic characters in it and you are not
using '-z'.  I think the latter makes quite a lot of sense.

The design comes from this reasoning (third point of "a few
results"); please look in your archive if you care about the
details.

    To:	git@vger.kernel.org
    Subject: Re: updated design for the diff-raw format.
    Date:	Sat, 21 May 2005 16:17:33 -0700
    Message-ID: <7vll68dv8y.fsf@assigned-by-dhcp.cox.net>

    (second of the replayed message, with blessing from Linus)


^ permalink raw reply

* Re: [PATCH] diff-raw format update take #2.
From: Chris Wedgwood @ 2005-05-24  0:31 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Linus Torvalds, git
In-Reply-To: <7vhdgt1ncz.fsf@assigned-by-dhcp.cox.net>

On Mon, May 23, 2005 at 05:25:32PM -0700, Junio C Hamano wrote:

> Then you would use '-z'.  (10) becomes NUL which your path
> cannot have inside.  So do (12) and (14).

Sure, I guess I meant to what would happen when not using '-z'?  Will
something notice this early on barf and tell me to use '-z' or will
BadThings(tm) just come bite me at some (possibly) later stage?


^ permalink raw reply

* Re: [PATCH] diff-raw format update take #2.
From: Junio C Hamano @ 2005-05-24  0:25 UTC (permalink / raw)
  To: Chris Wedgwood; +Cc: Linus Torvalds, git
In-Reply-To: <87bcada447378d0173a3c5f165c70b38.ANY@taniwha.stupidest.org>

>>>>> "CW" == Chris Wedgwood <cw@f00f.org> writes:

>> +  (1) a colon.
>> +  (2) mode for "src"; 000000 if creation or unmerged.
>> +  (3) a space.
>> +  (4) mode for "dst"; 000000 if deletion or unmerged.
>> +  (5) a space.
>> +  (6) sha1 for "src"; 0{40} if creation or unmerged.
>> +  (7) a space.
>> +  (8) sha1 for "dst"; 0{40} if creation, unmerged or "look at work tree".
>> +  (9) status, followed by similarlity index number only for C and R.
>> + (10) a tab or a NUL when '-z' option is used.
>> + (11) path for "src"

CW> What if the path has embedded tabs or spacs?

Then you would use '-z'.  (10) becomes NUL which your path
cannot have inside.  So do (12) and (14).




^ permalink raw reply

* Re: git-diff-tree -z HEAD | git-diff-helper -z fails for me
From: Junio C Hamano @ 2005-05-24  0:15 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Thomas Glanzmann, GIT
In-Reply-To: <Pine.LNX.4.58.0505231644560.2307@ppc970.osdl.org>

>>>>> "LT" == Linus Torvalds <torvalds@osdl.org> writes:

LT> Actually, your solution is the yucky one.

Nod.

By the way, there seems to be a big
screwup in the pruning code and currently -C does not work at
all.  Just to let you know that I am already looking into it.


^ permalink raw reply

* Re: [PATCH] diff-raw format update take #2.
From: Chris Wedgwood @ 2005-05-24  0:12 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Linus Torvalds, git
In-Reply-To: <7vy8a51uay.fsf_-_@assigned-by-dhcp.cox.net>

> +  (1) a colon.
> +  (2) mode for "src"; 000000 if creation or unmerged.
> +  (3) a space.
> +  (4) mode for "dst"; 000000 if deletion or unmerged.
> +  (5) a space.
> +  (6) sha1 for "src"; 0{40} if creation or unmerged.
> +  (7) a space.
> +  (8) sha1 for "dst"; 0{40} if creation, unmerged or "look at work tree".
> +  (9) status, followed by similarlity index number only for C and R.
> + (10) a tab or a NUL when '-z' option is used.
> + (11) path for "src"

What if the path has embedded tabs or spacs?  I skimmed over the patch
but didn't see anything to deal with this (I didn't apply it and check
the result though it might have been outside the patch).

^ permalink raw reply

* Re: git-diff-tree -z HEAD | git-diff-helper -z fails for me
From: Linus Torvalds @ 2005-05-23 23:47 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Thomas Glanzmann, GIT
In-Reply-To: <7vpsvh3hp0.fsf@assigned-by-dhcp.cox.net>



On Mon, 23 May 2005, Junio C Hamano wrote:
> 
> LT> How about instead making sure that any "extra" text be NUL-terminated and
> LT> never start with ':' after a NUL (which will automatically be true, since
> LT> it's either "diff-tree " + ascii for the verbose case, or just the tree
> LT> name).
> 
> Makes much more sense although it has certain amount of Yuck
> factor ;-).

Actually, your solution is the yucky one.

You didn't realize that your whole DIFF_FORMAT_MACHINE case really can be 
written as just

	printf("%s%c", header, 0);

ie you print the header as _one_ long line, instead of splitting it up 
into many. It's still a perfectly valid line, and perfectly unrecognizable 
as such.

Now, maybe diff-helper is unhappy about such long lines, but that should 
be solvable..

		Linus

^ permalink raw reply

* [PATCH] diff-raw format update take #2.
From: Junio C Hamano @ 2005-05-23 22:08 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git
In-Reply-To: <7vy8a51uay.fsf_-_@assigned-by-dhcp.cox.net>

Addendum.

------------
Document the diff-helper changes.

Now diff-helper is simple diff-raw to patch format converter
that does not take -R/-M/-C, drop the description of these
options from the documentation. 

Signed-off-by: Junio C Hamano <junkio@cox.net>
---
cd /opt/packrat/playpen/public/in-place/git/git.junio/
jit-diff
# - HEAD: diff-raw format update take #2.
# + (working tree)
diff --git a/Documentation/git-diff-helper.txt b/Documentation/git-diff-helper.txt
--- a/Documentation/git-diff-helper.txt
+++ b/Documentation/git-diff-helper.txt
@@ -9,7 +9,7 @@ git-diff-helper - Generates patch format
 
 SYNOPSIS
 --------
-'git-diff-helper' [-z] [-R] [-M] [-C] [-S<string>]
+'git-diff-helper' [-z] [-S<string>]
 
 DESCRIPTION
 -----------
@@ -21,22 +21,6 @@ OPTIONS
 -z::
 	\0 line termination on input
 
--R::
-	Output diff in reverse.  This is useful for displaying output from
-	"git-diff-cache" which always compares tree with cache or working
-	file.  E.g.
-
-		git-diff-cache <tree> | git-diff-helper -R file.c
-
-	would show a diff to bring the working file back to what
-	is in the <tree>.
-
--M::
-	Detect renames.
-
--C::
-	Detect copies as well as renames.
-
 -S<string>::
 	Look for differences that contains the change in <string>.
 

Compilation finished at Mon May 23 15:06:31


^ permalink raw reply

* [PATCH] diff-raw format update take #2.
From: Junio C Hamano @ 2005-05-23 21:55 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git
In-Reply-To: <Pine.LNX.4.58.0505231156210.2307@ppc970.osdl.org>

This changes the diff-raw format again, following the mailing
list discussion.  The new format explicitly expresses which one
is a rename and which one is a copy.

The documentation and tests are updated to match this change.

Signed-off-by: Junio C Hamano <junkio@cox.net>
---

Documentation/diff-format.txt |   42 ++++----
diff-cache.c                  |    2 
diff-files.c                  |    2 
diff-helper.c                 |  200 ++++++++++++++++++++--------------------
diff-tree.c                   |    4 
diff.c                        |  124 +++++++++++++++----------
diff.h                        |   16 +--
diffcore.h                    |    1 
t/t0000-basic.sh              |   16 +--
t/t4002-diff-basic.sh         |  208 +++++++++++++++++++++---------------------
t/t4003-diff-rename-1.sh      |  103 ++++++++++----------
t/t4005-diff-rename-2.sh      |  124 ++++++++++++++++++++++---
12 files changed, 493 insertions(+), 349 deletions(-)

diff --git a/Documentation/diff-format.txt b/Documentation/diff-format.txt
--- a/Documentation/diff-format.txt
+++ b/Documentation/diff-format.txt
@@ -19,28 +19,34 @@ git-diff-files [<pattern>...]::
 
 An output line is formatted this way:
 
-  ':' <mode> ' ' <mode> ' ' <sha1> ' ' <sha1> I <path> I <path> L
-
-By default, I and L are '\t' and '\n' respectively.  When '-z'
-flag is in effect, both I and L are '\0'.
-
-In each <mode>, <sha1> and <path> pair, left hand side describes
-the left hand side of what is being compared (<tree-ish> in
-git-diff-cache, <tree-ish-1> in git-diff-tree, cache contents in
-git-diff-files).  Non-existence is shown by having 000000 in the
-<mode> column.  That is, 000000 appears as the first <mode> for
-newly created files, and as the second <mode> for deleted files.
-
-Usually two <path> are the same.  When rename/copy detection is
-used, however, an "create" and another "delete" records can be
-merged into a single record that has two <path>, old name and
-new name.
+in-place edit  :100644 100644 bcd1234... 0123456... M file0
+copy-edit      :100644 100644 abcd123... 1234567... C68 file1 file2
+rename-edit    :100644 100644 abcd123... 1234567... R86 file1 file3
+create         :000000 100644 0000000... 1234567... N file4
+delete         :100644 000000 1234567... 0000000... D file5
+unmerged       :000000 000000 0000000... 0000000... U file6
+
+That is, from the left to the right:
+
+  (1) a colon.
+  (2) mode for "src"; 000000 if creation or unmerged.
+  (3) a space.
+  (4) mode for "dst"; 000000 if deletion or unmerged.
+  (5) a space.
+  (6) sha1 for "src"; 0{40} if creation or unmerged.
+  (7) a space.
+  (8) sha1 for "dst"; 0{40} if creation, unmerged or "look at work tree".
+  (9) status, followed by similarlity index number only for C and R.
+ (10) a tab or a NUL when '-z' option is used.
+ (11) path for "src"
+ (12) a tab or a NUL when '-z' option is used; only exists for C or R.
+ (13) path for "dst"; only exists for C or R.
+ (14) an LF or a NUL when '-z' option is used, to terminate the record.
 
 <sha1> is shown as all 0's if new is a file on the filesystem
 and it is out of sync with the cache.  Example:
 
-  :100644 100644 5be4a4...... 000000......    file.c    file.c
-
+  :100644 100644 5be4a4...... 000000...... M file.c
 
 Generating patches with -p
 --------------------------
diff --git a/diff-cache.c b/diff-cache.c
--- a/diff-cache.c
+++ b/diff-cache.c
@@ -234,6 +234,6 @@ int main(int argc, const char **argv)
 		diffcore_pickaxe(pickaxe);
 	if (2 <= argc)
 		diffcore_pathspec(argv + 1);
-	diff_flush(diff_output_format);
+	diff_flush(diff_output_format, 1);
 	return ret;
 }
diff --git a/diff-files.c b/diff-files.c
--- a/diff-files.c
+++ b/diff-files.c
@@ -120,6 +120,6 @@ int main(int argc, const char **argv)
 		diffcore_pickaxe(pickaxe);
 	if (1 < argc)
 		diffcore_pathspec(argv + 1);
-	diff_flush(diff_output_format);
+	diff_flush(diff_output_format, 1);
 	return 0;
 }
diff --git a/diff-helper.c b/diff-helper.c
--- a/diff-helper.c
+++ b/diff-helper.c
@@ -5,84 +5,21 @@
 #include "strbuf.h"
 #include "diff.h"
 
-static int detect_rename = 0;
-static int diff_score_opt = 0;
 static const char *pickaxe = NULL;
-static int diff_output_style = DIFF_FORMAT_PATCH;
 static int line_termination = '\n';
 static int inter_name_termination = '\t';
 
-static int parse_diff_raw(char *buf1, char *buf2, char *buf3)
-{
-	char old_path[PATH_MAX];
-	unsigned char old_sha1[20], new_sha1[20];
-	char *ep;
-	char *cp = buf1;
-	int ch, old_mode, new_mode;
-
-	old_mode = new_mode = 0;
-	while ((ch = *cp) && ('0' <= ch && ch <= '7')) {
-		old_mode = (old_mode << 3) | (ch - '0');
-		cp++;
-	}
-	if (*cp++ != ' ')
-		return -1;
-	while ((ch = *cp) && ('0' <= ch && ch <= '7')) {
-		new_mode = (new_mode << 3) | (ch - '0');
-		cp++;
-	}
-	if (*cp++ != ' ')
-		return -1;
-	if (get_sha1_hex(cp, old_sha1))
-		return -1;
-	cp += 40;
-	if (*cp++ != ' ')
-		return -1;
-	if (get_sha1_hex(cp, new_sha1))
-		return -1;
-	cp += 40;
-	if (*cp++ != inter_name_termination)
-		return -1;
-	if (buf2)
-		cp = buf2;
-	ep = strchr(cp, inter_name_termination);
-	if (!ep)
-		return -1;
-	*ep++ = 0;
-	strcpy(old_path, cp);
-	diff_guif(old_mode, new_mode, old_sha1, new_sha1,
-		  old_path, buf3 ? buf3 : ep);
-	return 0;
-}
-
 static const char *diff_helper_usage =
-	"git-diff-helper [-z] [-R] [-M] [-C] [-S<string>] paths...";
+	"git-diff-helper [-z] [-S<string>] paths...";
 
 int main(int ac, const char **av) {
-	struct strbuf sb1, sb2, sb3;
-	int reverse_diff = 0;
+	struct strbuf sb;
 
-	strbuf_init(&sb1);
-	strbuf_init(&sb2);
-	strbuf_init(&sb3);
+	strbuf_init(&sb);
 
 	while (1 < ac && av[1][0] == '-') {
-		if (av[1][1] == 'R')
-			reverse_diff = 1;
-		else if (av[1][1] == 'z')
+		if (av[1][1] == 'z')
 			line_termination = inter_name_termination = 0;
-		else if (av[1][1] == 'p') /* hidden from the help */
-			diff_output_style = DIFF_FORMAT_HUMAN;
-		else if (av[1][1] == 'P') /* hidden from the help */
-			diff_output_style = DIFF_FORMAT_MACHINE;
-		else if (av[1][1] == 'M') {
-			detect_rename = DIFF_DETECT_RENAME;
-			diff_score_opt = diff_scoreopt_parse(av[1]);
-		}
-		else if (av[1][1] == 'C') {
-			detect_rename = DIFF_DETECT_COPY;
-			diff_score_opt = diff_scoreopt_parse(av[1]);
-		}
 		else if (av[1][1] == 'S') {
 			pickaxe = av[1] + 2;
 		}
@@ -92,45 +29,114 @@ int main(int ac, const char **av) {
 	}
 	/* the remaining parameters are paths patterns */
 
-	diff_setup(reverse_diff);
+	diff_setup(0);
 	while (1) {
-		int status;
-		read_line(&sb1, stdin, line_termination);
-		if (sb1.eof)
+		unsigned old_mode, new_mode;
+		unsigned char old_sha1[20], new_sha1[20];
+		char old_path[PATH_MAX];
+		int status, score, two_paths;
+		char new_path[PATH_MAX];
+
+		int ch;
+		char *cp, *ep;
+
+		read_line(&sb, stdin, line_termination);
+		if (sb.eof)
 			break;
-		switch (sb1.buf[0]) {
-		case 'U':
-			diff_unmerge(sb1.buf + 2);
-			continue;
+		switch (sb.buf[0]) {
 		case ':':
-			break;
-		default:
-			goto unrecognized;
-		}
-		if (!line_termination) {
-			read_line(&sb2, stdin, line_termination);
-			if (sb2.eof)
+			/* parse the first part up to the status */
+			cp = sb.buf + 1;
+			old_mode = new_mode = 0;
+			while ((ch = *cp) && ('0' <= ch && ch <= '7')) {
+				old_mode = (old_mode << 3) | (ch - '0');
+				cp++;
+			}
+			if (*cp++ != ' ')
 				break;
-			read_line(&sb3, stdin, line_termination);
-			if (sb3.eof)
+			while ((ch = *cp) && ('0' <= ch && ch <= '7')) {
+				new_mode = (new_mode << 3) | (ch - '0');
+				cp++;
+			}
+			if (*cp++ != ' ')
 				break;
-			status = parse_diff_raw(sb1.buf+1, sb2.buf, sb3.buf);
-		}
-		else
-			status = parse_diff_raw(sb1.buf+1, NULL, NULL);
-		if (status) {
-		unrecognized:
-			diff_flush(diff_output_style);
-			printf("%s\n", sb1.buf);
+			if (get_sha1_hex(cp, old_sha1))
+				break;
+			cp += 40;
+			if (*cp++ != ' ')
+				break;
+			if (get_sha1_hex(cp, new_sha1))
+				break;
+			cp += 40;
+			if (*cp++ != ' ')
+				break;
+			status = *cp++;
+			if (!strchr("MCRNDU", status))
+				break;
+			two_paths = score = 0;
+			if (status == 'R' || status == 'C') {
+				two_paths = 1;
+				sscanf(cp, "%d", &score);
+				if (line_termination) {
+					cp = strchr(cp,
+						    inter_name_termination);
+					if (!cp)
+						break;
+				}
+			}
+
+			if (*cp++ != inter_name_termination)
+				break;
+
+			/* first pathname */
+			if (!line_termination) {
+				read_line(&sb, stdin, line_termination);
+				if (sb.eof)
+					break;
+				strcpy(old_path, sb.buf);
+			}
+			else if (!two_paths)
+				strcpy(old_path, cp);
+			else {
+				ep = strchr(cp, inter_name_termination);
+				if (!ep)
+					break;
+				strncpy(old_path, cp, ep-cp);
+				old_path[ep-cp] = 0;
+				cp = ep + 1;
+			}
+
+			/* second pathname */
+			if (!two_paths)
+				strcpy(new_path, old_path);
+			else {
+				if (!line_termination) {
+					read_line(&sb, stdin,
+						  line_termination);
+					if (sb.eof)
+						break;
+					strcpy(new_path, sb.buf);
+				}
+				else
+					strcpy(new_path, cp);
+			}
+			diff_helper_input(old_mode, new_mode,
+					  old_sha1, new_sha1,
+					  old_path, status, score,
+					  new_path);
+			continue;
 		}
+		if (pickaxe)
+			diffcore_pickaxe(pickaxe);
+		if (1 < ac)
+			diffcore_pathspec(av + 1);
+		diff_flush(DIFF_FORMAT_PATCH, 0);
+		printf("%s\n", sb.buf);
 	}
-	if (detect_rename)
-		diffcore_rename(detect_rename, diff_score_opt);
-	diffcore_prune();
 	if (pickaxe)
 		diffcore_pickaxe(pickaxe);
-	if (ac)
+	if (1 < ac)
 		diffcore_pathspec(av + 1);
-	diff_flush(diff_output_style);
+	diff_flush(DIFF_FORMAT_PATCH, 0);
 	return 0;
 }
diff --git a/diff-tree.c b/diff-tree.c
--- a/diff-tree.c
+++ b/diff-tree.c
@@ -270,7 +270,7 @@ static int call_diff_flush(void)
 	if (pickaxe) {
 		diffcore_pickaxe(pickaxe);
 		if (diff_queue_is_empty()) {
-			diff_flush(DIFF_FORMAT_NO_OUTPUT);
+			diff_flush(DIFF_FORMAT_NO_OUTPUT, 0);
 			return 0;
 		}
 	}
@@ -291,7 +291,7 @@ static int call_diff_flush(void)
 		}
 		header = NULL;
 	}
-	diff_flush(diff_output_format);
+	diff_flush(diff_output_format, 1);
 	return 1;
 }
 
diff --git a/diff.c b/diff.c
--- a/diff.c
+++ b/diff.c
@@ -504,16 +504,29 @@ struct diff_filepair *diff_queue(struct 
 
 static void diff_flush_raw(struct diff_filepair *p)
 {
-	if (DIFF_PAIR_UNMERGED(p)) {
-		printf("U %s%c", p->one->path, line_termination);
-		return;
+	int two_paths;
+	char status[10];
+	switch (p->status) {
+	case 'C': case 'R':
+		two_paths = 1;
+		sprintf(status, "%c%1d", p->status, p->score);
+		break;
+	default:
+		two_paths = 0;
+		status[0] = p->status;
+		status[1] = 0;
+		break;
 	}
 	printf(":%06o %06o %s ",
 	       p->one->mode, p->two->mode, sha1_to_hex(p->one->sha1));
-	printf("%s%c%s%c%s%c",
-	       sha1_to_hex(p->two->sha1), inter_name_termination,
-	       p->one->path, inter_name_termination,
-	       p->two->path, line_termination);
+	printf("%s %s%c%s",
+	       sha1_to_hex(p->two->sha1),
+	       status,
+	       inter_name_termination,
+	       p->one->path);
+	if (two_paths)
+		printf("%c%s", inter_name_termination, p->two->path);
+	putchar(line_termination);
 }
 
 int diff_unmodified_pair(struct diff_filepair *p)
@@ -548,9 +561,10 @@ int diff_unmodified_pair(struct diff_fil
 	return 0;
 }
 
-static void diff_flush_patch(struct diff_filepair *p, const char *msg)
+static void diff_flush_patch(struct diff_filepair *p)
 {
 	const char *name, *other;
+	char msg_[PATH_MAX*2+200], *msg;
 
 	/* diffcore_prune() keeps "stay" entries for diff-raw
 	 * copy/rename detection, but when we are generating
@@ -565,6 +579,29 @@ static void diff_flush_patch(struct diff
 	    (DIFF_FILE_VALID(p->two) && S_ISDIR(p->two->mode)))
 		return; /* no tree diffs in patch format */ 
 
+	switch (p->status) {
+	case 'C':
+		sprintf(msg_,
+			"similarity index %d%%\n"
+			"copy from %s\n"
+			"copy to %s\n",
+			(int)(0.5 + p->score * 100/MAX_SCORE),
+			p->one->path, p->two->path);
+		msg = msg_;
+		break;
+	case 'R':
+		sprintf(msg_,
+			"similarity index %d%%\n"
+			"rename old %s\n"
+			"rename new %s\n",
+			(int)(0.5 + p->score * 100/MAX_SCORE),
+			p->one->path, p->two->path);
+		msg = msg_;
+		break;
+	default:
+		msg = NULL;
+	}
+
 	if (DIFF_PAIR_UNMERGED(p))
 		run_external_diff(name, NULL, NULL, NULL, NULL);
 	else
@@ -643,21 +680,13 @@ void diffcore_prune(void)
 	return;
 }
 
-static void diff_flush_one(struct diff_filepair *p, const char *msg)
-{
-	if (generate_patch)
-		diff_flush_patch(p, msg);
-	else
-		diff_flush_raw(p);
-}
-
 int diff_queue_is_empty(void)
 {
 	struct diff_queue_struct *q = &diff_queued_diff;
 	return q->nr == 0;
 }
 
-void diff_flush(int diff_output_style)
+void diff_flush(int diff_output_style, int resolve_rename_copy)
 {
 	struct diff_queue_struct *q = &diff_queued_diff;
 	int i;
@@ -676,28 +705,28 @@ void diff_flush(int diff_output_style)
 		break;
 	}
 	for (i = 0; i < q->nr; i++) {
-		char msg_[PATH_MAX*2+200], *msg = NULL;
 		struct diff_filepair *p = q->queue[i];
-		if (strcmp(p->one->path, p->two->path)) {
-			/* This is rename or copy.  Which one is it? */
-			if (diff_needs_to_stay(q, i+1, p->one)) {
-				sprintf(msg_,
-					"similarity index %d%%\n"
-					"copy from %s\n"
-					"copy to %s\n",
-					(int)(0.5 + p->score * 100/MAX_SCORE),
-					p->one->path, p->two->path);
+		if (resolve_rename_copy) {
+			if (DIFF_PAIR_UNMERGED(p))
+				p->status = 'U';
+			else if (!DIFF_FILE_VALID((p)->one))
+				p->status = 'N';
+			else if (!DIFF_FILE_VALID((p)->two))
+				p->status = 'D';
+			else if (strcmp(p->one->path, p->two->path)) {
+				/* This is rename or copy.  Which one is it? */
+				if (diff_needs_to_stay(q, i+1, p->one))
+					p->status = 'C';
+				else
+					p->status = 'R';
 			}
 			else
-				sprintf(msg_,
-					"similarity index %d%%\n"
-					"rename old %s\n"
-					"rename new %s\n",
-					(int)(0.5 + p->score * 100/MAX_SCORE),
-					p->one->path, p->two->path);
-			msg = msg_;
+				p->status = 'M';
 		}
-		diff_flush_one(p, msg);
+		if (generate_patch)
+			diff_flush_patch(p);
+		else
+			diff_flush_raw(p);
 	}
 
 	for (i = 0; i < q->nr; i++) {
@@ -747,28 +776,27 @@ void diff_addremove(int addremove, unsig
 	diff_queue(&diff_queued_diff, one, two);
 }
 
-void diff_guif(unsigned old_mode,
-	       unsigned new_mode,
-	       const unsigned char *old_sha1,
-	       const unsigned char *new_sha1,
-	       const char *old_path,
-	       const char *new_path)
+void diff_helper_input(unsigned old_mode,
+		       unsigned new_mode,
+		       const unsigned char *old_sha1,
+		       const unsigned char *new_sha1,
+		       const char *old_path,
+		       int status,
+		       int score,
+		       const char *new_path)
 {
 	struct diff_filespec *one, *two;
+	struct diff_filepair *dp;
 
-	if (reverse_diff) {
-		unsigned tmp;
-		const unsigned char *tmp_c;
-		tmp = old_mode; old_mode = new_mode; new_mode = tmp;
-		tmp_c = old_sha1; old_sha1 = new_sha1; new_sha1 = tmp_c;
-	}
 	one = alloc_filespec(old_path);
 	two = alloc_filespec(new_path);
 	if (old_mode)
 		fill_filespec(one, old_sha1, old_mode);
 	if (new_mode)
 		fill_filespec(two, new_sha1, new_mode);
-	diff_queue(&diff_queued_diff, one, two);
+	dp = diff_queue(&diff_queued_diff, one, two);
+	dp->score = score;
+	dp->status = status;
 }
 
 void diff_change(unsigned old_mode, unsigned new_mode,
diff --git a/diff.h b/diff.h
--- a/diff.h
+++ b/diff.h
@@ -15,12 +15,14 @@ extern void diff_change(unsigned mode1, 
 			     const unsigned char *sha2,
 			     const char *base, const char *path);
 
-extern void diff_guif(unsigned mode1,
-		      unsigned mode2,
-		      const unsigned char *sha1,
-		      const unsigned char *sha2,
-		      const char *path1,
-		      const char *path2);
+extern void diff_helper_input(unsigned mode1,
+			      unsigned mode2,
+			      const unsigned char *sha1,
+			      const unsigned char *sha2,
+			      const char *path1,
+			      int status,
+			      int score,
+			      const char *path2);
 
 extern void diff_unmerge(const char *path);
 
@@ -44,6 +46,6 @@ extern void diffcore_pathspec(const char
 
 extern int diff_queue_is_empty(void);
 
-extern void diff_flush(int output_style);
+extern void diff_flush(int output_style, int resolve_rename_copy);
 
 #endif /* DIFF_H */
diff --git a/diffcore.h b/diffcore.h
--- a/diffcore.h
+++ b/diffcore.h
@@ -47,6 +47,7 @@ struct diff_filepair {
 			  * certain ordering of patches that later
 			  * diffcore transformations should not break.
 			  */
+	int status; /* M C R N D U (see Documentation/diff-format.txt) */
 };
 #define DIFF_PAIR_UNMERGED(p) \
 	(!DIFF_FILE_VALID((p)->one) && !DIFF_FILE_VALID((p)->two))
diff --git a/t/t0000-basic.sh b/t/t0000-basic.sh
--- a/t/t0000-basic.sh
+++ b/t/t0000-basic.sh
@@ -155,14 +155,14 @@ test_expect_success \
      test "$newtree" = "$tree"'
 
 cat >expected <<\EOF
-:100644 100644 f87290f8eb2cbbea7857214459a0739927eab154 0000000000000000000000000000000000000000	path0	path0
-:120000 120000 15a98433ae33114b085f3eb3bb03b832b3180a01 0000000000000000000000000000000000000000	path0sym	path0sym
-:100644 100644 3feff949ed00a62d9f7af97c15cd8a30595e7ac7 0000000000000000000000000000000000000000	path2/file2	path2/file2
-:120000 120000 d8ce161addc5173867a3c3c730924388daedbc38 0000000000000000000000000000000000000000	path2/file2sym	path2/file2sym
-:100644 100644 0aa34cae68d0878578ad119c86ca2b5ed5b28376 0000000000000000000000000000000000000000	path3/file3	path3/file3
-:120000 120000 8599103969b43aff7e430efea79ca4636466794f 0000000000000000000000000000000000000000	path3/file3sym	path3/file3sym
-:100644 100644 00fb5908cb97c2564a9783c0c64087333b3b464f 0000000000000000000000000000000000000000	path3/subp3/file3	path3/subp3/file3
-:120000 120000 6649a1ebe9e9f1c553b66f5a6e74136a07ccc57c 0000000000000000000000000000000000000000	path3/subp3/file3sym	path3/subp3/file3sym
+:100644 100644 f87290f8eb2cbbea7857214459a0739927eab154 0000000000000000000000000000000000000000 M	path0
+:120000 120000 15a98433ae33114b085f3eb3bb03b832b3180a01 0000000000000000000000000000000000000000 M	path0sym
+:100644 100644 3feff949ed00a62d9f7af97c15cd8a30595e7ac7 0000000000000000000000000000000000000000 M	path2/file2
+:120000 120000 d8ce161addc5173867a3c3c730924388daedbc38 0000000000000000000000000000000000000000 M	path2/file2sym
+:100644 100644 0aa34cae68d0878578ad119c86ca2b5ed5b28376 0000000000000000000000000000000000000000 M	path3/file3
+:120000 120000 8599103969b43aff7e430efea79ca4636466794f 0000000000000000000000000000000000000000 M	path3/file3sym
+:100644 100644 00fb5908cb97c2564a9783c0c64087333b3b464f 0000000000000000000000000000000000000000 M	path3/subp3/file3
+:120000 120000 6649a1ebe9e9f1c553b66f5a6e74136a07ccc57c 0000000000000000000000000000000000000000 M	path3/subp3/file3sym
 EOF
 test_expect_success \
     'validate git-diff-files output for a know cache/work tree state.' \
diff --git a/t/t4002-diff-basic.sh b/t/t4002-diff-basic.sh
--- a/t/t4002-diff-basic.sh
+++ b/t/t4002-diff-basic.sh
@@ -10,120 +10,120 @@ test_description='Test diff raw-output.
 . ../lib-read-tree-m-3way.sh
 
 cat >.test-plain-OA <<\EOF
-:000000 100644 0000000000000000000000000000000000000000 ccba72ad3888a3520b39efcf780b9ee64167535d	AA	AA
-:000000 100644 0000000000000000000000000000000000000000 7e426fb079479fd67f6d81f984e4ec649a44bc25	AN	AN
-:100644 000000 bcc68ef997017466d5c9094bcf7692295f588c9a 0000000000000000000000000000000000000000	DD	DD
-:000000 040000 0000000000000000000000000000000000000000 6d50f65d3bdab91c63444294d38f08aeff328e42	DF	DF
-:100644 000000 141c1f1642328e4bc46a7d801a71da392e66791e 0000000000000000000000000000000000000000	DM	DM
-:100644 000000 35abde1506ddf806572ff4d407bd06885d0f8ee9 0000000000000000000000000000000000000000	DN	DN
-:000000 100644 0000000000000000000000000000000000000000 1d41122ebdd7a640f29d3c9cc4f9d70094374762	LL	LL
-:100644 100644 03f24c8c4700babccfd28b654e7e8eac402ad6cd 103d9f89b50b9aad03054b579be5e7aa665f2d57	MD	MD
-:100644 100644 b258508afb7ceb449981bd9d63d2d3e971bf8d34 b431b272d829ff3aa4d1a5085f4394ab4d3305b6	MM	MM
-:100644 100644 bd084b0c27c7b6cc34f11d6d0509a29be3caf970 a716d58de4a570e0038f5c307bd8db34daea021f	MN	MN
-:100644 100644 40c959f984c8b89a2b02520d17f00d717f024397 2ac547ae9614a00d1b28275de608131f7a0e259f	SS	SS
-:100644 100644 4ac13458899ab908ef3b1128fa378daefc88d356 4c86f9a85fbc5e6804ee2e17a797538fbe785bca	TT	TT
-:040000 040000 7d670fdcdb9929f6c7dac196ff78689cd1c566a1 5e5f22072bb39f6e12cf663a57cb634c76eefb49	Z	Z
+:000000 100644 0000000000000000000000000000000000000000 ccba72ad3888a3520b39efcf780b9ee64167535d N	AA
+:000000 100644 0000000000000000000000000000000000000000 7e426fb079479fd67f6d81f984e4ec649a44bc25 N	AN
+:100644 000000 bcc68ef997017466d5c9094bcf7692295f588c9a 0000000000000000000000000000000000000000 D	DD
+:000000 040000 0000000000000000000000000000000000000000 6d50f65d3bdab91c63444294d38f08aeff328e42 N	DF
+:100644 000000 141c1f1642328e4bc46a7d801a71da392e66791e 0000000000000000000000000000000000000000 D	DM
+:100644 000000 35abde1506ddf806572ff4d407bd06885d0f8ee9 0000000000000000000000000000000000000000 D	DN
+:000000 100644 0000000000000000000000000000000000000000 1d41122ebdd7a640f29d3c9cc4f9d70094374762 N	LL
+:100644 100644 03f24c8c4700babccfd28b654e7e8eac402ad6cd 103d9f89b50b9aad03054b579be5e7aa665f2d57 M	MD
+:100644 100644 b258508afb7ceb449981bd9d63d2d3e971bf8d34 b431b272d829ff3aa4d1a5085f4394ab4d3305b6 M	MM
+:100644 100644 bd084b0c27c7b6cc34f11d6d0509a29be3caf970 a716d58de4a570e0038f5c307bd8db34daea021f M	MN
+:100644 100644 40c959f984c8b89a2b02520d17f00d717f024397 2ac547ae9614a00d1b28275de608131f7a0e259f M	SS
+:100644 100644 4ac13458899ab908ef3b1128fa378daefc88d356 4c86f9a85fbc5e6804ee2e17a797538fbe785bca M	TT
+:040000 040000 7d670fdcdb9929f6c7dac196ff78689cd1c566a1 5e5f22072bb39f6e12cf663a57cb634c76eefb49 M	Z
 EOF
 
 cat >.test-recursive-OA <<\EOF
-:000000 100644 0000000000000000000000000000000000000000 ccba72ad3888a3520b39efcf780b9ee64167535d	AA	AA
-:000000 100644 0000000000000000000000000000000000000000 7e426fb079479fd67f6d81f984e4ec649a44bc25	AN	AN
-:100644 000000 bcc68ef997017466d5c9094bcf7692295f588c9a 0000000000000000000000000000000000000000	DD	DD
-:000000 100644 0000000000000000000000000000000000000000 68a6d8b91da11045cf4aa3a5ab9f2a781c701249	DF/DF	DF/DF
-:100644 000000 141c1f1642328e4bc46a7d801a71da392e66791e 0000000000000000000000000000000000000000	DM	DM
-:100644 000000 35abde1506ddf806572ff4d407bd06885d0f8ee9 0000000000000000000000000000000000000000	DN	DN
-:000000 100644 0000000000000000000000000000000000000000 1d41122ebdd7a640f29d3c9cc4f9d70094374762	LL	LL
-:100644 100644 03f24c8c4700babccfd28b654e7e8eac402ad6cd 103d9f89b50b9aad03054b579be5e7aa665f2d57	MD	MD
-:100644 100644 b258508afb7ceb449981bd9d63d2d3e971bf8d34 b431b272d829ff3aa4d1a5085f4394ab4d3305b6	MM	MM
-:100644 100644 bd084b0c27c7b6cc34f11d6d0509a29be3caf970 a716d58de4a570e0038f5c307bd8db34daea021f	MN	MN
-:100644 100644 40c959f984c8b89a2b02520d17f00d717f024397 2ac547ae9614a00d1b28275de608131f7a0e259f	SS	SS
-:100644 100644 4ac13458899ab908ef3b1128fa378daefc88d356 4c86f9a85fbc5e6804ee2e17a797538fbe785bca	TT	TT
-:000000 100644 0000000000000000000000000000000000000000 8acb8e9750e3f644bf323fcf3d338849db106c77	Z/AA	Z/AA
-:000000 100644 0000000000000000000000000000000000000000 087494262084cefee7ed484d20c8dc0580791272	Z/AN	Z/AN
-:100644 000000 879007efae624d2b1307214b24a956f0a8d686a8 0000000000000000000000000000000000000000	Z/DD	Z/DD
-:100644 000000 9b541b2275c06e3a7b13f28badf5294e2ae63df4 0000000000000000000000000000000000000000	Z/DM	Z/DM
-:100644 000000 beb5d38c55283d280685ea21a0e50cfcc0ca064a 0000000000000000000000000000000000000000	Z/DN	Z/DN
-:100644 100644 d41fda41b7ec4de46b43cb7ea42a45001ae393d5 a79ac3be9377639e1c7d1edf1ae1b3a5f0ccd8a9	Z/MD	Z/MD
-:100644 100644 4ca22bae2527d3d9e1676498a0fba3b355bd1278 61422ba9c2c873416061a88cd40a59a35b576474	Z/MM	Z/MM
-:100644 100644 b16d7b25b869f2beb124efa53467d8a1550ad694 a5c544c21cfcb07eb80a4d89a5b7d1570002edfd	Z/MN	Z/MN
+:000000 100644 0000000000000000000000000000000000000000 ccba72ad3888a3520b39efcf780b9ee64167535d N	AA
+:000000 100644 0000000000000000000000000000000000000000 7e426fb079479fd67f6d81f984e4ec649a44bc25 N	AN
+:100644 000000 bcc68ef997017466d5c9094bcf7692295f588c9a 0000000000000000000000000000000000000000 D	DD
+:000000 100644 0000000000000000000000000000000000000000 68a6d8b91da11045cf4aa3a5ab9f2a781c701249 N	DF/DF
+:100644 000000 141c1f1642328e4bc46a7d801a71da392e66791e 0000000000000000000000000000000000000000 D	DM
+:100644 000000 35abde1506ddf806572ff4d407bd06885d0f8ee9 0000000000000000000000000000000000000000 D	DN
+:000000 100644 0000000000000000000000000000000000000000 1d41122ebdd7a640f29d3c9cc4f9d70094374762 N	LL
+:100644 100644 03f24c8c4700babccfd28b654e7e8eac402ad6cd 103d9f89b50b9aad03054b579be5e7aa665f2d57 M	MD
+:100644 100644 b258508afb7ceb449981bd9d63d2d3e971bf8d34 b431b272d829ff3aa4d1a5085f4394ab4d3305b6 M	MM
+:100644 100644 bd084b0c27c7b6cc34f11d6d0509a29be3caf970 a716d58de4a570e0038f5c307bd8db34daea021f M	MN
+:100644 100644 40c959f984c8b89a2b02520d17f00d717f024397 2ac547ae9614a00d1b28275de608131f7a0e259f M	SS
+:100644 100644 4ac13458899ab908ef3b1128fa378daefc88d356 4c86f9a85fbc5e6804ee2e17a797538fbe785bca M	TT
+:000000 100644 0000000000000000000000000000000000000000 8acb8e9750e3f644bf323fcf3d338849db106c77 N	Z/AA
+:000000 100644 0000000000000000000000000000000000000000 087494262084cefee7ed484d20c8dc0580791272 N	Z/AN
+:100644 000000 879007efae624d2b1307214b24a956f0a8d686a8 0000000000000000000000000000000000000000 D	Z/DD
+:100644 000000 9b541b2275c06e3a7b13f28badf5294e2ae63df4 0000000000000000000000000000000000000000 D	Z/DM
+:100644 000000 beb5d38c55283d280685ea21a0e50cfcc0ca064a 0000000000000000000000000000000000000000 D	Z/DN
+:100644 100644 d41fda41b7ec4de46b43cb7ea42a45001ae393d5 a79ac3be9377639e1c7d1edf1ae1b3a5f0ccd8a9 M	Z/MD
+:100644 100644 4ca22bae2527d3d9e1676498a0fba3b355bd1278 61422ba9c2c873416061a88cd40a59a35b576474 M	Z/MM
+:100644 100644 b16d7b25b869f2beb124efa53467d8a1550ad694 a5c544c21cfcb07eb80a4d89a5b7d1570002edfd M	Z/MN
 EOF
 cat >.test-plain-OB <<\EOF
-:000000 100644 0000000000000000000000000000000000000000 6aa2b5335b16431a0ef71e5c0a28be69183cf6a2	AA	AA
-:100644 000000 bcc68ef997017466d5c9094bcf7692295f588c9a 0000000000000000000000000000000000000000	DD	DD
-:000000 100644 0000000000000000000000000000000000000000 71420ab81e254145d26d6fc0cddee64c1acd4787	DF	DF
-:100644 100644 141c1f1642328e4bc46a7d801a71da392e66791e 3c4d8de5fbad08572bab8e10eef8dbb264cf0231	DM	DM
-:000000 100644 0000000000000000000000000000000000000000 1d41122ebdd7a640f29d3c9cc4f9d70094374762	LL	LL
-:100644 000000 03f24c8c4700babccfd28b654e7e8eac402ad6cd 0000000000000000000000000000000000000000	MD	MD
-:100644 100644 b258508afb7ceb449981bd9d63d2d3e971bf8d34 19989d4559aae417fedee240ccf2ba315ea4dc2b	MM	MM
-:000000 100644 0000000000000000000000000000000000000000 15885881ea69115351c09b38371f0348a3fb8c67	NA	NA
-:100644 000000 a4e179e4291e5536a5e1c82e091052772d2c5a93 0000000000000000000000000000000000000000	ND	ND
-:100644 100644 c8f25781e8f1792e3e40b74225e20553041b5226 cdb9a8c3da571502ac30225e9c17beccb8387983	NM	NM
-:100644 100644 40c959f984c8b89a2b02520d17f00d717f024397 2ac547ae9614a00d1b28275de608131f7a0e259f	SS	SS
-:100644 100644 4ac13458899ab908ef3b1128fa378daefc88d356 c4e4a12231b9fa79a0053cb6077fcb21bb5b135a	TT	TT
-:040000 040000 7d670fdcdb9929f6c7dac196ff78689cd1c566a1 1ba523955d5160681af65cb776411f574c1e8155	Z	Z
+:000000 100644 0000000000000000000000000000000000000000 6aa2b5335b16431a0ef71e5c0a28be69183cf6a2 N	AA
+:100644 000000 bcc68ef997017466d5c9094bcf7692295f588c9a 0000000000000000000000000000000000000000 D	DD
+:000000 100644 0000000000000000000000000000000000000000 71420ab81e254145d26d6fc0cddee64c1acd4787 N	DF
+:100644 100644 141c1f1642328e4bc46a7d801a71da392e66791e 3c4d8de5fbad08572bab8e10eef8dbb264cf0231 M	DM
+:000000 100644 0000000000000000000000000000000000000000 1d41122ebdd7a640f29d3c9cc4f9d70094374762 N	LL
+:100644 000000 03f24c8c4700babccfd28b654e7e8eac402ad6cd 0000000000000000000000000000000000000000 D	MD
+:100644 100644 b258508afb7ceb449981bd9d63d2d3e971bf8d34 19989d4559aae417fedee240ccf2ba315ea4dc2b M	MM
+:000000 100644 0000000000000000000000000000000000000000 15885881ea69115351c09b38371f0348a3fb8c67 N	NA
+:100644 000000 a4e179e4291e5536a5e1c82e091052772d2c5a93 0000000000000000000000000000000000000000 D	ND
+:100644 100644 c8f25781e8f1792e3e40b74225e20553041b5226 cdb9a8c3da571502ac30225e9c17beccb8387983 M	NM
+:100644 100644 40c959f984c8b89a2b02520d17f00d717f024397 2ac547ae9614a00d1b28275de608131f7a0e259f M	SS
+:100644 100644 4ac13458899ab908ef3b1128fa378daefc88d356 c4e4a12231b9fa79a0053cb6077fcb21bb5b135a M	TT
+:040000 040000 7d670fdcdb9929f6c7dac196ff78689cd1c566a1 1ba523955d5160681af65cb776411f574c1e8155 M	Z
 EOF
 cat >.test-recursive-OB <<\EOF
-:000000 100644 0000000000000000000000000000000000000000 6aa2b5335b16431a0ef71e5c0a28be69183cf6a2	AA	AA
-:100644 000000 bcc68ef997017466d5c9094bcf7692295f588c9a 0000000000000000000000000000000000000000	DD	DD
-:000000 100644 0000000000000000000000000000000000000000 71420ab81e254145d26d6fc0cddee64c1acd4787	DF	DF
-:100644 100644 141c1f1642328e4bc46a7d801a71da392e66791e 3c4d8de5fbad08572bab8e10eef8dbb264cf0231	DM	DM
-:000000 100644 0000000000000000000000000000000000000000 1d41122ebdd7a640f29d3c9cc4f9d70094374762	LL	LL
-:100644 000000 03f24c8c4700babccfd28b654e7e8eac402ad6cd 0000000000000000000000000000000000000000	MD	MD
-:100644 100644 b258508afb7ceb449981bd9d63d2d3e971bf8d34 19989d4559aae417fedee240ccf2ba315ea4dc2b	MM	MM
-:000000 100644 0000000000000000000000000000000000000000 15885881ea69115351c09b38371f0348a3fb8c67	NA	NA
-:100644 000000 a4e179e4291e5536a5e1c82e091052772d2c5a93 0000000000000000000000000000000000000000	ND	ND
-:100644 100644 c8f25781e8f1792e3e40b74225e20553041b5226 cdb9a8c3da571502ac30225e9c17beccb8387983	NM	NM
-:100644 100644 40c959f984c8b89a2b02520d17f00d717f024397 2ac547ae9614a00d1b28275de608131f7a0e259f	SS	SS
-:100644 100644 4ac13458899ab908ef3b1128fa378daefc88d356 c4e4a12231b9fa79a0053cb6077fcb21bb5b135a	TT	TT
-:000000 100644 0000000000000000000000000000000000000000 6c0b99286d0bce551ac4a7b3dff8b706edff3715	Z/AA	Z/AA
-:100644 000000 879007efae624d2b1307214b24a956f0a8d686a8 0000000000000000000000000000000000000000	Z/DD	Z/DD
-:100644 100644 9b541b2275c06e3a7b13f28badf5294e2ae63df4 d77371d15817fcaa57eeec27f770c505ba974ec1	Z/DM	Z/DM
-:100644 000000 d41fda41b7ec4de46b43cb7ea42a45001ae393d5 0000000000000000000000000000000000000000	Z/MD	Z/MD
-:100644 100644 4ca22bae2527d3d9e1676498a0fba3b355bd1278 697aad7715a1e7306ca76290a3dd4208fbaeddfa	Z/MM	Z/MM
-:000000 100644 0000000000000000000000000000000000000000 d12979c22fff69c59ca9409e7a8fe3ee25eaee80	Z/NA	Z/NA
-:100644 000000 a18393c636b98e9bd7296b8b437ea4992b72440c 0000000000000000000000000000000000000000	Z/ND	Z/ND
-:100644 100644 3fdbe17fd013303a2e981e1ca1c6cd6e72789087 7e09d6a3a14bd630913e8c75693cea32157b606d	Z/NM	Z/NM
+:000000 100644 0000000000000000000000000000000000000000 6aa2b5335b16431a0ef71e5c0a28be69183cf6a2 N	AA
+:100644 000000 bcc68ef997017466d5c9094bcf7692295f588c9a 0000000000000000000000000000000000000000 D	DD
+:000000 100644 0000000000000000000000000000000000000000 71420ab81e254145d26d6fc0cddee64c1acd4787 N	DF
+:100644 100644 141c1f1642328e4bc46a7d801a71da392e66791e 3c4d8de5fbad08572bab8e10eef8dbb264cf0231 M	DM
+:000000 100644 0000000000000000000000000000000000000000 1d41122ebdd7a640f29d3c9cc4f9d70094374762 N	LL
+:100644 000000 03f24c8c4700babccfd28b654e7e8eac402ad6cd 0000000000000000000000000000000000000000 D	MD
+:100644 100644 b258508afb7ceb449981bd9d63d2d3e971bf8d34 19989d4559aae417fedee240ccf2ba315ea4dc2b M	MM
+:000000 100644 0000000000000000000000000000000000000000 15885881ea69115351c09b38371f0348a3fb8c67 N	NA
+:100644 000000 a4e179e4291e5536a5e1c82e091052772d2c5a93 0000000000000000000000000000000000000000 D	ND
+:100644 100644 c8f25781e8f1792e3e40b74225e20553041b5226 cdb9a8c3da571502ac30225e9c17beccb8387983 M	NM
+:100644 100644 40c959f984c8b89a2b02520d17f00d717f024397 2ac547ae9614a00d1b28275de608131f7a0e259f M	SS
+:100644 100644 4ac13458899ab908ef3b1128fa378daefc88d356 c4e4a12231b9fa79a0053cb6077fcb21bb5b135a M	TT
+:000000 100644 0000000000000000000000000000000000000000 6c0b99286d0bce551ac4a7b3dff8b706edff3715 N	Z/AA
+:100644 000000 879007efae624d2b1307214b24a956f0a8d686a8 0000000000000000000000000000000000000000 D	Z/DD
+:100644 100644 9b541b2275c06e3a7b13f28badf5294e2ae63df4 d77371d15817fcaa57eeec27f770c505ba974ec1 M	Z/DM
+:100644 000000 d41fda41b7ec4de46b43cb7ea42a45001ae393d5 0000000000000000000000000000000000000000 D	Z/MD
+:100644 100644 4ca22bae2527d3d9e1676498a0fba3b355bd1278 697aad7715a1e7306ca76290a3dd4208fbaeddfa M	Z/MM
+:000000 100644 0000000000000000000000000000000000000000 d12979c22fff69c59ca9409e7a8fe3ee25eaee80 N	Z/NA
+:100644 000000 a18393c636b98e9bd7296b8b437ea4992b72440c 0000000000000000000000000000000000000000 D	Z/ND
+:100644 100644 3fdbe17fd013303a2e981e1ca1c6cd6e72789087 7e09d6a3a14bd630913e8c75693cea32157b606d M	Z/NM
 EOF
 cat >.test-plain-AB <<\EOF
-:100644 100644 ccba72ad3888a3520b39efcf780b9ee64167535d 6aa2b5335b16431a0ef71e5c0a28be69183cf6a2	AA	AA
-:100644 000000 7e426fb079479fd67f6d81f984e4ec649a44bc25 0000000000000000000000000000000000000000	AN	AN
-:000000 100644 0000000000000000000000000000000000000000 71420ab81e254145d26d6fc0cddee64c1acd4787	DF	DF
-:040000 000000 6d50f65d3bdab91c63444294d38f08aeff328e42 0000000000000000000000000000000000000000	DF	DF
-:000000 100644 0000000000000000000000000000000000000000 3c4d8de5fbad08572bab8e10eef8dbb264cf0231	DM	DM
-:000000 100644 0000000000000000000000000000000000000000 35abde1506ddf806572ff4d407bd06885d0f8ee9	DN	DN
-:100644 000000 103d9f89b50b9aad03054b579be5e7aa665f2d57 0000000000000000000000000000000000000000	MD	MD
-:100644 100644 b431b272d829ff3aa4d1a5085f4394ab4d3305b6 19989d4559aae417fedee240ccf2ba315ea4dc2b	MM	MM
-:100644 100644 a716d58de4a570e0038f5c307bd8db34daea021f bd084b0c27c7b6cc34f11d6d0509a29be3caf970	MN	MN
-:000000 100644 0000000000000000000000000000000000000000 15885881ea69115351c09b38371f0348a3fb8c67	NA	NA
-:100644 000000 a4e179e4291e5536a5e1c82e091052772d2c5a93 0000000000000000000000000000000000000000	ND	ND
-:100644 100644 c8f25781e8f1792e3e40b74225e20553041b5226 cdb9a8c3da571502ac30225e9c17beccb8387983	NM	NM
-:100644 100644 4c86f9a85fbc5e6804ee2e17a797538fbe785bca c4e4a12231b9fa79a0053cb6077fcb21bb5b135a	TT	TT
-:040000 040000 5e5f22072bb39f6e12cf663a57cb634c76eefb49 1ba523955d5160681af65cb776411f574c1e8155	Z	Z
+:100644 100644 ccba72ad3888a3520b39efcf780b9ee64167535d 6aa2b5335b16431a0ef71e5c0a28be69183cf6a2 M	AA
+:100644 000000 7e426fb079479fd67f6d81f984e4ec649a44bc25 0000000000000000000000000000000000000000 D	AN
+:000000 100644 0000000000000000000000000000000000000000 71420ab81e254145d26d6fc0cddee64c1acd4787 N	DF
+:040000 000000 6d50f65d3bdab91c63444294d38f08aeff328e42 0000000000000000000000000000000000000000 D	DF
+:000000 100644 0000000000000000000000000000000000000000 3c4d8de5fbad08572bab8e10eef8dbb264cf0231 N	DM
+:000000 100644 0000000000000000000000000000000000000000 35abde1506ddf806572ff4d407bd06885d0f8ee9 N	DN
+:100644 000000 103d9f89b50b9aad03054b579be5e7aa665f2d57 0000000000000000000000000000000000000000 D	MD
+:100644 100644 b431b272d829ff3aa4d1a5085f4394ab4d3305b6 19989d4559aae417fedee240ccf2ba315ea4dc2b M	MM
+:100644 100644 a716d58de4a570e0038f5c307bd8db34daea021f bd084b0c27c7b6cc34f11d6d0509a29be3caf970 M	MN
+:000000 100644 0000000000000000000000000000000000000000 15885881ea69115351c09b38371f0348a3fb8c67 N	NA
+:100644 000000 a4e179e4291e5536a5e1c82e091052772d2c5a93 0000000000000000000000000000000000000000 D	ND
+:100644 100644 c8f25781e8f1792e3e40b74225e20553041b5226 cdb9a8c3da571502ac30225e9c17beccb8387983 M	NM
+:100644 100644 4c86f9a85fbc5e6804ee2e17a797538fbe785bca c4e4a12231b9fa79a0053cb6077fcb21bb5b135a M	TT
+:040000 040000 5e5f22072bb39f6e12cf663a57cb634c76eefb49 1ba523955d5160681af65cb776411f574c1e8155 M	Z
 EOF
 cat >.test-recursive-AB <<\EOF
-:100644 100644 ccba72ad3888a3520b39efcf780b9ee64167535d 6aa2b5335b16431a0ef71e5c0a28be69183cf6a2	AA	AA
-:100644 000000 7e426fb079479fd67f6d81f984e4ec649a44bc25 0000000000000000000000000000000000000000	AN	AN
-:000000 100644 0000000000000000000000000000000000000000 71420ab81e254145d26d6fc0cddee64c1acd4787	DF	DF
-:100644 000000 68a6d8b91da11045cf4aa3a5ab9f2a781c701249 0000000000000000000000000000000000000000	DF/DF	DF/DF
-:000000 100644 0000000000000000000000000000000000000000 3c4d8de5fbad08572bab8e10eef8dbb264cf0231	DM	DM
-:000000 100644 0000000000000000000000000000000000000000 35abde1506ddf806572ff4d407bd06885d0f8ee9	DN	DN
-:100644 000000 103d9f89b50b9aad03054b579be5e7aa665f2d57 0000000000000000000000000000000000000000	MD	MD
-:100644 100644 b431b272d829ff3aa4d1a5085f4394ab4d3305b6 19989d4559aae417fedee240ccf2ba315ea4dc2b	MM	MM
-:100644 100644 a716d58de4a570e0038f5c307bd8db34daea021f bd084b0c27c7b6cc34f11d6d0509a29be3caf970	MN	MN
-:000000 100644 0000000000000000000000000000000000000000 15885881ea69115351c09b38371f0348a3fb8c67	NA	NA
-:100644 000000 a4e179e4291e5536a5e1c82e091052772d2c5a93 0000000000000000000000000000000000000000	ND	ND
-:100644 100644 c8f25781e8f1792e3e40b74225e20553041b5226 cdb9a8c3da571502ac30225e9c17beccb8387983	NM	NM
-:100644 100644 4c86f9a85fbc5e6804ee2e17a797538fbe785bca c4e4a12231b9fa79a0053cb6077fcb21bb5b135a	TT	TT
-:100644 100644 8acb8e9750e3f644bf323fcf3d338849db106c77 6c0b99286d0bce551ac4a7b3dff8b706edff3715	Z/AA	Z/AA
-:100644 000000 087494262084cefee7ed484d20c8dc0580791272 0000000000000000000000000000000000000000	Z/AN	Z/AN
-:000000 100644 0000000000000000000000000000000000000000 d77371d15817fcaa57eeec27f770c505ba974ec1	Z/DM	Z/DM
-:000000 100644 0000000000000000000000000000000000000000 beb5d38c55283d280685ea21a0e50cfcc0ca064a	Z/DN	Z/DN
-:100644 000000 a79ac3be9377639e1c7d1edf1ae1b3a5f0ccd8a9 0000000000000000000000000000000000000000	Z/MD	Z/MD
-:100644 100644 61422ba9c2c873416061a88cd40a59a35b576474 697aad7715a1e7306ca76290a3dd4208fbaeddfa	Z/MM	Z/MM
-:100644 100644 a5c544c21cfcb07eb80a4d89a5b7d1570002edfd b16d7b25b869f2beb124efa53467d8a1550ad694	Z/MN	Z/MN
-:000000 100644 0000000000000000000000000000000000000000 d12979c22fff69c59ca9409e7a8fe3ee25eaee80	Z/NA	Z/NA
-:100644 000000 a18393c636b98e9bd7296b8b437ea4992b72440c 0000000000000000000000000000000000000000	Z/ND	Z/ND
-:100644 100644 3fdbe17fd013303a2e981e1ca1c6cd6e72789087 7e09d6a3a14bd630913e8c75693cea32157b606d	Z/NM	Z/NM
+:100644 100644 ccba72ad3888a3520b39efcf780b9ee64167535d 6aa2b5335b16431a0ef71e5c0a28be69183cf6a2 M	AA
+:100644 000000 7e426fb079479fd67f6d81f984e4ec649a44bc25 0000000000000000000000000000000000000000 D	AN
+:000000 100644 0000000000000000000000000000000000000000 71420ab81e254145d26d6fc0cddee64c1acd4787 N	DF
+:100644 000000 68a6d8b91da11045cf4aa3a5ab9f2a781c701249 0000000000000000000000000000000000000000 D	DF/DF
+:000000 100644 0000000000000000000000000000000000000000 3c4d8de5fbad08572bab8e10eef8dbb264cf0231 N	DM
+:000000 100644 0000000000000000000000000000000000000000 35abde1506ddf806572ff4d407bd06885d0f8ee9 N	DN
+:100644 000000 103d9f89b50b9aad03054b579be5e7aa665f2d57 0000000000000000000000000000000000000000 D	MD
+:100644 100644 b431b272d829ff3aa4d1a5085f4394ab4d3305b6 19989d4559aae417fedee240ccf2ba315ea4dc2b M	MM
+:100644 100644 a716d58de4a570e0038f5c307bd8db34daea021f bd084b0c27c7b6cc34f11d6d0509a29be3caf970 M	MN
+:000000 100644 0000000000000000000000000000000000000000 15885881ea69115351c09b38371f0348a3fb8c67 N	NA
+:100644 000000 a4e179e4291e5536a5e1c82e091052772d2c5a93 0000000000000000000000000000000000000000 D	ND
+:100644 100644 c8f25781e8f1792e3e40b74225e20553041b5226 cdb9a8c3da571502ac30225e9c17beccb8387983 M	NM
+:100644 100644 4c86f9a85fbc5e6804ee2e17a797538fbe785bca c4e4a12231b9fa79a0053cb6077fcb21bb5b135a M	TT
+:100644 100644 8acb8e9750e3f644bf323fcf3d338849db106c77 6c0b99286d0bce551ac4a7b3dff8b706edff3715 M	Z/AA
+:100644 000000 087494262084cefee7ed484d20c8dc0580791272 0000000000000000000000000000000000000000 D	Z/AN
+:000000 100644 0000000000000000000000000000000000000000 d77371d15817fcaa57eeec27f770c505ba974ec1 N	Z/DM
+:000000 100644 0000000000000000000000000000000000000000 beb5d38c55283d280685ea21a0e50cfcc0ca064a N	Z/DN
+:100644 000000 a79ac3be9377639e1c7d1edf1ae1b3a5f0ccd8a9 0000000000000000000000000000000000000000 D	Z/MD
+:100644 100644 61422ba9c2c873416061a88cd40a59a35b576474 697aad7715a1e7306ca76290a3dd4208fbaeddfa M	Z/MM
+:100644 100644 a5c544c21cfcb07eb80a4d89a5b7d1570002edfd b16d7b25b869f2beb124efa53467d8a1550ad694 M	Z/MN
+:000000 100644 0000000000000000000000000000000000000000 d12979c22fff69c59ca9409e7a8fe3ee25eaee80 N	Z/NA
+:100644 000000 a18393c636b98e9bd7296b8b437ea4992b72440c 0000000000000000000000000000000000000000 D	Z/ND
+:100644 100644 3fdbe17fd013303a2e981e1ca1c6cd6e72789087 7e09d6a3a14bd630913e8c75693cea32157b606d M	Z/NM
 EOF
 
 x40='[0-9a-f][0-9a-f][0-9a-f][0-9a-f][0-9a-f]'
@@ -134,7 +134,7 @@ cmp_diff_files_output () {
     # object ID for the changed files because it wants you to look at the
     # filesystem.
     sed <"$2" >.test-tmp \
-	-e '/^:000000 /d;s/'$x40'	/'$z40'	/' &&
+	-e '/^:000000 /d;s/'$x40'\( [MCRNDU][0-9]*\)	/'$z40'\1	/' &&
     diff "$1" .test-tmp
 }
 
diff --git a/t/t4003-diff-rename-1.sh b/t/t4003-diff-rename-1.sh
--- a/t/t4003-diff-rename-1.sh
+++ b/t/t4003-diff-rename-1.sh
@@ -8,6 +8,14 @@ test_description='More rename detection
 '
 . ./test-lib.sh
 
+compare_diff_patch () {
+    # When heuristics are improved, the score numbers would change.
+    # Ignore them while comparing.
+    sed -e '/^similarity index [0-9]*%$/d' <"$1" >.tmp-1
+    sed -e '/^similarity index [0-9]*%$/d' <"$2" >.tmp-2
+    diff -u .tmp-1 .tmp-2 && rm -f .tmp-1 .tmp-2
+}
+
 test_expect_success \
     'prepare reference tree' \
     'cat ../../COPYING >COPYING &&
@@ -28,38 +36,35 @@ test_expect_success \
 # copy-and-edit one, and rename-and-edit the other.  We do not say
 # anything about rezrov.
 
-GIT_DIFF_OPTS=--unified=0 git-diff-cache -M -p $tree |
-sed -e 's/\([0-9][0-9]*\)/#/g' >current &&
+GIT_DIFF_OPTS=--unified=0 git-diff-cache -M -p $tree >current
 cat >expected <<\EOF
-diff --git a/COPYING b/COPYING.#
-similarity index #%
+diff --git a/COPYING b/COPYING.1
 copy from COPYING
-copy to COPYING.#
+copy to COPYING.1
 --- a/COPYING
-+++ b/COPYING.#
-@@ -# +# @@
-- HOWEVER, in order to allow a migration to GPLv# if that seems like
-+ However, in order to allow a migration to GPLv# if that seems like
-diff --git a/COPYING b/COPYING.#
-similarity index #%
++++ b/COPYING.1
+@@ -6 +6 @@
+- HOWEVER, in order to allow a migration to GPLv3 if that seems like
++ However, in order to allow a migration to GPLv3 if that seems like
+diff --git a/COPYING b/COPYING.2
 rename old COPYING
-rename new COPYING.#
+rename new COPYING.2
 --- a/COPYING
-+++ b/COPYING.#
-@@ -# +# @@
++++ b/COPYING.2
+@@ -2 +2 @@
 - Note that the only valid version of the GPL as far as this project
 + Note that the only valid version of the G.P.L as far as this project
-@@ -# +# @@
-- HOWEVER, in order to allow a migration to GPLv# if that seems like
-+ HOWEVER, in order to allow a migration to G.P.Lv# if that seems like
-@@ -# +# @@
--	This file is licensed under the GPL v#, or a later version
-+	This file is licensed under the G.P.L v#, or a later version
+@@ -6 +6 @@
+- HOWEVER, in order to allow a migration to GPLv3 if that seems like
++ HOWEVER, in order to allow a migration to G.P.Lv3 if that seems like
+@@ -12 +12 @@
+-	This file is licensed under the GPL v2, or a later version
++	This file is licensed under the G.P.L v2, or a later version
 EOF
 
 test_expect_success \
-    'validate output from rename/copy detection' \
-    'diff -u current expected'
+    'validate output from rename/copy detection (#1)' \
+    'compare_diff_patch current expected'
 
 test_expect_success \
     'prepare work tree again' \
@@ -71,35 +76,33 @@ test_expect_success \
 # edited one, and copy-and-edit the other.  We do not say
 # anything about rezrov.
 
-GIT_DIFF_OPTS=--unified=0 git-diff-cache -C -p $tree |
-sed -e 's/\([0-9][0-9]*\)/#/g' >current
+GIT_DIFF_OPTS=--unified=0 git-diff-cache -C -p $tree >current
 cat >expected <<\EOF
-diff --git a/COPYING b/COPYING.#
-similarity index #%
+diff --git a/COPYING b/COPYING.1
 copy from COPYING
-copy to COPYING.#
+copy to COPYING.1
 --- a/COPYING
-+++ b/COPYING.#
-@@ -# +# @@
-- HOWEVER, in order to allow a migration to GPLv# if that seems like
-+ However, in order to allow a migration to GPLv# if that seems like
++++ b/COPYING.1
+@@ -6 +6 @@
+- HOWEVER, in order to allow a migration to GPLv3 if that seems like
++ However, in order to allow a migration to GPLv3 if that seems like
 diff --git a/COPYING b/COPYING
 --- a/COPYING
 +++ b/COPYING
-@@ -# +# @@
+@@ -2 +2 @@
 - Note that the only valid version of the GPL as far as this project
 + Note that the only valid version of the G.P.L as far as this project
-@@ -# +# @@
-- HOWEVER, in order to allow a migration to GPLv# if that seems like
-+ HOWEVER, in order to allow a migration to G.P.Lv# if that seems like
-@@ -# +# @@
--	This file is licensed under the GPL v#, or a later version
-+	This file is licensed under the G.P.L v#, or a later version
+@@ -6 +6 @@
+- HOWEVER, in order to allow a migration to GPLv3 if that seems like
++ HOWEVER, in order to allow a migration to G.P.Lv3 if that seems like
+@@ -12 +12 @@
+-	This file is licensed under the GPL v2, or a later version
++	This file is licensed under the G.P.L v2, or a later version
 EOF
 
 test_expect_success \
-    'validate output from rename/copy detection' \
-    'diff -u current expected'
+    'validate output from rename/copy detection (#2)' \
+    'compare_diff_patch current expected'
 
 test_expect_success \
     'prepare work tree once again' \
@@ -112,22 +115,20 @@ test_expect_success \
 # the diff-core.  Unchanged rezrov, although being fed to
 # git-diff-cache as well, should not be mentioned.
 
-GIT_DIFF_OPTS=--unified=0 git-diff-cache -C -p $tree |
-sed -e 's/\([0-9][0-9]*\)/#/g' >current
+GIT_DIFF_OPTS=--unified=0 git-diff-cache -C -p $tree >current
 cat >expected <<\EOF
-diff --git a/COPYING b/COPYING.#
-similarity index #%
+diff --git a/COPYING b/COPYING.1
 copy from COPYING
-copy to COPYING.#
+copy to COPYING.1
 --- a/COPYING
-+++ b/COPYING.#
-@@ -# +# @@
-- HOWEVER, in order to allow a migration to GPLv# if that seems like
-+ However, in order to allow a migration to GPLv# if that seems like
++++ b/COPYING.1
+@@ -6 +6 @@
+- HOWEVER, in order to allow a migration to GPLv3 if that seems like
++ However, in order to allow a migration to GPLv3 if that seems like
 EOF
 
 test_expect_success \
-    'validate output from rename/copy detection' \
-    'diff -u current expected'
+    'validate output from rename/copy detection (#3)' \
+    'compare_diff_patch current expected'
 
 test_done
diff --git a/t/t4005-diff-rename-2.sh b/t/t4005-diff-rename-2.sh
--- a/t/t4005-diff-rename-2.sh
+++ b/t/t4005-diff-rename-2.sh
@@ -8,6 +8,22 @@ test_description='Same rename detection 
 '
 . ./test-lib.sh
 
+compare_diff_raw () {
+    # When heuristics are improved, the score numbers would change.
+    # Ignore them while comparing.
+    sed -e 's/ \([CR]\)[0-9]*	/\1#/' <"$1" >.tmp-1
+    sed -e 's/ \([CR]\)[0-9]*	/\1#/' <"$2" >.tmp-2
+    diff -u .tmp-1 .tmp-2 && rm -f .tmp-1 .tmp-2
+}
+
+compare_diff_patch () {
+    # When heuristics are improved, the score numbers would change.
+    # Ignore them while comparing.
+    sed -e '/^similarity index [0-9]*%$/d' <"$1" >.tmp-1
+    sed -e '/^similarity index [0-9]*%$/d' <"$2" >.tmp-2
+    diff -u .tmp-1 .tmp-2 && rm -f .tmp-1 .tmp-2
+}
+
 test_expect_success \
     'prepare reference tree' \
     'cat ../../COPYING >COPYING &&
@@ -31,13 +47,47 @@ test_expect_success \
 git-diff-cache -M $tree >current
 
 cat >expected <<\EOF
-:100644 100644 6ff87c4664981e4397625791c8ea3bbb5f2279a3 0603b3238a076dc6c8022aedc6648fa523a17178	COPYING	COPYING.1
-:100644 100644 6ff87c4664981e4397625791c8ea3bbb5f2279a3 06c67961bbaed34a127f76d261f4c0bf73eda471	COPYING	COPYING.2
+:100644 100644 6ff87c4664981e4397625791c8ea3bbb5f2279a3 0603b3238a076dc6c8022aedc6648fa523a17178 C1234	COPYING	COPYING.1
+:100644 100644 6ff87c4664981e4397625791c8ea3bbb5f2279a3 06c67961bbaed34a127f76d261f4c0bf73eda471 R1234	COPYING	COPYING.2
 EOF
 
 test_expect_success \
-    'validate output from rename/copy detection' \
-    'diff -u current expected'
+    'validate output from rename/copy detection (#1)' \
+    'compare_diff_raw current expected'
+
+# make sure diff-helper can grok it.
+mv expected diff-raw
+GIT_DIFF_OPTS=--unified=0 git-diff-helper <diff-raw >current
+cat >expected <<\EOF
+diff --git a/COPYING b/COPYING.1
+copy from COPYING
+copy to COPYING.1
+--- a/COPYING
++++ b/COPYING.1
+@@ -6 +6 @@
+- HOWEVER, in order to allow a migration to GPLv3 if that seems like
++ However, in order to allow a migration to GPLv3 if that seems like
+diff --git a/COPYING b/COPYING.2
+rename old COPYING
+rename new COPYING.2
+--- a/COPYING
++++ b/COPYING.2
+@@ -2 +2 @@
+- Note that the only valid version of the GPL as far as this project
++ Note that the only valid version of the G.P.L as far as this project
+@@ -6 +6 @@
+- HOWEVER, in order to allow a migration to GPLv3 if that seems like
++ HOWEVER, in order to allow a migration to G.P.Lv3 if that seems like
+@@ -12 +12 @@
+-	This file is licensed under the GPL v2, or a later version
++	This file is licensed under the G.P.L v2, or a later version
+EOF
+
+test_expect_success \
+    'validate output from diff-helper (#1)' \
+    'compare_diff_patch current expected'
+
+################################################################
 
 test_expect_success \
     'prepare work tree again' \
@@ -51,19 +101,51 @@ test_expect_success \
 
 git-diff-cache -C $tree >current
 cat >expected <<\EOF
-:100644 100644 6ff87c4664981e4397625791c8ea3bbb5f2279a3 0603b3238a076dc6c8022aedc6648fa523a17178	COPYING	COPYING.1
-:100644 100644 6ff87c4664981e4397625791c8ea3bbb5f2279a3 06c67961bbaed34a127f76d261f4c0bf73eda471	COPYING	COPYING
+:100644 100644 6ff87c4664981e4397625791c8ea3bbb5f2279a3 0603b3238a076dc6c8022aedc6648fa523a17178 C1234	COPYING	COPYING.1
+:100644 100644 6ff87c4664981e4397625791c8ea3bbb5f2279a3 06c67961bbaed34a127f76d261f4c0bf73eda471 M	COPYING
 EOF
 
 test_expect_success \
-    'validate output from rename/copy detection' \
-    'diff -u current expected'
+    'validate output from rename/copy detection (#2)' \
+    'compare_diff_raw current expected'
 
 test_expect_success \
     'prepare work tree once again' \
     'cat ../../COPYING >COPYING &&
      git-update-cache --add --remove COPYING COPYING.1'
 
+# make sure diff-helper can grok it.
+mv expected diff-raw
+GIT_DIFF_OPTS=--unified=0 git-diff-helper <diff-raw >current
+cat >expected <<\EOF
+diff --git a/COPYING b/COPYING.1
+copy from COPYING
+copy to COPYING.1
+--- a/COPYING
++++ b/COPYING.1
+@@ -6 +6 @@
+- HOWEVER, in order to allow a migration to GPLv3 if that seems like
++ However, in order to allow a migration to GPLv3 if that seems like
+diff --git a/COPYING b/COPYING
+--- a/COPYING
++++ b/COPYING
+@@ -2 +2 @@
+- Note that the only valid version of the GPL as far as this project
++ Note that the only valid version of the G.P.L as far as this project
+@@ -6 +6 @@
+- HOWEVER, in order to allow a migration to GPLv3 if that seems like
++ HOWEVER, in order to allow a migration to G.P.Lv3 if that seems like
+@@ -12 +12 @@
+-	This file is licensed under the GPL v2, or a later version
++	This file is licensed under the G.P.L v2, or a later version
+EOF
+
+test_expect_success \
+    'validate output from diff-helper (#2)' \
+    'compare_diff_patch current expected'
+
+################################################################
+
 # tree has COPYING and rezrov.  work tree has the same COPYING and
 # copy-edited COPYING.1, and unchanged rezrov.  We should see
 # unmodified COPYING in the output, so that downstream diff-helper can
@@ -71,12 +153,30 @@ test_expect_success \
 
 git-diff-cache -C $tree >current
 cat >expected <<\EOF
-:100644 100644 6ff87c4664981e4397625791c8ea3bbb5f2279a3 0603b3238a076dc6c8022aedc6648fa523a17178	COPYING	COPYING.1
-:100644 100644 6ff87c4664981e4397625791c8ea3bbb5f2279a3 6ff87c4664981e4397625791c8ea3bbb5f2279a3	COPYING	COPYING
+:100644 100644 6ff87c4664981e4397625791c8ea3bbb5f2279a3 0603b3238a076dc6c8022aedc6648fa523a17178 C1234	COPYING	COPYING.1
+:100644 100644 6ff87c4664981e4397625791c8ea3bbb5f2279a3 6ff87c4664981e4397625791c8ea3bbb5f2279a3 M	COPYING
+EOF
+
+test_expect_success \
+    'validate output from rename/copy detection (#3)' \
+    'compare_diff_raw current expected'
+
+# make sure diff-helper can grok it.
+mv expected diff-raw
+GIT_DIFF_OPTS=--unified=0 git-diff-helper <diff-raw >current
+cat >expected <<\EOF
+diff --git a/COPYING b/COPYING.1
+copy from COPYING
+copy to COPYING.1
+--- a/COPYING
++++ b/COPYING.1
+@@ -6 +6 @@
+- HOWEVER, in order to allow a migration to GPLv3 if that seems like
++ However, in order to allow a migration to GPLv3 if that seems like
 EOF
 
 test_expect_success \
-    'validate output from rename/copy detection' \
-    'diff -u current expected'
+    'validate output from diff-helper (#3)' \
+    'compare_diff_patch current expected'
 
 test_done
------------------------------------------------


^ permalink raw reply

* Re: [PATCH] show changed tree objects with recursive git-diff-tree
From: Nicolas Pitre @ 2005-05-23 21:49 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Junio C Hamano, git
In-Reply-To: <Pine.LNX.4.58.0505202025480.2206@ppc970.osdl.org>


[catching up after a weekind away -- you guys have not been standing still]

On Fri, 20 May 2005, Linus Torvalds wrote:

> On Fri, 20 May 2005, Junio C Hamano wrote:
> >
> > Although I do not have immediate objections to what it tries to
> > do, I have to think about the intent of the patch and its
> > ramifications.
> 
> I really think it should be a totally separate flag to enable showing the
> sub-trees if the tree-blobification wants this.
> 
> In fact, I can pretty much _guarantee_ that the patch as posted is the
> wrong thing to do: it will do horribly wrong things for things like
> 
> 	git-whatchanged arch/i386/kernel/head.S
> 
> (but I haven't tried it - try it yourself. The correct output for the 
> kernel archive is just a single commit, and a single blob change in that 
> commit).

OK.  What about the following patch?  It outputs changed tree objects 
only if -p nor -v nor -s is specified, i.e. whenever what is really 
wanted is output of what changed at the object level.  This makes it 
more coherent with the non-recursive output as well.  Checked that 
git-diff-helper doesn't get confused.

If a separate flag is really needed, then consistency dictates that the 
non recursive output should provide output for tree objects only when 
that flag is given as well, which makes the non recursive output rather 
useless in most cases.  And IMHO this is just too much burden for little 
benefit (the extra flag not the recursive tree object output).

=====

This patch includes output of modified tree objects to recursive output, 
just like non recursive output already does.  When -v, -s or -p  is 
specified then the recursive output supresses modified tree objects 
since they don't make much sense in that case.

Signed-off-by: Nicolas Pitre <nico@cam.org>

diff --git a/diff-tree.c b/diff-tree.c
--- a/diff-tree.c
+++ b/diff-tree.c
@@ -127,6 +127,8 @@ static int compare_tree_entry(void *tree
 	if (recursive && S_ISDIR(mode1)) {
 		int retval;
 		char *newbase = malloc_base(base, path1, pathlen1);
+		if (!silent && !verbose_header && !show_root_diff)
+			diff_change(mode1, mode2, sha1, sha2, base, path1);
 		retval = diff_tree_sha1(sha1, sha2, newbase);
 		free(newbase);
 		return retval;

^ permalink raw reply

* Re: [PATCH] Make sure diff-helper can tell rename/copy in the new diff-raw format.
From: Linus Torvalds @ 2005-05-23 19:16 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <Pine.LNX.4.58.0505231156210.2307@ppc970.osdl.org>



On Mon, 23 May 2005, Linus Torvalds wrote:
> 
> because having it there makes it just easier to parse, and means that we 
> can add "reasons" later without having to worry about ambiguities with 
> filenames.

Btw, putting the "what happened" into the fixed-format stuff before the
filenames allows you to have more thana fixed number of filenames even in
the machine-readable format. For example, it would allow the
machine-readable format to have just a single NUL character at the end of
the line for the "modified" case, since now it's unambigious where the
line ends (thanks to the fact that "M" always has only one file).

Similarly, it allows us to later add a "combine" reason (take two files, 
combine them into a third), which needs three filenames. That would 
becomes very messy if the "reason" part is after the filenames, because it 
gets easily ambigious.

		Linus

^ permalink raw reply

* Re: git-diff-tree -z HEAD | git-diff-helper -z fails for me
From: Junio C Hamano @ 2005-05-23 19:07 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Thomas Glanzmann, GIT
In-Reply-To: <7vpsvh3hp0.fsf@assigned-by-dhcp.cox.net>

>>>>> "JCH" == Junio C Hamano <junkio@cox.net> writes:

>>>>> "LT" == Linus Torvalds <torvalds@osdl.org> writes:
LT> How about instead making sure that any "extra" text be NUL-terminated and
LT> never start with ':' after a NUL (which will automatically be true, since
LT> it's either "diff-tree " + ascii for the verbose case, or just the tree
LT> name).

JCH> Makes much more sense although it has certain amount of Yuck
JCH> factor ;-).

------------
NUL terminate diff-tree header lines under -z.

Thomas Glanzmann noticed that diff-tree -z HEAD piped to
diff-helper -z did not work.  Since diff-helper -z expects NUL
terminated lines, we should generate such.

The output side of the diff-helper should always be using '\n'
termination; earlier it used the same line_termination used for
the input side, which was a mistake.

Signed-off-by: Junio C Hamano <junkio@cox.net>
---

*** Against your head, not my head that already has the
*** rename/copy fix.
*** Only rightly tested with the exact command line Thomas used.

cd /opt/packrat/playpen/public/in-place/git/git.junio/
jit-diff
# - linus: Don't care about st_dev in the index file
# + (working tree)
diff --git a/diff-helper.c b/diff-helper.c
--- a/diff-helper.c
+++ b/diff-helper.c
@@ -121,7 +121,7 @@ int main(int ac, const char **av) {
 		if (status) {
 		unrecognized:
 			diff_flush(diff_output_style);
-			printf("%s%c", sb1.buf, line_termination);
+			printf("%s\n", sb1.buf);
 		}
 	}
 	if (detect_rename)
diff --git a/diff-tree.c b/diff-tree.c
--- a/diff-tree.c
+++ b/diff-tree.c
@@ -277,7 +277,18 @@ static int call_diff_flush(void)
 	if (nr_paths)
 		diffcore_pathspec(paths);
 	if (header) {
-		printf("%s", header);
+		if (diff_output_format == DIFF_FORMAT_MACHINE) {
+			const char *ep, *cp;
+			for (cp = header; *cp; cp = ep) {
+				ep = strchr(cp, '\n');
+				if (ep == 0) ep = cp + strlen(cp);
+				printf("%.*s%c", ep-cp, cp, 0);
+				if (*ep) ep++;
+			}
+		}
+		else {
+			printf("%s", header);
+		}
 		header = NULL;
 	}
 	diff_flush(diff_output_format);



^ permalink raw reply

* Re: [PATCH 1/1] bugfix for git-checkout-cache --prefix=/symlink/export_dir/ -a
From: Linus Torvalds @ 2005-05-23 19:09 UTC (permalink / raw)
  To: David Greaves; +Cc: git
In-Reply-To: <Pine.LNX.4.58.0505231145190.2307@ppc970.osdl.org>



On Mon, 23 May 2005, Linus Torvalds wrote:
> > otherwise fails.
> 
> Hmm.. Does this alternative work for you instead?
> 
> [ Totally untested, please check for sanity first!! ]

Btw, I'm not going to apply this, and expect that David or somebody else 
can validate it and send it back to me as "tested".

		Linus

^ permalink raw reply

* Re: [PATCH] Make sure diff-helper can tell rename/copy in the new diff-raw format.
From: Linus Torvalds @ 2005-05-23 19:03 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <7vwtpp3hsa.fsf@assigned-by-dhcp.cox.net>



On Mon, 23 May 2005, Junio C Hamano wrote:
> 
> Human readable form should omit two->path and later fields
> altogether if one->path == two->path, so the above becomes:
> 
> in-place edit  :100644 100644 bcd1234... 0123456... file0
> copy-edit      :100644 100644 abcd123... 1234567... file1 file2 C 68
> rename-edit    :100644 100644 abcd123... 1234567... file1 file3 R 86
> create         :000000 100644 0000000... 1234567... file4
> delete         :100644 000000 1234567... 0000000... file5
> unmerged       :000000 000000 0000000... 0000000... file6

I'm ok with that format, although I'd actually prefer the "what happened"  
thing to come before the pathnames in the "fixed size" section, something
like

in-place edit  :100644 100644 bcd1234... 0123456... M file0
copy-edit      :100644 100644 abcd123... 1234567... C68 file1 file2
rename-edit    :100644 100644 abcd123... 1234567... R86 file1 file3
create         :000000 100644 0000000... 1234567... N file4
delete         :100644 000000 1234567... 0000000... D file5
unmerged       :000000 000000 0000000... 0000000... U file6

because having it there makes it just easier to parse, and means that we 
can add "reasons" later without having to worry about ambiguities with 
filenames. It means, for example, that the chacter that describes what 
that line does _always_ comes at the same byte offset in the line. That's 
just very convenient for everybody.

(In the above, I left the "similarity index" thing in, but it's not
important for the algorithm, and you could leave it out. It's nice for
debugging, and the major reason to not have it is the fact that it makes 
for non-constant format offsets for the first filename).

			Linus

^ permalink raw reply

* Re: git-diff-tree -z HEAD | git-diff-helper -z fails for me
From: Junio C Hamano @ 2005-05-23 18:44 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Thomas Glanzmann, GIT
In-Reply-To: <Pine.LNX.4.58.0505231119570.2307@ppc970.osdl.org>

>>>>> "LT" == Linus Torvalds <torvalds@osdl.org> writes:

LT> How about instead making sure that any "extra" text be NUL-terminated and
LT> never start with ':' after a NUL (which will automatically be true, since
LT> it's either "diff-tree " + ascii for the verbose case, or just the tree
LT> name).

Makes much more sense although it has certain amount of Yuck
factor ;-).



^ permalink raw reply

* Re: [PATCH 1/1] bugfix for git-checkout-cache --prefix=/symlink/export_dir/ -a
From: Linus Torvalds @ 2005-05-23 18:45 UTC (permalink / raw)
  To: David Greaves; +Cc: git
In-Reply-To: <E1Da8d0-0002GR-4m@ash.dgreaves.com>



On Mon, 23 May 2005, David Greaves wrote:
>
> If there's a prefix then allow symlinks to directories in it.
> This fixes a bug where
>   git-checkout-cache --prefix=/symlink/export_dir/ -a
> otherwise fails.

Hmm.. Does this alternative work for you instead?

[ Totally untested, please check for sanity first!! ]

		Linus

---
diff --git a/checkout-cache.c b/checkout-cache.c
--- a/checkout-cache.c
+++ b/checkout-cache.c
@@ -37,6 +37,8 @@
 #include "cache.h"
 
 static int force = 0, quiet = 0, not_new = 0, refresh_cache = 0;
+static const char *base_dir = "";
+static int base_dir_len = 0;
 
 static void create_directories(const char *path)
 {
@@ -51,10 +53,10 @@ static void create_directories(const cha
 		if (mkdir(buf, 0755)) {
 			if (errno == EEXIST) {
 				struct stat st;
-				if (!lstat(buf, &st) && S_ISDIR(st.st_mode))
-					continue; /* ok */
-				if (force && !unlink(buf) && !mkdir(buf, 0755))
+				if (len > base_dir_len && force && !unlink(buf) && !mkdir(buf, 0755))
 					continue;
+				if (!stat(buf, &st) && S_ISDIR(st.st_mode))
+					continue; /* ok */
 			}
 			die("cannot create directory at %s", buf);
 		}
@@ -163,11 +165,11 @@ static int write_entry(struct cache_entr
 	return 0;
 }
 
-static int checkout_entry(struct cache_entry *ce, const char *base_dir)
+static int checkout_entry(struct cache_entry *ce)
 {
 	struct stat st;
 	static char path[MAXPATHLEN+1];
-	int len = strlen(base_dir);
+	int len = base_dir_len;
 
 	memcpy(path, base_dir, len);
 	strcpy(path + len, ce->name);
@@ -194,7 +196,7 @@ static int checkout_entry(struct cache_e
 	return write_entry(ce, path);
 }
 
-static int checkout_file(const char *name, const char *base_dir)
+static int checkout_file(const char *name)
 {
 	int pos = cache_name_pos(name, strlen(name));
 	if (pos < 0) {
@@ -209,10 +211,10 @@ static int checkout_file(const char *nam
 		}
 		return -1;
 	}
-	return checkout_entry(active_cache[pos], base_dir);
+	return checkout_entry(active_cache[pos]);
 }
 
-static int checkout_all(const char *base_dir)
+static int checkout_all(void)
 {
 	int i;
 
@@ -220,7 +222,7 @@ static int checkout_all(const char *base
 		struct cache_entry *ce = active_cache[i];
 		if (ce_stage(ce))
 			continue;
-		if (checkout_entry(ce, base_dir) < 0)
+		if (checkout_entry(ce) < 0)
 			return -1;
 	}
 	return 0;
@@ -229,7 +231,6 @@ static int checkout_all(const char *base
 int main(int argc, char **argv)
 {
 	int i, force_filename = 0;
-	const char *base_dir = "";
 	struct cache_file cache_file;
 	int newfd = -1;
 
@@ -241,7 +242,7 @@ int main(int argc, char **argv)
 		const char *arg = argv[i];
 		if (!force_filename) {
 			if (!strcmp(arg, "-a")) {
-				checkout_all(base_dir);
+				checkout_all();
 				continue;
 			}
 			if (!strcmp(arg, "--")) {
@@ -272,10 +273,11 @@ int main(int argc, char **argv)
 			}
 			if (!memcmp(arg, "--prefix=", 9)) {
 				base_dir = arg+9;
+				base_dir_len = strlen(base_dir);
 				continue;
 			}
 		}
-		if (base_dir[0]) {
+		if (base_dir_len) {
 			/* when --prefix is specified we do not
 			 * want to update cache.
 			 */
@@ -285,7 +287,7 @@ int main(int argc, char **argv)
 			}
 			refresh_cache = 0;
 		}
-		checkout_file(arg, base_dir);
+		checkout_file(arg);
 	}
 
 	if (0 <= newfd &&

^ permalink raw reply

* Re: [PATCH] Make sure diff-helper can tell rename/copy in the new diff-raw format.
From: Junio C Hamano @ 2005-05-23 18:43 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git
In-Reply-To: <Pine.LNX.4.58.0505230736180.2307@ppc970.osdl.org>

>>>>> "LT" == Linus Torvalds <torvalds@osdl.org> writes:

LT> Btw, I still disagree with this notion that the order of the
LT> use of the names makes a difference.

Having slept over it, I think I tend to agree.  I do not mind
annotating diff-raw output with "is this copy or is this rename"
bit anymore.  While we are at it, I would also want to either
(1) add similarity index field to diff-raw output, or (2) drop
similarity index output from the built-in patch output.  I am
inclined to vote for the former right now (if only it is more
fun to watch), but I can easily be dissuaded.

The proposed diff-raw format, in its fully expended form, is this:

in-place edit  :100644 100644 bcd1234... 0123456... file0 file0 . 0
copy-edit      :100644 100644 abcd123... 1234567... file1 file2 C 68
rename-edit    :100644 100644 abcd123... 1234567... file1 file3 R 86
create         :000000 100644 0000000... 1234567... file4 file4 . 0
delete         :100644 000000 1234567... 0000000... file5 file5 . 0
unmerged       :000000 000000 0000000... 0000000... file6 file6 . 0

The two columns added are rename/copy bit and similarity index.
When one->path and two->path are the same, they do not mean
anything but for parser simplicity's sake I'd like to have 0 in
the similarity index field and a dot in copy/rename bit field.

Human readable form should omit two->path and later fields
altogether if one->path == two->path, so the above becomes:

in-place edit  :100644 100644 bcd1234... 0123456... file0
copy-edit      :100644 100644 abcd123... 1234567... file1 file2 C 68
rename-edit    :100644 100644 abcd123... 1234567... file1 file3 R 86
create         :000000 100644 0000000... 1234567... file4
delete         :100644 000000 1234567... 0000000... file5
unmerged       :000000 000000 0000000... 0000000... file6

This has a nice property that diff-helper, aside from its
diff-raw parsing part, can become quite simplified.  It should
lose rename/copy related flags (-M, -C) because they are already
detected by the tool in the upstream of the pipe; and because
rename-copy is an asymmetric operation, it should also lose the
-R flag.  I think it already does a wrong thing when you use
diff-tree brothers with -M or -C and feed diff-helper -R with
the output that contains already matched rename/copy.

The only thing diff-helper _will_ continue to do is to take a
diff-raw output prepared by diff-tree brothers, and generate
what the upstream tool would have generated if it were given
'-p' (and that should have been the case from the beginning).  

Although I think diffcore transformers other than rename/copy
may still be useful (like pickaxe) in diff-helper, that also can
be handled upstream.


^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox