git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* updated design for the diff-raw format.
@ 2005-05-21 23:12 Junio C Hamano
  2005-05-21 23:16 ` Junio C Hamano
                   ` (4 more replies)
  0 siblings, 5 replies; 33+ messages in thread
From: Junio C Hamano @ 2005-05-21 23:12 UTC (permalink / raw)
  To: git

After the introduction of the diff-core infrastructure, Linus
and I had a couple more back-and-forth, but by accident these
exchanges happened off the list.  I'll be replaying a couple of
messages to let people know about these because this change will
affect all the Porcalain implementations.



^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: updated design for the diff-raw format.
  2005-05-21 23:12 updated design for the diff-raw format Junio C Hamano
@ 2005-05-21 23:16 ` Junio C Hamano
  2005-05-21 23:17 ` Junio C Hamano
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 33+ messages in thread
From: Junio C Hamano @ 2005-05-21 23:16 UTC (permalink / raw)
  To: git

(first of the replayed exchange)

To: Linus Torvalds <torvalds@osdl.org>
Subject: Re: [PATCH 3/3] Diff overhaul, adding the other half of copy
 detection.
From: Junio C Hamano <junkio@cox.net>
Date: Sat, 21 May 2005 10:56:06 -0700
Message-ID: <7v4qcwihu1.fsf@assigned-by-dhcp.cox.net>

GIT_DIFF_OPTS=--unified=0 is good to me as well (GNU diffutils
2.8.1).

Now I think I am done with diff, except one thing.  And this is
quite an incompatible change so I do not know how well it would
work in practice.  I am not even advocating this.  It is more
like me thinking aloud.

The diff-raw format we have been dealing with (sorry about '\t'
vs ' ' gotcha again) is internally enhanced by diff-core.  It
first introduces entries for unmodified paths; '*' entries that
has the same mode/sha1 in from->to pair are such entries, and
that is what the change in the [PATCH 3/3] is about.

    *100644->100644 blob 233a250...->66818b4... file0
    *100755->100755 blob fc77389...->7b72d3d... file1
    +100644 blob 233a250... file2
    -100755 blob fc77389... file3
    *100644->100644 blob 233a250...->233a250... file4

Then diff-core internally extends the format to make things all
look like this ('*' and '-' are gone and each record acquires
the second path).

    100644->100644 233a250...->66818b4... file0 file0
    100755->100755 fc77389...->7b72d3d... file1 file1
    ______->100644 _______...->233a250... file2 file2
    100755->______ fc77389...->_______... file3 file3
    100644->100644 233a250...->233a250... file4 file4

Internally "______" above are represented with a separate flag
(file_valid), and denotes the absense of either src or dst.

The diff-core is all about manipulating this type of list and
changing one such list into a different list.

For example, rename-edit of file3 into file2 is detected by
diffcore-rename module and these entries:

    ______->100644 _______...->233a250... file2 file2
    100755->______ fc77389...->_______... file3 file3

become:

    100755->100644 fc77389...->233a250... file3 file2

What the diffcore-pickaxe does can also be explained clearly
with this model.  It takes such a list and works as a "grep".

Once we start to think of it this way, it becomes quite tempting
to change the diff-raw format to actually match the above
concept.  Meaning, (1) drop the operation letter +/-/*
(inferrable by looking at the both sides of ->); (2) drop
blob/tree (inferrable it from mode); (3) give two paths (usually
they are the same paths); (4) and perhaps replace '->' with the
same column separator.  Like this:

    100644 100644 233a250... 66818b4... file0 file0
    100755 100755 fc77389... 7b72d3d... file1 file1
    ______ 100644 _______... 233a250... file2 file2
    100755 ______ fc77389... _______... file3 file3
    100644 100644 233a250... 233a250... file4 file4

Again, I am not even advocating this.  It is more like me
still thinking aloud.







^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: updated design for the diff-raw format.
  2005-05-21 23:12 updated design for the diff-raw format Junio C Hamano
  2005-05-21 23:16 ` Junio C Hamano
@ 2005-05-21 23:17 ` Junio C Hamano
  2005-05-21 23:18 ` Junio C Hamano
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 33+ messages in thread
From: Junio C Hamano @ 2005-05-21 23:17 UTC (permalink / raw)
  To: git

(second of the replayed message, with blessing from Linus)

Date: Sat, 21 May 2005 11:24:31 -0700 (PDT)
From: Linus Torvalds <torvalds@osdl.org>
To: Junio C Hamano <junkio@cox.net>
Subject: Re: [PATCH 3/3] Diff overhaul, adding the other half of copy detection.
Message-ID: <Pine.LNX.4.58.0505211107160.2206@ppc970.osdl.org>

On Sat, 21 May 2005, Junio C Hamano wrote:
> 
> Once we start to think of it this way, it becomes quite tempting
> to change the diff-raw format to actually match the above
> concept.

I agree, and I was going to suggest changing the "raw" diff output for all
the same reasons. So I think you should do it, as the old format was based
on not really knowing where this all would take us. I think your proposed
format is visually nicer, and it's obviously more flexible.

Small suggestion on termination of the thing:
 - add a "inter_name_termination" variable, which defaults to '\t' (the 
   same way "line_termination" defaults to '\n')
 - make "-z" set both "inter_name_termination" _and_ "line_termination" to 
   0.
 - make the spacing be fixed (and add a test for it, so that there is 
   never any confusion): regular spaces between the non-file-names, and 
   "inter_name_termination" before the filenames, and "line_termination" 
   after the second filename.

This has a few results:

 - the default output is perfectly readable, if long

 - "cut" (which defaults to TAB delimeter) can directly pick up the
   three fields from a line: "state", "file1" and "file2"

 - even if you use the "readable" output (as opposed to the "-z" 
   machine-readable one), spaces in filenames are unambiguous, and we only 
   screw up on TAB and NL.

   Spaces in names are normal in many things. NL/TAB really _are_ unusual, 
   and I could imagine that some porcelain could actually disallow them 
   (and if that happens, we could support that by add a flag to
   "update-cache" to refuse to touch such files, the same way we refuse 
   non-canonical filenames now).

 - the -z flag results in fairly unreadable output, but is at least 
   totally parseable with all filename characters allowed.

With that in place, the new format would be a lot _easier_ to parse than 
the old one, I think. And will be more flexible, and since it's a 
fixed-column format, it's actually pretty readable for humans too, as long 
as the terminal line is wide enough.

>     100644 100644 233a250... 66818b4... file0 file0
>     100755 100755 fc77389... 7b72d3d... file1 file1
>     ______ 100644 _______... 233a250... file2 file2
>     100755 ______ fc77389... _______... file3 file3
>     100644 100644 233a250... 233a250... file4 file4
> 
> Again, I am not even advocating this.  It is more like me
> still thinking aloud.

No, I think it's really good. The _one_ thing I'd do is to maybe put a 
special character at the beginning of the line, so that "diff-helper" has 
an ever easier time to know whether it should care or not. Something that 
normally wouldn't show up at the beginning of a line, like ':'.

(This would have the secondary advantage that yuou could run "diff-helper"  
multiple times and not care whether it was already expanded or not. Right
now that is impossible: diff-helper can't know the difference between an
already-expanded diff that has a line that begins with a '+' or '-' and a
eleted/new object line).

		Linus





^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: updated design for the diff-raw format.
  2005-05-21 23:12 updated design for the diff-raw format Junio C Hamano
  2005-05-21 23:16 ` Junio C Hamano
  2005-05-21 23:17 ` Junio C Hamano
@ 2005-05-21 23:18 ` Junio C Hamano
  2005-05-21 23:19 ` Junio C Hamano
  2005-05-22  2:40 ` [PATCH] Prepare diffcore interface for diff-tree header supression Junio C Hamano
  4 siblings, 0 replies; 33+ messages in thread
From: Junio C Hamano @ 2005-05-21 23:18 UTC (permalink / raw)
  To: git

(third of the replayed messages)

To: Linus Torvalds <torvalds@osdl.org>
Subject: Re: [PATCH 3/3] Diff overhaul, adding the other half of copy
 detection.
From: Junio C Hamano <junkio@cox.net>
Date: Sat, 21 May 2005 13:36:25 -0700
Message-ID: <7vfywggvue.fsf@assigned-by-dhcp.cox.net>

>>>>> "LT" == Linus Torvalds <torvalds@osdl.org> writes:

LT> On Sat, 21 May 2005, Junio C Hamano wrote:
>> 
>> Once we start to think of it this way, it becomes quite tempting
>> to change the diff-raw format to actually match the above
>> concept.

LT> I agree, and I was going to suggest changing the "raw" diff output for all
LT> the same reasons. So I think you should do it, as the old format was based
LT> on not really knowing where this all would take us. I think your proposed
LT> format is visually nicer, and it's obviously more flexible.

LT> Small suggestion on termination of the thing:
LT>  - add a "inter_name_termination" variable, which defaults to '\t' (the 
LT>    same way "line_termination" defaults to '\n')
LT>  - make "-z" set both "inter_name_termination" _and_ "line_termination" to 
LT>    0.
LT>  - make the spacing be fixed (and add a test for it, so that there is 
LT>    never any confusion): regular spaces between the non-file-names, and 
LT>    "inter_name_termination" before the filenames, and "line_termination" 
LT>    after the second filename.

I am debating myself if I wanted to add this to the above list:

     - omit the inter_name_termination and second path if both
       paths are the same, only when doing human-readable
       (i.e. inter_name_termination != line_termination).

I'm not going to do this immediately though, for two reasons.

Somehow I failed to CC the GIT list the message you are
responding to.  Discussing a change with an impact of this scale
needs to be taken public before going further, so with your
permission I would like to repost both my original ("Once we
start to think of it this way...")  and your response to the GIT
list first.  At least I feel that Petr needs to be in the loop
about this one.

Another reason is that, as I said, I still have problems about
the diffcore interface, namely the lack of interface for the
applications to ask diffcore what the final outcome is.  The
"diff-tree not being to omit its header output when pickaxe says
the result is empty" problem is primarily what bothers me, but I
think we want a more generic interface for the application to
inspect the result (not just emptiness check), probably before
starting to feed the resulting diff list to the external diff.

Note that this interface needs to be inspection only---if the
application wants to further manipulate the result, then we
should extend list of diffcore transformations called from
diff_flush().  Which takes me to another point --- maybe the
list of diffcore transformations called from diff_flush() should
be made stackable, like streams.







^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: updated design for the diff-raw format.
  2005-05-21 23:12 updated design for the diff-raw format Junio C Hamano
                   ` (2 preceding siblings ...)
  2005-05-21 23:18 ` Junio C Hamano
@ 2005-05-21 23:19 ` Junio C Hamano
  2005-05-22  2:40 ` [PATCH] Prepare diffcore interface for diff-tree header supression Junio C Hamano
  4 siblings, 0 replies; 33+ messages in thread
From: Junio C Hamano @ 2005-05-21 23:19 UTC (permalink / raw)
  To: git

(fourth of the replayed messages)

Date: Sat, 21 May 2005 15:03:06 -0700 (PDT)
From: Linus Torvalds <torvalds@osdl.org>
To: Junio C Hamano <junkio@cox.net>
Subject: Re: [PATCH 3/3] Diff overhaul, adding the other half of copy detection.
Message-ID: <Pine.LNX.4.58.0505211452180.2206@ppc970.osdl.org>

On Sat, 21 May 2005, Junio C Hamano wrote:
> 
>      - omit the inter_name_termination and second path if both
>        paths are the same, only when doing human-readable
>        (i.e. inter_name_termination != line_termination).

Hmm. I guess that's fair enough, since it's still easily parseable even if
you don't want to use the -z version in scripts (as many tools are better
at handling line-terminated stuff than handling zero-terminations).

It might become another flag too, if somebody ends up caring.

> Somehow I failed to CC the GIT list the message you are
> responding to.  Discussing a change with an impact of this scale
> needs to be taken public before going further, so with your
> permission I would like to repost both my original ("Once we
> start to think of it this way...")  and your response to the GIT
> list first.  At least I feel that Petr needs to be in the loop
> about this one.

Sure. Although I doube people use the raw diff output except to (a) feed 
to diff-helper or (b) check that it's non-empty.

But absolutely, post the previous (and this) one.

> Another reason is that, as I said, I still have problems about the
> diffcore interface, namely the lack of interface for the applications to
> ask diffcore what the final outcome is.  The "diff-tree not being to
> omit its header output when pickaxe says the result is empty" problem is
> primarily what bothers me, but I think we want a more generic interface
> for the application to inspect the result (not just emptiness check),
> probably before starting to feed the resulting diff list to the external
> diff.

Why not just have a "is there anything pending" query before doing the 
flush? And always put _everything_ in the pending category, regardless of 
whether detect_rename/copy is in effect (but if it's not in effect, then 
flushing the pending queue is obviously just a "go through it and flush 
it" without any other complexity).

In other words, there would be four clear stages to this:

 1) diff_setup()
	remove "detect_rename" and "diff_score_opt" and "reverse_diff" 
	from this, since they are irrelevant until you _show_ the diffs

 2) diff_queue() *n
	tell diff-core about the files we are going to diff

 3) diff_detect_rename()
	this is what takes the "detect_rename" flag and "diff_score_opt", 
	and walks the list of diffs and potentially changes them into 
	"rename" and "copy" diffs.

 4) diff_flush()
	this just prints out the result (either as a raw diff or as 
	patches). This takes the "reverse_diff" flag that was removed from 
	diff_setup().

and then you can always query the state of the diff tree before stage 3 or 
before stage 4.

In fact, there's no reason not to even _change_ the diff-queue in magic 
ways before (3) or (4) depending on what you want to do. For example, your 
"-S" thing might want to do it's thing between stages (3) and (4).

		Linus





^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH] Prepare diffcore interface for diff-tree header supression.
  2005-05-21 23:12 updated design for the diff-raw format Junio C Hamano
                   ` (3 preceding siblings ...)
  2005-05-21 23:19 ` Junio C Hamano
@ 2005-05-22  2:40 ` Junio C Hamano
  2005-05-22  2:42   ` [PATCH] The diff-raw format updates Junio C Hamano
  2005-05-22  9:41   ` [PATCH] Diffcore updates Junio C Hamano
  4 siblings, 2 replies; 33+ messages in thread
From: Junio C Hamano @ 2005-05-22  2:40 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git

This does not actually supress the extra headers when pickaxe is
used, but prepares enough support for diff-tree to implement it.

Signed-off-by: Junio C Hamano <junkio@cox.net>
---

diff-cache.c       |   10 ++++++----
diff-files.c       |   10 ++++++----
diff-helper.c      |   13 +++++++------
diff-tree.c        |   26 ++++++++++++++++++--------
diff.c             |   43 ++++++++++++++++++++++---------------------
diff.h             |   12 +++++++-----
diffcore-pickaxe.c |    3 ++-
diffcore-rename.c  |    4 ++--
diffcore.h         |    3 +--
9 files changed, 71 insertions(+), 53 deletions(-)

diff --git a/diff-cache.c b/diff-cache.c
--- a/diff-cache.c
+++ b/diff-cache.c
@@ -214,9 +214,7 @@ int main(int argc, char **argv)
 	if (argc != 2 || get_sha1(argv[1], tree_sha1))
 		usage(diff_cache_usage);
 
-	diff_setup(detect_rename, diff_score_opt, pickaxe,
-		   reverse_diff, (generate_patch ? -1 : line_termination),
-		   NULL, 0);
+	diff_setup(reverse_diff, (generate_patch ? -1 : line_termination));
 
 	mark_merge_entries();
 
@@ -227,6 +225,10 @@ int main(int argc, char **argv)
 		die("unable to read tree object %s", argv[1]);
 
 	ret = diff_cache(active_cache, active_nr);
-	diff_flush();
+	if (detect_rename)
+		diff_detect_rename(detect_rename, diff_score_opt);
+	if (pickaxe)
+		diff_pickaxe(pickaxe);
+	diff_flush(NULL, 0);
 	return ret;
 }
diff --git a/diff-files.c b/diff-files.c
--- a/diff-files.c
+++ b/diff-files.c
@@ -92,9 +92,7 @@ int main(int argc, char **argv)
 		exit(1);
 	}
 
-	diff_setup(detect_rename, diff_score_opt, pickaxe,
-		   reverse_diff, (generate_patch ? -1 : line_termination),
-		   NULL, 0);
+	diff_setup(reverse_diff, (generate_patch ? -1 : line_termination));
 
 	for (i = 0; i < entries; i++) {
 		struct stat st;
@@ -136,6 +134,10 @@ int main(int argc, char **argv)
 		show_modified(oldmode, mode, ce->sha1, null_sha1,
 			      ce->name);
 	}
-	diff_flush();
+	if (detect_rename)
+		diff_detect_rename(detect_rename, diff_score_opt);
+	if (pickaxe)
+		diff_pickaxe(pickaxe);
+	diff_flush(NULL, 0);
 	return 0;
 }
diff --git a/diff-helper.c b/diff-helper.c
--- a/diff-helper.c
+++ b/diff-helper.c
@@ -127,10 +127,7 @@ int main(int ac, const char **av) {
 	}
 	/* the remaining parameters are paths patterns */
 
-	diff_setup(detect_rename, diff_score_opt, pickaxe,
-		   reverse, (generate_patch ? -1 : line_termination),
-		   av+1, ac-1);
-
+	diff_setup(reverse, (generate_patch ? -1 : line_termination));
 	while (1) {
 		int status;
 		read_line(&sb, stdin, line_termination);
@@ -138,11 +135,15 @@ int main(int ac, const char **av) {
 			break;
 		status = parse_diff_raw_output(sb.buf);
 		if (status) {
-			diff_flush();
+			diff_flush(av+1, ac-1);
 			printf("%s%c", sb.buf, line_termination);
 		}
 	}
 
-	diff_flush();
+	if (detect_rename)
+		diff_detect_rename(detect_rename, diff_score_opt);
+	if (pickaxe)
+		diff_pickaxe(pickaxe);
+	diff_flush(av+1, ac-1);
 	return 0;
 }
diff --git a/diff-tree.c b/diff-tree.c
--- a/diff-tree.c
+++ b/diff-tree.c
@@ -267,16 +267,28 @@ static int diff_tree_sha1(const unsigned
 	return retval;
 }
 
+static void call_diff_setup(void)
+{
+	diff_setup(reverse_diff, (generate_patch ? -1 : line_termination));
+}
+
+static void call_diff_flush(void)
+{
+	if (detect_rename)
+		diff_detect_rename(detect_rename, diff_score_opt);
+	if (pickaxe)
+		diff_pickaxe(pickaxe);
+	diff_flush(NULL, 0);
+}
+
 static int diff_tree_sha1_top(const unsigned char *old,
 			      const unsigned char *new, const char *base)
 {
 	int ret;
 
-	diff_setup(detect_rename, diff_score_opt, pickaxe,
-		   reverse_diff, (generate_patch ? -1 : line_termination),
-		   NULL, 0);
+	call_diff_setup();
 	ret = diff_tree_sha1(old, new, base);
-	diff_flush();
+	call_diff_flush();
 	return ret;
 }
 
@@ -286,15 +298,13 @@ static int diff_root_tree(const unsigned
 	void *tree;
 	unsigned long size;
 
-	diff_setup(detect_rename, diff_score_opt, pickaxe,
-		   reverse_diff, (generate_patch ? -1 : line_termination),
-		   NULL, 0);
+	call_diff_setup();
 	tree = read_object_with_reference(new, "tree", &size, NULL);
 	if (!tree)
 		die("unable to read root tree (%s)", sha1_to_hex(new));
 	retval = diff_tree("", 0, tree, size, base);
 	free(tree);
-	diff_flush();
+	call_diff_flush();
 	return retval;
 }
 
diff --git a/diff.c b/diff.c
--- a/diff.c
+++ b/diff.c
@@ -12,13 +12,10 @@
 static const char *diff_opts = "-pu";
 static unsigned char null_sha1[20] = { 0, };
 
-static int detect_rename;
 static int reverse_diff;
 static int diff_raw_output = -1;
 static const char **pathspec;
 static int speccnt;
-static const char *pickaxe;
-static int minimum_score;
 
 static const char *external_diff(void)
 {
@@ -512,21 +509,13 @@ int diff_scoreopt_parse(const char *opt)
 	return MAX_SCORE * num / scale;
 }
 
-void diff_setup(int detect_rename_, int minimum_score_,
-		const char *pickaxe_,
-		int reverse_diff_, int diff_raw_output_,
-		const char **pathspec_, int speccnt_)
+void diff_setup(int reverse_diff_, int diff_raw_output_)
 {
-	detect_rename = detect_rename_;
 	reverse_diff = reverse_diff_;
-	pathspec = pathspec_;
 	diff_raw_output = diff_raw_output_;
-	speccnt = speccnt_;
-	minimum_score = minimum_score_ ? : DEFAULT_MINIMUM_SCORE;
-	pickaxe = pickaxe_;
 }
 
-static struct diff_queue_struct queued_diff;
+struct diff_queue_struct diff_queued_diff;
 
 struct diff_filepair *diff_queue(struct diff_queue_struct *queue,
 				  struct diff_filespec *one,
@@ -636,15 +625,27 @@ static void diff_flush_one(struct diff_f
 		diff_flush_patch(p);
 }
 
-void diff_flush(void)
+int diff_queue_is_empty(void)
 {
-	struct diff_queue_struct *q = &queued_diff;
+	struct diff_queue_struct *q = &diff_queued_diff;
 	int i;
 
-	if (detect_rename)
-		diff_detect_rename(q, detect_rename, minimum_score);
-	if (pickaxe)
-		diff_pickaxe(q, pickaxe);
+	for (i = 0; i < q->nr; i++) {
+		struct diff_filepair *p = q->queue[i];
+		if (!identical(p->one, p->two))
+			return 0;
+	}
+	return 1;
+}
+
+void diff_flush(const char **pathspec_, int speccnt_)
+{
+	struct diff_queue_struct *q = &diff_queued_diff;
+	int i;
+
+	pathspec = pathspec_;
+	speccnt = speccnt_;
+
 	for (i = 0; i < q->nr; i++)
 		diff_flush_one(q->queue[i]);
 
@@ -693,7 +694,7 @@ void diff_addremove(int addremove, unsig
 	if (addremove != '-')
 		fill_filespec(two, sha1, mode);
 
-	diff_queue(&queued_diff, one, two);
+	diff_queue(&diff_queued_diff, one, two);
 }
 
 void diff_change(unsigned old_mode, unsigned new_mode,
@@ -716,7 +717,7 @@ void diff_change(unsigned old_mode, unsi
 	fill_filespec(one, old_sha1, old_mode);
 	fill_filespec(two, new_sha1, new_mode);
 
-	diff_queue(&queued_diff, one, two);
+	diff_queue(&diff_queued_diff, one, two);
 }
 
 void diff_unmerge(const char *path)
diff --git a/diff.h b/diff.h
--- a/diff.h
+++ b/diff.h
@@ -19,11 +19,13 @@ extern void diff_unmerge(const char *pat
 
 extern int diff_scoreopt_parse(const char *opt);
 
-extern void diff_setup(int detect_rename, int minimum_score,
-		       const char *pickaxe,
-		       int reverse, int raw_output,
-		       const char **spec, int cnt);
+extern void diff_setup(int reverse, int diff_raw_output);
 
-extern void diff_flush(void);
+extern void diff_detect_rename(int, int);
+extern void diff_pickaxe(const char *);
+
+extern int diff_queue_is_empty(void);
+
+extern void diff_flush(const char **, int);
 
 #endif /* DIFF_H */
diff --git a/diffcore-pickaxe.c b/diffcore-pickaxe.c
--- a/diffcore-pickaxe.c
+++ b/diffcore-pickaxe.c
@@ -21,8 +21,9 @@ static int contains(struct diff_filespec
 	return 0;
 }
 
-void diff_pickaxe(struct diff_queue_struct *q, const char *needle)
+void diff_pickaxe(const char *needle)
 {
+	struct diff_queue_struct *q = &diff_queued_diff;
 	unsigned long len = strlen(needle);
 	int i;
 	struct diff_queue_struct outq;
diff --git a/diffcore-rename.c b/diffcore-rename.c
--- a/diffcore-rename.c
+++ b/diffcore-rename.c
@@ -224,10 +224,10 @@ static int needs_to_stay(struct diff_que
 	return 0;
 }
 
-void diff_detect_rename(struct diff_queue_struct *q,
-			int detect_rename,
+void diff_detect_rename(int detect_rename,
 			int minimum_score)
 {
+	struct diff_queue_struct *q = &diff_queued_diff;
 	struct diff_queue_struct outq;
 	struct diff_rename_pool created, deleted, stay;
 	struct diff_rename_pool *(srcs[2]);
diff --git a/diffcore.h b/diffcore.h
--- a/diffcore.h
+++ b/diffcore.h
@@ -52,10 +52,9 @@ struct diff_queue_struct {
 	int nr;
 };
 
+extern struct diff_queue_struct diff_queued_diff;
 extern struct diff_filepair *diff_queue(struct diff_queue_struct *,
 					struct diff_filespec *,
 					struct diff_filespec *);
-extern void diff_detect_rename(struct diff_queue_struct *, int, int);
-extern void diff_pickaxe(struct diff_queue_struct *, const char *);
 
 #endif
------------------------------------------------


^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH] The diff-raw format updates.
  2005-05-22  2:40 ` [PATCH] Prepare diffcore interface for diff-tree header supression Junio C Hamano
@ 2005-05-22  2:42   ` Junio C Hamano
  2005-05-22  6:01     ` Linus Torvalds
  2005-05-22 18:35     ` [PATCH] The diff-raw format updates Linus Torvalds
  2005-05-22  9:41   ` [PATCH] Diffcore updates Junio C Hamano
  1 sibling, 2 replies; 33+ messages in thread
From: Junio C Hamano @ 2005-05-22  2:42 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git

Update the diff-raw format as Linus and I discussed, except that
it does not use sequence of underscore '_' letters to express
nonexistence.  All '0' mode is used for that purpose instead.

The new diff-raw format can express rename/copy, and the earlier
restriction that -M and -C _must_ be used with the patch format
output is no longer necessary.  The patch makes -M and -C flags
independent of -p flag, so you need to say git-whatchanged -M -p
to get the diff/patch format.

Updated are both documentations and tests.

Signed-off-by: Junio C Hamano <junkio@cox.net>
---

Documentation/diff-format.txt    |   29 +++--
Documentation/git-diff-cache.txt |    4 
Documentation/git-diff-files.txt |    4 
Documentation/git-diff-tree.txt  |    5 
diff-cache.c                     |   12 --
diff-files.c                     |   12 --
diff-helper.c                    |  153 ++++++++++++----------------
diff-tree.c                      |   13 +-
diff.c                           |  132 +++++++++++++-----------
diff.h                           |   12 ++
diffcore-pickaxe.c               |    6 -
diffcore-rename.c                |   20 +--
diffcore.h                       |    2 
t/t0000-basic.sh                 |   16 +-
t/t4002-diff-basic.sh            |  209 +++++++++++++++++++--------------------
t/t4003-diff-rename-1.sh         |    6 -
16 files changed, 322 insertions(+), 313 deletions(-)

diff --git a/Documentation/diff-format.txt b/Documentation/diff-format.txt
--- a/Documentation/diff-format.txt
+++ b/Documentation/diff-format.txt
@@ -16,25 +16,30 @@ git-diff-tree [-r] <tree-ish-1> <tree-is
 git-diff-files [<pattern>...]::
         compares the cache and the files on the filesystem.
 
-The following desription uses "old" and "new" to mean those
-compared entities.
 
-For files in old but not in new (i.e. removed):
+An output line is formatted this way:
 
-    -<mode> \t <type> \t <object> \t <path>
+  ':' <mode> ' ' <mode> ' ' <sha1> ' ' <sha1> I <path> I <path> L
 
-For files not in old but in new (i.e. added):
+By default, I and L are '\t' and '\n' respectively.  When '-z'
+flag is in effect, both I and L are '\0'.
 
-    +<mode> \t <type> \t <object> \t <path>
+In each <mode>, <sha1> and <path> pair, left hand side describes
+the left hand side of what is being compared (<tree-ish> in
+git-diff-cache, <tree-ish-1> in git-diff-tree, cache contents in
+git-diff-files).  Non-existence is shown by having 000000 in the
+<mode> column.  That is, 000000 appears as the first <mode> for
+newly created files, and as the second <mode> for deleted files.
 
-For files that differ:
+Usually two <path> are the same.  When rename/copy detection is
+used, however, an "create" and another "delete" records can be
+merged into a single record that has two <path>, old name and
+new name.
 
-    *<old-mode>-><new-mode> \t <type> \t <old-sha1>-><new-sha1> \t <path>
+<sha1> is shown as all 0's if new is a file on the filesystem
+and it is out of sync with the cache.  Example:
 
-<new-sha1> is shown as all 0's if new is a file on the
-filesystem and it is out of sync with the cache.  Example:
-
-  *100644->100644 blob    5be4a4.......->000000.......      file.c
+  :100644 100644 5be4a4...... 000000......    file.c    file.c
 
 
 Generating patches with -p
diff --git a/Documentation/git-diff-cache.txt b/Documentation/git-diff-cache.txt
--- a/Documentation/git-diff-cache.txt
+++ b/Documentation/git-diff-cache.txt
@@ -34,10 +34,10 @@ OPTIONS
 	\0 line termination on output
 
 -M::
-	Detect renames; implies -p.
+	Detect renames.
 
 -C::
-	Detect copies as well as renames; implies -p.
+	Detect copies as well as renames.
 
 -S<string>::
 	Look for differences that contains the change in <string>.
diff --git a/Documentation/git-diff-files.txt b/Documentation/git-diff-files.txt
--- a/Documentation/git-diff-files.txt
+++ b/Documentation/git-diff-files.txt
@@ -30,10 +30,10 @@ OPTIONS
 	Output diff in reverse.
 
 -M::
-	Detect renames; implies -p.
+	Detect renames.
 
 -C::
-	Detect copies as well as renames; implies -p.
+	Detect copies as well as renames.
 
 -S<string>::
 	Look for differences that contains the change in <string>.
diff --git a/Documentation/git-diff-tree.txt b/Documentation/git-diff-tree.txt
--- a/Documentation/git-diff-tree.txt
+++ b/Documentation/git-diff-tree.txt
@@ -34,11 +34,10 @@ OPTIONS
 	git-diff-tree, this flag implies '-r' as well.
 
 -M::
-	Detect renames; implies -p, in turn implying also '-r'.
+	Detect renames.
 
 -C::
-	Detect copies as well as renames; implies -p, in turn
-	implying also '-r'.
+	Detect copies as well as renames.
 
 -R::
 	Output diff in reverse.
diff --git a/diff-cache.c b/diff-cache.c
--- a/diff-cache.c
+++ b/diff-cache.c
@@ -2,9 +2,8 @@
 #include "diff.h"
 
 static int cached_only = 0;
-static int generate_patch = 0;
+static int diff_output_format = DIFF_FORMAT_HUMAN;
 static int match_nonexisting = 0;
-static int line_termination = '\n';
 static int detect_rename = 0;
 static int reverse_diff = 0;
 static int diff_score_opt = 0;
@@ -174,22 +173,21 @@ int main(int argc, char **argv)
 			continue;
 		}
 		if (!strcmp(arg, "-p")) {
-			generate_patch = 1;
+			diff_output_format = DIFF_FORMAT_PATCH;
 			continue;
 		}
 		if (!strncmp(arg, "-M", 2)) {
-			generate_patch = detect_rename = 1;
+			detect_rename = 1;
 			diff_score_opt = diff_scoreopt_parse(arg);
 			continue;
 		}
 		if (!strncmp(arg, "-C", 2)) {
-			generate_patch = 1;
 			detect_rename = 2;
 			diff_score_opt = diff_scoreopt_parse(arg);
 			continue;
 		}
 		if (!strcmp(arg, "-z")) {
-			line_termination = '\0';
+			diff_output_format = DIFF_FORMAT_MACHINE;
 			continue;
 		}
 		if (!strcmp(arg, "-R")) {
@@ -214,7 +212,7 @@ int main(int argc, char **argv)
 	if (argc != 2 || get_sha1(argv[1], tree_sha1))
 		usage(diff_cache_usage);
 
-	diff_setup(reverse_diff, (generate_patch ? -1 : line_termination));
+	diff_setup(reverse_diff, diff_output_format);
 
 	mark_merge_entries();
 
diff --git a/diff-files.c b/diff-files.c
--- a/diff-files.c
+++ b/diff-files.c
@@ -9,8 +9,7 @@
 static const char *diff_files_usage =
 "git-diff-files [-p] [-q] [-r] [-z] [-M] [-C] [-R] [-S<string>] [paths...]";
 
-static int generate_patch = 0;
-static int line_termination = '\n';
+static int diff_output_format = DIFF_FORMAT_HUMAN;
 static int detect_rename = 0;
 static int reverse_diff = 0;
 static int diff_score_opt = 0;
@@ -57,7 +56,7 @@ int main(int argc, char **argv)
 
 	while (1 < argc && argv[1][0] == '-') {
 		if (!strcmp(argv[1], "-p"))
-			generate_patch = 1;
+			diff_output_format = DIFF_FORMAT_PATCH;
 		else if (!strcmp(argv[1], "-q"))
 			silent = 1;
 		else if (!strcmp(argv[1], "-r"))
@@ -65,19 +64,18 @@ int main(int argc, char **argv)
 		else if (!strcmp(argv[1], "-s"))
 			; /* no-op */
 		else if (!strcmp(argv[1], "-z"))
-			line_termination = 0;
+			diff_output_format = DIFF_FORMAT_MACHINE;
 		else if (!strcmp(argv[1], "-R"))
 			reverse_diff = 1;
 		else if (!strcmp(argv[1], "-S"))
 			pickaxe = argv[1] + 2;
 		else if (!strncmp(argv[1], "-M", 2)) {
 			diff_score_opt = diff_scoreopt_parse(argv[1]);
-			detect_rename = generate_patch = 1;
+			detect_rename = 1;
 		}
 		else if (!strncmp(argv[1], "-C", 2)) {
 			diff_score_opt = diff_scoreopt_parse(argv[1]);
 			detect_rename = 2;
-			generate_patch = 1;
 		}
 		else
 			usage(diff_files_usage);
@@ -92,7 +90,7 @@ int main(int argc, char **argv)
 		exit(1);
 	}
 
-	diff_setup(reverse_diff, (generate_patch ? -1 : line_termination));
+	diff_setup(reverse_diff, diff_output_format);
 
 	for (i = 0; i < entries; i++) {
 		struct stat st;
diff --git a/diff-helper.c b/diff-helper.c
--- a/diff-helper.c
+++ b/diff-helper.c
@@ -8,88 +8,51 @@
 
 static int detect_rename = 0;
 static int diff_score_opt = 0;
-static int generate_patch = 1;
 static const char *pickaxe = NULL;
+static int diff_output_style = DIFF_FORMAT_PATCH;
+static int line_termination = '\n';
+static int inter_name_termination = '\t';
 
-static int parse_oneside_change(const char *cp, int *mode,
-				unsigned char *sha1, char *path)
+static int parse_diff_raw(char *buf1, char *buf2, char *buf3)
 {
-	int ch, m;
+	char old_path[PATH_MAX];
+	unsigned char old_sha1[20], new_sha1[20];
+	char *ep;
+	char *cp = buf1;
+	int ch, old_mode, new_mode;
 
-	m = 0;
-	while ((ch = *cp) && '0' <= ch && ch <= '7') {
-		m = (m << 3) | (ch - '0');
+	old_mode = new_mode = 0;
+	while ((ch = *cp) && ('0' <= ch && ch <= '7')) {
+		old_mode = (old_mode << 3) | (ch - '0');
+		cp++;
+	}
+	if (*cp++ != ' ')
+		return -1;
+	while ((ch = *cp) && ('0' <= ch && ch <= '7')) {
+		new_mode = (new_mode << 3) | (ch - '0');
 		cp++;
 	}
-	*mode = m;
-	if (strncmp(cp, "\tblob\t", 6) && strncmp(cp, " blob ", 6) &&
-	    strncmp(cp, "\ttree\t", 6) && strncmp(cp, " tree ", 6))
+	if (*cp++ != ' ')
 		return -1;
-	cp += 6;
-	if (get_sha1_hex(cp, sha1))
+	if (get_sha1_hex(cp, old_sha1))
 		return -1;
 	cp += 40;
-	if ((*cp != '\t') && *cp != ' ')
+	if (*cp++ != ' ')
 		return -1;
-	strcpy(path, ++cp);
-	return 0;
-}
-
-static int parse_diff_raw_output(const char *buf)
-{
-	char path[PATH_MAX];
-	unsigned char old_sha1[20], new_sha1[20];
-	const char *cp = buf;
-	int ch, old_mode, new_mode;
-
-	switch (*cp++) {
-	case 'U':
-		diff_unmerge(cp + 1);
-		break;
-	case '+':
-		if (parse_oneside_change(cp, &new_mode, new_sha1, path))
-			return -1;
-		diff_addremove('+', new_mode, new_sha1, path, NULL);
-		break;
-	case '-':
-		if (parse_oneside_change(cp, &old_mode, old_sha1, path))
-			return -1;
-		diff_addremove('-', old_mode, old_sha1, path, NULL);
-		break;
-	case '*':
-		old_mode = new_mode = 0;
-		while ((ch = *cp) && ('0' <= ch && ch <= '7')) {
-			old_mode = (old_mode << 3) | (ch - '0');
-			cp++;
-		}
-		if (strncmp(cp, "->", 2))
-			return -1;
-		cp += 2;
-		while ((ch = *cp) && ('0' <= ch && ch <= '7')) {
-			new_mode = (new_mode << 3) | (ch - '0');
-			cp++;
-		}
-		if (strncmp(cp, "\tblob\t", 6) && strncmp(cp, " blob ", 6) &&
-		    strncmp(cp, "\ttree\t", 6) && strncmp(cp, " tree ", 6))
-			return -1;
-		cp += 6;
-		if (get_sha1_hex(cp, old_sha1))
-			return -1;
-		cp += 40;
-		if (strncmp(cp, "->", 2))
-			return -1;
-		cp += 2;
-		if (get_sha1_hex(cp, new_sha1))
-			return -1;
-		cp += 40;
-		if ((*cp != '\t') && *cp != ' ')
-			return -1;
-		strcpy(path, ++cp);
-		diff_change(old_mode, new_mode, old_sha1, new_sha1, path, NULL);
-		break;
-	default:
+	if (get_sha1_hex(cp, new_sha1))
 		return -1;
-	}
+	cp += 40;
+	if (*cp++ != inter_name_termination)
+		return -1;
+	if (buf2)
+		cp = buf2;
+	ep = strchr(cp, inter_name_termination);
+	if (!ep)
+		return -1;
+	*ep++ = 0;
+	strcpy(old_path, cp);
+	diff_guif(old_mode, new_mode, old_sha1, new_sha1,
+		  old_path, buf3 ? buf3 : ep);
 	return 0;
 }
 
@@ -97,19 +60,22 @@ static const char *diff_helper_usage =
 	"git-diff-helper [-z] [-R] [-M] [-C] [-S<string>] paths...";
 
 int main(int ac, const char **av) {
-	struct strbuf sb;
-	int reverse = 0;
-	int line_termination = '\n';
+	struct strbuf sb1, sb2, sb3;
+	int reverse_diff = 0;
 
-	strbuf_init(&sb);
+	strbuf_init(&sb1);
+	strbuf_init(&sb2);
+	strbuf_init(&sb3);
 
 	while (1 < ac && av[1][0] == '-') {
 		if (av[1][1] == 'R')
-			reverse = 1;
+			reverse_diff = 1;
 		else if (av[1][1] == 'z')
-			line_termination = 0;
+			line_termination = inter_name_termination = 0;
 		else if (av[1][1] == 'p') /* hidden from the help */
-			generate_patch = 0;
+			diff_output_style = DIFF_FORMAT_HUMAN;
+		else if (av[1][1] == 'P') /* hidden from the help */
+			diff_output_style = DIFF_FORMAT_MACHINE;
 		else if (av[1][1] == 'M') {
 			detect_rename = 1;
 			diff_score_opt = diff_scoreopt_parse(av[1]);
@@ -127,19 +93,38 @@ int main(int ac, const char **av) {
 	}
 	/* the remaining parameters are paths patterns */
 
-	diff_setup(reverse, (generate_patch ? -1 : line_termination));
+	diff_setup(reverse_diff, diff_output_style);
 	while (1) {
 		int status;
-		read_line(&sb, stdin, line_termination);
-		if (sb.eof)
+		read_line(&sb1, stdin, line_termination);
+		if (sb1.eof)
 			break;
-		status = parse_diff_raw_output(sb.buf);
+		switch (sb1.buf[0]) {
+		case 'U':
+			diff_unmerge(sb1.buf + 2);
+			continue;
+		case ':':
+			break;
+		default:
+			goto unrecognized;
+		}
+		if (!line_termination) {
+			read_line(&sb2, stdin, line_termination);
+			if (sb2.eof)
+				break;
+			read_line(&sb3, stdin, line_termination);
+			if (sb3.eof)
+				break;
+			status = parse_diff_raw(sb1.buf+1, sb2.buf, sb3.buf);
+		}
+		else
+			status = parse_diff_raw(sb1.buf+1, NULL, NULL);
 		if (status) {
+		unrecognized:
 			diff_flush(av+1, ac-1);
-			printf("%s%c", sb.buf, line_termination);
+			printf("%s%c", sb1.buf, line_termination);
 		}
 	}
-
 	if (detect_rename)
 		diff_detect_rename(detect_rename, diff_score_opt);
 	if (pickaxe)
diff --git a/diff-tree.c b/diff-tree.c
--- a/diff-tree.c
+++ b/diff-tree.c
@@ -8,8 +8,7 @@ static int verbose_header = 0;
 static int ignore_merges = 1;
 static int recursive = 0;
 static int read_stdin = 0;
-static int line_termination = '\n';
-static int generate_patch = 0;
+static int diff_output_format = DIFF_FORMAT_HUMAN;
 static int detect_rename = 0;
 static int reverse_diff = 0;
 static int diff_score_opt = 0;
@@ -269,7 +268,7 @@ static int diff_tree_sha1(const unsigned
 
 static void call_diff_setup(void)
 {
-	diff_setup(reverse_diff, (generate_patch ? -1 : line_termination));
+	diff_setup(reverse_diff, diff_output_format);
 }
 
 static void call_diff_flush(void)
@@ -501,7 +500,8 @@ int main(int argc, char **argv)
 			continue;
 		}
 		if (!strcmp(arg, "-p")) {
-			recursive = generate_patch = 1;
+			diff_output_format = DIFF_FORMAT_PATCH;
+			recursive = 1;
 			continue;
 		}
 		if (!strncmp(arg, "-S", 2)) {
@@ -509,18 +509,17 @@ int main(int argc, char **argv)
 			continue;
 		}
 		if (!strncmp(arg, "-M", 2)) {
-			detect_rename = recursive = generate_patch = 1;
+			detect_rename = 1;
 			diff_score_opt = diff_scoreopt_parse(arg);
 			continue;
 		}
 		if (!strncmp(arg, "-C", 2)) {
 			detect_rename = 2;
-			recursive = generate_patch = 1;
 			diff_score_opt = diff_scoreopt_parse(arg);
 			continue;
 		}
 		if (!strcmp(arg, "-z")) {
-			line_termination = '\0';
+			diff_output_format = DIFF_FORMAT_MACHINE;
 			continue;
 		}
 		if (!strcmp(arg, "-m")) {
diff --git a/diff.c b/diff.c
--- a/diff.c
+++ b/diff.c
@@ -13,7 +13,9 @@ static const char *diff_opts = "-pu";
 static unsigned char null_sha1[20] = { 0, };
 
 static int reverse_diff;
-static int diff_raw_output = -1;
+static int generate_patch;
+static int line_termination = '\n';
+static int inter_name_termination = '\t';
 static const char **pathspec;
 static int speccnt;
 
@@ -163,20 +165,23 @@ struct diff_filespec *alloc_filespec(con
 	struct diff_filespec *spec = xmalloc(sizeof(*spec) + namelen + 1);
 	spec->path = (char *)(spec + 1);
 	strcpy(spec->path, path);
-	spec->should_free = spec->should_munmap = spec->file_valid = 0;
+	spec->should_free = spec->should_munmap = 0;
 	spec->xfrm_flags = 0;
 	spec->size = 0;
 	spec->data = 0;
+	spec->mode = 0;
+	memset(spec->sha1, 0, 20);
 	return spec;
 }
 
 void fill_filespec(struct diff_filespec *spec, const unsigned char *sha1,
 		   unsigned short mode)
 {
-	spec->mode = mode;
-	memcpy(spec->sha1, sha1, 20);
-	spec->sha1_valid = !!memcmp(sha1, null_sha1, 20);
-	spec->file_valid = 1;
+	if (mode) { /* just playing defensive */
+		spec->mode = mode;
+		memcpy(spec->sha1, sha1, 20);
+		spec->sha1_valid = !!memcmp(sha1, null_sha1, 20);
+	}
 }
 
 /*
@@ -231,7 +236,7 @@ static int work_tree_matches(const char 
 int diff_populate_filespec(struct diff_filespec *s)
 {
 	int err = 0;
-	if (!s->file_valid)
+	if (!DIFF_FILE_VALID(s))
 		die("internal error: asking to populate invalid file.");
 	if (S_ISDIR(s->mode))
 		return -1;
@@ -316,7 +321,7 @@ static void prepare_temp_file(const char
 			      struct diff_tempfile *temp,
 			      struct diff_filespec *one)
 {
-	if (!one->file_valid) {
+	if (!DIFF_FILE_VALID(one)) {
 	not_a_valid_file:
 		/* A '-' entry produces this for file-2, and
 		 * a '+' entry produces this for file-1.
@@ -509,10 +514,22 @@ int diff_scoreopt_parse(const char *opt)
 	return MAX_SCORE * num / scale;
 }
 
-void diff_setup(int reverse_diff_, int diff_raw_output_)
+void diff_setup(int reverse_diff_, int diff_output_style)
 {
 	reverse_diff = reverse_diff_;
-	diff_raw_output = diff_raw_output_;
+	generate_patch = 0;
+	switch (diff_output_style) {
+	case DIFF_FORMAT_HUMAN:
+		line_termination = '\n';
+		inter_name_termination = '\t';
+		break;
+	case DIFF_FORMAT_MACHINE:
+		line_termination = inter_name_termination = 0;
+		break;
+	case DIFF_FORMAT_PATCH:
+		generate_patch = 1;
+		break;
+	}
 }
 
 struct diff_queue_struct diff_queued_diff;
@@ -536,43 +553,17 @@ struct diff_filepair *diff_queue(struct 
 	return dp;
 }
 
-static const char *git_object_type(unsigned mode)
-{
-	return S_ISDIR(mode) ? "tree" : "blob";
-}
-
 static void diff_flush_raw(struct diff_filepair *p)
 {
-	struct diff_filespec *it;
-	int addremove;
-
-	/* raw output does not have a way to express rename nor copy */
-	if (strcmp(p->one->path, p->two->path))
-		return;
-
-	if (p->one->file_valid && p->two->file_valid) {
-		char hex[41];
-		strcpy(hex, sha1_to_hex(p->one->sha1));
-		printf("*%06o->%06o %s %s->%s %s%c",
-		       p->one->mode, p->two->mode,
-		       git_object_type(p->one->mode),
-		       hex, sha1_to_hex(p->two->sha1),
-		       p->one->path, diff_raw_output);
-		return;
-	}
-
-	if (p->one->file_valid) {
-		it = p->one;
-		addremove = '-';
-	} else {
-		it = p->two;
-		addremove = '+';
-	}
-
-	printf("%c%06o %s %s %s%c",
-	       addremove,
-	       it->mode, git_object_type(it->mode),
-	       sha1_to_hex(it->sha1), it->path, diff_raw_output);
+	/*
+	 * We used to reject rename/copy but new diff-raw can express them.
+	 */
+	printf(":%06o %06o %s ",
+	       p->one->mode, p->two->mode, sha1_to_hex(p->one->sha1));
+	printf("%s%c%s%c%s%c",
+	       sha1_to_hex(p->two->sha1), inter_name_termination,
+	       p->one->path, inter_name_termination,
+	       p->two->path, line_termination);
 }
 
 static void diff_flush_patch(struct diff_filepair *p)
@@ -581,8 +572,8 @@ static void diff_flush_patch(struct diff
 
 	name = p->one->path;
 	other = (strcmp(name, p->two->path) ? p->two->path : NULL);
-	if ((p->one->file_valid && S_ISDIR(p->one->mode)) ||
-	    (p->two->file_valid && S_ISDIR(p->two->mode)))
+	if ((DIFF_FILE_VALID(p->one) && S_ISDIR(p->one->mode)) ||
+	    (DIFF_FILE_VALID(p->two) && S_ISDIR(p->two->mode)))
 		return; /* no tree diffs in patch format */ 
 
 	run_external_diff(name, other, p->one, p->two, p->xfrm_msg);
@@ -596,11 +587,12 @@ static int identical(struct diff_filespe
 	 * and filter and clean them up here before producing the output.
 	 */
 
-	if (!one->file_valid && !two->file_valid)
+	if (!DIFF_FILE_VALID(one) && !DIFF_FILE_VALID(two))
 		return 1; /* not interesting */
 
 	/* deletion, addition, mode change and renames are all interesting. */
-	if ((one->file_valid != two->file_valid) || (one->mode != two->mode) ||
+	if (DIFF_FILE_VALID(one) != DIFF_FILE_VALID(two) ||
+	    (one->mode != two->mode) ||
 	    strcmp(one->path, two->path))
 		return 0;
 
@@ -619,10 +611,10 @@ static void diff_flush_one(struct diff_f
 {
 	if (identical(p->one, p->two))
 		return;
-	if (0 <= diff_raw_output)
-		diff_flush_raw(p);
-	else
+	if (generate_patch)
 		diff_flush_patch(p);
+	else
+		diff_flush_raw(p);
 }
 
 int diff_queue_is_empty(void)
@@ -697,10 +689,35 @@ void diff_addremove(int addremove, unsig
 	diff_queue(&diff_queued_diff, one, two);
 }
 
+void diff_guif(unsigned old_mode,
+	       unsigned new_mode,
+	       const unsigned char *old_sha1,
+	       const unsigned char *new_sha1,
+	       const char *old_path,
+	       const char *new_path)
+{
+	struct diff_filespec *one, *two;
+
+	if (reverse_diff) {
+		unsigned tmp;
+		const unsigned char *tmp_c;
+		tmp = old_mode; old_mode = new_mode; new_mode = tmp;
+		tmp_c = old_sha1; old_sha1 = new_sha1; new_sha1 = tmp_c;
+	}
+	one = alloc_filespec(old_path);
+	two = alloc_filespec(new_path);
+	if (old_mode)
+		fill_filespec(one, old_sha1, old_mode);
+	if (new_mode)
+		fill_filespec(two, new_sha1, new_mode);
+	diff_queue(&diff_queued_diff, one, two);
+}
+
 void diff_change(unsigned old_mode, unsigned new_mode,
 		 const unsigned char *old_sha1,
 		 const unsigned char *new_sha1,
-		 const char *base, const char *path) {
+		 const char *base, const char *path) 
+{
 	char concatpath[PATH_MAX];
 	struct diff_filespec *one, *two;
 
@@ -722,9 +739,8 @@ void diff_change(unsigned old_mode, unsi
 
 void diff_unmerge(const char *path)
 {
-	if (0 <= diff_raw_output) {
-		printf("U %s%c", path, diff_raw_output);
-		return;
-	}
-	run_external_diff(path, NULL, NULL, NULL, NULL);
+	if (generate_patch)
+		run_external_diff(path, NULL, NULL, NULL, NULL);
+	else
+		printf("U %s%c", path, line_termination);
 }
diff --git a/diff.h b/diff.h
--- a/diff.h
+++ b/diff.h
@@ -15,11 +15,21 @@ extern void diff_change(unsigned mode1, 
 			     const unsigned char *sha2,
 			     const char *base, const char *path);
 
+extern void diff_guif(unsigned mode1,
+		      unsigned mode2,
+		      const unsigned char *sha1,
+		      const unsigned char *sha2,
+		      const char *path1,
+		      const char *path2);
+
 extern void diff_unmerge(const char *path);
 
 extern int diff_scoreopt_parse(const char *opt);
 
-extern void diff_setup(int reverse, int diff_raw_output);
+#define DIFF_FORMAT_HUMAN   0
+#define DIFF_FORMAT_MACHINE 1
+#define DIFF_FORMAT_PATCH   2
+extern void diff_setup(int reverse, int diff_output_style);
 
 extern void diff_detect_rename(int, int);
 extern void diff_pickaxe(const char *);
diff --git a/diffcore-pickaxe.c b/diffcore-pickaxe.c
--- a/diffcore-pickaxe.c
+++ b/diffcore-pickaxe.c
@@ -32,14 +32,14 @@ void diff_pickaxe(const char *needle)
 
 	for (i = 0; i < q->nr; i++) {
 		struct diff_filepair *p = q->queue[i];
-		if (!p->one->file_valid) {
-			if (!p->two->file_valid)
+		if (!DIFF_FILE_VALID(p->one)) {
+			if (!DIFF_FILE_VALID(p->two))
 				continue; /* ignore nonsense */
 			/* created */
 			if (contains(p->two, needle, len))
 				diff_queue(&outq, p->one, p->two);
 		}
-		else if (!p->two->file_valid) {
+		else if (!DIFF_FILE_VALID(p->two)) {
 			if (contains(p->one, needle, len))
 				diff_queue(&outq, p->one, p->two);
 		}
diff --git a/diffcore-rename.c b/diffcore-rename.c
--- a/diffcore-rename.c
+++ b/diffcore-rename.c
@@ -142,7 +142,7 @@ static void debug_filespec(struct diff_f
 	fprintf(stderr, "queue[%d] %s (%s) %s %06o %s\n",
 		x, one,
 		s->path,
-		s->file_valid ? "valid" : "invalid",
+		DIFF_FILE_VALID(s) ? "valid" : "invalid",
 		s->mode,
 		s->sha1_valid ? sha1_to_hex(s->sha1) : "");
 	fprintf(stderr, "queue[%d] %s size %lu flags %d\n",
@@ -210,7 +210,7 @@ static int needs_to_stay(struct diff_que
 	 */
 	while (i < q->nr) {
 		struct diff_filepair *p = q->queue[i++];
-		if (!p->two->file_valid)
+		if (!DIFF_FILE_VALID(p->two))
 			continue; /* removed is fine */
 		if (strcmp(p->one->path, it->path))
 			continue; /* not relevant */
@@ -247,12 +247,12 @@ void diff_detect_rename(int detect_renam
 
 	for (i = 0; i < q->nr; i++) {
 		struct diff_filepair *p = q->queue[i];
-		if (!p->one->file_valid)
-			if (!p->two->file_valid)
+		if (!DIFF_FILE_VALID(p->one))
+			if (!DIFF_FILE_VALID(p->two))
 				continue; /* ignore nonsense */
 			else
 				diff_rename_pool_add(&created, p->two);
-		else if (!p->two->file_valid)
+		else if (!DIFF_FILE_VALID(p->two))
 			diff_rename_pool_add(&deleted, p->one);
 		else if (1 < detect_rename) /* find copy, too */
 			diff_rename_pool_add(&stay, p->one);
@@ -340,15 +340,15 @@ void diff_detect_rename(int detect_renam
 	 */
 	for (i = 0; i < q->nr; i++) {
 		struct diff_filepair *dp, *p = q->queue[i];
-		if (!p->one->file_valid) {
-			if (p->two->file_valid) {
+		if (!DIFF_FILE_VALID(p->one)) {
+			if (DIFF_FILE_VALID(p->two)) {
 				/* creation */
 				dp = diff_queue(&outq, p->one, p->two);
 				dp->xfrm_work = 4;
 			}
 			/* otherwise it is a nonsense; just ignore it */
 		}
-		else if (!p->two->file_valid) {
+		else if (!DIFF_FILE_VALID(p->two)) {
 			/* deletion */
 			dp = diff_queue(&outq, p->one, p->two);
 			dp->xfrm_work = 2;
@@ -374,14 +374,14 @@ void diff_detect_rename(int detect_renam
 	/* Copy it out to q, removing duplicates. */
 	for (i = 0; i < outq.nr; i++) {
 		struct diff_filepair *p = outq.queue[i];
-		if (!p->one->file_valid) {
+		if (!DIFF_FILE_VALID(p->one)) {
 			/* created */
 			if (p->two->xfrm_flags & RENAME_DST_MATCHED)
 				; /* rename/copy created it already */
 			else
 				diff_queue(q, p->one, p->two);
 		}
-		else if (!p->two->file_valid) {
+		else if (!DIFF_FILE_VALID(p->two)) {
 			/* deleted */
 			if (p->one->xfrm_flags & RENAME_SRC_GONE)
 				; /* rename/copy deleted it already */
diff --git a/diffcore.h b/diffcore.h
--- a/diffcore.h
+++ b/diffcore.h
@@ -26,7 +26,7 @@ struct diff_filespec {
 				  * if false, use the name and read from
 				  * the filesystem.
 				  */
-	unsigned file_valid : 1; /* if false the file does not exist */
+#define DIFF_FILE_VALID(spec) (((spec)->mode) != 0)
 	unsigned should_free : 1; /* data should be free()'ed */
 	unsigned should_munmap : 1; /* data should be munmap()'ed */
 };
diff --git a/t/t0000-basic.sh b/t/t0000-basic.sh
--- a/t/t0000-basic.sh
+++ b/t/t0000-basic.sh
@@ -155,14 +155,14 @@ test_expect_success \
      test "$newtree" = "$tree"'
 
 cat >expected <<\EOF
-*100644->100644	blob	f87290f8eb2cbbea7857214459a0739927eab154->0000000000000000000000000000000000000000	path0
-*120000->120000	blob	15a98433ae33114b085f3eb3bb03b832b3180a01->0000000000000000000000000000000000000000	path0sym
-*100644->100644	blob	3feff949ed00a62d9f7af97c15cd8a30595e7ac7->0000000000000000000000000000000000000000	path2/file2
-*120000->120000	blob	d8ce161addc5173867a3c3c730924388daedbc38->0000000000000000000000000000000000000000	path2/file2sym
-*100644->100644	blob	0aa34cae68d0878578ad119c86ca2b5ed5b28376->0000000000000000000000000000000000000000	path3/file3
-*120000->120000	blob	8599103969b43aff7e430efea79ca4636466794f->0000000000000000000000000000000000000000	path3/file3sym
-*100644->100644	blob	00fb5908cb97c2564a9783c0c64087333b3b464f->0000000000000000000000000000000000000000	path3/subp3/file3
-*120000->120000	blob	6649a1ebe9e9f1c553b66f5a6e74136a07ccc57c->0000000000000000000000000000000000000000	path3/subp3/file3sym
+:100644 100644 f87290f8eb2cbbea7857214459a0739927eab154 0000000000000000000000000000000000000000	path0	path0
+:120000 120000 15a98433ae33114b085f3eb3bb03b832b3180a01 0000000000000000000000000000000000000000	path0sym	path0sym
+:100644 100644 3feff949ed00a62d9f7af97c15cd8a30595e7ac7 0000000000000000000000000000000000000000	path2/file2	path2/file2
+:120000 120000 d8ce161addc5173867a3c3c730924388daedbc38 0000000000000000000000000000000000000000	path2/file2sym	path2/file2sym
+:100644 100644 0aa34cae68d0878578ad119c86ca2b5ed5b28376 0000000000000000000000000000000000000000	path3/file3	path3/file3
+:120000 120000 8599103969b43aff7e430efea79ca4636466794f 0000000000000000000000000000000000000000	path3/file3sym	path3/file3sym
+:100644 100644 00fb5908cb97c2564a9783c0c64087333b3b464f 0000000000000000000000000000000000000000	path3/subp3/file3	path3/subp3/file3
+:120000 120000 6649a1ebe9e9f1c553b66f5a6e74136a07ccc57c 0000000000000000000000000000000000000000	path3/subp3/file3sym	path3/subp3/file3sym
 EOF
 test_expect_success \
     'validate git-diff-files output for a know cache/work tree state.' \
diff --git a/t/t4002-diff-basic.sh b/t/t4002-diff-basic.sh
--- a/t/t4002-diff-basic.sh
+++ b/t/t4002-diff-basic.sh
@@ -10,123 +10,122 @@ test_description='Test diff raw-output.
 . ../lib-read-tree-m-3way.sh
 
 cat >.test-plain-OA <<\EOF
-+100644 blob ccba72ad3888a3520b39efcf780b9ee64167535d AA
-+100644 blob 7e426fb079479fd67f6d81f984e4ec649a44bc25 AN
--100644 blob bcc68ef997017466d5c9094bcf7692295f588c9a DD
-+040000 tree 6d50f65d3bdab91c63444294d38f08aeff328e42 DF
--100644 blob 141c1f1642328e4bc46a7d801a71da392e66791e DM
--100644 blob 35abde1506ddf806572ff4d407bd06885d0f8ee9 DN
-+100644 blob 1d41122ebdd7a640f29d3c9cc4f9d70094374762 LL
-*100644->100644 blob 03f24c8c4700babccfd28b654e7e8eac402ad6cd->103d9f89b50b9aad03054b579be5e7aa665f2d57 MD
-*100644->100644 blob b258508afb7ceb449981bd9d63d2d3e971bf8d34->b431b272d829ff3aa4d1a5085f4394ab4d3305b6 MM
-*100644->100644 blob bd084b0c27c7b6cc34f11d6d0509a29be3caf970->a716d58de4a570e0038f5c307bd8db34daea021f MN
-*100644->100644 blob 40c959f984c8b89a2b02520d17f00d717f024397->2ac547ae9614a00d1b28275de608131f7a0e259f SS
-*100644->100644 blob 4ac13458899ab908ef3b1128fa378daefc88d356->4c86f9a85fbc5e6804ee2e17a797538fbe785bca TT
-*040000->040000 tree 7d670fdcdb9929f6c7dac196ff78689cd1c566a1->5e5f22072bb39f6e12cf663a57cb634c76eefb49 Z
+:000000 100644 0000000000000000000000000000000000000000 ccba72ad3888a3520b39efcf780b9ee64167535d	AA	AA
+:000000 100644 0000000000000000000000000000000000000000 7e426fb079479fd67f6d81f984e4ec649a44bc25	AN	AN
+:100644 000000 bcc68ef997017466d5c9094bcf7692295f588c9a 0000000000000000000000000000000000000000	DD	DD
+:000000 040000 0000000000000000000000000000000000000000 6d50f65d3bdab91c63444294d38f08aeff328e42	DF	DF
+:100644 000000 141c1f1642328e4bc46a7d801a71da392e66791e 0000000000000000000000000000000000000000	DM	DM
+:100644 000000 35abde1506ddf806572ff4d407bd06885d0f8ee9 0000000000000000000000000000000000000000	DN	DN
+:000000 100644 0000000000000000000000000000000000000000 1d41122ebdd7a640f29d3c9cc4f9d70094374762	LL	LL
+:100644 100644 03f24c8c4700babccfd28b654e7e8eac402ad6cd 103d9f89b50b9aad03054b579be5e7aa665f2d57	MD	MD
+:100644 100644 b258508afb7ceb449981bd9d63d2d3e971bf8d34 b431b272d829ff3aa4d1a5085f4394ab4d3305b6	MM	MM
+:100644 100644 bd084b0c27c7b6cc34f11d6d0509a29be3caf970 a716d58de4a570e0038f5c307bd8db34daea021f	MN	MN
+:100644 100644 40c959f984c8b89a2b02520d17f00d717f024397 2ac547ae9614a00d1b28275de608131f7a0e259f	SS	SS
+:100644 100644 4ac13458899ab908ef3b1128fa378daefc88d356 4c86f9a85fbc5e6804ee2e17a797538fbe785bca	TT	TT
+:040000 040000 7d670fdcdb9929f6c7dac196ff78689cd1c566a1 5e5f22072bb39f6e12cf663a57cb634c76eefb49	Z	Z
 EOF
 
 cat >.test-recursive-OA <<\EOF
-+100644 blob ccba72ad3888a3520b39efcf780b9ee64167535d AA
-+100644 blob 7e426fb079479fd67f6d81f984e4ec649a44bc25 AN
--100644 blob bcc68ef997017466d5c9094bcf7692295f588c9a DD
-+100644 blob 68a6d8b91da11045cf4aa3a5ab9f2a781c701249 DF/DF
--100644 blob 141c1f1642328e4bc46a7d801a71da392e66791e DM
--100644 blob 35abde1506ddf806572ff4d407bd06885d0f8ee9 DN
-+100644 blob 1d41122ebdd7a640f29d3c9cc4f9d70094374762 LL
-*100644->100644 blob 03f24c8c4700babccfd28b654e7e8eac402ad6cd->103d9f89b50b9aad03054b579be5e7aa665f2d57 MD
-*100644->100644 blob b258508afb7ceb449981bd9d63d2d3e971bf8d34->b431b272d829ff3aa4d1a5085f4394ab4d3305b6 MM
-*100644->100644 blob bd084b0c27c7b6cc34f11d6d0509a29be3caf970->a716d58de4a570e0038f5c307bd8db34daea021f MN
-*100644->100644 blob 40c959f984c8b89a2b02520d17f00d717f024397->2ac547ae9614a00d1b28275de608131f7a0e259f SS
-*100644->100644 blob 4ac13458899ab908ef3b1128fa378daefc88d356->4c86f9a85fbc5e6804ee2e17a797538fbe785bca TT
-+100644 blob 8acb8e9750e3f644bf323fcf3d338849db106c77 Z/AA
-+100644 blob 087494262084cefee7ed484d20c8dc0580791272 Z/AN
--100644 blob 879007efae624d2b1307214b24a956f0a8d686a8 Z/DD
--100644 blob 9b541b2275c06e3a7b13f28badf5294e2ae63df4 Z/DM
--100644 blob beb5d38c55283d280685ea21a0e50cfcc0ca064a Z/DN
-*100644->100644 blob d41fda41b7ec4de46b43cb7ea42a45001ae393d5->a79ac3be9377639e1c7d1edf1ae1b3a5f0ccd8a9 Z/MD
-*100644->100644 blob 4ca22bae2527d3d9e1676498a0fba3b355bd1278->61422ba9c2c873416061a88cd40a59a35b576474 Z/MM
-*100644->100644 blob b16d7b25b869f2beb124efa53467d8a1550ad694->a5c544c21cfcb07eb80a4d89a5b7d1570002edfd Z/MN
+:000000 100644 0000000000000000000000000000000000000000 ccba72ad3888a3520b39efcf780b9ee64167535d	AA	AA
+:000000 100644 0000000000000000000000000000000000000000 7e426fb079479fd67f6d81f984e4ec649a44bc25	AN	AN
+:100644 000000 bcc68ef997017466d5c9094bcf7692295f588c9a 0000000000000000000000000000000000000000	DD	DD
+:000000 100644 0000000000000000000000000000000000000000 68a6d8b91da11045cf4aa3a5ab9f2a781c701249	DF/DF	DF/DF
+:100644 000000 141c1f1642328e4bc46a7d801a71da392e66791e 0000000000000000000000000000000000000000	DM	DM
+:100644 000000 35abde1506ddf806572ff4d407bd06885d0f8ee9 0000000000000000000000000000000000000000	DN	DN
+:000000 100644 0000000000000000000000000000000000000000 1d41122ebdd7a640f29d3c9cc4f9d70094374762	LL	LL
+:100644 100644 03f24c8c4700babccfd28b654e7e8eac402ad6cd 103d9f89b50b9aad03054b579be5e7aa665f2d57	MD	MD
+:100644 100644 b258508afb7ceb449981bd9d63d2d3e971bf8d34 b431b272d829ff3aa4d1a5085f4394ab4d3305b6	MM	MM
+:100644 100644 bd084b0c27c7b6cc34f11d6d0509a29be3caf970 a716d58de4a570e0038f5c307bd8db34daea021f	MN	MN
+:100644 100644 40c959f984c8b89a2b02520d17f00d717f024397 2ac547ae9614a00d1b28275de608131f7a0e259f	SS	SS
+:100644 100644 4ac13458899ab908ef3b1128fa378daefc88d356 4c86f9a85fbc5e6804ee2e17a797538fbe785bca	TT	TT
+:000000 100644 0000000000000000000000000000000000000000 8acb8e9750e3f644bf323fcf3d338849db106c77	Z/AA	Z/AA
+:000000 100644 0000000000000000000000000000000000000000 087494262084cefee7ed484d20c8dc0580791272	Z/AN	Z/AN
+:100644 000000 879007efae624d2b1307214b24a956f0a8d686a8 0000000000000000000000000000000000000000	Z/DD	Z/DD
+:100644 000000 9b541b2275c06e3a7b13f28badf5294e2ae63df4 0000000000000000000000000000000000000000	Z/DM	Z/DM
+:100644 000000 beb5d38c55283d280685ea21a0e50cfcc0ca064a 0000000000000000000000000000000000000000	Z/DN	Z/DN
+:100644 100644 d41fda41b7ec4de46b43cb7ea42a45001ae393d5 a79ac3be9377639e1c7d1edf1ae1b3a5f0ccd8a9	Z/MD	Z/MD
+:100644 100644 4ca22bae2527d3d9e1676498a0fba3b355bd1278 61422ba9c2c873416061a88cd40a59a35b576474	Z/MM	Z/MM
+:100644 100644 b16d7b25b869f2beb124efa53467d8a1550ad694 a5c544c21cfcb07eb80a4d89a5b7d1570002edfd	Z/MN	Z/MN
 EOF
 cat >.test-plain-OB <<\EOF
-+100644 blob 6aa2b5335b16431a0ef71e5c0a28be69183cf6a2 AA
--100644 blob bcc68ef997017466d5c9094bcf7692295f588c9a DD
-+100644 blob 71420ab81e254145d26d6fc0cddee64c1acd4787 DF
-*100644->100644 blob 141c1f1642328e4bc46a7d801a71da392e66791e->3c4d8de5fbad08572bab8e10eef8dbb264cf0231 DM
-+100644 blob 1d41122ebdd7a640f29d3c9cc4f9d70094374762 LL
--100644 blob 03f24c8c4700babccfd28b654e7e8eac402ad6cd MD
-*100644->100644 blob b258508afb7ceb449981bd9d63d2d3e971bf8d34->19989d4559aae417fedee240ccf2ba315ea4dc2b MM
-+100644 blob 15885881ea69115351c09b38371f0348a3fb8c67 NA
--100644 blob a4e179e4291e5536a5e1c82e091052772d2c5a93 ND
-*100644->100644 blob c8f25781e8f1792e3e40b74225e20553041b5226->cdb9a8c3da571502ac30225e9c17beccb8387983 NM
-*100644->100644 blob 40c959f984c8b89a2b02520d17f00d717f024397->2ac547ae9614a00d1b28275de608131f7a0e259f SS
-*100644->100644 blob 4ac13458899ab908ef3b1128fa378daefc88d356->c4e4a12231b9fa79a0053cb6077fcb21bb5b135a TT
-*040000->040000 tree 7d670fdcdb9929f6c7dac196ff78689cd1c566a1->1ba523955d5160681af65cb776411f574c1e8155 Z
+:000000 100644 0000000000000000000000000000000000000000 6aa2b5335b16431a0ef71e5c0a28be69183cf6a2	AA	AA
+:100644 000000 bcc68ef997017466d5c9094bcf7692295f588c9a 0000000000000000000000000000000000000000	DD	DD
+:000000 100644 0000000000000000000000000000000000000000 71420ab81e254145d26d6fc0cddee64c1acd4787	DF	DF
+:100644 100644 141c1f1642328e4bc46a7d801a71da392e66791e 3c4d8de5fbad08572bab8e10eef8dbb264cf0231	DM	DM
+:000000 100644 0000000000000000000000000000000000000000 1d41122ebdd7a640f29d3c9cc4f9d70094374762	LL	LL
+:100644 000000 03f24c8c4700babccfd28b654e7e8eac402ad6cd 0000000000000000000000000000000000000000	MD	MD
+:100644 100644 b258508afb7ceb449981bd9d63d2d3e971bf8d34 19989d4559aae417fedee240ccf2ba315ea4dc2b	MM	MM
+:000000 100644 0000000000000000000000000000000000000000 15885881ea69115351c09b38371f0348a3fb8c67	NA	NA
+:100644 000000 a4e179e4291e5536a5e1c82e091052772d2c5a93 0000000000000000000000000000000000000000	ND	ND
+:100644 100644 c8f25781e8f1792e3e40b74225e20553041b5226 cdb9a8c3da571502ac30225e9c17beccb8387983	NM	NM
+:100644 100644 40c959f984c8b89a2b02520d17f00d717f024397 2ac547ae9614a00d1b28275de608131f7a0e259f	SS	SS
+:100644 100644 4ac13458899ab908ef3b1128fa378daefc88d356 c4e4a12231b9fa79a0053cb6077fcb21bb5b135a	TT	TT
+:040000 040000 7d670fdcdb9929f6c7dac196ff78689cd1c566a1 1ba523955d5160681af65cb776411f574c1e8155	Z	Z
 EOF
 cat >.test-recursive-OB <<\EOF
-+100644 blob 6aa2b5335b16431a0ef71e5c0a28be69183cf6a2 AA
--100644 blob bcc68ef997017466d5c9094bcf7692295f588c9a DD
-+100644 blob 71420ab81e254145d26d6fc0cddee64c1acd4787 DF
-*100644->100644 blob 141c1f1642328e4bc46a7d801a71da392e66791e->3c4d8de5fbad08572bab8e10eef8dbb264cf0231 DM
-+100644 blob 1d41122ebdd7a640f29d3c9cc4f9d70094374762 LL
--100644 blob 03f24c8c4700babccfd28b654e7e8eac402ad6cd MD
-*100644->100644 blob b258508afb7ceb449981bd9d63d2d3e971bf8d34->19989d4559aae417fedee240ccf2ba315ea4dc2b MM
-+100644 blob 15885881ea69115351c09b38371f0348a3fb8c67 NA
--100644 blob a4e179e4291e5536a5e1c82e091052772d2c5a93 ND
-*100644->100644 blob c8f25781e8f1792e3e40b74225e20553041b5226->cdb9a8c3da571502ac30225e9c17beccb8387983 NM
-*100644->100644 blob 40c959f984c8b89a2b02520d17f00d717f024397->2ac547ae9614a00d1b28275de608131f7a0e259f SS
-*100644->100644 blob 4ac13458899ab908ef3b1128fa378daefc88d356->c4e4a12231b9fa79a0053cb6077fcb21bb5b135a TT
-+100644 blob 6c0b99286d0bce551ac4a7b3dff8b706edff3715 Z/AA
--100644 blob 879007efae624d2b1307214b24a956f0a8d686a8 Z/DD
-*100644->100644 blob 9b541b2275c06e3a7b13f28badf5294e2ae63df4->d77371d15817fcaa57eeec27f770c505ba974ec1 Z/DM
--100644 blob d41fda41b7ec4de46b43cb7ea42a45001ae393d5 Z/MD
-*100644->100644 blob 4ca22bae2527d3d9e1676498a0fba3b355bd1278->697aad7715a1e7306ca76290a3dd4208fbaeddfa Z/MM
-+100644 blob d12979c22fff69c59ca9409e7a8fe3ee25eaee80 Z/NA
--100644 blob a18393c636b98e9bd7296b8b437ea4992b72440c Z/ND
-*100644->100644 blob 3fdbe17fd013303a2e981e1ca1c6cd6e72789087->7e09d6a3a14bd630913e8c75693cea32157b606d Z/NM
+:000000 100644 0000000000000000000000000000000000000000 6aa2b5335b16431a0ef71e5c0a28be69183cf6a2	AA	AA
+:100644 000000 bcc68ef997017466d5c9094bcf7692295f588c9a 0000000000000000000000000000000000000000	DD	DD
+:000000 100644 0000000000000000000000000000000000000000 71420ab81e254145d26d6fc0cddee64c1acd4787	DF	DF
+:100644 100644 141c1f1642328e4bc46a7d801a71da392e66791e 3c4d8de5fbad08572bab8e10eef8dbb264cf0231	DM	DM
+:000000 100644 0000000000000000000000000000000000000000 1d41122ebdd7a640f29d3c9cc4f9d70094374762	LL	LL
+:100644 000000 03f24c8c4700babccfd28b654e7e8eac402ad6cd 0000000000000000000000000000000000000000	MD	MD
+:100644 100644 b258508afb7ceb449981bd9d63d2d3e971bf8d34 19989d4559aae417fedee240ccf2ba315ea4dc2b	MM	MM
+:000000 100644 0000000000000000000000000000000000000000 15885881ea69115351c09b38371f0348a3fb8c67	NA	NA
+:100644 000000 a4e179e4291e5536a5e1c82e091052772d2c5a93 0000000000000000000000000000000000000000	ND	ND
+:100644 100644 c8f25781e8f1792e3e40b74225e20553041b5226 cdb9a8c3da571502ac30225e9c17beccb8387983	NM	NM
+:100644 100644 40c959f984c8b89a2b02520d17f00d717f024397 2ac547ae9614a00d1b28275de608131f7a0e259f	SS	SS
+:100644 100644 4ac13458899ab908ef3b1128fa378daefc88d356 c4e4a12231b9fa79a0053cb6077fcb21bb5b135a	TT	TT
+:000000 100644 0000000000000000000000000000000000000000 6c0b99286d0bce551ac4a7b3dff8b706edff3715	Z/AA	Z/AA
+:100644 000000 879007efae624d2b1307214b24a956f0a8d686a8 0000000000000000000000000000000000000000	Z/DD	Z/DD
+:100644 100644 9b541b2275c06e3a7b13f28badf5294e2ae63df4 d77371d15817fcaa57eeec27f770c505ba974ec1	Z/DM	Z/DM
+:100644 000000 d41fda41b7ec4de46b43cb7ea42a45001ae393d5 0000000000000000000000000000000000000000	Z/MD	Z/MD
+:100644 100644 4ca22bae2527d3d9e1676498a0fba3b355bd1278 697aad7715a1e7306ca76290a3dd4208fbaeddfa	Z/MM	Z/MM
+:000000 100644 0000000000000000000000000000000000000000 d12979c22fff69c59ca9409e7a8fe3ee25eaee80	Z/NA	Z/NA
+:100644 000000 a18393c636b98e9bd7296b8b437ea4992b72440c 0000000000000000000000000000000000000000	Z/ND	Z/ND
+:100644 100644 3fdbe17fd013303a2e981e1ca1c6cd6e72789087 7e09d6a3a14bd630913e8c75693cea32157b606d	Z/NM	Z/NM
 EOF
 cat >.test-plain-AB <<\EOF
-*100644->100644 blob ccba72ad3888a3520b39efcf780b9ee64167535d->6aa2b5335b16431a0ef71e5c0a28be69183cf6a2 AA
--100644 blob 7e426fb079479fd67f6d81f984e4ec649a44bc25 AN
-+100644 blob 71420ab81e254145d26d6fc0cddee64c1acd4787 DF
--040000 tree 6d50f65d3bdab91c63444294d38f08aeff328e42 DF
-+100644 blob 3c4d8de5fbad08572bab8e10eef8dbb264cf0231 DM
-+100644 blob 35abde1506ddf806572ff4d407bd06885d0f8ee9 DN
--100644 blob 103d9f89b50b9aad03054b579be5e7aa665f2d57 MD
-*100644->100644 blob b431b272d829ff3aa4d1a5085f4394ab4d3305b6->19989d4559aae417fedee240ccf2ba315ea4dc2b MM
-*100644->100644 blob a716d58de4a570e0038f5c307bd8db34daea021f->bd084b0c27c7b6cc34f11d6d0509a29be3caf970 MN
-+100644 blob 15885881ea69115351c09b38371f0348a3fb8c67 NA
--100644 blob a4e179e4291e5536a5e1c82e091052772d2c5a93 ND
-*100644->100644 blob c8f25781e8f1792e3e40b74225e20553041b5226->cdb9a8c3da571502ac30225e9c17beccb8387983 NM
-*100644->100644 blob 4c86f9a85fbc5e6804ee2e17a797538fbe785bca->c4e4a12231b9fa79a0053cb6077fcb21bb5b135a TT
-*040000->040000 tree 5e5f22072bb39f6e12cf663a57cb634c76eefb49->1ba523955d5160681af65cb776411f574c1e8155 Z
+:100644 100644 ccba72ad3888a3520b39efcf780b9ee64167535d 6aa2b5335b16431a0ef71e5c0a28be69183cf6a2	AA	AA
+:100644 000000 7e426fb079479fd67f6d81f984e4ec649a44bc25 0000000000000000000000000000000000000000	AN	AN
+:000000 100644 0000000000000000000000000000000000000000 71420ab81e254145d26d6fc0cddee64c1acd4787	DF	DF
+:040000 000000 6d50f65d3bdab91c63444294d38f08aeff328e42 0000000000000000000000000000000000000000	DF	DF
+:000000 100644 0000000000000000000000000000000000000000 3c4d8de5fbad08572bab8e10eef8dbb264cf0231	DM	DM
+:000000 100644 0000000000000000000000000000000000000000 35abde1506ddf806572ff4d407bd06885d0f8ee9	DN	DN
+:100644 000000 103d9f89b50b9aad03054b579be5e7aa665f2d57 0000000000000000000000000000000000000000	MD	MD
+:100644 100644 b431b272d829ff3aa4d1a5085f4394ab4d3305b6 19989d4559aae417fedee240ccf2ba315ea4dc2b	MM	MM
+:100644 100644 a716d58de4a570e0038f5c307bd8db34daea021f bd084b0c27c7b6cc34f11d6d0509a29be3caf970	MN	MN
+:000000 100644 0000000000000000000000000000000000000000 15885881ea69115351c09b38371f0348a3fb8c67	NA	NA
+:100644 000000 a4e179e4291e5536a5e1c82e091052772d2c5a93 0000000000000000000000000000000000000000	ND	ND
+:100644 100644 c8f25781e8f1792e3e40b74225e20553041b5226 cdb9a8c3da571502ac30225e9c17beccb8387983	NM	NM
+:100644 100644 4c86f9a85fbc5e6804ee2e17a797538fbe785bca c4e4a12231b9fa79a0053cb6077fcb21bb5b135a	TT	TT
+:040000 040000 5e5f22072bb39f6e12cf663a57cb634c76eefb49 1ba523955d5160681af65cb776411f574c1e8155	Z	Z
 EOF
 cat >.test-recursive-AB <<\EOF
-*100644->100644 blob ccba72ad3888a3520b39efcf780b9ee64167535d->6aa2b5335b16431a0ef71e5c0a28be69183cf6a2 AA
--100644 blob 7e426fb079479fd67f6d81f984e4ec649a44bc25 AN
-+100644 blob 71420ab81e254145d26d6fc0cddee64c1acd4787 DF
--100644 blob 68a6d8b91da11045cf4aa3a5ab9f2a781c701249 DF/DF
-+100644 blob 3c4d8de5fbad08572bab8e10eef8dbb264cf0231 DM
-+100644 blob 35abde1506ddf806572ff4d407bd06885d0f8ee9 DN
--100644 blob 103d9f89b50b9aad03054b579be5e7aa665f2d57 MD
-*100644->100644 blob b431b272d829ff3aa4d1a5085f4394ab4d3305b6->19989d4559aae417fedee240ccf2ba315ea4dc2b MM
-*100644->100644 blob a716d58de4a570e0038f5c307bd8db34daea021f->bd084b0c27c7b6cc34f11d6d0509a29be3caf970 MN
-+100644 blob 15885881ea69115351c09b38371f0348a3fb8c67 NA
--100644 blob a4e179e4291e5536a5e1c82e091052772d2c5a93 ND
-*100644->100644 blob c8f25781e8f1792e3e40b74225e20553041b5226->cdb9a8c3da571502ac30225e9c17beccb8387983 NM
-*100644->100644 blob 4c86f9a85fbc5e6804ee2e17a797538fbe785bca->c4e4a12231b9fa79a0053cb6077fcb21bb5b135a TT
-*100644->100644 blob 8acb8e9750e3f644bf323fcf3d338849db106c77->6c0b99286d0bce551ac4a7b3dff8b706edff3715 Z/AA
--100644 blob 087494262084cefee7ed484d20c8dc0580791272 Z/AN
-+100644 blob d77371d15817fcaa57eeec27f770c505ba974ec1 Z/DM
-+100644 blob beb5d38c55283d280685ea21a0e50cfcc0ca064a Z/DN
--100644 blob a79ac3be9377639e1c7d1edf1ae1b3a5f0ccd8a9 Z/MD
-*100644->100644 blob 61422ba9c2c873416061a88cd40a59a35b576474->697aad7715a1e7306ca76290a3dd4208fbaeddfa Z/MM
-*100644->100644 blob a5c544c21cfcb07eb80a4d89a5b7d1570002edfd->b16d7b25b869f2beb124efa53467d8a1550ad694 Z/MN
-+100644 blob d12979c22fff69c59ca9409e7a8fe3ee25eaee80 Z/NA
--100644 blob a18393c636b98e9bd7296b8b437ea4992b72440c Z/ND
-*100644->100644 blob 3fdbe17fd013303a2e981e1ca1c6cd6e72789087->7e09d6a3a14bd630913e8c75693cea32157b606d Z/NM
+:100644 100644 ccba72ad3888a3520b39efcf780b9ee64167535d 6aa2b5335b16431a0ef71e5c0a28be69183cf6a2	AA	AA
+:100644 000000 7e426fb079479fd67f6d81f984e4ec649a44bc25 0000000000000000000000000000000000000000	AN	AN
+:000000 100644 0000000000000000000000000000000000000000 71420ab81e254145d26d6fc0cddee64c1acd4787	DF	DF
+:100644 000000 68a6d8b91da11045cf4aa3a5ab9f2a781c701249 0000000000000000000000000000000000000000	DF/DF	DF/DF
+:000000 100644 0000000000000000000000000000000000000000 3c4d8de5fbad08572bab8e10eef8dbb264cf0231	DM	DM
+:000000 100644 0000000000000000000000000000000000000000 35abde1506ddf806572ff4d407bd06885d0f8ee9	DN	DN
+:100644 000000 103d9f89b50b9aad03054b579be5e7aa665f2d57 0000000000000000000000000000000000000000	MD	MD
+:100644 100644 b431b272d829ff3aa4d1a5085f4394ab4d3305b6 19989d4559aae417fedee240ccf2ba315ea4dc2b	MM	MM
+:100644 100644 a716d58de4a570e0038f5c307bd8db34daea021f bd084b0c27c7b6cc34f11d6d0509a29be3caf970	MN	MN
+:000000 100644 0000000000000000000000000000000000000000 15885881ea69115351c09b38371f0348a3fb8c67	NA	NA
+:100644 000000 a4e179e4291e5536a5e1c82e091052772d2c5a93 0000000000000000000000000000000000000000	ND	ND
+:100644 100644 c8f25781e8f1792e3e40b74225e20553041b5226 cdb9a8c3da571502ac30225e9c17beccb8387983	NM	NM
+:100644 100644 4c86f9a85fbc5e6804ee2e17a797538fbe785bca c4e4a12231b9fa79a0053cb6077fcb21bb5b135a	TT	TT
+:100644 100644 8acb8e9750e3f644bf323fcf3d338849db106c77 6c0b99286d0bce551ac4a7b3dff8b706edff3715	Z/AA	Z/AA
+:100644 000000 087494262084cefee7ed484d20c8dc0580791272 0000000000000000000000000000000000000000	Z/AN	Z/AN
+:000000 100644 0000000000000000000000000000000000000000 d77371d15817fcaa57eeec27f770c505ba974ec1	Z/DM	Z/DM
+:000000 100644 0000000000000000000000000000000000000000 beb5d38c55283d280685ea21a0e50cfcc0ca064a	Z/DN	Z/DN
+:100644 000000 a79ac3be9377639e1c7d1edf1ae1b3a5f0ccd8a9 0000000000000000000000000000000000000000	Z/MD	Z/MD
+:100644 100644 61422ba9c2c873416061a88cd40a59a35b576474 697aad7715a1e7306ca76290a3dd4208fbaeddfa	Z/MM	Z/MM
+:100644 100644 a5c544c21cfcb07eb80a4d89a5b7d1570002edfd b16d7b25b869f2beb124efa53467d8a1550ad694	Z/MN	Z/MN
+:000000 100644 0000000000000000000000000000000000000000 d12979c22fff69c59ca9409e7a8fe3ee25eaee80	Z/NA	Z/NA
+:100644 000000 a18393c636b98e9bd7296b8b437ea4992b72440c 0000000000000000000000000000000000000000	Z/ND	Z/ND
+:100644 100644 3fdbe17fd013303a2e981e1ca1c6cd6e72789087 7e09d6a3a14bd630913e8c75693cea32157b606d	Z/NM	Z/NM
 EOF
 
-
 x40='[0-9a-f][0-9a-f][0-9a-f][0-9a-f][0-9a-f]'
 x40="$x40$x40$x40$x40$x40$x40$x40$x40"
 z40='0000000000000000000000000000000000000000'
@@ -135,7 +134,7 @@ cmp_diff_files_output () {
     # object ID for the changed files because it wants you to look at the
     # filesystem.
     sed <"$2" >.test-tmp \
-	-e '/^+/d;/\^*/s/\( '$x40'->\)'$x40' /\1'$z40' /' &&
+	-e '/^:000000 /d;s/'$x40'	/'$z40'	/' &&
     diff "$1" .test-tmp
 }
 
diff --git a/t/t4003-diff-rename-1.sh b/t/t4003-diff-rename-1.sh
--- a/t/t4003-diff-rename-1.sh
+++ b/t/t4003-diff-rename-1.sh
@@ -26,7 +26,7 @@ test_expect_success \
 # both are slightly edited.  So we say you copy-and-edit one,
 # and rename-and-edit the other.
 
-GIT_DIFF_OPTS=--unified=0 git-diff-cache -M $tree |
+GIT_DIFF_OPTS=--unified=0 git-diff-cache -M -p $tree |
 sed -e 's/\([0-9][0-9]*\)/#/g' >current &&
 cat >expected <<\EOF
 diff --git a/COPYING b/COPYING.#
@@ -68,7 +68,7 @@ test_expect_success \
 # both are slightly edited.  So we say you edited one,
 # and copy-and-edit the other.
 
-GIT_DIFF_OPTS=--unified=0 git-diff-cache -C $tree |
+GIT_DIFF_OPTS=--unified=0 git-diff-cache -C -p $tree |
 sed -e 's/\([0-9][0-9]*\)/#/g' >current
 cat >expected <<\EOF
 diff --git a/COPYING b/COPYING.#
@@ -108,7 +108,7 @@ test_expect_success \
 # this is only possible because -C mode now reports the unmodified
 # file to the diff-core.
 
-GIT_DIFF_OPTS=--unified=0 git-diff-cache -C $tree |
+GIT_DIFF_OPTS=--unified=0 git-diff-cache -C -p $tree |
 sed -e 's/\([0-9][0-9]*\)/#/g' >current
 cat >expected <<\EOF
 diff --git a/COPYING b/COPYING.#
------------------------------------------------


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH] The diff-raw format updates.
  2005-05-22  2:42   ` [PATCH] The diff-raw format updates Junio C Hamano
@ 2005-05-22  6:01     ` Linus Torvalds
  2005-05-22  6:33       ` Junio C Hamano
  2005-05-22  6:57       ` Junio C Hamano
  2005-05-22 18:35     ` [PATCH] The diff-raw format updates Linus Torvalds
  1 sibling, 2 replies; 33+ messages in thread
From: Linus Torvalds @ 2005-05-22  6:01 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Git Mailing List



On Sat, 21 May 2005, Junio C Hamano wrote:
>
> Update the diff-raw format as Linus and I discussed, except that
> it does not use sequence of underscore '_' letters to express
> nonexistence.  All '0' mode is used for that purpose instead.

Ok, it all looks very nice, but it looks like -C is still broken ;)

Do 

	git-whatchanged -C -p

and you'll see this hilarious entry as the first "copy" entry:

	diff --git a/Documentation/git-diff-cache.txt b/diffcore-pickaxe.c
	similarity index 0%
	copy from Documentation/git-diff-cache.txt
	copy to diffcore-pickaxe.c
	--- a/Documentation/git-diff-cache.txt
	+++ b/diffcore-pickaxe.c
	@@ -1,150 +1,56 @@
	..

it even says "similarity index 0%", adn that sure is accurate ;)

Doing "-C50" to ask for at least 50% similarity makes it ok (and it finds
the "git-pull-script" -> "git-resolve-script" thing), but there's
something strange going on, becuase -C10 and -C90 don't work, even if -C40
and -C50 _do_ work.

		Linus

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH] The diff-raw format updates.
  2005-05-22  6:01     ` Linus Torvalds
@ 2005-05-22  6:33       ` Junio C Hamano
  2005-05-22  6:57       ` Junio C Hamano
  1 sibling, 0 replies; 33+ messages in thread
From: Junio C Hamano @ 2005-05-22  6:33 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Git Mailing List

>>>>> "LT" == Linus Torvalds <torvalds@osdl.org> writes:

LT> On Sat, 21 May 2005, Junio C Hamano wrote:
>> 
>> Update the diff-raw format as Linus and I discussed, except that
>> it does not use sequence of underscore '_' letters to express
>> nonexistence.  All '0' mode is used for that purpose instead.

LT> Ok, it all looks very nice, but it looks like -C is still broken ;)

Well it is not _still_ broken, but rather was introduced by the
code reorganization to expose diff_rename_detect().  Sorry about
the lack of testing.

------------
Add the code to set default minimum score back in.

When the minimum score is specified as 0 (meaning "use default
value"), set it to the default as we are told.

Signed-off-by: Junio C Hamano <junkio@cox.net>
---
$ jit-diff
# - linus: [PATCH] The diff-raw format updates.
# + (working tree)
diff --git a/diffcore-rename.c b/diffcore-rename.c
--- a/diffcore-rename.c
+++ b/diffcore-rename.c
@@ -235,6 +235,8 @@ void diff_detect_rename(int detect_renam
 	int h, i, j;
 	int num_create, num_src, dst_cnt, src_cnt;
 
+	if (!minimum_score)
+		minimum_score = DEFAULT_MINIMUM_SCORE;
 	outq.queue = NULL;
 	outq.nr = outq.alloc = 0;
 



^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH] The diff-raw format updates.
  2005-05-22  6:01     ` Linus Torvalds
  2005-05-22  6:33       ` Junio C Hamano
@ 2005-05-22  6:57       ` Junio C Hamano
  2005-05-22  8:31         ` [PATCH] Fix tweak in similarity estimator Junio C Hamano
  1 sibling, 1 reply; 33+ messages in thread
From: Junio C Hamano @ 2005-05-22  6:57 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Git Mailing List

>>>>> "LT" == Linus Torvalds <torvalds@osdl.org> writes:

LT> Doing "-C50" to ask for at least 50% similarity makes it ok (and it finds
LT> the "git-pull-script" -> "git-resolve-script" thing), but there's
LT> something strange going on, becuase -C10 and -C90 don't work, even if -C40
LT> and -C50 _do_ work.

Plain -C (or -M) not working was a bug I now understand why,
and I would not be surprised that -C10 may give ridiculous
or hilarious results (false hits), but I am having trouble
reproducing the -C90 case.  Will do some more digging later.



^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH] Fix tweak in similarity estimator.
  2005-05-22  6:57       ` Junio C Hamano
@ 2005-05-22  8:31         ` Junio C Hamano
  0 siblings, 0 replies; 33+ messages in thread
From: Junio C Hamano @ 2005-05-22  8:31 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Git Mailing List

>>>>> "JCH" == Junio C Hamano <junkio@cox.net> writes:

JCH> Plain -C (or -M) not working was a bug I now understand why,
JCH> and I would not be surprised that -C10 may give ridiculous
JCH> or hilarious results (false hits), but I am having trouble
JCH> reproducing the -C90 case.  Will do some more digging later.

An embarrasing math thinko was causing this confusion.

------------
Fix tweak in similarity estimator.

There was a screwy math bug in the estimator that confused what
-C1 meant and what -C9 meant, only in one of the early "cheap"
check, which resulted in quite confusing behaviour.

Signed-off-by: Junio C Hamano <junkio@cox.net>
---

diffcore-rename.c |    4 ++--
1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/diffcore-rename.c b/diffcore-rename.c
--- a/diffcore-rename.c
+++ b/diffcore-rename.c
@@ -77,12 +77,12 @@ static int estimate_similarity(struct di
 
 	/* We would not consider edits that change the file size so
 	 * drastically.  delta_size must be smaller than
-	 * minimum_score/MAX_SCORE * min(src->size, dst->size).
+	 * (MAX_SCORE-minimum_score)/MAX_SCORE * min(src->size, dst->size).
 	 * Note that base_size == 0 case is handled here already
 	 * and the final score computation below would not have a
 	 * divide-by-zero issue.
 	 */
-	if (base_size * minimum_score < delta_size * MAX_SCORE)
+	if (base_size * (MAX_SCORE-minimum_score) < delta_size * MAX_SCORE)
 		return 0;
 
 	delta = diff_delta(src->data, src->size,


^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH] Diffcore updates.
  2005-05-22  2:40 ` [PATCH] Prepare diffcore interface for diff-tree header supression Junio C Hamano
  2005-05-22  2:42   ` [PATCH] The diff-raw format updates Junio C Hamano
@ 2005-05-22  9:41   ` Junio C Hamano
  2005-05-22 16:40     ` Linus Torvalds
  2005-05-22 17:04     ` Junio C Hamano
  1 sibling, 2 replies; 33+ messages in thread
From: Junio C Hamano @ 2005-05-22  9:41 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git

Linus, this truly makes diff-tree with the pickaxe very pleasant to
use.  Please give it a try.

------------
This patch moves the path selection logic from individual
programs to a new diffcore transformer (diff-tree still needs
to have its own for performance reasons).  Also the header
printing code in diff-tree was tweaked not to produce anything
when pickaxe is in effect and there is nothing interesting to
report.  An interesting example is the following in the GIT
archive itself:

    $ git-whatchanged -p -C -S'or something in a real script'

Signed-off-by: Junio C Hamano <junkio@cox.net>
---

Makefile           |    9 +--
diff-cache.c       |   26 ++++++----
diff-files.c       |   37 +++-----------
diff-helper.c      |   16 +++---
diff-tree.c        |   43 ++++++++--------
diff.c             |  137 +++++++++++++++++++++--------------------------------
diff.h             |   20 ++++---
diffcore-pickaxe.c |   15 ++---
diffcore-rename.c  |   23 ++++++++
diffcore.h         |    4 +
10 files changed, 162 insertions(+), 168 deletions(-)

diff --git a/Makefile b/Makefile
--- a/Makefile
+++ b/Makefile
@@ -45,7 +45,7 @@ LIB_H += strbuf.h
 LIB_OBJS += strbuf.o
 
 LIB_H += diff.h
-LIB_OBJS += diff.o diffcore-rename.o diffcore-pickaxe.o
+LIB_OBJS += diff.o diffcore-rename.o diffcore-pickaxe.o diffcore-pathspec.o
 
 LIB_OBJS += gitenv.o
 
@@ -123,9 +123,10 @@ sha1_file.o: $(LIB_H)
 usage.o: $(LIB_H)
 strbuf.o: $(LIB_H)
 gitenv.o: $(LIB_H)
-diff.o: $(LIB_H)
-diffcore-rename.o : $(LIB_H)
-diffcore-pickaxe.o : $(LIB_H)
+diff.o: $(LIB_H) diffcore.h
+diffcore-rename.o : $(LIB_H) diffcore.h
+diffcore-pathspec.o : $(LIB_H) diffcore.h
+diffcore-pickaxe.o : $(LIB_H) diffcore.h
 
 test: all
 	make -C t/ all
diff --git a/diff-cache.c b/diff-cache.c
--- a/diff-cache.c
+++ b/diff-cache.c
@@ -71,7 +71,7 @@ static int show_modified(struct cache_en
 
 	oldmode = old->ce_mode;
 	if (mode == oldmode && !memcmp(sha1, old->sha1, 20) &&
-	    detect_rename < 2)
+	    detect_rename < DIFF_DETECT_COPY)
 		return 0;
 
 	mode = ntohl(mode);
@@ -156,7 +156,7 @@ static void mark_merge_entries(void)
 static char *diff_cache_usage =
 "git-diff-cache [-p] [-r] [-z] [-m] [-M] [-C] [-R] [-S<string>] [--cached] <tree-ish>";
 
-int main(int argc, char **argv)
+int main(int argc, const char **argv)
 {
 	unsigned char tree_sha1[20];
 	void *tree;
@@ -165,7 +165,7 @@ int main(int argc, char **argv)
 
 	read_cache();
 	while (argc > 2) {
-		char *arg = argv[1];
+		const char *arg = argv[1];
 		argv++;
 		argc--;
 		if (!strcmp(arg, "-r")) {
@@ -177,12 +177,12 @@ int main(int argc, char **argv)
 			continue;
 		}
 		if (!strncmp(arg, "-M", 2)) {
-			detect_rename = 1;
+			detect_rename = DIFF_DETECT_RENAME;
 			diff_score_opt = diff_scoreopt_parse(arg);
 			continue;
 		}
 		if (!strncmp(arg, "-C", 2)) {
-			detect_rename = 2;
+			detect_rename = DIFF_DETECT_COPY;
 			diff_score_opt = diff_scoreopt_parse(arg);
 			continue;
 		}
@@ -209,10 +209,14 @@ int main(int argc, char **argv)
 		usage(diff_cache_usage);
 	}
 
-	if (argc != 2 || get_sha1(argv[1], tree_sha1))
+	if (argc < 2 || get_sha1(argv[1], tree_sha1))
 		usage(diff_cache_usage);
+	argv++;
+	argc--;
 
-	diff_setup(reverse_diff, diff_output_format);
+	/* The rest is for paths restriction. */
+
+	diff_setup(reverse_diff);
 
 	mark_merge_entries();
 
@@ -224,9 +228,11 @@ int main(int argc, char **argv)
 
 	ret = diff_cache(active_cache, active_nr);
 	if (detect_rename)
-		diff_detect_rename(detect_rename, diff_score_opt);
+		diffcore_rename(detect_rename, diff_score_opt);
 	if (pickaxe)
-		diff_pickaxe(pickaxe);
-	diff_flush(NULL, 0);
+		diffcore_pickaxe(pickaxe);
+	if (2 <= argc)
+		diffcore_pathspec(argv + 1);
+	diff_flush(diff_output_format);
 	return ret;
 }
diff --git a/diff-files.c b/diff-files.c
--- a/diff-files.c
+++ b/diff-files.c
@@ -16,21 +16,6 @@ static int diff_score_opt = 0;
 static const char *pickaxe = NULL;
 static int silent = 0;
 
-static int matches_pathspec(struct cache_entry *ce, char **spec, int cnt)
-{
-	int i;
-	int namelen = ce_namelen(ce);
-	for (i = 0; i < cnt; i++) {
-		int speclen = strlen(spec[i]);
-		if (! strncmp(spec[i], ce->name, speclen) &&
-		    speclen <= namelen &&
-		    (ce->name[speclen] == 0 ||
-		     ce->name[speclen] == '/'))
-			return 1;
-	}
-	return 0;
-}
-
 static void show_unmerge(const char *path)
 {
 	diff_unmerge(path);
@@ -48,7 +33,7 @@ static void show_modified(int oldmode, i
 	diff_change(oldmode, mode, old_sha1, sha1, path, NULL);
 }
 
-int main(int argc, char **argv)
+int main(int argc, const char **argv)
 {
 	static const unsigned char null_sha1[20] = { 0, };
 	int entries = read_cache();
@@ -71,11 +56,11 @@ int main(int argc, char **argv)
 			pickaxe = argv[1] + 2;
 		else if (!strncmp(argv[1], "-M", 2)) {
 			diff_score_opt = diff_scoreopt_parse(argv[1]);
-			detect_rename = 1;
+			detect_rename = DIFF_DETECT_RENAME;
 		}
 		else if (!strncmp(argv[1], "-C", 2)) {
 			diff_score_opt = diff_scoreopt_parse(argv[1]);
-			detect_rename = 2;
+			detect_rename = DIFF_DETECT_COPY;
 		}
 		else
 			usage(diff_files_usage);
@@ -90,7 +75,7 @@ int main(int argc, char **argv)
 		exit(1);
 	}
 
-	diff_setup(reverse_diff, diff_output_format);
+	diff_setup(reverse_diff);
 
 	for (i = 0; i < entries; i++) {
 		struct stat st;
@@ -98,10 +83,6 @@ int main(int argc, char **argv)
 		struct cache_entry *ce = active_cache[i];
 		int changed;
 
-		if (1 < argc &&
-		    ! matches_pathspec(ce, argv+1, argc-1))
-			continue;
-
 		if (ce_stage(ce)) {
 			show_unmerge(ce->name);
 			while (i < entries &&
@@ -122,7 +103,7 @@ int main(int argc, char **argv)
 			continue;
 		}
 		changed = ce_match_stat(ce, &st);
-		if (!changed && detect_rename < 2)
+		if (!changed && detect_rename < DIFF_DETECT_COPY)
 			continue;
 
 		oldmode = ntohl(ce->ce_mode);
@@ -133,9 +114,11 @@ int main(int argc, char **argv)
 			      ce->name);
 	}
 	if (detect_rename)
-		diff_detect_rename(detect_rename, diff_score_opt);
+		diffcore_rename(detect_rename, diff_score_opt);
 	if (pickaxe)
-		diff_pickaxe(pickaxe);
-	diff_flush(NULL, 0);
+		diffcore_pickaxe(pickaxe);
+	if (1 < argc)
+		diffcore_pathspec(argv + 1);
+	diff_flush(diff_output_format);
 	return 0;
 }
diff --git a/diff-helper.c b/diff-helper.c
--- a/diff-helper.c
+++ b/diff-helper.c
@@ -77,11 +77,11 @@ int main(int ac, const char **av) {
 		else if (av[1][1] == 'P') /* hidden from the help */
 			diff_output_style = DIFF_FORMAT_MACHINE;
 		else if (av[1][1] == 'M') {
-			detect_rename = 1;
+			detect_rename = DIFF_DETECT_RENAME;
 			diff_score_opt = diff_scoreopt_parse(av[1]);
 		}
 		else if (av[1][1] == 'C') {
-			detect_rename = 2;
+			detect_rename = DIFF_DETECT_COPY;
 			diff_score_opt = diff_scoreopt_parse(av[1]);
 		}
 		else if (av[1][1] == 'S') {
@@ -93,7 +93,7 @@ int main(int ac, const char **av) {
 	}
 	/* the remaining parameters are paths patterns */
 
-	diff_setup(reverse_diff, diff_output_style);
+	diff_setup(reverse_diff);
 	while (1) {
 		int status;
 		read_line(&sb1, stdin, line_termination);
@@ -121,14 +121,16 @@ int main(int ac, const char **av) {
 			status = parse_diff_raw(sb1.buf+1, NULL, NULL);
 		if (status) {
 		unrecognized:
-			diff_flush(av+1, ac-1);
+			diff_flush(diff_output_style);
 			printf("%s%c", sb1.buf, line_termination);
 		}
 	}
 	if (detect_rename)
-		diff_detect_rename(detect_rename, diff_score_opt);
+		diffcore_rename(detect_rename, diff_score_opt);
 	if (pickaxe)
-		diff_pickaxe(pickaxe);
-	diff_flush(av+1, ac-1);
+		diffcore_pickaxe(pickaxe);
+	if (ac)
+		diffcore_pathspec(av + 1);
+	diff_flush(diff_output_style);
 	return 0;
 }
diff --git a/diff-tree.c b/diff-tree.c
--- a/diff-tree.c
+++ b/diff-tree.c
@@ -18,7 +18,7 @@ static const char *header_prefix = "";
 
 // What paths are we interested in?
 static int nr_paths = 0;
-static char **paths = NULL;
+static const char **paths = NULL;
 static int *pathlens = NULL;
 
 static int diff_tree_sha1(const unsigned char *old, const unsigned char *new, const char *base);
@@ -67,11 +67,6 @@ static void show_file(const char *prefix
 	const char *path;
 	const unsigned char *sha1 = extract(tree, size, &path, &mode);
 
-	if (header) {
-		printf("%s", header);
-		header = NULL;
-	}
-
 	if (silent)
 		return;
 
@@ -137,10 +132,6 @@ static int compare_tree_entry(void *tree
 		return retval;
 	}
 
-	if (header) {
-		printf("%s", header);
-		header = NULL;
-	}
 	if (silent)
 		return 0;
 
@@ -268,16 +259,28 @@ static int diff_tree_sha1(const unsigned
 
 static void call_diff_setup(void)
 {
-	diff_setup(reverse_diff, diff_output_format);
+	diff_setup(reverse_diff);
 }
 
-static void call_diff_flush(void)
+static int call_diff_flush()
 {
 	if (detect_rename)
-		diff_detect_rename(detect_rename, diff_score_opt);
-	if (pickaxe)
-		diff_pickaxe(pickaxe);
-	diff_flush(NULL, 0);
+		diffcore_rename(detect_rename, diff_score_opt);
+	if (pickaxe) {
+		diffcore_pickaxe(pickaxe);
+		if (diff_queue_is_empty()) {
+			diff_flush(DIFF_FORMAT_NO_OUTPUT);
+			return 0;
+		}
+	}
+	if (nr_paths)
+		diffcore_pathspec(paths);
+	if (header) {
+		printf("%s", header);
+		header = NULL;
+	}
+	diff_flush(diff_output_format);
+	return 1;
 }
 
 static int diff_tree_sha1_top(const unsigned char *old,
@@ -462,7 +465,7 @@ static int diff_tree_stdin(char *line)
 static char *diff_tree_usage =
 "git-diff-tree [-p] [-r] [-z] [--stdin] [-M] [-C] [-R] [-S<string>] [-m] [-s] [-v] <tree-ish> <tree-ish>";
 
-int main(int argc, char **argv)
+int main(int argc, const char **argv)
 {
 	int nr_sha1;
 	char line[1000];
@@ -470,7 +473,7 @@ int main(int argc, char **argv)
 
 	nr_sha1 = 0;
 	for (;;) {
-		char *arg;
+		const char *arg;
 
 		argv++;
 		argc--;
@@ -509,12 +512,12 @@ int main(int argc, char **argv)
 			continue;
 		}
 		if (!strncmp(arg, "-M", 2)) {
-			detect_rename = 1;
+			detect_rename = DIFF_DETECT_RENAME;
 			diff_score_opt = diff_scoreopt_parse(arg);
 			continue;
 		}
 		if (!strncmp(arg, "-C", 2)) {
-			detect_rename = 2;
+			detect_rename = DIFF_DETECT_COPY;
 			diff_score_opt = diff_scoreopt_parse(arg);
 			continue;
 		}
diff --git a/diff.c b/diff.c
--- a/diff.c
+++ b/diff.c
@@ -16,8 +16,6 @@ static int reverse_diff;
 static int generate_patch;
 static int line_termination = '\n';
 static int inter_name_termination = '\t';
-static const char **pathspec;
-static int speccnt;
 
 static const char *external_diff(void)
 {
@@ -286,6 +284,12 @@ int diff_populate_filespec(struct diff_f
 	return 0;
 }
 
+void diff_free_filepair(struct diff_filepair *p)
+{
+	free(p->xfrm_msg);
+	free(p);
+}
+
 void diff_free_filespec_data(struct diff_filespec *s)
 {
 	if (s->should_free)
@@ -390,25 +394,6 @@ static void remove_tempfile_on_signal(in
 	remove_tempfile();
 }
 
-static int matches_pathspec(const char *name)
-{
-	int i;
-	int namelen;
-
-	if (speccnt == 0)
-		return 1;
-
-	namelen = strlen(name);
-	for (i = 0; i < speccnt; i++) {
-		int speclen = strlen(pathspec[i]);
-		if (! strncmp(pathspec[i], name, speclen) &&
-		    speclen <= namelen &&
-		    (name[speclen] == 0 || name[speclen] == '/'))
-			return 1;
-	}
-	return 0;
-}
-
 /* An external diff command takes:
  *
  * diff-cmd name infile1 infile1-sha1 infile1-mode \
@@ -426,9 +411,6 @@ static void run_external_diff(const char
 	int status;
 	static int atexit_asked = 0;
 
-	if (!matches_pathspec(name) && (!other || !matches_pathspec(other)))
-		return;
-
 	if (one && two) {
 		prepare_temp_file(name, &temp[0], one);
 		prepare_temp_file(other ? : name, &temp[1], two);
@@ -496,47 +478,26 @@ static void run_external_diff(const char
 	remove_tempfile();
 }
 
-int diff_scoreopt_parse(const char *opt)
+void diff_setup(int reverse_diff_)
 {
-	int diglen, num, scale, i;
-	if (opt[0] != '-' || (opt[1] != 'M' && opt[1] != 'C'))
-		return -1; /* that is not a -M nor -C option */
-	diglen = strspn(opt+2, "0123456789");
-	if (diglen == 0 || strlen(opt+2) != diglen)
-		return 0; /* use default */
-	sscanf(opt+2, "%d", &num);
-	for (i = 0, scale = 1; i < diglen; i++)
-		scale *= 10;
-
-	/* user says num divided by scale and we say internally that
-	 * is MAX_SCORE * num / scale.
-	 */
-	return MAX_SCORE * num / scale;
+	reverse_diff = reverse_diff_;
 }
 
-void diff_setup(int reverse_diff_, int diff_output_style)
+struct diff_queue_struct diff_queued_diff;
+
+void diff_q(struct diff_queue_struct *queue, struct diff_filepair *dp)
 {
-	reverse_diff = reverse_diff_;
-	generate_patch = 0;
-	switch (diff_output_style) {
-	case DIFF_FORMAT_HUMAN:
-		line_termination = '\n';
-		inter_name_termination = '\t';
-		break;
-	case DIFF_FORMAT_MACHINE:
-		line_termination = inter_name_termination = 0;
-		break;
-	case DIFF_FORMAT_PATCH:
-		generate_patch = 1;
-		break;
+	if (queue->alloc <= queue->nr) {
+		queue->alloc = alloc_nr(queue->alloc);
+		queue->queue = xrealloc(queue->queue,
+					sizeof(dp) * queue->alloc);
 	}
+	queue->queue[queue->nr++] = dp;
 }
 
-struct diff_queue_struct diff_queued_diff;
-
 struct diff_filepair *diff_queue(struct diff_queue_struct *queue,
-				  struct diff_filespec *one,
-				  struct diff_filespec *two)
+				 struct diff_filespec *one,
+				 struct diff_filespec *two)
 {
 	struct diff_filepair *dp = xmalloc(sizeof(*dp));
 	dp->one = one;
@@ -544,20 +505,16 @@ struct diff_filepair *diff_queue(struct 
 	dp->xfrm_msg = 0;
 	dp->orig_order = queue->nr;
 	dp->xfrm_work = 0;
-	if (queue->alloc <= queue->nr) {
-		queue->alloc = alloc_nr(queue->alloc);
-		queue->queue = xrealloc(queue->queue,
-				       sizeof(dp) * queue->alloc);
-	}
-	queue->queue[queue->nr++] = dp;
+	diff_q(queue, dp);
 	return dp;
 }
 
 static void diff_flush_raw(struct diff_filepair *p)
 {
-	/*
-	 * We used to reject rename/copy but new diff-raw can express them.
-	 */
+	if (DIFF_PAIR_UNMERGED(p)) {
+		printf("U %s%c", p->one->path, line_termination);
+		return;
+	}
 	printf(":%06o %06o %s ",
 	       p->one->mode, p->two->mode, sha1_to_hex(p->one->sha1));
 	printf("%s%c%s%c%s%c",
@@ -576,19 +533,26 @@ static void diff_flush_patch(struct diff
 	    (DIFF_FILE_VALID(p->two) && S_ISDIR(p->two->mode)))
 		return; /* no tree diffs in patch format */ 
 
-	run_external_diff(name, other, p->one, p->two, p->xfrm_msg);
+	if (DIFF_PAIR_UNMERGED(p))
+		run_external_diff(name, NULL, NULL, NULL, NULL);
+	else
+		run_external_diff(name, other, p->one, p->two, p->xfrm_msg);
 }
 
-static int identical(struct diff_filespec *one, struct diff_filespec *two)
+static int interesting(struct diff_filepair *p)
 {
 	/* This function is written stricter than necessary to support
 	 * the currently implemented transformers, but the idea is to
 	 * let transformers to produce diff_filepairs any way they want,
 	 * and filter and clean them up here before producing the output.
 	 */
+	struct diff_filespec *one, *two;
+
+	if (DIFF_PAIR_UNMERGED(p))
+		return 0; /* unmerged is interesting */
 
-	if (!DIFF_FILE_VALID(one) && !DIFF_FILE_VALID(two))
-		return 1; /* not interesting */
+	one = p->one;
+	two = p->two;
 
 	/* deletion, addition, mode change and renames are all interesting. */
 	if (DIFF_FILE_VALID(one) != DIFF_FILE_VALID(two) ||
@@ -609,7 +573,7 @@ static int identical(struct diff_filespe
 
 static void diff_flush_one(struct diff_filepair *p)
 {
-	if (identical(p->one, p->two))
+	if (interesting(p))
 		return;
 	if (generate_patch)
 		diff_flush_patch(p);
@@ -624,23 +588,32 @@ int diff_queue_is_empty(void)
 
 	for (i = 0; i < q->nr; i++) {
 		struct diff_filepair *p = q->queue[i];
-		if (!identical(p->one, p->two))
+		if (!interesting(p))
 			return 0;
 	}
 	return 1;
 }
 
-void diff_flush(const char **pathspec_, int speccnt_)
+void diff_flush(int diff_output_style)
 {
 	struct diff_queue_struct *q = &diff_queued_diff;
 	int i;
 
-	pathspec = pathspec_;
-	speccnt = speccnt_;
-
+	generate_patch = 0;
+	switch (diff_output_style) {
+	case DIFF_FORMAT_HUMAN:
+		line_termination = '\n';
+		inter_name_termination = '\t';
+		break;
+	case DIFF_FORMAT_MACHINE:
+		line_termination = inter_name_termination = 0;
+		break;
+	case DIFF_FORMAT_PATCH:
+		generate_patch = 1;
+		break;
+	}
 	for (i = 0; i < q->nr; i++)
 		diff_flush_one(q->queue[i]);
-
 	for (i = 0; i < q->nr; i++) {
 		struct diff_filepair *p = q->queue[i];
 		diff_free_filespec_data(p->one);
@@ -669,7 +642,7 @@ void diff_addremove(int addremove, unsig
 	 * which but should not make any difference).
 	 * Feeding the same new and old to diff_change() should
 	 * also have the same effect.  diff_flush() should
-	 * filter the identical ones out at the final output
+	 * filter uninteresting ones out at the final output
 	 * stage.
 	 */
 	if (reverse_diff)
@@ -739,8 +712,8 @@ void diff_change(unsigned old_mode, unsi
 
 void diff_unmerge(const char *path)
 {
-	if (generate_patch)
-		run_external_diff(path, NULL, NULL, NULL, NULL);
-	else
-		printf("U %s%c", path, line_termination);
+	struct diff_filespec *one, *two;
+	one = alloc_filespec(path);
+	two = alloc_filespec(path);
+	diff_queue(&diff_queued_diff, one, two);
 }
diff --git a/diff.h b/diff.h
--- a/diff.h
+++ b/diff.h
@@ -26,16 +26,20 @@ extern void diff_unmerge(const char *pat
 
 extern int diff_scoreopt_parse(const char *opt);
 
-#define DIFF_FORMAT_HUMAN   0
-#define DIFF_FORMAT_MACHINE 1
-#define DIFF_FORMAT_PATCH   2
-extern void diff_setup(int reverse, int diff_output_style);
-
-extern void diff_detect_rename(int, int);
-extern void diff_pickaxe(const char *);
+#define DIFF_FORMAT_HUMAN	0
+#define DIFF_FORMAT_MACHINE	1
+#define DIFF_FORMAT_PATCH	2
+#define DIFF_FORMAT_NO_OUTPUT	3
+extern void diff_setup(int reverse);
+
+#define DIFF_DETECT_RENAME	1
+#define DIFF_DETECT_COPY	2
+extern void diffcore_rename(int rename_copy, int minimum_score);
+extern void diffcore_pickaxe(const char *needle);
+extern void diffcore_pathspec(const char **pathspec);
 
 extern int diff_queue_is_empty(void);
 
-extern void diff_flush(const char **, int);
+extern void diff_flush(int output_style);
 
 #endif /* DIFF_H */
diff --git a/diffcore-pickaxe.c b/diffcore-pickaxe.c
--- a/diffcore-pickaxe.c
+++ b/diffcore-pickaxe.c
@@ -21,7 +21,7 @@ static int contains(struct diff_filespec
 	return 0;
 }
 
-void diff_pickaxe(const char *needle)
+void diffcore_pickaxe(const char *needle)
 {
 	struct diff_queue_struct *q = &diff_queued_diff;
 	unsigned long len = strlen(needle);
@@ -32,24 +32,23 @@ void diff_pickaxe(const char *needle)
 
 	for (i = 0; i < q->nr; i++) {
 		struct diff_filepair *p = q->queue[i];
+		int onum = outq.nr;
 		if (!DIFF_FILE_VALID(p->one)) {
 			if (!DIFF_FILE_VALID(p->two))
 				continue; /* ignore nonsense */
 			/* created */
 			if (contains(p->two, needle, len))
-				diff_queue(&outq, p->one, p->two);
+				diff_q(&outq, p);
 		}
 		else if (!DIFF_FILE_VALID(p->two)) {
 			if (contains(p->one, needle, len))
-				diff_queue(&outq, p->one, p->two);
+				diff_q(&outq, p);
 		}
 		else if (contains(p->one, needle, len) !=
 			 contains(p->two, needle, len))
-			diff_queue(&outq, p->one, p->two);
-	}
-	for (i = 0; i < q->nr; i++) {
-		struct diff_filepair *p = q->queue[i];
-		free(p);
+			diff_q(&outq, p);
+		if (onum == outq.nr)
+			diff_free_filepair(p);
 	}
 	free(q->queue);
 	*q = outq;
diff --git a/diffcore-rename.c b/diffcore-rename.c
--- a/diffcore-rename.c
+++ b/diffcore-rename.c
@@ -224,8 +224,25 @@ static int needs_to_stay(struct diff_que
 	return 0;
 }
 
-void diff_detect_rename(int detect_rename,
-			int minimum_score)
+int diff_scoreopt_parse(const char *opt)
+{
+	int diglen, num, scale, i;
+	if (opt[0] != '-' || (opt[1] != 'M' && opt[1] != 'C'))
+		return -1; /* that is not a -M nor -C option */
+	diglen = strspn(opt+2, "0123456789");
+	if (diglen == 0 || strlen(opt+2) != diglen)
+		return 0; /* use default */
+	sscanf(opt+2, "%d", &num);
+	for (i = 0, scale = 1; i < diglen; i++)
+		scale *= 10;
+
+	/* user says num divided by scale and we say internally that
+	 * is MAX_SCORE * num / scale.
+	 */
+	return MAX_SCORE * num / scale;
+}
+
+void diffcore_rename(int detect_rename, int minimum_score)
 {
 	struct diff_queue_struct *q = &diff_queued_diff;
 	struct diff_queue_struct outq;
@@ -235,6 +252,8 @@ void diff_detect_rename(int detect_renam
 	int h, i, j;
 	int num_create, num_src, dst_cnt, src_cnt;
 
+	if (!minimum_score)
+		minimum_score = DEFAULT_MINIMUM_SCORE;
 	outq.queue = NULL;
 	outq.nr = outq.alloc = 0;
 
diff --git a/diffcore.h b/diffcore.h
--- a/diffcore.h
+++ b/diffcore.h
@@ -45,6 +45,8 @@ struct diff_filepair {
 	int orig_order; /* the original order of insertion into the queue */
 	int xfrm_work; /* for use by tramsformers, not by diffcore */
 };
+#define DIFF_PAIR_UNMERGED(p) \
+	(!DIFF_FILE_VALID((p)->one) && !DIFF_FILE_VALID((p)->two))
 
 struct diff_queue_struct {
 	struct diff_filepair **queue;
@@ -56,5 +58,7 @@ extern struct diff_queue_struct diff_que
 extern struct diff_filepair *diff_queue(struct diff_queue_struct *,
 					struct diff_filespec *,
 					struct diff_filespec *);
+extern void diff_q(struct diff_queue_struct *, struct diff_filepair *);
+extern void diff_free_filepair(struct diff_filepair *);
 
 #endif
------------------------------------------------


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH] Diffcore updates.
  2005-05-22  9:41   ` [PATCH] Diffcore updates Junio C Hamano
@ 2005-05-22 16:40     ` Linus Torvalds
  2005-05-22 16:47       ` Junio C Hamano
  2005-05-22 17:04     ` Junio C Hamano
  1 sibling, 1 reply; 33+ messages in thread
From: Linus Torvalds @ 2005-05-22 16:40 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git



On Sun, 22 May 2005, Junio C Hamano wrote:
>
> Linus, this truly makes diff-tree with the pickaxe very pleasant to
> use.  Please give it a try.

I can't. You split "diffcore_pathspec()" into a new common library file, 
but you didn't add that to the archive, so your patch doesn't contain it.

Maybe you should make "jit" warn about unadded files ;)

		Linus

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH] Diffcore updates.
  2005-05-22 16:40     ` Linus Torvalds
@ 2005-05-22 16:47       ` Junio C Hamano
  0 siblings, 0 replies; 33+ messages in thread
From: Junio C Hamano @ 2005-05-22 16:47 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git

>>>>> "LT" == Linus Torvalds <torvalds@osdl.org> writes:

LT> Maybe you should make "jit" warn about unadded files ;)

Sorry my bad.  I do not run jit-status often enough.


^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH] Diffcore updates.
  2005-05-22  9:41   ` [PATCH] Diffcore updates Junio C Hamano
  2005-05-22 16:40     ` Linus Torvalds
@ 2005-05-22 17:04     ` Junio C Hamano
  2005-05-23  4:24       ` [PATCH] Be careful with symlinks when detecting renames and copies Junio C Hamano
  1 sibling, 1 reply; 33+ messages in thread
From: Junio C Hamano @ 2005-05-22 17:04 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git

This applies on top of the "math thinko" fix; would not conflict
if you are missing it, but you would see offsets.  I added the
file missing from the previous patch.  Sorry about the screwup.

------------
This patch moves the path selection logic from individual
programs to a new diffcore transformer (diff-tree still needs
to have its own for performance reasons).  Also the header
printing code in diff-tree was tweaked not to produce anything
when pickaxe is in effect and there is nothing interesting to
report.  An interesting example is the following in the GIT
archive itself:

    $ git-whatchanged -p -C -S'or something in a real script'

Signed-off-by: Junio C Hamano <junkio@cox.net>
---

Makefile            |    9 +-
diff-cache.c        |   27 +++++---
diff-files.c        |   38 +++--------
diff-helper.c       |   17 +++--
diff-tree.c         |   44 +++++++------
diff.c              |  172 +++++++++++++++++++++++++++-------------------------
diff.h              |   22 ++++--
diffcore-pathspec.c |   63 +++++++++++++++++++
diffcore-pickaxe.c  |   15 ++--
diffcore-rename.c   |   23 ++++++
diffcore.h          |    4 +
11 files changed, 267 insertions(+), 167 deletions(-)
new file (100644): diffcore-pathspec.c

diff --git a/Makefile b/Makefile
--- a/Makefile
+++ b/Makefile
@@ -45,7 +45,7 @@ LIB_H += strbuf.h
 LIB_OBJS += strbuf.o
 
 LIB_H += diff.h
-LIB_OBJS += diff.o diffcore-rename.o diffcore-pickaxe.o
+LIB_OBJS += diff.o diffcore-rename.o diffcore-pickaxe.o diffcore-pathspec.o
 
 LIB_OBJS += gitenv.o
 
@@ -123,9 +123,10 @@ sha1_file.o: $(LIB_H)
 usage.o: $(LIB_H)
 strbuf.o: $(LIB_H)
 gitenv.o: $(LIB_H)
-diff.o: $(LIB_H)
-diffcore-rename.o : $(LIB_H)
-diffcore-pickaxe.o : $(LIB_H)
+diff.o: $(LIB_H) diffcore.h
+diffcore-rename.o : $(LIB_H) diffcore.h
+diffcore-pathspec.o : $(LIB_H) diffcore.h
+diffcore-pickaxe.o : $(LIB_H) diffcore.h
 
 test: all
 	make -C t/ all
diff --git a/diff-cache.c b/diff-cache.c
--- a/diff-cache.c
+++ b/diff-cache.c
@@ -71,7 +71,7 @@ static int show_modified(struct cache_en
 
 	oldmode = old->ce_mode;
 	if (mode == oldmode && !memcmp(sha1, old->sha1, 20) &&
-	    detect_rename < 2)
+	    detect_rename < DIFF_DETECT_COPY)
 		return 0;
 
 	mode = ntohl(mode);
@@ -156,7 +156,7 @@ static void mark_merge_entries(void)
 static char *diff_cache_usage =
 "git-diff-cache [-p] [-r] [-z] [-m] [-M] [-C] [-R] [-S<string>] [--cached] <tree-ish>";
 
-int main(int argc, char **argv)
+int main(int argc, const char **argv)
 {
 	unsigned char tree_sha1[20];
 	void *tree;
@@ -165,7 +165,7 @@ int main(int argc, char **argv)
 
 	read_cache();
 	while (argc > 2) {
-		char *arg = argv[1];
+		const char *arg = argv[1];
 		argv++;
 		argc--;
 		if (!strcmp(arg, "-r")) {
@@ -177,12 +177,12 @@ int main(int argc, char **argv)
 			continue;
 		}
 		if (!strncmp(arg, "-M", 2)) {
-			detect_rename = 1;
+			detect_rename = DIFF_DETECT_RENAME;
 			diff_score_opt = diff_scoreopt_parse(arg);
 			continue;
 		}
 		if (!strncmp(arg, "-C", 2)) {
-			detect_rename = 2;
+			detect_rename = DIFF_DETECT_COPY;
 			diff_score_opt = diff_scoreopt_parse(arg);
 			continue;
 		}
@@ -209,10 +209,14 @@ int main(int argc, char **argv)
 		usage(diff_cache_usage);
 	}
 
-	if (argc != 2 || get_sha1(argv[1], tree_sha1))
+	if (argc < 2 || get_sha1(argv[1], tree_sha1))
 		usage(diff_cache_usage);
+	argv++;
+	argc--;
 
-	diff_setup(reverse_diff, diff_output_format);
+	/* The rest is for paths restriction. */
+
+	diff_setup(reverse_diff);
 
 	mark_merge_entries();
 
@@ -224,9 +228,12 @@ int main(int argc, char **argv)
 
 	ret = diff_cache(active_cache, active_nr);
 	if (detect_rename)
-		diff_detect_rename(detect_rename, diff_score_opt);
+		diffcore_rename(detect_rename, diff_score_opt);
+	diffcore_prune();
 	if (pickaxe)
-		diff_pickaxe(pickaxe);
-	diff_flush(NULL, 0);
+		diffcore_pickaxe(pickaxe);
+	if (2 <= argc)
+		diffcore_pathspec(argv + 1);
+	diff_flush(diff_output_format);
 	return ret;
 }
diff --git a/diff-files.c b/diff-files.c
--- a/diff-files.c
+++ b/diff-files.c
@@ -16,21 +16,6 @@ static int diff_score_opt = 0;
 static const char *pickaxe = NULL;
 static int silent = 0;
 
-static int matches_pathspec(struct cache_entry *ce, char **spec, int cnt)
-{
-	int i;
-	int namelen = ce_namelen(ce);
-	for (i = 0; i < cnt; i++) {
-		int speclen = strlen(spec[i]);
-		if (! strncmp(spec[i], ce->name, speclen) &&
-		    speclen <= namelen &&
-		    (ce->name[speclen] == 0 ||
-		     ce->name[speclen] == '/'))
-			return 1;
-	}
-	return 0;
-}
-
 static void show_unmerge(const char *path)
 {
 	diff_unmerge(path);
@@ -48,7 +33,7 @@ static void show_modified(int oldmode, i
 	diff_change(oldmode, mode, old_sha1, sha1, path, NULL);
 }
 
-int main(int argc, char **argv)
+int main(int argc, const char **argv)
 {
 	static const unsigned char null_sha1[20] = { 0, };
 	int entries = read_cache();
@@ -71,11 +56,11 @@ int main(int argc, char **argv)
 			pickaxe = argv[1] + 2;
 		else if (!strncmp(argv[1], "-M", 2)) {
 			diff_score_opt = diff_scoreopt_parse(argv[1]);
-			detect_rename = 1;
+			detect_rename = DIFF_DETECT_RENAME;
 		}
 		else if (!strncmp(argv[1], "-C", 2)) {
 			diff_score_opt = diff_scoreopt_parse(argv[1]);
-			detect_rename = 2;
+			detect_rename = DIFF_DETECT_COPY;
 		}
 		else
 			usage(diff_files_usage);
@@ -90,7 +75,7 @@ int main(int argc, char **argv)
 		exit(1);
 	}
 
-	diff_setup(reverse_diff, diff_output_format);
+	diff_setup(reverse_diff);
 
 	for (i = 0; i < entries; i++) {
 		struct stat st;
@@ -98,10 +83,6 @@ int main(int argc, char **argv)
 		struct cache_entry *ce = active_cache[i];
 		int changed;
 
-		if (1 < argc &&
-		    ! matches_pathspec(ce, argv+1, argc-1))
-			continue;
-
 		if (ce_stage(ce)) {
 			show_unmerge(ce->name);
 			while (i < entries &&
@@ -122,7 +103,7 @@ int main(int argc, char **argv)
 			continue;
 		}
 		changed = ce_match_stat(ce, &st);
-		if (!changed && detect_rename < 2)
+		if (!changed && detect_rename < DIFF_DETECT_COPY)
 			continue;
 
 		oldmode = ntohl(ce->ce_mode);
@@ -133,9 +114,12 @@ int main(int argc, char **argv)
 			      ce->name);
 	}
 	if (detect_rename)
-		diff_detect_rename(detect_rename, diff_score_opt);
+		diffcore_rename(detect_rename, diff_score_opt);
+	diffcore_prune();
 	if (pickaxe)
-		diff_pickaxe(pickaxe);
-	diff_flush(NULL, 0);
+		diffcore_pickaxe(pickaxe);
+	if (1 < argc)
+		diffcore_pathspec(argv + 1);
+	diff_flush(diff_output_format);
 	return 0;
 }
diff --git a/diff-helper.c b/diff-helper.c
--- a/diff-helper.c
+++ b/diff-helper.c
@@ -77,11 +77,11 @@ int main(int ac, const char **av) {
 		else if (av[1][1] == 'P') /* hidden from the help */
 			diff_output_style = DIFF_FORMAT_MACHINE;
 		else if (av[1][1] == 'M') {
-			detect_rename = 1;
+			detect_rename = DIFF_DETECT_RENAME;
 			diff_score_opt = diff_scoreopt_parse(av[1]);
 		}
 		else if (av[1][1] == 'C') {
-			detect_rename = 2;
+			detect_rename = DIFF_DETECT_COPY;
 			diff_score_opt = diff_scoreopt_parse(av[1]);
 		}
 		else if (av[1][1] == 'S') {
@@ -93,7 +93,7 @@ int main(int ac, const char **av) {
 	}
 	/* the remaining parameters are paths patterns */
 
-	diff_setup(reverse_diff, diff_output_style);
+	diff_setup(reverse_diff);
 	while (1) {
 		int status;
 		read_line(&sb1, stdin, line_termination);
@@ -121,14 +121,17 @@ int main(int ac, const char **av) {
 			status = parse_diff_raw(sb1.buf+1, NULL, NULL);
 		if (status) {
 		unrecognized:
-			diff_flush(av+1, ac-1);
+			diff_flush(diff_output_style);
 			printf("%s%c", sb1.buf, line_termination);
 		}
 	}
 	if (detect_rename)
-		diff_detect_rename(detect_rename, diff_score_opt);
+		diffcore_rename(detect_rename, diff_score_opt);
+	diffcore_prune();
 	if (pickaxe)
-		diff_pickaxe(pickaxe);
-	diff_flush(av+1, ac-1);
+		diffcore_pickaxe(pickaxe);
+	if (ac)
+		diffcore_pathspec(av + 1);
+	diff_flush(diff_output_style);
 	return 0;
 }
diff --git a/diff-tree.c b/diff-tree.c
--- a/diff-tree.c
+++ b/diff-tree.c
@@ -18,7 +18,7 @@ static const char *header_prefix = "";
 
 // What paths are we interested in?
 static int nr_paths = 0;
-static char **paths = NULL;
+static const char **paths = NULL;
 static int *pathlens = NULL;
 
 static int diff_tree_sha1(const unsigned char *old, const unsigned char *new, const char *base);
@@ -67,11 +67,6 @@ static void show_file(const char *prefix
 	const char *path;
 	const unsigned char *sha1 = extract(tree, size, &path, &mode);
 
-	if (header) {
-		printf("%s", header);
-		header = NULL;
-	}
-
 	if (silent)
 		return;
 
@@ -137,10 +132,6 @@ static int compare_tree_entry(void *tree
 		return retval;
 	}
 
-	if (header) {
-		printf("%s", header);
-		header = NULL;
-	}
 	if (silent)
 		return 0;
 
@@ -268,16 +259,29 @@ static int diff_tree_sha1(const unsigned
 
 static void call_diff_setup(void)
 {
-	diff_setup(reverse_diff, diff_output_format);
+	diff_setup(reverse_diff);
 }
 
-static void call_diff_flush(void)
+static int call_diff_flush()
 {
 	if (detect_rename)
-		diff_detect_rename(detect_rename, diff_score_opt);
-	if (pickaxe)
-		diff_pickaxe(pickaxe);
-	diff_flush(NULL, 0);
+		diffcore_rename(detect_rename, diff_score_opt);
+	diffcore_prune();
+	if (pickaxe) {
+		diffcore_pickaxe(pickaxe);
+		if (diff_queue_is_empty()) {
+			diff_flush(DIFF_FORMAT_NO_OUTPUT);
+			return 0;
+		}
+	}
+	if (nr_paths)
+		diffcore_pathspec(paths);
+	if (header) {
+		printf("%s", header);
+		header = NULL;
+	}
+	diff_flush(diff_output_format);
+	return 1;
 }
 
 static int diff_tree_sha1_top(const unsigned char *old,
@@ -462,7 +466,7 @@ static int diff_tree_stdin(char *line)
 static char *diff_tree_usage =
 "git-diff-tree [-p] [-r] [-z] [--stdin] [-M] [-C] [-R] [-S<string>] [-m] [-s] [-v] <tree-ish> <tree-ish>";
 
-int main(int argc, char **argv)
+int main(int argc, const char **argv)
 {
 	int nr_sha1;
 	char line[1000];
@@ -470,7 +474,7 @@ int main(int argc, char **argv)
 
 	nr_sha1 = 0;
 	for (;;) {
-		char *arg;
+		const char *arg;
 
 		argv++;
 		argc--;
@@ -509,12 +513,12 @@ int main(int argc, char **argv)
 			continue;
 		}
 		if (!strncmp(arg, "-M", 2)) {
-			detect_rename = 1;
+			detect_rename = DIFF_DETECT_RENAME;
 			diff_score_opt = diff_scoreopt_parse(arg);
 			continue;
 		}
 		if (!strncmp(arg, "-C", 2)) {
-			detect_rename = 2;
+			detect_rename = DIFF_DETECT_COPY;
 			diff_score_opt = diff_scoreopt_parse(arg);
 			continue;
 		}
diff --git a/diff.c b/diff.c
--- a/diff.c
+++ b/diff.c
@@ -16,8 +16,6 @@ static int reverse_diff;
 static int generate_patch;
 static int line_termination = '\n';
 static int inter_name_termination = '\t';
-static const char **pathspec;
-static int speccnt;
 
 static const char *external_diff(void)
 {
@@ -286,6 +284,12 @@ int diff_populate_filespec(struct diff_f
 	return 0;
 }
 
+void diff_free_filepair(struct diff_filepair *p)
+{
+	free(p->xfrm_msg);
+	free(p);
+}
+
 void diff_free_filespec_data(struct diff_filespec *s)
 {
 	if (s->should_free)
@@ -390,25 +394,6 @@ static void remove_tempfile_on_signal(in
 	remove_tempfile();
 }
 
-static int matches_pathspec(const char *name)
-{
-	int i;
-	int namelen;
-
-	if (speccnt == 0)
-		return 1;
-
-	namelen = strlen(name);
-	for (i = 0; i < speccnt; i++) {
-		int speclen = strlen(pathspec[i]);
-		if (! strncmp(pathspec[i], name, speclen) &&
-		    speclen <= namelen &&
-		    (name[speclen] == 0 || name[speclen] == '/'))
-			return 1;
-	}
-	return 0;
-}
-
 /* An external diff command takes:
  *
  * diff-cmd name infile1 infile1-sha1 infile1-mode \
@@ -426,9 +411,6 @@ static void run_external_diff(const char
 	int status;
 	static int atexit_asked = 0;
 
-	if (!matches_pathspec(name) && (!other || !matches_pathspec(other)))
-		return;
-
 	if (one && two) {
 		prepare_temp_file(name, &temp[0], one);
 		prepare_temp_file(other ? : name, &temp[1], two);
@@ -496,47 +478,26 @@ static void run_external_diff(const char
 	remove_tempfile();
 }
 
-int diff_scoreopt_parse(const char *opt)
+void diff_setup(int reverse_diff_)
 {
-	int diglen, num, scale, i;
-	if (opt[0] != '-' || (opt[1] != 'M' && opt[1] != 'C'))
-		return -1; /* that is not a -M nor -C option */
-	diglen = strspn(opt+2, "0123456789");
-	if (diglen == 0 || strlen(opt+2) != diglen)
-		return 0; /* use default */
-	sscanf(opt+2, "%d", &num);
-	for (i = 0, scale = 1; i < diglen; i++)
-		scale *= 10;
-
-	/* user says num divided by scale and we say internally that
-	 * is MAX_SCORE * num / scale.
-	 */
-	return MAX_SCORE * num / scale;
+	reverse_diff = reverse_diff_;
 }
 
-void diff_setup(int reverse_diff_, int diff_output_style)
+struct diff_queue_struct diff_queued_diff;
+
+void diff_q(struct diff_queue_struct *queue, struct diff_filepair *dp)
 {
-	reverse_diff = reverse_diff_;
-	generate_patch = 0;
-	switch (diff_output_style) {
-	case DIFF_FORMAT_HUMAN:
-		line_termination = '\n';
-		inter_name_termination = '\t';
-		break;
-	case DIFF_FORMAT_MACHINE:
-		line_termination = inter_name_termination = 0;
-		break;
-	case DIFF_FORMAT_PATCH:
-		generate_patch = 1;
-		break;
+	if (queue->alloc <= queue->nr) {
+		queue->alloc = alloc_nr(queue->alloc);
+		queue->queue = xrealloc(queue->queue,
+					sizeof(dp) * queue->alloc);
 	}
+	queue->queue[queue->nr++] = dp;
 }
 
-struct diff_queue_struct diff_queued_diff;
-
 struct diff_filepair *diff_queue(struct diff_queue_struct *queue,
-				  struct diff_filespec *one,
-				  struct diff_filespec *two)
+				 struct diff_filespec *one,
+				 struct diff_filespec *two)
 {
 	struct diff_filepair *dp = xmalloc(sizeof(*dp));
 	dp->one = one;
@@ -544,20 +505,16 @@ struct diff_filepair *diff_queue(struct 
 	dp->xfrm_msg = 0;
 	dp->orig_order = queue->nr;
 	dp->xfrm_work = 0;
-	if (queue->alloc <= queue->nr) {
-		queue->alloc = alloc_nr(queue->alloc);
-		queue->queue = xrealloc(queue->queue,
-				       sizeof(dp) * queue->alloc);
-	}
-	queue->queue[queue->nr++] = dp;
+	diff_q(queue, dp);
 	return dp;
 }
 
 static void diff_flush_raw(struct diff_filepair *p)
 {
-	/*
-	 * We used to reject rename/copy but new diff-raw can express them.
-	 */
+	if (DIFF_PAIR_UNMERGED(p)) {
+		printf("U %s%c", p->one->path, line_termination);
+		return;
+	}
 	printf(":%06o %06o %s ",
 	       p->one->mode, p->two->mode, sha1_to_hex(p->one->sha1));
 	printf("%s%c%s%c%s%c",
@@ -576,19 +533,26 @@ static void diff_flush_patch(struct diff
 	    (DIFF_FILE_VALID(p->two) && S_ISDIR(p->two->mode)))
 		return; /* no tree diffs in patch format */ 
 
-	run_external_diff(name, other, p->one, p->two, p->xfrm_msg);
+	if (DIFF_PAIR_UNMERGED(p))
+		run_external_diff(name, NULL, NULL, NULL, NULL);
+	else
+		run_external_diff(name, other, p->one, p->two, p->xfrm_msg);
 }
 
-static int identical(struct diff_filespec *one, struct diff_filespec *two)
+static int uninteresting(struct diff_filepair *p)
 {
 	/* This function is written stricter than necessary to support
 	 * the currently implemented transformers, but the idea is to
 	 * let transformers to produce diff_filepairs any way they want,
 	 * and filter and clean them up here before producing the output.
 	 */
+	struct diff_filespec *one, *two;
 
-	if (!DIFF_FILE_VALID(one) && !DIFF_FILE_VALID(two))
-		return 1; /* not interesting */
+	if (DIFF_PAIR_UNMERGED(p))
+		return 0; /* unmerged is interesting */
+
+	one = p->one;
+	two = p->two;
 
 	/* deletion, addition, mode change and renames are all interesting. */
 	if (DIFF_FILE_VALID(one) != DIFF_FILE_VALID(two) ||
@@ -607,9 +571,44 @@ static int identical(struct diff_filespe
 	return 0;
 }
 
+void diffcore_prune(void)
+{
+	/*
+	 * Although rename/copy detection wants to have "no-change"
+	 * entries fed into them, the downstream do not need to see
+	 * them.  This function removes such entries.
+	 *
+	 * The applications that use rename/copy should:
+	 *
+	 * (1) feed change and "no-change" entries via diff_queue().
+	 * (2) call diffcore_rename, and any other future diffcore_xxx
+	 *     that would benefit by still having "no-change" entries.
+	 * (3) call diffcore_prune
+	 * (4) call other diffcore_xxx that do not need to see
+	 *     "no-change" entries.
+	 */
+	struct diff_queue_struct *q = &diff_queued_diff;
+	struct diff_queue_struct outq;
+	int i;
+
+	outq.queue = NULL;
+	outq.nr = outq.alloc = 0;
+
+	for (i = 0; i < q->nr; i++) {
+		struct diff_filepair *p = q->queue[i];
+		if (!uninteresting(p))
+			diff_q(&outq, p);
+		else
+			diff_free_filepair(p);
+	}
+	free(q->queue);
+	*q = outq;
+	return;
+}
+
 static void diff_flush_one(struct diff_filepair *p)
 {
-	if (identical(p->one, p->two))
+	if (uninteresting(p))
 		return;
 	if (generate_patch)
 		diff_flush_patch(p);
@@ -624,23 +623,32 @@ int diff_queue_is_empty(void)
 
 	for (i = 0; i < q->nr; i++) {
 		struct diff_filepair *p = q->queue[i];
-		if (!identical(p->one, p->two))
+		if (!uninteresting(p))
 			return 0;
 	}
 	return 1;
 }
 
-void diff_flush(const char **pathspec_, int speccnt_)
+void diff_flush(int diff_output_style)
 {
 	struct diff_queue_struct *q = &diff_queued_diff;
 	int i;
 
-	pathspec = pathspec_;
-	speccnt = speccnt_;
-
+	generate_patch = 0;
+	switch (diff_output_style) {
+	case DIFF_FORMAT_HUMAN:
+		line_termination = '\n';
+		inter_name_termination = '\t';
+		break;
+	case DIFF_FORMAT_MACHINE:
+		line_termination = inter_name_termination = 0;
+		break;
+	case DIFF_FORMAT_PATCH:
+		generate_patch = 1;
+		break;
+	}
 	for (i = 0; i < q->nr; i++)
 		diff_flush_one(q->queue[i]);
-
 	for (i = 0; i < q->nr; i++) {
 		struct diff_filepair *p = q->queue[i];
 		diff_free_filespec_data(p->one);
@@ -669,7 +677,7 @@ void diff_addremove(int addremove, unsig
 	 * which but should not make any difference).
 	 * Feeding the same new and old to diff_change() should
 	 * also have the same effect.  diff_flush() should
-	 * filter the identical ones out at the final output
+	 * filter uninteresting ones out at the final output
 	 * stage.
 	 */
 	if (reverse_diff)
@@ -739,8 +747,8 @@ void diff_change(unsigned old_mode, unsi
 
 void diff_unmerge(const char *path)
 {
-	if (generate_patch)
-		run_external_diff(path, NULL, NULL, NULL, NULL);
-	else
-		printf("U %s%c", path, line_termination);
+	struct diff_filespec *one, *two;
+	one = alloc_filespec(path);
+	two = alloc_filespec(path);
+	diff_queue(&diff_queued_diff, one, two);
 }
diff --git a/diff.h b/diff.h
--- a/diff.h
+++ b/diff.h
@@ -26,16 +26,24 @@ extern void diff_unmerge(const char *pat
 
 extern int diff_scoreopt_parse(const char *opt);
 
-#define DIFF_FORMAT_HUMAN   0
-#define DIFF_FORMAT_MACHINE 1
-#define DIFF_FORMAT_PATCH   2
-extern void diff_setup(int reverse, int diff_output_style);
+#define DIFF_FORMAT_HUMAN	0
+#define DIFF_FORMAT_MACHINE	1
+#define DIFF_FORMAT_PATCH	2
+#define DIFF_FORMAT_NO_OUTPUT	3
+extern void diff_setup(int reverse);
 
-extern void diff_detect_rename(int, int);
-extern void diff_pickaxe(const char *);
+#define DIFF_DETECT_RENAME	1
+#define DIFF_DETECT_COPY	2
+
+extern void diffcore_rename(int rename_copy, int minimum_score);
+
+extern void diffcore_prune(void);
+
+extern void diffcore_pickaxe(const char *needle);
+extern void diffcore_pathspec(const char **pathspec);
 
 extern int diff_queue_is_empty(void);
 
-extern void diff_flush(const char **, int);
+extern void diff_flush(int output_style);
 
 #endif /* DIFF_H */
diff --git a/diffcore-pathspec.c b/diffcore-pathspec.c
new file mode 100644
--- /dev/null
+++ b/diffcore-pathspec.c
@@ -0,0 +1,63 @@
+/*
+ * Copyright (C) 2005 Junio C Hamano
+ */
+#include "cache.h"
+#include "diff.h"
+#include "diffcore.h"
+#include "delta.h"
+
+struct path_spec {
+	const char *spec;
+	int len;
+};
+
+static int matches_pathspec(const char *name, struct path_spec *s, int cnt)
+{
+	int i;
+	int namelen;
+
+	if (cnt == 0)
+		return 1;
+
+	namelen = strlen(name);
+	for (i = 0; i < cnt; i++) {
+		int len = s->len;
+		if (! strncmp(s->spec, name, len) &&
+		    len <= namelen &&
+		    (name[len] == 0 || name[len] == '/'))
+			return 1;
+	}
+	return 0;
+}
+
+void diffcore_pathspec(const char **pathspec)
+{
+	struct diff_queue_struct *q = &diff_queued_diff;
+	int i, speccnt;
+	struct diff_queue_struct outq;
+	struct path_spec *spec;
+
+	outq.queue = NULL;
+	outq.nr = outq.alloc = 0;
+
+	for (i = 0; pathspec[i]; i++)
+		;
+	speccnt = i;
+	spec = xmalloc(sizeof(*spec) * speccnt);
+	for (i = 0; pathspec[i]; i++) {
+		spec[i].spec = pathspec[i];
+		spec[i].len = strlen(pathspec[i]);
+	}
+
+	for (i = 0; i < q->nr; i++) {
+		struct diff_filepair *p = q->queue[i];
+		if (matches_pathspec(p->one->path, spec, speccnt) ||
+		    matches_pathspec(p->two->path, spec, speccnt))
+			diff_q(&outq, p);
+		else
+			diff_free_filepair(p);
+	}
+	free(q->queue);
+	*q = outq;
+	return;
+}
diff --git a/diffcore-pickaxe.c b/diffcore-pickaxe.c
--- a/diffcore-pickaxe.c
+++ b/diffcore-pickaxe.c
@@ -21,7 +21,7 @@ static int contains(struct diff_filespec
 	return 0;
 }
 
-void diff_pickaxe(const char *needle)
+void diffcore_pickaxe(const char *needle)
 {
 	struct diff_queue_struct *q = &diff_queued_diff;
 	unsigned long len = strlen(needle);
@@ -32,24 +32,23 @@ void diff_pickaxe(const char *needle)
 
 	for (i = 0; i < q->nr; i++) {
 		struct diff_filepair *p = q->queue[i];
+		int onum = outq.nr;
 		if (!DIFF_FILE_VALID(p->one)) {
 			if (!DIFF_FILE_VALID(p->two))
 				continue; /* ignore nonsense */
 			/* created */
 			if (contains(p->two, needle, len))
-				diff_queue(&outq, p->one, p->two);
+				diff_q(&outq, p);
 		}
 		else if (!DIFF_FILE_VALID(p->two)) {
 			if (contains(p->one, needle, len))
-				diff_queue(&outq, p->one, p->two);
+				diff_q(&outq, p);
 		}
 		else if (contains(p->one, needle, len) !=
 			 contains(p->two, needle, len))
-			diff_queue(&outq, p->one, p->two);
-	}
-	for (i = 0; i < q->nr; i++) {
-		struct diff_filepair *p = q->queue[i];
-		free(p);
+			diff_q(&outq, p);
+		if (onum == outq.nr)
+			diff_free_filepair(p);
 	}
 	free(q->queue);
 	*q = outq;
diff --git a/diffcore-rename.c b/diffcore-rename.c
--- a/diffcore-rename.c
+++ b/diffcore-rename.c
@@ -224,8 +224,25 @@ static int needs_to_stay(struct diff_que
 	return 0;
 }
 
-void diff_detect_rename(int detect_rename,
-			int minimum_score)
+int diff_scoreopt_parse(const char *opt)
+{
+	int diglen, num, scale, i;
+	if (opt[0] != '-' || (opt[1] != 'M' && opt[1] != 'C'))
+		return -1; /* that is not a -M nor -C option */
+	diglen = strspn(opt+2, "0123456789");
+	if (diglen == 0 || strlen(opt+2) != diglen)
+		return 0; /* use default */
+	sscanf(opt+2, "%d", &num);
+	for (i = 0, scale = 1; i < diglen; i++)
+		scale *= 10;
+
+	/* user says num divided by scale and we say internally that
+	 * is MAX_SCORE * num / scale.
+	 */
+	return MAX_SCORE * num / scale;
+}
+
+void diffcore_rename(int detect_rename, int minimum_score)
 {
 	struct diff_queue_struct *q = &diff_queued_diff;
 	struct diff_queue_struct outq;
@@ -235,6 +252,8 @@ void diff_detect_rename(int detect_renam
 	int h, i, j;
 	int num_create, num_src, dst_cnt, src_cnt;
 
+	if (!minimum_score)
+		minimum_score = DEFAULT_MINIMUM_SCORE;
 	outq.queue = NULL;
 	outq.nr = outq.alloc = 0;
 
diff --git a/diffcore.h b/diffcore.h
--- a/diffcore.h
+++ b/diffcore.h
@@ -45,6 +45,8 @@ struct diff_filepair {
 	int orig_order; /* the original order of insertion into the queue */
 	int xfrm_work; /* for use by tramsformers, not by diffcore */
 };
+#define DIFF_PAIR_UNMERGED(p) \
+	(!DIFF_FILE_VALID((p)->one) && !DIFF_FILE_VALID((p)->two))
 
 struct diff_queue_struct {
 	struct diff_filepair **queue;
@@ -56,5 +58,7 @@ extern struct diff_queue_struct diff_que
 extern struct diff_filepair *diff_queue(struct diff_queue_struct *,
 					struct diff_filespec *,
 					struct diff_filespec *);
+extern void diff_q(struct diff_queue_struct *, struct diff_filepair *);
+extern void diff_free_filepair(struct diff_filepair *);
 
 #endif
------------------------------------------------


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH] The diff-raw format updates.
  2005-05-22  2:42   ` [PATCH] The diff-raw format updates Junio C Hamano
  2005-05-22  6:01     ` Linus Torvalds
@ 2005-05-22 18:35     ` Linus Torvalds
  2005-05-22 18:36       ` Niklas Hoglund
                         ` (2 more replies)
  1 sibling, 3 replies; 33+ messages in thread
From: Linus Torvalds @ 2005-05-22 18:35 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Git Mailing List



On Sat, 21 May 2005, Junio C Hamano wrote:
>
> Update the diff-raw format as Linus and I discussed, except that
> it does not use sequence of underscore '_' letters to express
> nonexistence.  All '0' mode is used for that purpose instead.
> 
> The new diff-raw format can express rename/copy

Having looked at this, I have to disagree.

It can _almost_ express rename/copy, but you can't tell the two apart. In 
both cases you have two different modes, two different SHA1's, and two 
different filenames.

Also, while you can trivially tell whether a file is deleted or new (look 
at the 000000... SHA1), it is pretty illogical to give a "filename" for 
the non-existent side, as in the line

:000000 100644 0000000000000000000000000000000000000000 25ab9eda939ad92bb746c2419d083b1e52117a56	diffcore-pathspec.c	diffcore-pathspec.c

Finally, having now looked at it some more, I've come to realize that it's 
actually pretty hard to tell the different cases apart visually (is it a 
rename or just a change), because the full pathnames can be so long that 
it's not at all immediately obvious.

Anyway, I think we can trivially tweak the filename output to handle all 
these problems.

I'd suggest:

 - we'd continue to have two "filename" fields, with the existing
   termination, but they aren't pure filenames any more, they are just
   tab-separated (or zero-separated) "source" and "destination" fields.

 - if no filename exists (ie the source side of a new file, or the 
   destination side of a deleted file), output "/dev/null". In other
   words, a nonexistent file is _always_ associated with mode 000000, SHA1
   00000..  and a "name field" of "/dev/null".

 - ONLY IF HUMAN-READABLE: if the destination filename is the same as the
   source, drop it (and the tab) completely. This just makes things so
   much more readable, and it's still parseable, because the 
   line-termination is different from the inter-file termination.

   NOTE! In the zero-terminated format, you can't do this, since you 
   wouldn't know where the line ended. You might drop the name completely, 
   but you'd have to have two NUL bytes. I'd argue that since this format 
   isn't human-readable anyway, you might just want to keep the filename 
   the same.

 - in all other cases: if the file is new, prepend a "+", if the file is 
   old, prepend a "*", and if the file goes away, prepend a "-". In other 
   words, the actual pathname (if it exists) always starts at the second
   character and is always prepended by _something_ (ie there is no 
   ambiguoity with pathnames that start in -/+/*).

The above hass a few nice properties, notably you can parse the first
character of the name field, and you always know what's up:

 - '/' is always "/dev/null" aka "no file"
 - '+' is always "added file"
 - '-' is always "removed file"
 - '*' is always "existing file"
 - '\0' (ie empty) is always "same filename as source"

So for the above "create" event, it would look like

	:000000 100644 0000.. 25ab..	/dev/null	+diffcore-pathspec.c

which is visually quite obviously a create. Similarly, deletes are also 
visually pretty obvious:

	:100644 000000 25ab.. 0000..	-diffcore-pathspec.c	/dev/null

while a "copy" would be (git-pull-script stays around, so it gets a "*"):

	:100755 100755 bd89.. 17f2..	*git-pull-script	+git-fetch-script

and a "rename" would be:

	:100644 100644 51bb.. 51bb..	-diff-tree-helper.c	+diff-helper.c

(ie the difference is in the source file having a "-" in front of it 
instead of a "*").

A regular modification would be

	:100644 100644 bcd3.. c05b..	*Documentation/git-fsck-cache.txt

which is also very visually distinct from the other cases.

What do you think?

		Linus

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH] The diff-raw format updates.
  2005-05-22 18:35     ` [PATCH] The diff-raw format updates Linus Torvalds
@ 2005-05-22 18:36       ` Niklas Hoglund
  2005-05-22 19:15         ` Junio C Hamano
  2005-05-22 18:42       ` Thomas Glanzmann
  2005-05-22 19:13       ` [PATCH] The diff-raw format updates Junio C Hamano
  2 siblings, 1 reply; 33+ messages in thread
From: Niklas Hoglund @ 2005-05-22 18:36 UTC (permalink / raw)
  To: git

On Sun, May 22, 2005 at 11:35:34AM -0700, Linus Torvalds wrote:
> :000000 100644 0000000000000000000000000000000000000000 25ab9eda939ad92bb746c2419d083b1e52117a56	diffcore-pathspec.c	diffcore-pathspec.c

Don't you think it would be easier to read if the SHA1 field was, say, a
dash and 19 spaces, instead of an obviously bogus SHA1?

This feels a bit like using 0 where NULL would be more appropriate...

Best regards,
Niklas

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH] The diff-raw format updates.
  2005-05-22 18:35     ` [PATCH] The diff-raw format updates Linus Torvalds
  2005-05-22 18:36       ` Niklas Hoglund
@ 2005-05-22 18:42       ` Thomas Glanzmann
  2005-05-22 19:05         ` Linus Torvalds
  2005-05-22 19:13       ` [PATCH] The diff-raw format updates Junio C Hamano
  2 siblings, 1 reply; 33+ messages in thread
From: Thomas Glanzmann @ 2005-05-22 18:42 UTC (permalink / raw)
  To: Git Mailing List

Hello,

>  - in all other cases: if the file is new, prepend a "+", if the file is 
>    old, prepend a "*", and if the file goes away, prepend a "-". In other 
>    words, the actual pathname (if it exists) always starts at the second
>    character and is always prepended by _something_ (ie there is no 
>    ambiguoity with pathnames that start in -/+/*).

I guess that this is only on human readable but not on the machine
format, right?

> What do you think?

Sounds good. Especially that we keep the fixed fields for the machine
parsable stuff.

	Thomas

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH] The diff-raw format updates.
  2005-05-22 18:42       ` Thomas Glanzmann
@ 2005-05-22 19:05         ` Linus Torvalds
  2005-05-22 19:05           ` Thomas Glanzmann
  2005-05-22 19:20           ` Junio C Hamano
  0 siblings, 2 replies; 33+ messages in thread
From: Linus Torvalds @ 2005-05-22 19:05 UTC (permalink / raw)
  To: Thomas Glanzmann; +Cc: Git Mailing List



On Sun, 22 May 2005, Thomas Glanzmann wrote:
> 
> >  - in all other cases: if the file is new, prepend a "+", if the file is 
> >    old, prepend a "*", and if the file goes away, prepend a "-". In other 
> >    words, the actual pathname (if it exists) always starts at the second
> >    character and is always prepended by _something_ (ie there is no 
> >    ambiguoity with pathnames that start in -/+/*).
> 
> I guess that this is only on human readable but not on the machine
> format, right?

The machine readable format has the same issue: it needs to be able to 
distinguish between a "copy" (where the source remains) and a "rename" 
(where the source is removed). So you still need the "*/-" thing, and then 
you're better off doing "+" and "/" too in the machine-readable format, to 
make the differences be as small as possible _and_ to avoid confusion if a 
pathname starts with '*' or '-'.

		Linus

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH] The diff-raw format updates.
  2005-05-22 19:05         ` Linus Torvalds
@ 2005-05-22 19:05           ` Thomas Glanzmann
  2005-05-22 19:20           ` Junio C Hamano
  1 sibling, 0 replies; 33+ messages in thread
From: Thomas Glanzmann @ 2005-05-22 19:05 UTC (permalink / raw)
  To: Git Mailing List

Hello,

> The machine readable format has the same issue: it needs to be able to 
> distinguish between a "copy" (where the source remains) and a "rename" 
> (where the source is removed). So you still need the "*/-" thing, and then 
> you're better off doing "+" and "/" too in the machine-readable format, to 
> make the differences be as small as possible _and_ to avoid confusion if a 
> pathname starts with '*' or '-'.

thanks, I see the point now.

	Thomas

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH] The diff-raw format updates.
  2005-05-22 18:35     ` [PATCH] The diff-raw format updates Linus Torvalds
  2005-05-22 18:36       ` Niklas Hoglund
  2005-05-22 18:42       ` Thomas Glanzmann
@ 2005-05-22 19:13       ` Junio C Hamano
  2 siblings, 0 replies; 33+ messages in thread
From: Junio C Hamano @ 2005-05-22 19:13 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Git Mailing List

I have to have some time to ruminate about this, but I am
leaving for the day and won't be back till late afternoon (PDT).

My feeling, knee-jerk reaction, to what you are proposing is
that you are trying to go half-way back to the earlier format
that had addremove/change distinction.

I have to disagree that having path for the non-existent side is
_illogical_.  I ask "what is different between tree A and tree
B", and the diff-raw answers "there was no path0 in tree A but
path0 exists in tree B; there was path1 in tree A but now path2
is there instead in tree B".  To me, the answer would be more
illogical if the diff-raw said "there was no /dev/null in tree
A, and instead path0 exists in tree B.", which is what your
suggestion makes it say if I am reading you right.  Also, again
this is my knee-jerk reaction without thinking things through,
you are hiding (not losing, because having a rename/copy entry
that removes a path _is_ illogical) information by not saying
what no longer exists by replacing the right hand side with
/dev/null and having the reader to infer what is removed.  I
want the reader to be able to say "I can look at only right hand
side to see what is in the right hand side tree".  With
/dev/null munging, it becomes necessary to sometimes look at the
other side.


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH] The diff-raw format updates.
  2005-05-22 18:36       ` Niklas Hoglund
@ 2005-05-22 19:15         ` Junio C Hamano
  0 siblings, 0 replies; 33+ messages in thread
From: Junio C Hamano @ 2005-05-22 19:15 UTC (permalink / raw)
  To: Niklas Hoglund; +Cc: git

>>>>> "NH" == Niklas Hoglund <nhoglund@gmail.com> writes:

NH> On Sun, May 22, 2005 at 11:35:34AM -0700, Linus Torvalds wrote:
>> :000000 100644 0000000000000000000000000000000000000000 25ab9eda939ad92bb746c2419d083b1e52117a56	diffcore-pathspec.c	diffcore-pathspec.c

NH> Don't you think it would be easier to read if the SHA1 field was, say, a
NH> dash and 19 spaces, instead of an obviously bogus SHA1?

My understanding of the purpose of the diff-raw format is
primarily about machine readablility; 0{40} was far easier to
code cleanly---please see why in diff-helper.c.



^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH] The diff-raw format updates.
  2005-05-22 19:05         ` Linus Torvalds
  2005-05-22 19:05           ` Thomas Glanzmann
@ 2005-05-22 19:20           ` Junio C Hamano
  2005-05-22 19:35             ` Junio C Hamano
  1 sibling, 1 reply; 33+ messages in thread
From: Junio C Hamano @ 2005-05-22 19:20 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Thomas Glanzmann, Git Mailing List

>>>>> "LT" == Linus Torvalds <torvalds@osdl.org> writes:

LT> The machine readable format has the same issue: it needs to be able to 
LT> distinguish between a "copy" (where the source remains) and a "rename" 
LT> (where the source is removed).

Why?  If the same path appears later as the left hand side then
it is a copy otherwise it is a rename.  Please see what
diffcore-rename does when assigning to dp->xfrm_msg.  

What I've been trying hard so far was to keep diff_filepair not
to be too specific to rename/copy.  What I do not like about
what is being proposed is that it would force me to introduce
this "is this a copy or a rename" flag to that structure.


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH] The diff-raw format updates.
  2005-05-22 19:20           ` Junio C Hamano
@ 2005-05-22 19:35             ` Junio C Hamano
  2005-05-22 20:24               ` Linus Torvalds
  2005-05-23  4:26               ` [PATCH] Rename/copy detection fix Junio C Hamano
  0 siblings, 2 replies; 33+ messages in thread
From: Junio C Hamano @ 2005-05-22 19:35 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Thomas Glanzmann, Git Mailing List

>>>>> "JCH" == Junio C Hamano <junkio@cox.net> writes:
>>>>> "LT" == Linus Torvalds <torvalds@osdl.org> writes:
LT> The machine readable format has the same issue: it needs to be able to 
LT> distinguish between a "copy" (where the source remains) and a "rename" 
LT> (where the source is removed).

JCH> Why?  If the same path appears later as the left hand side then
JCH> it is a copy otherwise it is a rename.  Please see what
JCH> diffcore-rename does when assigning to dp->xfrm_msg.  

JCH> What I've been trying hard so far was to keep diff_filepair not
JCH> to be too specific to rename/copy.  What I do not like about
JCH> what is being proposed is that it would force me to introduce
JCH> this "is this a copy or a rename" flag to that structure.

Linus, another way of saying the above is this.

Think about the example in test that munge COPYING file.  What
was recorded in the two trees as the result of

    sed $munge_1 <COPYING >COPYING.1
    sed $munge_2 <COPYING COPYING.2
    rm COPYING

will be expressed as copy-edit of COPYING.1 and rename-edit of
COPYING.2 (both from COPYING).  But that is just _one_
interpretation diffcore-rename _arbitrarily_ makes.

It could have been copy-edit of COPYING.2 and rename-edit of
COPYING.1 (both from COPYING).  If you make the change you are
proposing to diff-raw format, reordering diff-raw entries stops
making sense, while the current format allows us to.  

It is similar to your argument of not _recording_ renames on
commit but _tracking_ it upon inspection.  I deliberatly chose
not to record rename/copy distinction in diff-raw --- it is to
be inferred from which entry touches the src _last_.  Everything
but the last one to touch the same path is copy, and the last
one is rename.


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH] The diff-raw format updates.
  2005-05-22 19:35             ` Junio C Hamano
@ 2005-05-22 20:24               ` Linus Torvalds
  2005-05-22 23:01                 ` Junio C Hamano
  2005-05-23  4:26               ` [PATCH] Rename/copy detection fix Junio C Hamano
  1 sibling, 1 reply; 33+ messages in thread
From: Linus Torvalds @ 2005-05-22 20:24 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Thomas Glanzmann, Git Mailing List



On Sun, 22 May 2005, Junio C Hamano wrote:
>
> I deliberatly chose not to record rename/copy distinction in diff-raw
> --- it is to be inferred from which entry touches the src _last_.  
> Everything but the last one to touch the same path is copy, and the last
> one is rename.

My counter-example: there is no rename at all, just a copy.

The current raw-diff output simply _cannot_ distinguish this from the case
where there is a rename, as far as I can tell. 

Basically, imagine doing something like this:

	mkdir test-copy
	cd test-copy
	git-init-db 
	cp ~/v2.6/linux/kernel/sched.c .
	git-update-cache --add sched.c 
	echo duh | git-commit-tree $(git-write-tree) > .git/HEAD
	sed 's/sched\.c/other\.c/g' < sched.c > other.c
	git-update-cache --add other.c 
	echo duh2 | git-commit-tree $(git-write-tree) -p HEAD

and dammit, if "other.c" doesn't show up as a copy (with small
differences) of sched.c, then the whole diff algorithm is broken and
clearly isn't doing copy-diffs right at all (right now it doesn't do it
since diff-tree doesn't even do the "check copies against old", but that's
a different issue).

In other words, how do you propose to _describe_ that copy?

The whole point to the "raw diff" format is that you should be able to 
get the same output with

	git-diff-tree -C HEAD | git-diff-helper
	git-diff-tree -C -p HEAD

and right now I don't see how you can do that.. And if you can't do that,
then the raw diff format is pointless. It's not "raw diff" at all, it's a
"less capable diff" format.

THAT is why I think the raw diff output is broken right now. All the rest 
was about just trying to also make it a bit more readable while at it.

Try it. I added in the "&& detect_rename < 2" thing to diff-tree.c, and I 
get:

	git-whatchanged -C --root | git-diff-helper | less -S

resulting in:

	diff-tree eb773598aa29bb642d6016a0d5961c80628ce490 (from 8c63fcad1f27ceed6825110aa806349dad99d89f)
	Author: Linus Torvalds <torvalds@ppc970.osdl.org>
	Date:   Sun May 22 13:17:40 2005 -0700
	    
	    duh2
	
	diff --git a/sched.c b/other.c
	--- a/sched.c
	+++ b/other.c
	@@ -1,5 +1,5 @@
	 /*
	- *  kernel/sched.c
	+ *  kernel/other.c
	  *
	  *  Kernel scheduler and related syscalls
	  *

ie it does NOT say it is a copy, because it _cannot_ say it is a copy. It 
just doesn't know. A copy and a rename look exactly the same in the raw 
diff output, and that's a BUG.

		Linus


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH] The diff-raw format updates.
  2005-05-22 20:24               ` Linus Torvalds
@ 2005-05-22 23:01                 ` Junio C Hamano
  2005-05-22 23:14                   ` Linus Torvalds
  0 siblings, 1 reply; 33+ messages in thread
From: Junio C Hamano @ 2005-05-22 23:01 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Thomas Glanzmann, Git Mailing List

>>>>> "LT" == Linus Torvalds <torvalds@osdl.org> writes:

LT> On Sun, 22 May 2005, Junio C Hamano wrote:
>> 
>> I deliberatly chose not to record rename/copy distinction in diff-raw
>> --- it is to be inferred from which entry touches the src _last_.  
>> Everything but the last one to touch the same path is copy, and the last
>> one is rename.

LT> My counter-example: there is no rename at all, just a copy.

My wording was wrong.  If you do not use the helper, you should
be able to get copy and in-place edit or no-modification (I
collectively call them "stay" in diffcore-rename.c).  If it does
not work then you have spotted a bug in the implementation but
not the design.  Anyway, I should have said:

    Everything but the last are copies.  If the last one have
    different src and dst, then it is a rename.  Otherwise it
    is a "stay".

LT> Try it. I added in the "&& detect_rename < 2" thing to diff-tree.c, and I 
LT> get:

LT> 	git-whatchanged -C --root | git-diff-helper | less -S

LT> resulting in:

That's not a counter-example.  You are agreeing to what I said in
this message:

    To: Linus Torvalds <torvalds@osdl.org>
    Cc: git@vger.kernel.org
    Subject: [PATCH] Teach diff-tree to report unmodified paths for -C option.
    Date: Sat, 21 May 2005 03:11:49 -0700
    Message-ID: <7vpsvkj3bu.fsf_-_@assigned-by-dhcp.cox.net>

    ...

    Another useless comment.  For obvious reasons, there is nothing
    we can do about the diff-helper to add "the other half of copy
    detection information", because what it can tell diff-core is
    limited to its input, which usually is just differences prepared
    by somebody else, and it cannot know anything about unchanged
    files.  When I started pushing '-p' flag to diff-tree family, I
    remember that your reaction was neutral to moderately negative
    ("I'd tolerate, although I think it is redundant and you are not
    even generating diff yourself anyway" as opposed to "That's just
    great").  I think now you would thank me for shoving the diff
    interface into them ;-).

If you want the diff-helper to be able to the full scale copy
detection, you must _feed_ the full information including "stay"
entries to it.  Unfortunately, the current diffcore interface
does not let the callers (diff-tree family) to tell it to keep
the "stay" entries in its output.  I've been working on that
part this morning before this discussion started, so that
electively they can tell the diffcore layer not to do the "stay"
pruning ("stay pruning" will simply become another diffcore
transformation).

I have to leave again now, but I promise you'll hear back from
me on this one later tonight (or tomorrow evening at the latest
if things do not work out).


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH] The diff-raw format updates.
  2005-05-22 23:01                 ` Junio C Hamano
@ 2005-05-22 23:14                   ` Linus Torvalds
  2005-05-23  0:35                     ` Junio C Hamano
  0 siblings, 1 reply; 33+ messages in thread
From: Linus Torvalds @ 2005-05-22 23:14 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Thomas Glanzmann, Git Mailing List



On Sun, 22 May 2005, Junio C Hamano wrote:
> 
>     Everything but the last are copies.  If the last one have
>     different src and dst, then it is a rename.  Otherwise it
>     is a "stay".

If so, I disagree. Totally.

You seem to think that it's a feature that you can't get the same output 
out of git-diff-helper, and I think that's not a feature, but a total bug.

> LT> resulting in:
> 
> That's not a counter-example.  You are agreeing to what I said in
> this message:

No, I'm not agreeing at all. I'm saying that this is unacceptable, and if 
this was intentional, as you seem to be saying, then it was in my opinion 
a bad idea. We might as well go back to the original diff format, which 
had other problems, but they were no worse than the new one.

Basically, with the new format as-is, renames and copies cannot be 
described sanely. That was exactly the same problem as the old format had, 
except the old format was less verbose. So why do the new format at all?

I'm arguing that we should consider it a _requirement_ that "raw diffs" 
can be translated into te same thing the "-p" flag internally does.

		Linus

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH] The diff-raw format updates.
  2005-05-22 23:14                   ` Linus Torvalds
@ 2005-05-23  0:35                     ` Junio C Hamano
  2005-05-23  1:07                       ` Linus Torvalds
  0 siblings, 1 reply; 33+ messages in thread
From: Junio C Hamano @ 2005-05-23  0:35 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Thomas Glanzmann, Git Mailing List

>>>>> "LT" == Linus Torvalds <torvalds@osdl.org> writes:

LT> You seem to think that it's a feature that you can't get the same output 
LT> out of git-diff-helper, and I think that's not a feature, but a total bug.

I do agree that it is a real problem.

And I think you've filled some holes in the idea I originally
had while working on the following (the part you omitted from
the quote):

    ...  Unfortunately, the current diffcore interface does not
    let the callers (diff-tree family) to tell it to keep the
    "stay" entries in its output.  I've been working on that
    part this morning before this discussion started, so that
    electively they can tell the diffcore layer not to do the
    "stay" pruning ("stay pruning" will simply become another
    diffcore transformation).

The change I am making right now is essentially what the above
paragraph says, with your "hole-filling", to do something like
this:

 (1) diff-tree family feeds unmodified entries to the diffcore,
     in addition to changes.  diffcore-rename matches and merges
     them into rename/copy; diffcore-pickaxe filters; other
     diffcore transformations may later be written (e.g
     reordering the patch so that .h files come first before .c
     files).  The result of the sequence of these
     transformations is fed to the diff_flush() for output.
     Essentially this is the same architecture as we already
     have.

 (2) diff_flush() currently prunes _all_ the unmodified entries
     just before output happens.  This is a _bug_ as you said.
     To fix this, first, separate that pruning part (which is
     currently embedded in diff_flush_one as the first
     statement) out, so this pruning happens independently and
     before diff_flush() happens; and make the pruning more
     careful, not to prune entries that we _need_ to express
     rename/copy distinction in diff-raw output.  Namely, it
     should keep the unmodified entry for a path _if_ the path
     is to be used as the source of a rename or copy entry.
     Also diff-patch output needs to be taught to take notice
     and ignore such entries.

     Optionally, the callers may want to tell diffcore not to do
     _any_ pruning, making diff-raw output similar to a merge of
     ls-files --stage outputs of two trees.

LT> I'm arguing that we should consider it a _requirement_ that "raw diffs" 
LT> can be translated into te same thing the "-p" flag internally does.

To this, I 100% agree, and I think it is doable.  Give me a bit
more time for testing and tweaking.


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH] The diff-raw format updates.
  2005-05-23  0:35                     ` Junio C Hamano
@ 2005-05-23  1:07                       ` Linus Torvalds
  2005-05-23  1:33                         ` Junio C Hamano
  0 siblings, 1 reply; 33+ messages in thread
From: Linus Torvalds @ 2005-05-23  1:07 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Thomas Glanzmann, Git Mailing List



On Sun, 22 May 2005, Junio C Hamano wrote:
> 
>  (2) diff_flush() currently prunes _all_ the unmodified entries
>      just before output happens.  This is a _bug_ as you said.
>      To fix this, first, separate that pruning part


I don't understand why this is so hard.

The built-in diff knows to do the right thing. That means that the 
information is there, in xfrm_msg. 

Instead of making a big deal out of this, why not admit that "xfrm_msg" 
was a mistake, and it should be replaced with just a flag about what that 
thing is. Make the flag be "diff", "copy" or "rename". Add another field 
to say how good a match it was, and then let the people who now use 
xfrm_msg generate the string instead. 

The fact is, we alreadt _have_ the information, but it's been corrupted
into a string too early, making it hard to get at. No?

		Linus

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH] The diff-raw format updates.
  2005-05-23  1:07                       ` Linus Torvalds
@ 2005-05-23  1:33                         ` Junio C Hamano
  0 siblings, 0 replies; 33+ messages in thread
From: Junio C Hamano @ 2005-05-23  1:33 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Thomas Glanzmann, Git Mailing List

Delaying the message generation to just before output is part of
the fix I've been working on, yes.  But there are some other
issues.  Give me some more time as I asked.


^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH] Be careful with symlinks when detecting renames and copies.
  2005-05-22 17:04     ` Junio C Hamano
@ 2005-05-23  4:24       ` Junio C Hamano
  0 siblings, 0 replies; 33+ messages in thread
From: Junio C Hamano @ 2005-05-23  4:24 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git

Linus, this is *not* based on the fixed rename/copy fix I am
going to send out shortly, but comes before that due to patch
dependency (the order I happened to have done things, that is).

Please apply this one before the rename/copy fix.

------------
Earlier round was not treating symbolic links carefully enough,
and would have produced diff output that renamed/copied then
edited the contents of a symbolic link, which made no practical
sense.  Change it to detect only pure renames.

Signed-off-by: Junio C Hamano <junkio@cox.net>
---

diffcore-rename.c              |   24 ++++++++------
t/t4004-diff-rename-symlink.sh |   66 +++++++++++++++++++++++++++++++++++++++++
2 files changed, 80 insertions(+), 10 deletions(-)
new file (100755): t/t4004-diff-rename-symlink.sh

diff --git a/diffcore-rename.c b/diffcore-rename.c
--- a/diffcore-rename.c
+++ b/diffcore-rename.c
@@ -20,7 +20,7 @@ static void diff_rename_pool_add(struct 
 				 struct diff_filespec *s)
 {
 	if (S_ISDIR(s->mode))
-		return;  /* rename/copy patch for tree does not make sense. */
+		return;  /* no trees, please */
 
 	if (pool->alloc <= pool->nr) {
 		pool->alloc = alloc_nr(pool->alloc);
@@ -71,6 +71,13 @@ static int estimate_similarity(struct di
 	unsigned long delta_size, base_size;
 	int score;
 
+	/* We deal only with regular files.  Symlink renames are handled
+	 * only when they are exact matches --- in other words, no edits
+	 * after renaming.
+	 */
+	if (!S_ISREG(src->mode) || !S_ISREG(dst->mode))
+		return 0;
+
 	delta_size = ((src->size < dst->size) ?
 		      (dst->size - src->size) : (src->size - dst->size));
 	base_size = ((src->size < dst->size) ? src->size : dst->size);
@@ -268,7 +275,7 @@ void diffcore_rename(int detect_rename, 
 		struct diff_filepair *p = q->queue[i];
 		if (!DIFF_FILE_VALID(p->one))
 			if (!DIFF_FILE_VALID(p->two))
-				continue; /* ignore nonsense */
+				continue; /* unmerged */
 			else
 				diff_rename_pool_add(&created, p->two);
 		else if (!DIFF_FILE_VALID(p->two))
@@ -360,12 +367,9 @@ void diffcore_rename(int detect_rename, 
 	for (i = 0; i < q->nr; i++) {
 		struct diff_filepair *dp, *p = q->queue[i];
 		if (!DIFF_FILE_VALID(p->one)) {
-			if (DIFF_FILE_VALID(p->two)) {
-				/* creation */
-				dp = diff_queue(&outq, p->one, p->two);
-				dp->xfrm_work = 4;
-			}
-			/* otherwise it is a nonsense; just ignore it */
+			/* creation or unmerged entries */
+			dp = diff_queue(&outq, p->one, p->two);
+			dp->xfrm_work = 4;
 		}
 		else if (!DIFF_FILE_VALID(p->two)) {
 			/* deletion */
@@ -394,7 +398,7 @@ void diffcore_rename(int detect_rename, 
 	for (i = 0; i < outq.nr; i++) {
 		struct diff_filepair *p = outq.queue[i];
 		if (!DIFF_FILE_VALID(p->one)) {
-			/* created */
+			/* created or unmerged */
 			if (p->two->xfrm_flags & RENAME_DST_MATCHED)
 				; /* rename/copy created it already */
 			else
@@ -443,7 +447,7 @@ void diffcore_rename(int detect_rename, 
 		else
 			/* otherwise it is a modified (or stayed) entry */
 			diff_queue(q, p->one, p->two);
-		free(p);
+		diff_free_filepair(p);
 	}
 
 	free(outq.queue);
diff --git a/t/t4004-diff-rename-symlink.sh b/t/t4004-diff-rename-symlink.sh
new file mode 100755
--- /dev/null
+++ b/t/t4004-diff-rename-symlink.sh
@@ -0,0 +1,66 @@
+#!/bin/sh
+#
+# Copyright (c) 2005 Junio C Hamano
+#
+
+test_description='More rename detection tests.
+
+The rename detection logic should be able to detect pure rename or
+copy of symbolic links, but should not produce rename/copy followed
+by an edit for them.
+'
+. ./test-lib.sh
+
+test_expect_success \
+    'prepare reference tree' \
+    'echo xyzzy | tr -d '\\\\'012 >yomin &&
+     ln -s xyzzy frotz &&
+    git-update-cache --add frotz yomin &&
+    tree=$(git-write-tree) &&
+    echo $tree'
+
+test_expect_success \
+    'prepare work tree' \
+    'mv frotz rezrov &&
+     rm -f yomin &&
+     ln -s xyzzy nitfol &&
+     ln -s xzzzy bozbar &&
+    git-update-cache --add --remove frotz rezrov nitfol bozbar yomin'
+
+# tree has frotz pointing at xyzzy, and yomin that contains xyzzy to
+# confuse things.  work tree has rezrov (xyzzy) nitfol (xyzzy) and
+# bozbar (xzzzy).
+# rezrov and nitfol are rename/copy of frotz and bozbar should be
+# a new creation.
+
+GIT_DIFF_OPTS=--unified=0 git-diff-cache -M -p $tree >current
+cat >expected <<\EOF
+diff --git a/frotz b/nitfol
+similarity index 100%
+copy from frotz
+copy to nitfol
+diff --git a/frotz b/rezrov
+similarity index 100%
+rename old frotz
+rename new rezrov
+diff --git a/yomin b/yomin
+deleted file mode 100644
+--- a/yomin
++++ /dev/null
+@@ -1 +0,0 @@
+-xyzzy
+\ No newline at end of file
+diff --git a/bozbar b/bozbar
+new file mode 120000
+--- /dev/null
++++ b/bozbar
+@@ -0,0 +1 @@
++xzzzy
+\ No newline at end of file
+EOF
+
+test_expect_success \
+    'validate diff output' \
+    'diff -u current expected'
+
+test_done
------------------------------------------------


^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH] Rename/copy detection fix.
  2005-05-22 19:35             ` Junio C Hamano
  2005-05-22 20:24               ` Linus Torvalds
@ 2005-05-23  4:26               ` Junio C Hamano
  2005-05-23  4:38                 ` Comments on "Rename/copy detection fix." Junio C Hamano
  1 sibling, 1 reply; 33+ messages in thread
From: Junio C Hamano @ 2005-05-23  4:26 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Git Mailing List

The rename/copy detection logic in earlier round was only good
enough to show patch output and discussion on the mailing list
about the diff-raw format updates revealed many problems with
it.  This patch fixes all the ones known to me, without making
things I want to do later impossible, mostly related to patch
reordering.

 (1) Earlier rename/copy detector determined which one is rename
     and which one is copy too early, which made it impossible
     to later introduce diffcore transformers to reorder
     patches.  This patch fixes it by moving that logic to the
     very end of the processing.

 (2) Earlier output routine diff_flush() was pruning all the
     "no-change" entries indiscriminatingly.  This was done due
     to my false assumption that one of the requirements in the
     diff-raw output was not to show such an entry (which
     resulted in my incorrect comment about "diff-helper never
     being able to be equivalent to built-in diff driver").  My
     special thanks go to Linus for correcting me about this.
     When we produce diff-raw output, for the downstream to be
     able to tell renames from copies, sometimes it _is_
     necessary to output "no-change" entries, and this patch
     adds diffcore_prune() function for doing it.

 (3) Earlier diff_filepair structure was trying to be not too
     specific about rename/copy operations, but the purpose of
     the structure was to record one or two paths, which _was_
     indeed about rename/copy.  This patch discards xfrm_msg
     field which was trying to be generic for this wrong reason,
     and introduces a couple of fields (rename_score and
     rename_rank) that are explicitly specific to rename/copy
     logic.  One thing to note is that the information in a
     single diff_filepair structure _still_ does not distinguish
     renames from copies, and it is deliberately so.  This is to
     allow patches to be reordered in later stages.

 (4) This patch also adds some tests about diff-raw format
     output and makes sure that necessary "no-change" entries
     appear on the output.

Signed-off-by: Junio C Hamano <junkio@cox.net>
---

diff.c                   |  145 +++++++++++++++++++++++++++++++----------------
diffcore-pathspec.c      |    2 
diffcore-pickaxe.c       |    2 
diffcore-rename.c        |  112 +++++++++++-------------------------
diffcore.h               |   13 +++-
t/t4003-diff-rename-1.sh |   26 ++++----
t/t4005-diff-rename-2.sh |   82 ++++++++++++++++++++++++++
7 files changed, 244 insertions(+), 138 deletions(-)
new file (100644): t/t4005-diff-rename-2.sh

diff --git a/diff.c b/diff.c
--- a/diff.c
+++ b/diff.c
@@ -283,12 +283,6 @@ int diff_populate_filespec(struct diff_f
 	return 0;
 }
 
-void diff_free_filepair(struct diff_filepair *p)
-{
-	free(p->xfrm_msg);
-	free(p);
-}
-
 void diff_free_filespec_data(struct diff_filespec *s)
 {
 	if (s->should_free)
@@ -501,9 +495,9 @@ struct diff_filepair *diff_queue(struct 
 	struct diff_filepair *dp = xmalloc(sizeof(*dp));
 	dp->one = one;
 	dp->two = two;
-	dp->xfrm_msg = NULL;
+	dp->score = 0;
 	dp->orig_order = queue->nr;
-	dp->xfrm_work = 0;
+	dp->rename_rank = 0;
 	diff_q(queue, dp);
 	return dp;
 }
@@ -522,23 +516,7 @@ static void diff_flush_raw(struct diff_f
 	       p->two->path, line_termination);
 }
 
-static void diff_flush_patch(struct diff_filepair *p)
-{
-	const char *name, *other;
-
-	name = p->one->path;
-	other = (strcmp(name, p->two->path) ? p->two->path : NULL);
-	if ((DIFF_FILE_VALID(p->one) && S_ISDIR(p->one->mode)) ||
-	    (DIFF_FILE_VALID(p->two) && S_ISDIR(p->two->mode)))
-		return; /* no tree diffs in patch format */ 
-
-	if (DIFF_PAIR_UNMERGED(p))
-		run_external_diff(name, NULL, NULL, NULL, NULL);
-	else
-		run_external_diff(name, other, p->one, p->two, p->xfrm_msg);
-}
-
-static int uninteresting(struct diff_filepair *p)
+int diff_unmodified_pair(struct diff_filepair *p)
 {
 	/* This function is written stricter than necessary to support
 	 * the currently implemented transformers, but the idea is to
@@ -570,12 +548,70 @@ static int uninteresting(struct diff_fil
 	return 0;
 }
 
+static void diff_flush_patch(struct diff_filepair *p, const char *msg)
+{
+	const char *name, *other;
+
+	/* diffcore_prune() keeps "stay" entries for diff-raw
+	 * copy/rename detection, but when we are generating
+	 * patches we do not need them.
+	 */
+	if (diff_unmodified_pair(p))
+		return;
+
+	name = p->one->path;
+	other = (strcmp(name, p->two->path) ? p->two->path : NULL);
+	if ((DIFF_FILE_VALID(p->one) && S_ISDIR(p->one->mode)) ||
+	    (DIFF_FILE_VALID(p->two) && S_ISDIR(p->two->mode)))
+		return; /* no tree diffs in patch format */ 
+
+	if (DIFF_PAIR_UNMERGED(p))
+		run_external_diff(name, NULL, NULL, NULL, NULL);
+	else
+		run_external_diff(name, other, p->one, p->two, msg);
+}
+
+int diff_needs_to_stay(struct diff_queue_struct *q, int i,
+		       struct diff_filespec *it)
+{
+	/* If it will be used in later entry (either stay or used
+	 * as the source of rename/copy), we need to copy, not rename.
+	 */
+	while (i < q->nr) {
+		struct diff_filepair *p = q->queue[i++];
+		if (!DIFF_FILE_VALID(p->two))
+			continue; /* removed is fine */
+		if (strcmp(p->one->path, it->path))
+			continue; /* not relevant */
+
+		/* p has its src set to *it and it is not a delete;
+		 * it will be used for in-place change, rename/copy,
+		 * or just stays there.  We cannot rename it out.
+		 */
+		return 1;
+	}
+	return 0;
+}
+
+static int diff_used_as_source(struct diff_queue_struct *q, int lim,
+			       struct diff_filespec *it)
+{
+	int i;
+	for (i = 0; i < lim; i++) {
+		struct diff_filepair *p = q->queue[i++];
+		if (!strcmp(p->one->path, it->path))
+			return 1;
+	}
+	return 0;
+}
+
 void diffcore_prune(void)
 {
 	/*
 	 * Although rename/copy detection wants to have "no-change"
 	 * entries fed into them, the downstream do not need to see
-	 * them.  This function removes such entries.
+	 * them, unless we had rename/copy for the same path earlier.
+	 * This function removes such entries.
 	 *
 	 * The applications that use rename/copy should:
 	 *
@@ -585,6 +621,7 @@ void diffcore_prune(void)
 	 * (3) call diffcore_prune
 	 * (4) call other diffcore_xxx that do not need to see
 	 *     "no-change" entries.
+	 * (5) call diff_flush().
 	 */
 	struct diff_queue_struct *q = &diff_queued_diff;
 	struct diff_queue_struct outq;
@@ -595,22 +632,21 @@ void diffcore_prune(void)
 
 	for (i = 0; i < q->nr; i++) {
 		struct diff_filepair *p = q->queue[i];
-		if (!uninteresting(p))
+		if (!diff_unmodified_pair(p) ||
+		    diff_used_as_source(q, i, p->one))
 			diff_q(&outq, p);
 		else
-			diff_free_filepair(p);
+			free(p);
 	}
 	free(q->queue);
 	*q = outq;
 	return;
 }
 
-static void diff_flush_one(struct diff_filepair *p)
+static void diff_flush_one(struct diff_filepair *p, const char *msg)
 {
-	if (uninteresting(p))
-		return;
 	if (generate_patch)
-		diff_flush_patch(p);
+		diff_flush_patch(p, msg);
 	else
 		diff_flush_raw(p);
 }
@@ -618,14 +654,7 @@ static void diff_flush_one(struct diff_f
 int diff_queue_is_empty(void)
 {
 	struct diff_queue_struct *q = &diff_queued_diff;
-	int i;
-
-	for (i = 0; i < q->nr; i++) {
-		struct diff_filepair *p = q->queue[i];
-		if (!uninteresting(p))
-			return 0;
-	}
-	return 1;
+	return q->nr == 0;
 }
 
 void diff_flush(int diff_output_style)
@@ -646,13 +675,35 @@ void diff_flush(int diff_output_style)
 		generate_patch = 1;
 		break;
 	}
-	for (i = 0; i < q->nr; i++)
-		diff_flush_one(q->queue[i]);
+	for (i = 0; i < q->nr; i++) {
+		char msg_[PATH_MAX*2+200], *msg = NULL;
+		struct diff_filepair *p = q->queue[i];
+		if (strcmp(p->one->path, p->two->path)) {
+			/* This is rename or copy.  Which one is it? */
+			if (diff_needs_to_stay(q, i+1, p->one)) {
+				sprintf(msg_,
+					"similarity index %d%%\n"
+					"copy from %s\n"
+					"copy to %s\n",
+					(int)(0.5 + p->score * 100/MAX_SCORE),
+					p->one->path, p->two->path);
+			}
+			else
+				sprintf(msg_,
+					"similarity index %d%%\n"
+					"rename old %s\n"
+					"rename new %s\n",
+					(int)(0.5 + p->score * 100/MAX_SCORE),
+					p->one->path, p->two->path);
+			msg = msg_;
+		}
+		diff_flush_one(p, msg);
+	}
+
 	for (i = 0; i < q->nr; i++) {
 		struct diff_filepair *p = q->queue[i];
 		diff_free_filespec_data(p->one);
 		diff_free_filespec_data(p->two);
-		free(p->xfrm_msg);
 		free(p);
 	}
 	free(q->queue);
@@ -674,10 +725,10 @@ void diff_addremove(int addremove, unsig
 	 * entries to the diff-core.  They will be prefixed
 	 * with something like '=' or '*' (I haven't decided
 	 * which but should not make any difference).
-	 * Feeding the same new and old to diff_change() should
-	 * also have the same effect.  diff_flush() should
-	 * filter uninteresting ones out at the final output
-	 * stage.
+	 * Feeding the same new and old to diff_change() 
+	 * also has the same effect.  diffcore_prune() should
+	 * be used to filter uninteresting ones out before the
+	 * final output happens.
 	 */
 	if (reverse_diff)
 		addremove = (addremove == '+' ? '-' :
diff --git a/diffcore-pathspec.c b/diffcore-pathspec.c
--- a/diffcore-pathspec.c
+++ b/diffcore-pathspec.c
@@ -55,7 +55,7 @@ void diffcore_pathspec(const char **path
 		    matches_pathspec(p->two->path, spec, speccnt))
 			diff_q(&outq, p);
 		else
-			diff_free_filepair(p);
+			free(p);
 	}
 	free(q->queue);
 	*q = outq;
diff --git a/diffcore-pickaxe.c b/diffcore-pickaxe.c
--- a/diffcore-pickaxe.c
+++ b/diffcore-pickaxe.c
@@ -48,7 +48,7 @@ void diffcore_pickaxe(const char *needle
 			 contains(p->two, needle, len))
 			diff_q(&outq, p);
 		if (onum == outq.nr)
-			diff_free_filepair(p);
+			free(p);
 	}
 	free(q->queue);
 	*q = outq;
diff --git a/diffcore-rename.c b/diffcore-rename.c
--- a/diffcore-rename.c
+++ b/diffcore-rename.c
@@ -119,27 +119,34 @@ static void record_rename_pair(struct di
 			       int rank,
 			       int score)
 {
-	/* The rank is used to sort the final output, because there
-	 * are certain dependencies.
+	/*
+	 * These ranks are used to sort the final output, because there
+	 * are certain dependencies:
 	 *
-	 *  - rank #0 depends on deleted ones.
-	 *  - rank #1 depends on kept files before they are modified.
-	 *  - rank #2 depends on kept files after they are modified;
-	 *    currently not used.
-	 *
-	 * Therefore, the final output order should be:
-	 *
-	 *  1. rank #0 rename/copy diffs.
+	 *  1. rename/copy that depends on deleted ones.
 	 *  2. deletions in the original.
-	 *  3. rank #1 rename/copy diffs.
-	 *  4. additions and modifications in the original.
-	 *  5. rank #2 rename/copy diffs; currently not used.
+	 *  3. rename/copy that depends on the pre-edit image of kept files.
+	 *  4. additions, modifications and no-modifications in the original.
+	 *  5. rename/copy that depends on the post-edit image of kept files
+	 *     (note that we currently do not detect such rename/copy).
 	 *
-	 * To achieve this sort order, we give xform_work the number
-	 * above.
+	 * The downstream diffcore transformers are free to reorder
+	 * the entries as long as they keep file pairs that has the
+	 * same p->one->path in earlier rename_rank to appear before
+	 * later ones.  This ordering is used by the diff_flush()
+	 * logic to tell renames from copies, and also used by the
+	 * diffcore_prune() logic to omit unnecessary
+	 * "no-modification" entries.
+	 *
+	 * To the final output routine, and in the diff-raw format
+	 * output, a rename/copy that is based on a path that has a
+	 * later entry that shares the same p->one->path and is not a
+	 * deletion is a copy.  Otherwise it is a rename.
 	 */
+
 	struct diff_filepair *dp = diff_queue(outq, src, dst);
-	dp->xfrm_work = (rank * 2 + 1) | (score<<RENAME_SCORE_SHIFT);
+	dp->rename_rank = rank * 2 + 1;
+	dp->score = score;
 	dst->xfrm_flags |= RENAME_DST_MATCHED;
 }
 
@@ -161,10 +168,8 @@ static void debug_filepair(const struct 
 {
 	debug_filespec(p->one, i, "one");
 	debug_filespec(p->two, i, "two");
-	fprintf(stderr, "pair flags %d, orig order %d, score %d\n",
-		(p->xfrm_work & ((1<<RENAME_SCORE_SHIFT) - 1)),
-		p->orig_order,
-		(p->xfrm_work >> RENAME_SCORE_SHIFT));
+	fprintf(stderr, "pair rank %d, orig order %d, score %d\n",
+		p->rename_rank, p->orig_order, p->score);
 }
 
 static void debug_queue(const char *msg, struct diff_queue_struct *q)
@@ -191,8 +196,8 @@ static int rank_compare(const void *a_, 
 {
 	const struct diff_filepair *a = *(const struct diff_filepair **)a_;
 	const struct diff_filepair *b = *(const struct diff_filepair **)b_;
-	int a_rank = a->xfrm_work & ((1<<RENAME_SCORE_SHIFT) - 1);
-	int b_rank = b->xfrm_work & ((1<<RENAME_SCORE_SHIFT) - 1);
+	int a_rank = a->rename_rank;
+	int b_rank = b->rename_rank;
 
 	if (a_rank != b_rank)
 		return a_rank - b_rank;
@@ -209,28 +214,6 @@ static int score_compare(const void *a_,
 	return b->score - a->score;
 }
 
-static int needs_to_stay(struct diff_queue_struct *q, int i,
-			 struct diff_filespec *it)
-{
-	/* If it will be used in later entry (either stay or used
-	 * as the source of rename/copy), we need to copy, not rename.
-	 */
-	while (i < q->nr) {
-		struct diff_filepair *p = q->queue[i++];
-		if (!DIFF_FILE_VALID(p->two))
-			continue; /* removed is fine */
-		if (strcmp(p->one->path, it->path))
-			continue; /* not relevant */
-
-		/* p has its src set to *it and it is not a delete;
-		 * it will be used for in-place change or rename/copy,
-		 * so we cannot rename it out.
-		 */
-		return 1;
-	}
-	return 0;
-}
-
 int diff_scoreopt_parse(const char *opt)
 {
 	int diglen, num, scale, i;
@@ -359,27 +342,24 @@ void diffcore_rename(int detect_rename, 
 	 * downstream, so we assign the sort keys in this loop.
 	 *
 	 * See comments at the top of record_rename_pair for numbers used
-	 * to assign xfrm_work.
-	 *
-	 * Note that we have not annotated the diff_filepair with any comment
-	 * so there is nothing other than p to free.
+	 * to assign rename_rank.
 	 */
 	for (i = 0; i < q->nr; i++) {
 		struct diff_filepair *dp, *p = q->queue[i];
 		if (!DIFF_FILE_VALID(p->one)) {
 			/* creation or unmerged entries */
 			dp = diff_queue(&outq, p->one, p->two);
-			dp->xfrm_work = 4;
+			dp->rename_rank = 4;
 		}
 		else if (!DIFF_FILE_VALID(p->two)) {
 			/* deletion */
 			dp = diff_queue(&outq, p->one, p->two);
-			dp->xfrm_work = 2;
+			dp->rename_rank = 2;
 		}
 		else {
 			/* modification, or stay as is */
 			dp = diff_queue(&outq, p->one, p->two);
-			dp->xfrm_work = 4;
+			dp->rename_rank = 4;
 		}
 		free(p);
 	}
@@ -415,39 +395,21 @@ void diffcore_rename(int detect_rename, 
 			/* rename or copy */
 			struct diff_filepair *dp =
 				diff_queue(q, p->one, p->two);
-			int msglen = (strlen(p->one->path) +
-				      strlen(p->two->path) + 100);
-			int score = (p->xfrm_work >> RENAME_SCORE_SHIFT);
-			dp->xfrm_msg = xmalloc(msglen);
+			dp->score = p->score;
 
 			/* if we have a later entry that is a rename/copy
 			 * that depends on p->one, then we copy here.
 			 * otherwise we rename it.
 			 */
-			if (needs_to_stay(&outq, i+1, p->one)) {
-				/* copy it */
-				sprintf(dp->xfrm_msg,
-					"similarity index %d%%\n"
-					"copy from %s\n"
-					"copy to %s\n",
-					(int)(0.5 + score * 100 / MAX_SCORE),
-					p->one->path, p->two->path);
-			}
-			else {
-				/* rename it, and mark it as gone. */
+			if (!diff_needs_to_stay(&outq, i+1, p->one))
+				/* this is the last one, so mark it as gone.
+				 */
 				p->one->xfrm_flags |= RENAME_SRC_GONE;
-				sprintf(dp->xfrm_msg,
-					"similarity index %d%%\n"
-					"rename old %s\n"
-					"rename new %s\n",
-					(int)(0.5 + score * 100 / MAX_SCORE),
-					p->one->path, p->two->path);
-			}
 		}
 		else
-			/* otherwise it is a modified (or stayed) entry */
+			/* otherwise it is a modified (or "stay") entry */
 			diff_queue(q, p->one, p->two);
-		diff_free_filepair(p);
+		free(p);
 	}
 
 	free(outq.queue);
diff --git a/diffcore.h b/diffcore.h
--- a/diffcore.h
+++ b/diffcore.h
@@ -41,13 +41,18 @@ extern void diff_free_filespec_data(stru
 struct diff_filepair {
 	struct diff_filespec *one;
 	struct diff_filespec *two;
-	char *xfrm_msg;
+	int score; /* only valid when one and two are different paths */
 	int orig_order; /* the original order of insertion into the queue */
-	int xfrm_work; /* for use by tramsformers, not by diffcore */
+	int rename_rank; /* rename/copy dependency needs to enforce
+			  * certain ordering of patches that later
+			  * diffcore transformations should not break.
+			  */
 };
 #define DIFF_PAIR_UNMERGED(p) \
 	(!DIFF_FILE_VALID((p)->one) && !DIFF_FILE_VALID((p)->two))
 
+extern int diff_unmodified_pair(struct diff_filepair *);
+
 struct diff_queue_struct {
 	struct diff_filepair **queue;
 	int alloc;
@@ -59,6 +64,8 @@ extern struct diff_filepair *diff_queue(
 					struct diff_filespec *,
 					struct diff_filespec *);
 extern void diff_q(struct diff_queue_struct *, struct diff_filepair *);
-extern void diff_free_filepair(struct diff_filepair *);
+
+extern int diff_needs_to_stay(struct diff_queue_struct *, int,
+			      struct diff_filespec *);
 
 #endif
diff --git a/t/t4003-diff-rename-1.sh b/t/t4003-diff-rename-1.sh
--- a/t/t4003-diff-rename-1.sh
+++ b/t/t4003-diff-rename-1.sh
@@ -11,7 +11,8 @@ test_description='More rename detection
 test_expect_success \
     'prepare reference tree' \
     'cat ../../COPYING >COPYING &&
-    git-update-cache --add COPYING &&
+     echo frotz >rezrov &&
+    git-update-cache --add COPYING rezrov &&
     tree=$(git-write-tree) &&
     echo $tree'
 
@@ -22,9 +23,10 @@ test_expect_success \
     rm -f COPYING &&
     git-update-cache --add --remove COPYING COPYING.?'
 
-# tree has COPYING.  work tree has COPYING.1 and COPYING.2,
-# both are slightly edited.  So we say you copy-and-edit one,
-# and rename-and-edit the other.
+# tree has COPYING and rezrov.  work tree has COPYING.1 and COPYING.2,
+# both are slightly edited, and unchanged rezrov.  So we say you
+# copy-and-edit one, and rename-and-edit the other.  We do not say
+# anything about rezrov.
 
 GIT_DIFF_OPTS=--unified=0 git-diff-cache -M -p $tree |
 sed -e 's/\([0-9][0-9]*\)/#/g' >current &&
@@ -64,9 +66,10 @@ test_expect_success \
     'mv COPYING.2 COPYING &&
      git-update-cache --add --remove COPYING COPYING.1 COPYING.2'
 
-# tree has COPYING.  work tree has COPYING and COPYING.1,
-# both are slightly edited.  So we say you edited one,
-# and copy-and-edit the other.
+# tree has COPYING and rezrov.  work tree has COPYING and COPYING.1,
+# both are slightly edited, and unchanged rezrov.  So we say you
+# edited one, and copy-and-edit the other.  We do not say
+# anything about rezrov.
 
 GIT_DIFF_OPTS=--unified=0 git-diff-cache -C -p $tree |
 sed -e 's/\([0-9][0-9]*\)/#/g' >current
@@ -103,10 +106,11 @@ test_expect_success \
     'cat ../../COPYING >COPYING &&
      git-update-cache --add --remove COPYING COPYING.1'
 
-# tree has COPYING.  work tree has the same COPYING and COPYING.1,
-# but COPYING is not edited.  We say you copy-and-edit COPYING.1;
-# this is only possible because -C mode now reports the unmodified
-# file to the diff-core.
+# tree has COPYING and rezrov.  work tree has COPYING and COPYING.1,
+# but COPYING is not edited.  We say you copy-and-edit COPYING.1; this
+# is only possible because -C mode now reports the unmodified file to
+# the diff-core.  Unchanged rezrov, although being fed to
+# git-diff-cache as well, should not be mentioned.
 
 GIT_DIFF_OPTS=--unified=0 git-diff-cache -C -p $tree |
 sed -e 's/\([0-9][0-9]*\)/#/g' >current
diff --git a/t/t4005-diff-rename-2.sh b/t/t4005-diff-rename-2.sh
new file mode 100644
--- /dev/null
+++ b/t/t4005-diff-rename-2.sh
@@ -0,0 +1,82 @@
+#!/bin/sh
+#
+# Copyright (c) 2005 Junio C Hamano
+#
+
+test_description='Same rename detection as t4003 but testing diff-raw.
+
+'
+. ./test-lib.sh
+
+test_expect_success \
+    'prepare reference tree' \
+    'cat ../../COPYING >COPYING &&
+     echo frotz >rezrov &&
+    git-update-cache --add COPYING rezrov &&
+    tree=$(git-write-tree) &&
+    echo $tree'
+
+test_expect_success \
+    'prepare work tree' \
+    'sed -e 's/HOWEVER/However/' <COPYING >COPYING.1 &&
+    sed -e 's/GPL/G.P.L/g' <COPYING >COPYING.2 &&
+    rm -f COPYING &&
+    git-update-cache --add --remove COPYING COPYING.?'
+
+# tree has COPYING and rezrov.  work tree has COPYING.1 and COPYING.2,
+# both are slightly edited, and unchanged rezrov.  We say COPYING.1
+# and COPYING.2 are based on COPYING, and do not say anything about
+# rezrov.
+
+git-diff-cache -M $tree >current
+
+cat >expected <<\EOF
+:100644 100644 6ff87c4664981e4397625791c8ea3bbb5f2279a3 0603b3238a076dc6c8022aedc6648fa523a17178	COPYING	COPYING.1
+:100644 100644 6ff87c4664981e4397625791c8ea3bbb5f2279a3 06c67961bbaed34a127f76d261f4c0bf73eda471	COPYING	COPYING.2
+EOF
+
+test_expect_success \
+    'validate output from rename/copy detection' \
+    'diff -u current expected'
+
+test_expect_success \
+    'prepare work tree again' \
+    'mv COPYING.2 COPYING &&
+     git-update-cache --add --remove COPYING COPYING.1 COPYING.2'
+
+# tree has COPYING and rezrov.  work tree has COPYING and COPYING.1,
+# both are slightly edited, and unchanged rezrov.  We say COPYING.1
+# is based on COPYING and COPYING is still there, and do not say anything
+# about rezrov.
+
+git-diff-cache -C $tree >current
+cat >expected <<\EOF
+:100644 100644 6ff87c4664981e4397625791c8ea3bbb5f2279a3 0603b3238a076dc6c8022aedc6648fa523a17178	COPYING	COPYING.1
+:100644 100644 6ff87c4664981e4397625791c8ea3bbb5f2279a3 06c67961bbaed34a127f76d261f4c0bf73eda471	COPYING	COPYING
+EOF
+
+test_expect_success \
+    'validate output from rename/copy detection' \
+    'diff -u current expected'
+
+test_expect_success \
+    'prepare work tree once again' \
+    'cat ../../COPYING >COPYING &&
+     git-update-cache --add --remove COPYING COPYING.1'
+
+# tree has COPYING and rezrov.  work tree has the same COPYING and
+# copy-edited COPYING.1, and unchanged rezrov.  We should see
+# unmodified COPYING in the output, so that downstream diff-helper can
+# notice.  We should not say anything about rezrov.
+
+git-diff-cache -C $tree >current
+cat >expected <<\EOF
+:100644 100644 6ff87c4664981e4397625791c8ea3bbb5f2279a3 0603b3238a076dc6c8022aedc6648fa523a17178	COPYING	COPYING.1
+:100644 100644 6ff87c4664981e4397625791c8ea3bbb5f2279a3 6ff87c4664981e4397625791c8ea3bbb5f2279a3	COPYING	COPYING
+EOF
+
+test_expect_success \
+    'validate output from rename/copy detection' \
+    'diff -u current expected'
+
+test_done
------------------------------------------------


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Comments on "Rename/copy detection fix."
  2005-05-23  4:26               ` [PATCH] Rename/copy detection fix Junio C Hamano
@ 2005-05-23  4:38                 ` Junio C Hamano
  0 siblings, 0 replies; 33+ messages in thread
From: Junio C Hamano @ 2005-05-23  4:38 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Git Mailing List

I talked about patch reordering in the commit log without giving
specifics, because I did not think it is good to cast specifics
of uncoded ideas into stone in a commit log message.

But to throw it out open in the public, here are two of the
obvious ones.

 (1) Give users a way to specify patches about which files
     should come before which other files.  Things like "*.h
     files before *.c files", or "README and Documentation first
     before anything else".

 (2) Currently when pickaxe is used, it shows only diffs about
     the files involved in the change we looked for.  The user
     may sometimes want to see such a diff in the context of
     whole patches and I plan to add an option for doing so.  It
     may be useful in such a case to show the patch about the
     file that triggered the commit to be selected by the
     pickaxe first before patches for all other files.

These may or may not be useful, but this kind of reordering
would make it pretty much pointless to record which is a rename
and which is a copy in the diff_filepair structure in the
diff_queue.  It all depends on which one comes first and which
one comes later in the final ordering, which is something
diffcore-rename cannot (and should not) know about.



^ permalink raw reply	[flat|nested] 33+ messages in thread

end of thread, other threads:[~2005-05-23  4:37 UTC | newest]

Thread overview: 33+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-05-21 23:12 updated design for the diff-raw format Junio C Hamano
2005-05-21 23:16 ` Junio C Hamano
2005-05-21 23:17 ` Junio C Hamano
2005-05-21 23:18 ` Junio C Hamano
2005-05-21 23:19 ` Junio C Hamano
2005-05-22  2:40 ` [PATCH] Prepare diffcore interface for diff-tree header supression Junio C Hamano
2005-05-22  2:42   ` [PATCH] The diff-raw format updates Junio C Hamano
2005-05-22  6:01     ` Linus Torvalds
2005-05-22  6:33       ` Junio C Hamano
2005-05-22  6:57       ` Junio C Hamano
2005-05-22  8:31         ` [PATCH] Fix tweak in similarity estimator Junio C Hamano
2005-05-22 18:35     ` [PATCH] The diff-raw format updates Linus Torvalds
2005-05-22 18:36       ` Niklas Hoglund
2005-05-22 19:15         ` Junio C Hamano
2005-05-22 18:42       ` Thomas Glanzmann
2005-05-22 19:05         ` Linus Torvalds
2005-05-22 19:05           ` Thomas Glanzmann
2005-05-22 19:20           ` Junio C Hamano
2005-05-22 19:35             ` Junio C Hamano
2005-05-22 20:24               ` Linus Torvalds
2005-05-22 23:01                 ` Junio C Hamano
2005-05-22 23:14                   ` Linus Torvalds
2005-05-23  0:35                     ` Junio C Hamano
2005-05-23  1:07                       ` Linus Torvalds
2005-05-23  1:33                         ` Junio C Hamano
2005-05-23  4:26               ` [PATCH] Rename/copy detection fix Junio C Hamano
2005-05-23  4:38                 ` Comments on "Rename/copy detection fix." Junio C Hamano
2005-05-22 19:13       ` [PATCH] The diff-raw format updates Junio C Hamano
2005-05-22  9:41   ` [PATCH] Diffcore updates Junio C Hamano
2005-05-22 16:40     ` Linus Torvalds
2005-05-22 16:47       ` Junio C Hamano
2005-05-22 17:04     ` Junio C Hamano
2005-05-23  4:24       ` [PATCH] Be careful with symlinks when detecting renames and copies Junio C Hamano

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).