Git development
 help / color / mirror / Atom feed
* Re: More gitweb queries..
From: Thomas Glanzmann @ 2005-05-29 23:56 UTC (permalink / raw)
  To: Junio C Hamano, Linus Torvalds; +Cc: Git Mailing List
In-Reply-To: <20050529234606.GF12290@cip.informatik.uni-erlangen.de>

Hello,
here is the actual version of my merge function. However I have a
question left over: When I do an automatic merge or a threeway merge
should I set the head for the next round to the remote one or just let
it be the current one (this is the current behaviour) or maybe make it
configurable? Note: At the moment the *local* head for the merge-base is only
modified during the set of the first head *or* on a head forward condition. I
have no clue. I am going to write the 'more on merging chapter' after I have
figured this out. ;-)

sub
merge
{

	my $message      = undef;
	my $head         = undef;
	my $last_tree    = undef;
	my $fh;
	my @heads        = ();

	foreach my $r (@_) {
		my $current_head = @{$r}[0];
		my $current_url  = @{$r}[1];

		print "current_head => $current_head\ncurrent_url => $current_url\n";

		push(@heads, '-p', ${current_head});

		if (! defined($last_tree)) {
			$message    = "=> ${current_url}\n";
			$head       = $current_head;
			$last_tree  = $current_head;
			
			if (@_ == 1) {
				head($head);
				return;
			}

			next;
		}

		my $merge_base = gitcmdout('git-merge-base', $head, $current_head)
				 || die ("no merge-base");
		chomp($merge_base);

		print "head => $head\nremote => $current_head\nbase => $merge_base\n";
	
		if ($merge_base eq $current_head) {
			$message .= "<= ${current_url} (nothing to merge)\n";

			$#heads -= 2;

			next;
		}

		if ($merge_base eq $head) {
			$message   .= "<= ${current_url} (bringing head ahead)\n";
			$head       = ${current_head};
			$last_tree  = ${current_head};

			$#heads -= 4;
			push(@heads, '-p', $current_head);

			next;
		}

		gitcmd('git-read-tree', '-m', $merge_base, $last_tree, $current_head);
		if (! defined($last_tree = write_tree())) {
			system('git-merge-cache', '-o', 'git-merge-one-file-script', '-a');
			if (! defined($last_tree = write_tree())) {
				# FIXME: Make manual intervention possible
				# --tg 23:11 05-05-29
				die("Couldn't merge automatically: Call 'git resolve'");
			}
			$message .= "<= ${current_url} (threeway merge)\n";

		} else {
			$message .= "<= ${current_url} (automatic merge)\n";
		}
	}

	if (@heads == 1) {
		head(@head[0]);
		return;
	}

	open($fh, "+>", undef);
	print $fh $message;
	seek($fh, 0, 0);
	$head = gitcmdinout($fh, 'git-commit-tree', $last_tree, @heads);
	chomp($head);
	close $fh;

	head($head);
	return;
}

	Thomas

^ permalink raw reply

* [PATCH 3/3] diff: code clean-up and removal of rename hack.
From: Junio C Hamano @ 2005-05-29 23:56 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Git Mailing List
In-Reply-To: <7vmzqdiore.fsf_-_@assigned-by-dhcp.cox.net>

A new macro, DIFF_PAIR_RENAME(), is introduced to distinguish a
filepair that is a rename/copy (the definition of which is src
and dst are different paths, of course).  This removes the hack
used in the record_rename_pair() to always put a non-zero value
in the score field.

Signed-off-by: Junio C Hamano <junkio@cox.net>
---

diff.c            |    6 +++---
diffcore-rename.c |    2 +-
diffcore.h        |    6 +++---
3 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/diff.c b/diff.c
--- a/diff.c
+++ b/diff.c
@@ -796,7 +796,7 @@ static void diff_resolve_rename_copy(voi
 			for (j = 0; j < q->nr; j++) {
 				pp = q->queue[j];
 				if (!strcmp(p->one->path, pp->one->path) &&
-				    pp->score) {
+				    DIFF_PAIR_RENAME(pp)) {
 					/* rename/copy are always valid
 					 * so we do not say DIFF_FILE_VALID()
 					 * on pp->one and pp->two.
@@ -815,7 +815,7 @@ static void diff_resolve_rename_copy(voi
 		 * whose both sides are valid and of the same type, i.e.
 		 * either in-place edit or rename/copy edit.
 		 */
-		else if (p->score) {
+		else if (DIFF_PAIR_RENAME(p)) {
 			if (p->source_stays) {
 				p->status = 'C';
 				continue;
@@ -828,7 +828,7 @@ static void diff_resolve_rename_copy(voi
 				pp = q->queue[j];
 				if (strcmp(pp->one->path, p->one->path))
 					continue; /* not us */
-				if (!pp->score)
+				if (!DIFF_PAIR_RENAME(pp))
 					continue; /* not a rename/copy */
 				/* pp is a rename/copy from the same source */
 				p->status = 'C';
diff --git a/diffcore-rename.c b/diffcore-rename.c
--- a/diffcore-rename.c
+++ b/diffcore-rename.c
@@ -207,7 +207,7 @@ static void record_rename_pair(struct di
 	fill_filespec(two, dst->sha1, dst->mode);
 
 	dp = diff_queue(renq, one, two);
-	dp->score = score ? : 1; /* make sure it is at least 1 */
+	dp->score = score;
 	dp->source_stays = rename_src[src_index].src_stays;
 	rename_dst[dst_index].pair = dp;
 }
diff --git a/diffcore.h b/diffcore.h
--- a/diffcore.h
+++ b/diffcore.h
@@ -39,15 +39,15 @@ extern void diff_free_filespec_data(stru
 struct diff_filepair {
 	struct diff_filespec *one;
 	struct diff_filespec *two;
-	unsigned short int score; /* only valid when one and two are
-				   * different paths
-				   */
+	unsigned short int score;
 	char source_stays; /* all of R/C are copies */
 	char status; /* M C R N D U (see Documentation/diff-format.txt) */
 };
 #define DIFF_PAIR_UNMERGED(p) \
 	(!DIFF_FILE_VALID((p)->one) && !DIFF_FILE_VALID((p)->two))
 
+#define DIFF_PAIR_RENAME(p) (strcmp((p)->one->path, (p)->two->path))
+
 #define DIFF_PAIR_TYPE_CHANGED(p) \
 	((S_IFMT & (p)->one->mode) != (S_IFMT & (p)->two->mode))
 


^ permalink raw reply

* Re: Problem with cg-diff <file>
From: Junio C Hamano @ 2005-05-30  0:19 UTC (permalink / raw)
  To: Petr Baudis; +Cc: Marcel Holtmann, GIT Mailing List
In-Reply-To: <20050529233840.GY1036@pasky.ji.cz>

>>>>> "PB" == Petr Baudis <pasky@ucw.cz> writes:

PB> ... git-diff-tree (in contrast to
PB> git-diff-cache) won't take the pathspec as its trailing arguments,

PB> Junio, is there any specific reason for that, or is the end of
PB> git-diff-tree argument list the right spot for the pathspec stuff?

Baffled.  Are you at Linus tip?

Linus correctly decided that diff-tree does not have to call
diffcore_pathspec(), which may be what confused you to make the
comment "... in contrast to git-diff-cache) won't take ...".
But it does not call it only because it does not need to.  It
filters the filepairs itself on the input side using the
trailing arguments; since diffcore_pathspec filters as the first
one in the chain as the input filter, calling it from diff-tree
would not cull anything further.

Here is what I am getting from the Linus tip binary, between my
HEAD and Linus tip:

$ git-diff-tree -r linus HEAD   >.all    ;# everything
$ git-diff-tree -r linus HEAD t >.t-only ;# limiting to the test suite
$ wc -l .all .t-only                     ;# count results
  12 .all
   3 .t-only
  15 total
$ cat .t-only				 ;# show what's in "t" output
:100644 100644 a51985518251f6c3f677438c3cb51b9716c20296 5ac29d1f98438d3530bbc8b0bdaced58200aca37 M	t/t4005-diff-rename-2.sh
:100644 100644 518892b90c7cbb3fb193d6bfb622046aff0f4431 76ae7201f0d19b7933ca44958b7c468193ec9778 M	t/t4007-rename-3.sh
:000000 100755 0000000000000000000000000000000000000000 01d276692669f2241471b8ad611b17d51e2a98ab N	t/t4009-diff-rename-4.sh
$ head -n 1 .all                         ;# prove that "t" filtered. 
:100644 100644 f85a605f0a336f506cf5cf46476a43e4c56b3e66 1d92a01a02543e55d0feb3541ee594fbc638136c M	Documentation/diff-format.txt


^ permalink raw reply

* Re: [PATCH] Add -O<orderfile> option to diff-* brothers.
From: Junio C Hamano @ 2005-05-30  0:23 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git
In-Reply-To: <Pine.LNX.4.58.0505291154030.10545@ppc970.osdl.org>

>>>>> "LT" == Linus Torvalds <torvalds@osdl.org> writes:

I realize I did not answer your question.

LT> In other words: what is the problem this is trying to solve?

To produce a patch that is easier to review, using customized
patch order list for projects.  I envision that Porcelain
noticing the existence of ${GIT-.git}/patch-order file and
adding -O to its diff-* argument would make the world a better
place.


^ permalink raw reply

* Re: Problem with cg-diff <file>
From: Petr Baudis @ 2005-05-30  0:32 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Marcel Holtmann, GIT Mailing List
In-Reply-To: <7vis11ftvm.fsf@assigned-by-dhcp.cox.net>

Dear diary, on Mon, May 30, 2005 at 02:19:09AM CEST, I got a letter
where Junio C Hamano <junkio@cox.net> told me that...
> >>>>> "PB" == Petr Baudis <pasky@ucw.cz> writes:
> 
> PB> ... git-diff-tree (in contrast to
> PB> git-diff-cache) won't take the pathspec as its trailing arguments,
> 
> PB> Junio, is there any specific reason for that, or is the end of
> PB> git-diff-tree argument list the right spot for the pathspec stuff?
> 
> Baffled.  Are you at Linus tip?
> 
> Linus correctly decided that diff-tree does not have to call
> diffcore_pathspec(), which may be what confused you to make the
> comment "... in contrast to git-diff-cache) won't take ...".
> But it does not call it only because it does not need to.  It
> filters the filepairs itself on the input side using the
> trailing arguments; since diffcore_pathspec filters as the first
> one in the chain as the input filter, calling it from diff-tree
> would not cull anything further.

Ok, so this is what you get when you mix: sleepiness, performing only
mental experiments not verified in practice, and inattentive reading of
the code.

I'm sorry for bothering. Instruct yourself from my bad example, please.
:-)

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
C++: an octopus made by nailing extra legs onto a dog. -- Steve Taylor

^ permalink raw reply

* Re: More gitweb queries..
From: Junio C Hamano @ 2005-05-30  0:50 UTC (permalink / raw)
  To: Thomas Glanzmann; +Cc: Linus Torvalds, Git Mailing List
In-Reply-To: <20050529235630.GG12290@cip.informatik.uni-erlangen.de>

Instead of inflicting a Perl script on us, maybe writing a
textual specification of what you want it to do would help to
clarify your thinking and help us understand the problem you are
trying to describe a lot better.  I think Linus publicly stated
he does not do Perl much.  I am OK with Perl but I'd rather
answer questions posed in a more reader-friendly manner, rather
than having to guess what the caller is expected to give this
"merge" sub, which you do not document well.

I think I've already asked you something quite similar when you
posted another part of your script for parsing the new diff-raw
format, which I responded with something like: "Without knowing
how this sub is supposed to be called, I think you are stripping
leading colon from a filename if there is one".  Anyhow.

Are you trying to implement an Octopus capable N-way merger?
If so, the way I would do would be something like this:

 - Accept N parameters, which are heads being merged.

 - Sanity check that given heads are commits, and N <= 16.

 - Initialize a set, HTM (heads to be merged), to contain all of
   the supplied heads.

 - Remove one commit from HTM, call it H0.

 - Initialize a variable, BASE, with H0.  This variable
   determines the base of the merge in the commit topology.

 - Initialize a variable, T, with tree associated with H0.  This
   variable holds the "current intermediate merge result" tree.

 - While HTM is not empty, loop over the following:

   - Remove one commit out of HTM; call it H1.

   - MB = git-merge-base BASE H1;

   - If MB is either BASE or H1, then you have a fast forward.
     Take either BASE or H1 that is not MB and update variable
     BASE with it, and update variable T with the tree
     associated with it.  Continue with the loop (i.e. Perl
     "next").

   - Run your usual read-tree -m MB T H1 and git-merge-cache; as
     Linus explained, if this step ends up involving any
     non-trivial merges, you should not do an Octopus.  So in
     such a case, if HTM is not empty yet, barf (i.e. Perl
     "die", or at least "last").

   - Do not touch your ${GIT-.git}/HEAD in any way at this
     moment.

   - Update variable T with git-write-tree of the resolved cache
     contents.

   - Update varaible BASE with MB.

   - Continue with the loop. 

 - We exited the loop by now.  HTM being empty means that T has
   the result of N-way merge.  Create a single commit object
   that has all the commits you have merged as its parents, and
   register T as its associated tree.  I would imagine recording
   that commit in ${GIT-.git}/HEAD is what the user usually
   wants but there may be use cases that it may not be
   appropriate (I do not do Porcelain so I do not know).


^ permalink raw reply

* Re: More gitweb queries..
From: Junio C Hamano @ 2005-05-30  0:57 UTC (permalink / raw)
  To: Thomas Glanzmann; +Cc: Linus Torvalds, Git Mailing List
In-Reply-To: <7vsm05bkps.fsf@assigned-by-dhcp.cox.net>

>>>>> "JCH" == Junio C Hamano <junkio@cox.net> writes:

JCH>    - If MB is either BASE or H1, then you have a fast forward.
JCH>      Take either BASE or H1 that is not MB and update variable
JCH>      BASE with it, and update variable T with the tree
JCH>      associated with it.  Continue with the loop (i.e. Perl
JCH>      "next").

Chuck this part please.  I was not thinking.


^ permalink raw reply

* Re: More gitweb queries..
From: Thomas Glanzmann @ 2005-05-30  1:30 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Linus Torvalds, Git Mailing List
In-Reply-To: <7vsm05bkps.fsf@assigned-by-dhcp.cox.net>

Hello,
okay let me try again.

I have a function merge which gets a sorted array of heads. Heads can be
unlimited at the time because some of the heads can be included into
other heads (they're a subset) and so they don't show up in the commit
object. I call this array MERGE_HEADS.

Note: If I pull into an empty tree (no HEAD) there is only one head in
this array which corresponds to the remote_head. Otherwise the first
element is *always* the local HEAD.

After that I am starting looping over MERGE_HEADS. The first thing I
have to do is getting the first element out of this array and safe it
for later reference I call this 'head'. Also I have to push this head in
a another array called COMMIT_HEADS which will be used to create the final
commit object later on. The latter will be done for every loop pass. next;

Note: If I left the the loop because there are no more MERGE_HEADS to
work on and my COMMIT_HEADS array consists only of *one* member I don't
create a COMMIT object, but save it as new HEAD because we're in a fast
forward condition (this could be pulling into an empty tree; having many
fast forward object (remote is ahead or included into the current
'head'). On the contrary if I have *more* than one object I call
commit-tree with the COMMIT_HEADS as arguments and save the new head
return from this call.

Now I start processing the second HEAD from MERGE_HEADS. I use
merge_base to find out the MERGE_BASE. If this MERGE_BASE ==
head than we have a (remote is fast forward condition) so our
CURRENT_HEAD becomes head and I delete the week of the last element of
COMMIT_HEADS (but leaving the CURRENT_HEAD in COMMIT_HEADS). next;
If MERGE_BASE == CURRENT_HEAD than CURRENT_HEAD is already included in
our history so no need to anything, but get it out of COMMIT_HEADS.
next; If it isn't a fast forward or already included case, we do
automatic/threeway/manual merge and save the resulting tree for the
maybe to come next automatic/threeway/manual merge. And of course also
leaving the CURRENT_HEAD in COMMIT_HEADS. FIXME: Do we need to update
our 'head' to the REMOTE_HEAD? next;

Oh and of course the sanity check: I can't commit-tree more than 16
parents at a time. (16 is of course the define mentioned by Linus
before).

That's it.

	Thomas

^ permalink raw reply

* Re: More gitweb queries..
From: Thomas Glanzmann @ 2005-05-30  1:33 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Linus Torvalds, Git Mailing List
In-Reply-To: <7voeatbkey.fsf@assigned-by-dhcp.cox.net>

Hello,

> JCH>    - If MB is either BASE or H1, then you have a fast forward.
> JCH>      Take either BASE or H1 that is not MB and update variable
> JCH>      BASE with it, and update variable T with the tree
> JCH>      associated with it.  Continue with the loop (i.e. Perl
> JCH>      "next").

> Chuck this part please.  I was not thinking.

No, I don't because I think this exactly what I have to do here. And
yes, there can be fast forwards. :-)

But note: On a fast forward condition we have to remove a element from
COMMIT_HEADS: the (last) or (last - 1). Depending on if 'local' ==
MERGE_BASE or 'remote' == MERGE_BASE.

	Thomas

^ permalink raw reply

* [PATCH / RESEND] mkdelta enhancements (take 2)
From: Nicolas Pitre @ 2005-05-30  1:52 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git
In-Reply-To: <Pine.LNX.4.62.0505271511450.16151@localhost.localdomain>


Linus, please apply.

Current version of mkdelta has the potential to corrupt a repository and 
I'd like for that problem to go away ASAP before it creates too much 
dammage.

==========

Although it was described as such, git-mkdelta didn't really attempt to
find the best delta against any previous object in the list, but was 
only able to create a delta against the preceeding object.  This patch 
reworks the code to fix that limitation and hopefully makes it a bit
clearer than before, including fixing the delta loop detection which was 
broken.

This means that

	git-mkdelta sha1 sha2 sha3 sha4 sha5 sha6

will now create a sha2 delta against sha1, a sha3 delta against either
sha2 or sha1 and keep the best one, a sha4 delta against either sha3,
sha2 or sha1, etc.  The --max-behind argument limits that search for the
best delta to the specified number of previous objects in the list.  If
no limit is specified it is unlimited (note: it might run out of 
memory with long object lists).

Also added a -q (quiet) switch so it is possible to have 3 levels of
output: -q for nothing, -v for verbose, and if none of -q nor -v is
specified then only actual changes on the object database are shown.

Finally the git-deltafy-script has been updated accordingly, and some 
bugs fixed (thanks to Stephen C. Tweedie for spotting them).

This version has been toroughly tested and I think it is ready
for public consumption.

Signed-off-by: Nicolas Pitre <nico@cam.org>

diff --git a/git-deltafy-script b/git-deltafy-script
old mode 100644
new mode 100755
--- a/git-deltafy-script
+++ b/git-deltafy-script
@@ -1,40 +1,67 @@
 #!/bin/bash
 
-# Script to deltafy an entire GIT repository based on the commit list.
+# Example script to deltafy an entire GIT repository based on the commit list.
 # The most recent version of a file is the reference and previous versions
 # are made delta against the best earlier version available. And so on for
-# successive versions going back in time.  This way the delta overhead is
-# pushed towards older version of any given file.
-#
-# NOTE: the "best earlier version" is not implemented in mkdelta yet
-#       and therefore only the next eariler version is used at this time.
-#
-# TODO: deltafy tree objects as well.
+# successive versions going back in time.  This way the increasing delta
+# overhead is pushed towards older versions of any given file.
 #
 # The -d argument allows to provide a limit on the delta chain depth.
-# If 0 is passed then everything is undeltafied.
+# If 0 is passed then everything is undeltafied.  Limiting the delta
+# depth is meaningful for subsequent access performance to old revisions.
+# A value of 16 might be a good compromize between performance and good
+# space saving.  Current default is unbounded.
+#
+# The --max-behind=30 argument is passed to git-mkdelta so to keep
+# combinations and memory usage bounded a bit.  If you have lots of memory
+# and CPU power you may remove it (or set to 0) to let git-mkdelta find the
+# best delta match regardless of the number of revisions for a given file.
+# You can also make the value smaller to make it faster and less
+# memory hungry.  A value of 5 ought to still give pretty good results.
+# When set to 0 or ommitted then look behind is unbounded.  Note that
+# git-mkdelta might die with a segmentation fault in that case if it
+# runs out of memory.  Note that the GIT repository will still be consistent
+# even if git-mkdelta dies unexpectedly.
 
 set -e
 
 depth=
 [ "$1" == "-d" ] && depth="--max-depth=$2" && shift 2
 
+function process_list() {
+	if [ "$list" ]; then
+		echo "Processing $curr_file"
+		echo "$head $list" | xargs git-mkdelta $depth --max-behind=30 -v
+	fi
+}
+
 curr_file=""
 
 git-rev-list HEAD |
-git-diff-tree -r --stdin |
-awk '/^:/ { if ($5 == "M" || $5 == "N") print $4, $6 }' |
+git-diff-tree -r -t --stdin |
+awk '/^:/ { if ($5 == "M" || $5 == "N") print $4, $6;
+            if ($5 == "M") print $3, $6 }' |
 LC_ALL=C sort -s -k 2 | uniq |
 while read sha1 file; do
 	if [ "$file" == "$curr_file" ]; then
 		list="$list $sha1"
 	else
-		if [ "$list" ]; then
-			echo "Processing $curr_file"
-			echo "$head $list" | xargs git-mkdelta $depth -v
-		fi
+		process_list
 		curr_file="$file"
 		list=""
 		head="$sha1"
 	fi
 done
+process_list
+
+curr_file="root directory"
+head=""
+list="$(
+	git-rev-list HEAD |
+	while read commit; do
+		git-cat-file commit $commit |
+		sed -n 's/tree //p;Q'
+	done
+	)"
+process_list
+
diff --git a/mkdelta.c b/mkdelta.c
--- a/mkdelta.c
+++ b/mkdelta.c
@@ -98,21 +98,16 @@ static void *create_delta_object(char *b
 	return create_object(buf, len, hdr, hdrlen, size);
 }
 
-static unsigned long get_object_size(unsigned char *sha1)
-{
-	struct stat st;
-	if (stat(sha1_file_name(sha1), &st))
-		die("%s: %s", sha1_to_hex(sha1), strerror(errno));
-	return st.st_size;
-}
-
-static void *get_buffer(unsigned char *sha1, char *type, unsigned long *size)
+static void *get_buffer(unsigned char *sha1, char *type,
+			unsigned long *size, unsigned long *compsize)
 {
 	unsigned long mapsize;
 	void *map = map_sha1_file(sha1, &mapsize);
 	if (map) {
 		void *buffer = unpack_sha1_file(map, mapsize, type, size);
 		munmap(map, mapsize);
+		if (compsize)
+			*compsize = mapsize;
 		if (buffer)
 			return buffer;
 	}
@@ -120,198 +115,249 @@ static void *get_buffer(unsigned char *s
 	return NULL;
 }
 
-static void *expand_delta(void *delta, unsigned long delta_size, char *type,
-			  unsigned long *size, unsigned int *depth, char *head)
+static void *expand_delta(void *delta, unsigned long *size, char *type,
+			  unsigned int *depth, unsigned char **links)
 {
 	void *buf = NULL;
-	*depth++;
-	if (delta_size < 20) {
+	unsigned int level = (*depth)++;
+	if (*size < 20) {
 		error("delta object is bad");
 		free(delta);
 	} else {
 		unsigned long ref_size;
-		void *ref = get_buffer(delta, type, &ref_size);
+		void *ref = get_buffer(delta, type, &ref_size, NULL);
 		if (ref && !strcmp(type, "delta"))
-			ref = expand_delta(ref, ref_size, type, &ref_size,
-					   depth, head);
-		else
-			memcpy(head, delta, 20);
-		if (ref)
-			buf = patch_delta(ref, ref_size, delta+20,
-					  delta_size-20, size);
-		free(ref);
+			ref = expand_delta(ref, &ref_size, type, depth, links);
+		else if (ref)
+{
+			*links = xmalloc(*depth * 20);
+}
+		if (ref) {
+			buf = patch_delta(ref, ref_size, delta+20, *size-20, size);
+			free(ref);
+			if (buf)
+				memcpy(*links + level*20, delta, 20);
+			else
+				free(*links);
+		}
 		free(delta);
 	}
 	return buf;
 }
 
 static char *mkdelta_usage =
-"mkdelta [ --max-depth=N ] <reference_sha1> <target_sha1> [ <next_sha1> ... ]";
+"mkdelta [--max-depth=N] [--max-behind=N] <reference_sha1> <target_sha1> [<next_sha1> ...]";
 
+struct delta {
+	unsigned char sha1[20];		/* object sha1 */
+	unsigned long size;		/* object size */
+	void *buf;			/* object content */
+	unsigned char *links;		/* delta reference links */
+	unsigned int depth;		/* delta depth */
+};
+	
 int main(int argc, char **argv)
 {
-	unsigned char sha1_ref[20], sha1_trg[20], head_ref[20], head_trg[20];
-	char type_ref[20], type_trg[20];
-	void *buf_ref, *buf_trg, *buf_delta;
-	unsigned long size_ref, size_trg, size_orig, size_delta;
-	unsigned int depth_ref, depth_trg, depth_max = -1;
-	int i, verbose = 0;
+	struct delta *ref, trg;
+	char ref_type[20], trg_type[20], *skip_reason;
+	void *best_buf;
+	unsigned long best_size, orig_size, orig_compsize;
+	unsigned int r, orig_ref, best_ref, nb_refs, next_ref, max_refs = 0;
+	unsigned int i, duplicate, skip_lvl, verbose = 0, quiet = 0;
+	unsigned int max_depth = -1;
 
 	for (i = 1; i < argc; i++) {
 		if (!strcmp(argv[i], "-v")) {
 			verbose = 1;
+			quiet = 0;
+		} else if (!strcmp(argv[i], "-q")) {
+			quiet = 1;
+			verbose = 0;
 		} else if (!strcmp(argv[i], "-d") && i+1 < argc) {
-			depth_max = atoi(argv[++i]);
+			max_depth = atoi(argv[++i]);
 		} else if (!strncmp(argv[i], "--max-depth=", 12)) {
-			depth_max = atoi(argv[i]+12);
+			max_depth = atoi(argv[i]+12);
+		} else if (!strcmp(argv[i], "-b") && i+1 < argc) {
+			max_refs = atoi(argv[++i]);
+		} else if (!strncmp(argv[i], "--max-behind=", 13)) {
+			max_refs = atoi(argv[i]+13);
 		} else
 			break;
 	}
 
-	if (i + (depth_max != 0) >= argc)
+	if (i + (max_depth != 0) >= argc)
 		usage(mkdelta_usage);
 
-	if (get_sha1(argv[i], sha1_ref))
-		die("bad sha1 %s", argv[i]);
-	depth_ref = 0;
-	buf_ref = get_buffer(sha1_ref, type_ref, &size_ref);
-	if (buf_ref && !strcmp(type_ref, "delta"))
-		buf_ref = expand_delta(buf_ref, size_ref, type_ref,
-				       &size_ref, &depth_ref, head_ref);
-	else
-		memcpy(head_ref, sha1_ref, 20);
-	if (!buf_ref)
-		die("unable to obtain initial object %s", argv[i]);
-
-	if (depth_ref > depth_max) {
-		if (restore_original_object(buf_ref, size_ref, type_ref, sha1_ref))
-			die("unable to restore %s", argv[i]);
-		if (verbose)
-			printf("undelta %s (depth was %d)\n", argv[i], depth_ref);
-		depth_ref = 0;
-	}
-
-	/*
-	 * TODO: deltafication should be tried against any early object
-	 * in the object list and not only the previous object.
-	 */
+	if (!max_refs || max_refs > argc - i)
+		max_refs = argc - i;
+	ref = xmalloc(max_refs * sizeof(*ref));
+	for (r = 0; r < max_refs; r++)
+		ref[r].buf = ref[r].links = NULL;
+	next_ref = nb_refs = 0;
 
-	while (++i < argc) {
-		if (get_sha1(argv[i], sha1_trg))
+	do {
+		if (get_sha1(argv[i], trg.sha1))
 			die("bad sha1 %s", argv[i]);
-		depth_trg = 0;
-		buf_trg = get_buffer(sha1_trg, type_trg, &size_trg);
-		if (buf_trg && !size_trg) {
+		trg.buf = get_buffer(trg.sha1, trg_type, &trg.size, &orig_compsize);
+		if (trg.buf && !trg.size) {
 			if (verbose)
 				printf("skip    %s (object is empty)\n", argv[i]);
 			continue;
 		}
-		size_orig = size_trg;
-		if (buf_trg && !strcmp(type_trg, "delta")) {
-			if (!memcmp(buf_trg, sha1_ref, 20)) {
-				/* delta already in place */
-				depth_ref++;
-				memcpy(sha1_ref, sha1_trg, 20);
-				buf_ref = patch_delta(buf_ref, size_ref,
-						      buf_trg+20, size_trg-20,
-						      &size_ref);
-				if (!buf_ref)
-					die("unable to apply delta %s", argv[i]);
-				if (depth_ref > depth_max) {
-					if (restore_original_object(buf_ref, size_ref,
-								    type_ref, sha1_ref))
-						die("unable to restore %s", argv[i]);
-					if (verbose)
-						printf("undelta %s (depth was %d)\n", argv[i], depth_ref);
-					depth_ref = 0;
-					continue;
-				}
-				if (verbose)
-					printf("skip    %s (delta already in place)\n", argv[i]);
-				continue;
+		orig_size = trg.size;
+		orig_ref = -1;
+		trg.depth = 0;
+		trg.links = NULL;
+		if (trg.buf && !strcmp(trg_type, "delta")) {
+			for (r = 0; r < nb_refs; r++)
+				if (!memcmp(trg.buf, ref[r].sha1, 20))
+					break;
+			if (r < nb_refs) {
+				/* no need to reload the reference object */
+				trg.depth = ref[r].depth + 1;
+				trg.links = xmalloc(trg.depth*20);
+				memcpy(trg.links, trg.buf, 20);
+				memcpy(trg.links+20, ref[r].links, ref[r].depth*20);
+				trg.buf = patch_delta(ref[r].buf, ref[r].size,
+						      trg.buf+20, trg.size-20,
+						      &trg.size);
+				strcpy(trg_type, ref_type);
+				orig_ref = r;
+			} else {
+				trg.buf = expand_delta(trg.buf, &trg.size, trg_type,
+						       &trg.depth, &trg.links);
 			}
-			buf_trg = expand_delta(buf_trg, size_trg, type_trg,
-					       &size_trg, &depth_trg, head_trg);
-		} else
-			memcpy(head_trg, sha1_trg, 20);
-		if (!buf_trg)
-			die("unable to read target object %s", argv[i]);
-
-		if (depth_trg > depth_max) {
-			if (restore_original_object(buf_trg, size_trg, type_trg, sha1_trg))
-				die("unable to restore %s", argv[i]);
-			if (verbose)
-				printf("undelta %s (depth was %d)\n", argv[i], depth_trg);
-			depth_trg = 0;
-			size_orig = size_trg;
 		}
+		if (!trg.buf)
+			die("unable to read target object %s", argv[i]);
 
-		if (depth_max == 0)
-			goto skip;
-
-		if (strcmp(type_ref, type_trg))
+		if (!nb_refs) {
+			strcpy(ref_type, trg_type);
+		} else if (max_depth && strcmp(ref_type, trg_type)) {
 			die("type mismatch for object %s", argv[i]);
-
-		if (!size_ref) {
-			if (verbose)
-				printf("skip    %s (initial object is empty)\n", argv[i]);
-			goto skip;
-		}
-		
-		if (depth_ref + 1 > depth_max) {
-			if (verbose)
-				printf("skip    %s (exceeding max link depth)\n", argv[i]);
-			goto skip;
 		}
 
-		if (!memcmp(head_ref, sha1_trg, 20)) {
-			if (verbose)
-				printf("skip    %s (would create a loop)\n", argv[i]);
-			goto skip;
+		duplicate = 0;
+		best_buf = NULL;
+		best_size = -1;
+		best_ref = -1;
+		skip_lvl = 0;
+		skip_reason = NULL;
+		for (r = 0; max_depth && r < nb_refs; r++) {
+			void *delta_buf, *comp_buf;
+			unsigned long delta_size, comp_size;
+			unsigned int l;
+
+			duplicate = !memcmp(trg.sha1, ref[r].sha1, 20);
+			if (duplicate) {
+				skip_reason = "already seen";
+				break;
+			}
+			if (ref[r].depth >= max_depth) {
+				if (skip_lvl < 1) {
+					skip_reason = "exceeding max link depth";
+					skip_lvl = 1;
+				}
+				continue;
+			}
+			for (l = 0; l < ref[r].depth; l++)
+				if (!memcmp(trg.sha1, ref[r].links + l*20, 20))
+					break;
+			if (l != ref[r].depth) {
+				if (skip_lvl < 2) {
+					skip_reason = "would create a loop";
+					skip_lvl = 2;
+				}
+				continue;
+			}
+			if (trg.depth < max_depth && r == orig_ref) {
+				if (skip_lvl < 3) {
+					skip_reason = "delta already in place";
+					skip_lvl = 3;
+				}
+				continue;
+			}
+			delta_buf = diff_delta(ref[r].buf, ref[r].size,
+					       trg.buf, trg.size, &delta_size);
+			if (!delta_buf)
+				die("out of memory");
+			if (trg.depth < max_depth &&
+			    delta_size+20 >= orig_size) {
+				/* no need to even try to compress if original
+				   object is smaller than this delta */
+				free(delta_buf);
+				if (skip_lvl < 4) {
+					skip_reason = "no size reduction";
+					skip_lvl = 4;
+				}
+				continue;
+			}
+			comp_buf = create_delta_object(delta_buf, delta_size,
+						       ref[r].sha1, &comp_size);
+			if (!comp_buf)
+				die("out of memory");
+			free(delta_buf);
+			if (trg.depth < max_depth &&
+			    comp_size >= orig_compsize) {
+				free(comp_buf);
+				if (skip_lvl < 5) {
+					skip_reason = "no size reduction";
+					skip_lvl = 5;
+				}
+				continue;
+			}
+			if ((comp_size < best_size) ||
+			    (comp_size == best_size &&
+			     ref[r].depth < ref[best_ref].depth)) {
+				free(best_buf);
+				best_buf = comp_buf;
+				best_size = comp_size;
+				best_ref = r;
+			}
 		}
 
-		buf_delta = diff_delta(buf_ref, size_ref, buf_trg, size_trg, &size_delta);
-		if (!buf_delta)
-			die("out of memory");
-
-		/* no need to even try to compress if original
-		   uncompressed is already smaller */
-		if (size_delta+20 < size_orig) {
-			void *buf_obj;
-			unsigned long size_obj;
-			buf_obj = create_delta_object(buf_delta, size_delta,
-						      sha1_ref, &size_obj);
-			free(buf_delta);
-			size_orig = get_object_size(sha1_trg);
-			if (size_obj >= size_orig) {
-				free(buf_obj);
-				if (verbose)
-					printf("skip    %s (original is smaller)\n", argv[i]);
-				goto skip;
-			}
-			if (replace_object(buf_obj, size_obj, sha1_trg))
+		if (best_buf) {
+			if (replace_object(best_buf, best_size, trg.sha1))
 				die("unable to write delta for %s", argv[i]);
-			free(buf_obj);
-			depth_ref++;
-			if (verbose)
-				printf("delta   %s (size=%ld.%02ld%%, depth=%d)\n",
-				       argv[i], size_obj*100 / size_orig,
-				       (size_obj*10000 / size_orig)%100,
-				       depth_ref);
-		} else {
-			free(buf_delta);
-			if (verbose)
-				printf("skip    %s (original is smaller)\n", argv[i]);
-			skip:
-			depth_ref = depth_trg;
-			memcpy(head_ref, head_trg, 20);
+			free(best_buf);
+			free(trg.links);
+			trg.depth = ref[best_ref].depth + 1;
+			trg.links = xmalloc(trg.depth*20);
+			memcpy(trg.links, ref[best_ref].sha1, 20);
+			memcpy(trg.links+20, ref[best_ref].links, ref[best_ref].depth*20);
+			if (!quiet)
+				printf("delta   %s (size=%ld.%02ld%% depth=%d dist=%d)\n",
+				       argv[i], best_size*100 / orig_compsize,
+				       (best_size*10000 / orig_compsize)%100,
+				       trg.depth,
+				       (next_ref - best_ref + max_refs)
+				       % (max_refs + 1) + 1);
+		} else if (trg.depth > max_depth) {
+			if (restore_original_object(trg.buf, trg.size, trg_type, trg.sha1))
+				die("unable to restore %s", argv[i]);
+			if (!quiet)
+				printf("undelta %s (depth was %d)\n",
+				       argv[i], trg.depth);
+			trg.depth = 0;
+			free(trg.links);
+			trg.links = NULL;
+		} else if (skip_reason && verbose) {
+			printf("skip    %s (%s)\n", argv[i], skip_reason);
 		}
 
-		free(buf_ref);
-		buf_ref = buf_trg;
-		size_ref = size_trg;
-		memcpy(sha1_ref, sha1_trg, 20);
-	}
+		if (!duplicate) {
+			free(ref[next_ref].buf);
+			free(ref[next_ref].links);
+			ref[next_ref] = trg;
+			if (++next_ref > nb_refs)
+				nb_refs = next_ref;
+			if (next_ref == max_refs)
+				next_ref = 0;
+		} else {
+			free(trg.buf);
+			free(trg.links);
+		}
+	} while (++i < argc);
 
 	return 0;
 }

^ permalink raw reply

* [PATCH] cg-pull: summarize the number of pulled objects
From: Jonas Fonseca @ 2005-05-30  1:56 UTC (permalink / raw)
  To: Petr Baudis; +Cc: git

Show cg-pull progress by summarizing the very verbose output of the pull
backends into a continously updated line specifying the number of
objects which have already been pulled.
		     
Signed-off-by: Jonas Fonseca <fonseca@diku.dk>
---

Straight from the bloat department, perhaps, but it is nice to not have
the terminal backlog ruined and the object count is quite nice too. :)

Interesting, it counts 4950 objects when pulling over rsync and 4454
objects when pulling locally. Didn't test HTTP pulling other than to see
if the "got <sha>" lines was matched correctly.

 cg-pull |   29 ++++++++++++++++++++++++++---
 1 files changed, 26 insertions(+), 3 deletions(-)

diff --git a/cg-pull b/cg-pull
--- a/cg-pull
+++ b/cg-pull
@@ -29,6 +29,29 @@ if echo "$uri" | grep -q '#'; then
 	uri=$(echo $uri | cut -d '#' -f 1)
 fi
 
+pull_progress() {
+	objects=0
+	last_objects=0
+
+	while read line; do
+		case "$line" in
+		link*| symlink*| \
+		[a-f0-9][a-f0-9]/[a-f0-9][a-f0-9][a-f0-9][a-f0-9][a-f0-9]*| \
+		"got "[a-f0-9][a-f0-9][a-f0-9][a-f0-9][a-f0-9][a-f0-9]*)
+			objects=$(($objects + 1));
+			echo -ne "Pulling objects: $objects\r"
+			;;
+		*)
+			if [ "$last_objects" != "$objecst" ]; then
+				last_objects=$objects
+				echo;
+			fi
+			echo "$line"
+			;;
+		esac 
+	done;
+	[ "$last_objects" != "$objecst" ] && echo
+}
 
 fetch_rsync () {
 	redir=
@@ -62,7 +85,7 @@ fetch_rsync () {
 }
 
 pull_rsync () {
-	fetch_rsync -s -u -d "$2/objects" "$_git_objects"
+	fetch_rsync -s -u -d "$2/objects" "$_git_objects" | pull_progress
 }
 
 
@@ -107,7 +130,7 @@ fetch_http () {
 }
 
 pull_http () {
-	git-http-pull -a -v "$(cat "$_git/refs/heads/$1")" "$2/"
+	(git-http-pull -a -v "$(cat "$_git/refs/heads/$1")" "$2/" 2>&1 /dev/null) | pull_progress
 }
 
 
@@ -170,7 +193,7 @@ fetch_local () {
 }
 
 pull_local () {
-	git-local-pull -a -l -v "$(cat "$_git/refs/heads/$1")" "$2"
+	(git-local-pull -a -l -v "$(cat "$_git/refs/heads/$1")" "$2" 2>&1 /dev/null) | pull_progress
 }
 
 if echo "$uri" | grep -q "^http://"; then
-- 
Jonas Fonseca

^ permalink raw reply

* [PATCH] Cleanup cogito command usage reporting
From: Jonas Fonseca @ 2005-05-30  2:36 UTC (permalink / raw)
  To: Petr Baudis; +Cc: git

Add usage utility function which uses the recently introduced USAGE variables.
Minor fix and improvement of cg-clone documentation.

Signed-off-by: Jonas Fonseca <fonseca@diku.dk>
---

 cg-Xlib       |    3 +++
 cg-add        |    2 +-
 cg-branch-add |    2 +-
 cg-clone      |    5 +++--
 cg-export     |    2 +-
 cg-merge      |    4 ++--
 cg-rm         |    2 +-
 cg-tag        |    2 +-
 8 files changed, 13 insertions(+), 9 deletions(-)

diff --git a/cg-Xlib b/cg-Xlib
--- a/cg-Xlib
+++ b/cg-Xlib
@@ -17,6 +17,9 @@ die () {
 	exit 1
 }
 
+usage() {
+	die "usage: $USAGE"
+}
 
 mktemp () {
 	if [ ! "$BROKEN_MKTEMP" ]; then
diff --git a/cg-add b/cg-add
--- a/cg-add
+++ b/cg-add
@@ -22,7 +22,7 @@ USAGE="cg-add FILE..."
 
 . ${COGITO_LIB}cg-Xlib
 
-[ "$1" ] || die "usage: cg-add FILE..."
+[ "$1" ] || usage
 
 TMPFILE=$(mktemp -t gitadd.XXXXXX)
 find "$@" -type f > $TMPFILE || die "not all files exist, nothing added"
diff --git a/cg-branch-add b/cg-branch-add
--- a/cg-branch-add
+++ b/cg-branch-add
@@ -42,7 +42,7 @@ USAGE="cg-branch-add BRANCH LOCATION"
 name=$1
 location=$2
 
-([ "$name" ] && [ "$location" ]) || die "usage: cg-branch-add NAME SOURCE_LOC"
+([ "$name" ] && [ "$location" ]) || usage
 (echo $name | egrep -qv '[^a-zA-Z0-9_.@!:-]') || \
 	die "name contains invalid characters"
 if [ "$name" = "this" ] || [ "$name" = "HEAD" ]; then
diff --git a/cg-clone b/cg-clone
--- a/cg-clone
+++ b/cg-clone
@@ -15,11 +15,12 @@
 # -------
 # -s::
 #	Clone in the current directory instead of creating a new one.
+#	Specifying both -s and a desination directory makes no sense.
 #
 # -h, --help::
 #	Print usage help
 
-USAGE="cg-clone [-s] LOCATION [<directory>]"
+USAGE="cg-clone [-s] LOCATION [DESTINATION]"
 
 . ${COGITO_LIB}cg-Xlib
 
@@ -30,7 +31,7 @@ if [ "$1" = "-s" ]; then
 fi
 
 location=$1
-[ "$location" ] || die "usage: cg-clone [-s] SOURCE_LOC [DESTDIR]"
+[ "$location" ] || usage
 location=${location%/}
 
 destdir=$2
diff --git a/cg-export b/cg-export
--- a/cg-export
+++ b/cg-export
@@ -22,7 +22,7 @@ USAGE="cg-export DESTINATION [TREE]"
 dest=$1
 id=$(tree-id $2)
 
-([ "$dest" ] && [ "$id" ]) || die "usage: cg-export DEST [TREE_ID]"
+([ "$dest" ] && [ "$id" ]) || usage
 
 [ -e "$dest" ] && die "$dest already exists."
 
diff --git a/cg-merge b/cg-merge
--- a/cg-merge
+++ b/cg-merge
@@ -38,11 +38,11 @@ fi
 base=
 if [ "$1" = "-b" ]; then
 	shift
-	[ "$1" ] || die "usage: cg-merge [-c] [-b BASE_ID] FROM_ID"
+	[ "$1" ] || usage
 	base=$(commit-id "$1") || exit 1; shift
 fi
 
-[ "$1" ] || die "usage: cg-merge [-c] [-b BASE_ID] FROM_ID"
+[ "$1" ] || usage
 branchname="$1"
 branch=$(commit-id "$branchname") || exit 1
 
diff --git a/cg-rm b/cg-rm
--- a/cg-rm
+++ b/cg-rm
@@ -15,7 +15,7 @@ USAGE="cg-rm FILE..."
 
 . ${COGITO_LIB}cg-Xlib
 
-[ "$1" ] || die "usage: cg-rm FILE..."
+[ "$1" ] || usage
 
 rm -f "$@"
 git-update-cache --remove -- "$@"
diff --git a/cg-tag b/cg-tag
--- a/cg-tag
+++ b/cg-tag
@@ -20,7 +20,7 @@ USAGE="cg-tag TAG [REVISION]"
 name=$1
 id=$2
 
-[ "$name" ] || die "usage: cg-tag TNAME [COMMIT_ID]"
+[ "$name" ] || usage
 [ "$id" ] || id=$(commit-id)
 
 (echo $name | egrep -qv '[^a-zA-Z0-9_.@!:-]') || \
-- 
Jonas Fonseca

^ permalink raw reply

* Re: [PATCH] Cleanup cogito command usage reporting
From: Jonas Fonseca @ 2005-05-30  3:05 UTC (permalink / raw)
  To: Petr Baudis; +Cc: git
In-Reply-To: <20050530023603.GC10715@diku.dk>

[ Sorry for the resend. This should bring all usage strings in sync. I
  forgot to add the new options. ]

 - Synchronize usage strings with those in cg-help. The command
   identifiers are still not as descriptive, though.
 - Add usage utility function which uses the recently introduced USAGE
   variables.
 - Minor fix and improvement of cg-clone documentation.

Signed-off-by: Jonas Fonseca <fonseca@diku.dk>

---

 cg-Xlib       |    3 +++
 cg-add        |    2 +-
 cg-branch-add |    2 +-
 cg-clone      |    5 +++--
 cg-commit     |    4 ++--
 cg-diff       |    2 +-
 cg-export     |    2 +-
 cg-help       |    2 +-
 cg-log        |    2 +-
 cg-merge      |    4 ++--
 cg-mkpatch    |    2 +-
 cg-rm         |    2 +-
 cg-tag        |    2 +-
 13 files changed, 19 insertions(+), 15 deletions(-)

diff --git a/cg-Xlib b/cg-Xlib
--- a/cg-Xlib
+++ b/cg-Xlib
@@ -17,6 +17,9 @@ die () {
 	exit 1
 }
 
+usage() {
+	die "usage: $USAGE"
+}
 
 mktemp () {
 	if [ ! "$BROKEN_MKTEMP" ]; then
diff --git a/cg-add b/cg-add
--- a/cg-add
+++ b/cg-add
@@ -22,7 +22,7 @@ USAGE="cg-add FILE..."
 
 . ${COGITO_LIB}cg-Xlib
 
-[ "$1" ] || die "usage: cg-add FILE..."
+[ "$1" ] || usage
 
 TMPFILE=$(mktemp -t gitadd.XXXXXX)
 find "$@" -type f > $TMPFILE || die "not all files exist, nothing added"
diff --git a/cg-branch-add b/cg-branch-add
--- a/cg-branch-add
+++ b/cg-branch-add
@@ -42,7 +42,7 @@ USAGE="cg-branch-add BRANCH LOCATION"
 name=$1
 location=$2
 
-([ "$name" ] && [ "$location" ]) || die "usage: cg-branch-add NAME SOURCE_LOC"
+([ "$name" ] && [ "$location" ]) || usage
 (echo $name | egrep -qv '[^a-zA-Z0-9_.@!:-]') || \
 	die "name contains invalid characters"
 if [ "$name" = "this" ] || [ "$name" = "HEAD" ]; then
diff --git a/cg-clone b/cg-clone
--- a/cg-clone
+++ b/cg-clone
@@ -15,11 +15,12 @@
 # -------
 # -s::
 #	Clone in the current directory instead of creating a new one.
+#	Specifying both -s and a desination directory makes no sense.
 #
 # -h, --help::
 #	Print usage help
 
-USAGE="cg-clone [-s] LOCATION [<directory>]"
+USAGE="cg-clone [-s] LOCATION [DESTINATION]"
 
 . ${COGITO_LIB}cg-Xlib
 
@@ -30,7 +31,7 @@ if [ "$1" = "-s" ]; then
 fi
 
 location=$1
-[ "$location" ] || die "usage: cg-clone [-s] SOURCE_LOC [DESTDIR]"
+[ "$location" ] || usage
 location=${location%/}
 
 destdir=$2
diff --git a/cg-commit b/cg-commit
--- a/cg-commit
+++ b/cg-commit
@@ -20,7 +20,7 @@
 #	Note, this is used internally by Cogito when merging. This option
 #	does not make sense when files are given on the command line.
 #
-# -m<message>::
+# -mMESSAGE::
 #	Specify the commit message, which is used instead of starting
 #	up an editor (if the input is not `stdin`, the input is appended
 #	after all the '-m' messages). Multiple '-m' parameters are appended
@@ -75,7 +75,7 @@
 # EDITOR::
 #	The editor used for entering revision log information.
 
-USAGE="cg-commit [-m<message>]... [-C] [-e | -E] [FILE]..."
+USAGE="cg-commit [-mMESSAGE]... [-C] [-e | -E] [FILE]..."
 
 . ${COGITO_LIB}cg-Xlib
 
diff --git a/cg-diff b/cg-diff
--- a/cg-diff
+++ b/cg-diff
@@ -30,7 +30,7 @@
 # -h, --help::
 #	Print usage help.
 
-USAGE="cg-diff [-p] [-r REVISION[:REVISION]] [FILE]..."
+USAGE="cg-diff [-c] [-m] [-p] [-r REVISION[:REVISION]] [FILE]..."
 
 . ${COGITO_LIB}cg-Xlib
 
diff --git a/cg-export b/cg-export
--- a/cg-export
+++ b/cg-export
@@ -22,7 +22,7 @@ USAGE="cg-export DESTINATION [TREE]"
 dest=$1
 id=$(tree-id $2)
 
-([ "$dest" ] && [ "$id" ]) || die "usage: cg-export DEST [TREE_ID]"
+([ "$dest" ] && [ "$id" ]) || usage
 
 [ -e "$dest" ] && die "$dest already exists."
 
diff --git a/cg-help b/cg-help
--- a/cg-help
+++ b/cg-help
@@ -43,7 +43,7 @@ Available commands:
 	cg-export	DEST [TREE_ID]
 	cg-help		[COMMAND]
 	cg-init
-	cg-log		[-c] [-f] [-m] [-r FROM_ID[:TO_ID]] [FILE]...
+	cg-log		[-c] [-f] [-m] [-uUSERNAME] [-r FROM_ID[:TO_ID]] [FILE]...
 	cg-ls		[TREE_ID]
 	cg-merge	[-c] [-b BASE_ID] FROM_ID
 	cg-mkpatch	[-m] [-s] [-r FROM_ID[:TO_ID]]
diff --git a/cg-log b/cg-log
--- a/cg-log
+++ b/cg-log
@@ -58,7 +58,7 @@
 #
 #	$ cg-log -r releasetag-0.9:releasetag-0.10
 
-USAGE="cg-log [-c] [-f] [-uUSERNAME] [-r REVISION[:REVISION]] FILE..."
+USAGE="cg-log [-c] [-f] [-m] [-uUSERNAME] [-r REVISION[:REVISION]] FILE..."
 
 . ${COGITO_LIB}cg-Xlib
 # Try to fix the annoying "Broken pipe" output. May not help, but apparently
diff --git a/cg-merge b/cg-merge
--- a/cg-merge
+++ b/cg-merge
@@ -38,11 +38,11 @@ fi
 base=
 if [ "$1" = "-b" ]; then
 	shift
-	[ "$1" ] || die "usage: cg-merge [-c] [-b BASE_ID] FROM_ID"
+	[ "$1" ] || usage
 	base=$(commit-id "$1") || exit 1; shift
 fi
 
-[ "$1" ] || die "usage: cg-merge [-c] [-b BASE_ID] FROM_ID"
+[ "$1" ] || usage
 branchname="$1"
 branch=$(commit-id "$branchname") || exit 1
 
diff --git a/cg-mkpatch b/cg-mkpatch
--- a/cg-mkpatch
+++ b/cg-mkpatch
@@ -37,7 +37,7 @@
 # the line
 # `!-------------------------------------------------------------flip-`
 
-USAGE="cg-mkpatch [-s] [-r REVISION[:REVISION]]"
+USAGE="cg-mkpatch [-m] [-s] [-r REVISION[:REVISION]]"
 
 . ${COGITO_LIB}cg-Xlib
 
diff --git a/cg-rm b/cg-rm
--- a/cg-rm
+++ b/cg-rm
@@ -15,7 +15,7 @@ USAGE="cg-rm FILE..."
 
 . ${COGITO_LIB}cg-Xlib
 
-[ "$1" ] || die "usage: cg-rm FILE..."
+[ "$1" ] || usage
 
 rm -f "$@"
 git-update-cache --remove -- "$@"
diff --git a/cg-tag b/cg-tag
--- a/cg-tag
+++ b/cg-tag
@@ -20,7 +20,7 @@ USAGE="cg-tag TAG [REVISION]"
 name=$1
 id=$2
 
-[ "$name" ] || die "usage: cg-tag TNAME [COMMIT_ID]"
+[ "$name" ] || usage
 [ "$id" ] || id=$(commit-id)
 
 (echo $name | egrep -qv '[^a-zA-Z0-9_.@!:-]') || \
-- 
Jonas Fonseca

^ permalink raw reply

* Re: change of git-diff-tree and symlinks
From: Sebastian Kuzminsky @ 2005-05-30  3:17 UTC (permalink / raw)
  To: Jochen Roemling; +Cc: Kay Sievers, Git Mailing List
In-Reply-To: <4299E88E.7090306@roemling.net>

Jochen Roemling <jochen@roemling.net> wrote:
> Sebastian, could you include a matching gitweb.cgi into your 
> Debian-Package?


Looks like gitweb's already packaged for Debian.  Andres Salomon is
doing it:

    http://marc.theaimsgroup.com/?l=git&m=111661740226054&w=2




-- 
Sebastian Kuzminsky

^ permalink raw reply

* Re: [PATCH] cg-pull: summarize the number of pulled objects
From: Frank Sorenson @ 2005-05-30  3:48 UTC (permalink / raw)
  To: Jonas Fonseca; +Cc: Petr Baudis, git
In-Reply-To: <20050530015650.GB10715@diku.dk>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Jonas Fonseca wrote:
> +			if [ "$last_objects" != "$objecst" ]; then
                                                 ^^^^^^^^
Did you mean 'objects' ???

> +				last_objects=$objects
> +				echo;
> +			fi
> +			echo "$line"
> +			;;
> +		esac 
> +	done;
> +	[ "$last_objects" != "$objecst" ] && echo
                               ^^^^^^^
Here too?

Frank
- --
Frank Sorenson - KD7TZK
Systems Manager, Computer Science Department
Brigham Young University
frank@tuxrocks.com
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.6 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFCmo0haI0dwg4A47wRAkJHAJ98i7xUZZVd3rzXdlor9f5+Lly7SgCfc4qK
dXkHJegPZLxP3CKzvm7SFHM=
=sKlv
-----END PGP SIGNATURE-----

^ permalink raw reply

* Re: [PATCH] Do not show empty diff in diff-cache uncached.
From: Linus Torvalds @ 2005-05-30  5:34 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Git Mailing List
In-Reply-To: <7v3bs5k8d1.fsf@assigned-by-dhcp.cox.net>



On Sun, 29 May 2005, Junio C Hamano wrote:
> 
> Please disregard the patches you have already discarded so far;
> this request-to-discard includes -O and -B enhancements.

I actually like -B, it's just that that patch depended on -O and also came 
with a separate patch that was the reason I liked -B in the first place..

IO, if we have

	/* start out with files "a" and "b" */
	mv b c
	mv a b
	git-update-cache --add --remove a b c
	git-diff-cache -B HEAD

then I htink you're 100% right that it should show up as two renames in
a diffs, and "-B" would catch it. I think that's a great thing.

		Linus

^ permalink raw reply

* Re: [PATCH] Do not show empty diff in diff-cache uncached.
From: Junio C Hamano @ 2005-05-30  5:53 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Git Mailing List
In-Reply-To: <Pine.LNX.4.58.0505292231580.10545@ppc970.osdl.org>

>>>>> "LT" == Linus Torvalds <torvalds@osdl.org> writes:

LT> On Sun, 29 May 2005, Junio C Hamano wrote:
>> 
>> Please disregard the patches you have already discarded so far;
>> this request-to-discard includes -O and -B enhancements.

LT> I actually like -B, it's just that that patch depended on -O and also came 
LT> with a separate patch that was the reason I liked -B in the first place..

Good.  I actually just finished testing the complete reorder of
the patches and about to start the final review before throwing
the bundle at you again.  In the new set, -B comes before -O.


^ permalink raw reply

* [PATCH 0/4]
From: Junio C Hamano @ 2005-05-30  6:58 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Git Mailing List
In-Reply-To: <7vmzqdiore.fsf_-_@assigned-by-dhcp.cox.net>

Linus,

	as promised, I am sending a couple of further cleanup,
and a pair of new diffcore routines.

I am assuming that you would have applied the three clean-up
patches I sent after you rejected some of my 12-piece set,
before you use these four.  They come on top of the previous
three.  Although I do not think they depend on the previous
three, I am letting you know that that was how I tested these
four:

    [PATCH 1/4] diff: further cleanup.
    [PATCH 2/4] diff: fix the culling of unneeded delete record.
    [PATCH 3/4] Add -B flag to diff-* brothers.
    [PATCH 4/4] Add -O<orderfile> option to diff-* brothers.

The third one is the gem of this series.  I think I covered the
basics I can think of in the new test script, but there could
still be cases that rename/copy detector does something
interesting when broken pairs are involved.  Please give it a
good beating before you use it for anything important.  This
being diff routine, it obviously cannot corrupt your data,
though.

The fourth one was what both you and Petr expressed reluctance,
although Thomas was supportive.  I admit it is of "nice to have"
category not "great we need to have it inside" category, but it
is my favorite.

Oh, before I forget, I was wondering if you want me to mark
broken pair in any special way, just line I mark matched
rename/copy pairs.  Something along the lines of:

    diff --git a/foo b/foo
    dissimilarity index 100%
    deleted file mode 100644
    --- a/foo
    +++ /dev/null
    @@ ...
    diff --git a/foo b/foo
    dissimilarity index 100%
    new file mode 100644
    --- /dev/null
    +++ a/foo
    @@ ...



^ permalink raw reply

* [PATCH 1/4] diff: further cleanup.
From: Junio C Hamano @ 2005-05-30  7:07 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Git Mailing List
In-Reply-To: <7vekbp8ajm.fsf_-_@assigned-by-dhcp.cox.net>

When preparing data to feed the external diff, we should give
the mode we obtained from the caller, even when we are dealing
with a file with 0{40} SHA1 (i.e. the caller said "look at the
filesystem"), since the mode passed by the caller via
diff_addremove() or diff_change() is always trustworthy.

This is _not_ a bugfix --- the existing code stat() on the file
ifself and does the same computation on st.st_mode to compute
the mode the same way the caller did to give the original mode.
We cannot remove the stat() call from here, but the extra
computation to create the mode value is unnecessary.

Signed-off-by: Junio C Hamano <junkio@cox.net>
---
 diff.c |    9 +++++++--
 1 files changed, 7 insertions(+), 2 deletions(-)

9c88ea020142346a56e36cea87e349814675f3f5 (from f96e0e2250f29f5ba2ae06c6b401a83fa1b828b4)
diff --git a/diff.c b/diff.c
--- a/diff.c
+++ b/diff.c
@@ -421,8 +421,13 @@ static void prepare_temp_file(const char
 				strcpy(temp->hex, sha1_to_hex(null_sha1));
 			else
 				strcpy(temp->hex, sha1_to_hex(one->sha1));
-			sprintf(temp->mode, "%06o",
-				S_IFREG |ce_permissions(st.st_mode));
+			/* Even though we may sometimes borrow the
+			 * contents from the work tree, we always want
+			 * one->mode.  mode is trustworthy even when
+			 * !(one->sha1_valid), as long as
+			 * DIFF_FILE_VALID(one).
+			 */
+			sprintf(temp->mode, "%06o", one->mode);
 		}
 		return;
 	}


^ permalink raw reply

* [PATCH 2/4] diff: fix the culling of unneeded delete record.
From: Junio C Hamano @ 2005-05-30  7:08 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Git Mailing List
In-Reply-To: <7vekbp8ajm.fsf_-_@assigned-by-dhcp.cox.net>

The commit 15d061b435a7e3b6bead39df3889f4af78c4b00a

    [PATCH] Fix the way diffcore-rename records unremoved source.

still leaves unneeded delete records in its output stream by
mistake, which was covered up by having an extra check to turn
such a delete into a no-op downstream.  Fix the check in the
diffcore-rename to simplify the output routine.

Signed-off-by: Junio C Hamano <junkio@cox.net>
---
 diff.c            |   23 ++---------------------
 diffcore-rename.c |   44 +++++++++++++++++++++++++++++++++-----------
 2 files changed, 35 insertions(+), 32 deletions(-)

3fde726e95a828e680c95297f185d6d25fcf853a (from 9c88ea020142346a56e36cea87e349814675f3f5)
diff --git a/diff.c b/diff.c
--- a/diff.c
+++ b/diff.c
@@ -792,27 +792,8 @@ static void diff_resolve_rename_copy(voi
 			p->status = 'U';
 		else if (!DIFF_FILE_VALID(p->one))
 			p->status = 'N';
-		else if (!DIFF_FILE_VALID(p->two)) {
-			/* Deleted entry may have been picked up by
-			 * another rename-copy entry.  So we scan the
-			 * queue and if we find one that uses us as the
-			 * source we do not say delete for this entry.
-			 */
-			for (j = 0; j < q->nr; j++) {
-				pp = q->queue[j];
-				if (!strcmp(p->one->path, pp->one->path) &&
-				    DIFF_PAIR_RENAME(pp)) {
-					/* rename/copy are always valid
-					 * so we do not say DIFF_FILE_VALID()
-					 * on pp->one and pp->two.
-					 */
-					p->status = 'X';
-					break;
-				}
-			}
-			if (!p->status)
-				p->status = 'D';
-		}
+		else if (!DIFF_FILE_VALID(p->two))
+			p->status = 'D';
 		else if (DIFF_PAIR_TYPE_CHANGED(p))
 			p->status = 'T';
 
diff --git a/diffcore-rename.c b/diffcore-rename.c
--- a/diffcore-rename.c
+++ b/diffcore-rename.c
@@ -328,26 +328,48 @@ void diffcore_rename(int detect_rename, 
 	outq.nr = outq.alloc = 0;
 	for (i = 0; i < q->nr; i++) {
 		struct diff_filepair *p = q->queue[i];
-		struct diff_rename_dst *dst = locate_rename_dst(p->two, 0);
 		struct diff_filepair *pair_to_free = NULL;
 
-		if (dst) {
-			/* creation */
-			if (dst->pair) {
-				/* renq has rename/copy to produce
-				 * this file already, so we do not
-				 * emit the creation record in the
-				 * output.
-				 */
+		if (!DIFF_FILE_VALID(p->one) && DIFF_FILE_VALID(p->two)) {
+			/*
+			 * Creation
+			 *
+			 * We would output this create record if it has
+			 * not been turned into a rename/copy already.
+			 */
+			struct diff_rename_dst *dst =
+				locate_rename_dst(p->two, 0);
+			if (dst && dst->pair) {
 				diff_q(&outq, dst->pair);
 				pair_to_free = p;
 			}
 			else
-				/* no matching rename/copy source, so record
-				 * this as a creation.
+				/* no matching rename/copy source, so
+				 * record this as a creation.
 				 */
 				diff_q(&outq, p);
 		}
+		else if (DIFF_FILE_VALID(p->one) && !DIFF_FILE_VALID(p->two)) {
+			/*
+			 * Deletion
+			 *
+			 * We would output this delete record if renq
+			 * does not have a rename/copy to move
+			 * p->one->path out.
+			 */
+			for (j = 0; j < renq.nr; j++)
+				if (!strcmp(renq.queue[j]->one->path,
+					    p->one->path))
+					break;
+			if (j < renq.nr)
+				/* this path remains */
+				pair_to_free = p;
+
+			if (pair_to_free)
+				;
+			else
+				diff_q(&outq, p);
+		}
 		else if (!diff_unmodified_pair(p))
 			/* all the usual ones need to be kept */
 			diff_q(&outq, p);


^ permalink raw reply

* [PATCH 3/4] Add -B flag to diff-* brothers.
From: Junio C Hamano @ 2005-05-30  7:08 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Git Mailing List
In-Reply-To: <7vekbp8ajm.fsf_-_@assigned-by-dhcp.cox.net>

A new diffcore transformation, diffcore-break.c, is introduced.

When the -B flag is given, a patch that represents a complete
rewrite is broken into a deletion followed by a creation.  This
makes it easier to review such a complete rewrite patch.

The -B flag takes the same syntax as the -M and -C flags to
specify the minimum amount of non-source material the resulting
file needs to have to be considered a complete rewrite, and
defaults to 99% if not specified.

As the new test t4008-diff-break-rewrite.sh demonstrates, if a
file is a complete rewrite, it is broken into a delete/create
pair, which can further be subjected to the usual rename
detection if -M or -C is used.  For example, if file0 gets
completely rewritten to make it as if it were rather based on
file1 which itself disappeared, the following happens:

    The original change looks like this:

	file0     --> file0' (quite different from file0)
	file1     --> /dev/null

    After diffcore-break runs, it would become this:

	file0     --> /dev/null
	/dev/null --> file0'
	file1     --> /dev/null

    Then diffcore-rename matches them up:

	file1     --> file0'

The internal score values are finer grained now.  Earlier
maximum of 10000 has been raised to 60000; there is no user
visible changes but there is no reason to waste available bits.

Signed-off-by: Junio C Hamano <junkio@cox.net>
---
 Documentation/git-diff-cache.txt |    5 
 Documentation/git-diff-files.txt |    5 
 Documentation/git-diff-tree.txt  |    5 
 Makefile                         |    3 
 diff-cache.c                     |   11 +-
 diff-files.c                     |    8 +
 diff-tree.c                      |    8 +
 diff.c                           |   21 +++
 diff.h                           |    5 
 diffcore-break.c                 |  127 +++++++++++++++++++++++
 diffcore-rename.c                |   45 ++++++--
 diffcore.h                       |   12 +-
 t/t4008-diff-break-rewrite.sh    |  207 +++++++++++++++++++++++++++++++++++++++
 13 files changed, 433 insertions(+), 29 deletions(-)

96c0ba825059e65743bfce718e54413c4583ff35 (from 3fde726e95a828e680c95297f185d6d25fcf853a)
diff --git a/Documentation/git-diff-cache.txt b/Documentation/git-diff-cache.txt
--- a/Documentation/git-diff-cache.txt
+++ b/Documentation/git-diff-cache.txt
@@ -9,7 +9,7 @@ git-diff-cache - Compares content and mo
 
 SYNOPSIS
 --------
-'git-diff-cache' [-p] [-r] [-z] [-m] [-M] [-R] [-C] [-S<string>] [--pickaxe-all] [--cached] <tree-ish> [<path>...]
+'git-diff-cache' [-p] [-r] [-z] [-m] [-B] [-M] [-R] [-C] [-S<string>] [--pickaxe-all] [--cached] <tree-ish> [<path>...]
 
 DESCRIPTION
 -----------
@@ -35,6 +35,9 @@ OPTIONS
 -z::
 	\0 line termination on output
 
+-B::
+	Break complete rewrite changes into pairs of delete and create.
+
 -M::
 	Detect renames.
 
diff --git a/Documentation/git-diff-files.txt b/Documentation/git-diff-files.txt
--- a/Documentation/git-diff-files.txt
+++ b/Documentation/git-diff-files.txt
@@ -9,7 +9,7 @@ git-diff-files - Compares files in the w
 
 SYNOPSIS
 --------
-'git-diff-files' [-p] [-q] [-r] [-z] [-M] [-C] [-R] [-S<string>] [--pickaxe-all] [<pattern>...]
+'git-diff-files' [-p] [-q] [-r] [-z] [-B] [-M] [-C] [-R] [-S<string>] [--pickaxe-all] [<pattern>...]
 
 DESCRIPTION
 -----------
@@ -29,6 +29,9 @@ OPTIONS
 -R::
 	Output diff in reverse.
 
+-B::
+	Break complete rewrite changes into pairs of delete and create.
+
 -M::
 	Detect renames.
 
diff --git a/Documentation/git-diff-tree.txt b/Documentation/git-diff-tree.txt
--- a/Documentation/git-diff-tree.txt
+++ b/Documentation/git-diff-tree.txt
@@ -9,7 +9,7 @@ git-diff-tree - Compares the content and
 
 SYNOPSIS
 --------
-'git-diff-tree' [-p] [-r] [-z] [--stdin] [-M] [-R] [-C] [-S<string>] [--pickaxe-all] [-m] [-s] [-v] [-t] <tree-ish> <tree-ish> [<pattern>]\*
+'git-diff-tree' [-p] [-r] [-z] [--stdin] [-B] [-M] [-R] [-C] [-S<string>] [--pickaxe-all] [-m] [-s] [-v] [-t] <tree-ish> <tree-ish> [<pattern>]\*
 
 DESCRIPTION
 -----------
@@ -33,6 +33,9 @@ OPTIONS
 	generate patch (see section on generating patches).  For
 	git-diff-tree, this flag implies '-r' as well.
 
+-B::
+	Break complete rewrite changes into pairs of delete and create.
+
 -M::
 	Detect renames.
 
diff --git a/Makefile b/Makefile
--- a/Makefile
+++ b/Makefile
@@ -48,7 +48,7 @@ LIB_OBJS += strbuf.o
 
 LIB_H += diff.h count-delta.h
 LIB_OBJS += diff.o diffcore-rename.o diffcore-pickaxe.o diffcore-pathspec.o \
-	count-delta.o
+	count-delta.o diffcore-break.o
 
 LIB_OBJS += gitenv.o
 
@@ -130,6 +130,7 @@ diff.o: $(LIB_H) diffcore.h
 diffcore-rename.o : $(LIB_H) diffcore.h
 diffcore-pathspec.o : $(LIB_H) diffcore.h
 diffcore-pickaxe.o : $(LIB_H) diffcore.h
+diffcore-break.o : $(LIB_H) diffcore.h
 
 test: all
 	$(MAKE) -C t/ all
diff --git a/diff-cache.c b/diff-cache.c
--- a/diff-cache.c
+++ b/diff-cache.c
@@ -9,6 +9,7 @@ static int diff_setup_opt = 0;
 static int diff_score_opt = 0;
 static const char *pickaxe = NULL;
 static int pickaxe_opts = 0;
+static int diff_break_opt = -1;
 
 /* A file entry went away or appeared */
 static void show_file(const char *prefix, struct cache_entry *ce, unsigned char *sha1, unsigned int mode)
@@ -188,6 +189,10 @@ int main(int argc, const char **argv)
 			diff_output_format = DIFF_FORMAT_PATCH;
 			continue;
 		}
+		if (!strncmp(arg, "-B", 2)) {
+			diff_break_opt = diff_scoreopt_parse(arg);
+			continue;
+		}
 		if (!strncmp(arg, "-M", 2)) {
 			detect_rename = DIFF_DETECT_RENAME;
 			diff_score_opt = diff_scoreopt_parse(arg);
@@ -240,9 +245,11 @@ int main(int argc, const char **argv)
 		die("unable to read tree object %s", tree_name);
 
 	ret = diff_cache(active_cache, active_nr);
-	diffcore_std(pathspec,
+
+	diffcore_std(pathspec ? : NULL,
 		     detect_rename, diff_score_opt,
-		     pickaxe, pickaxe_opts);
+		     pickaxe, pickaxe_opts,
+		     diff_break_opt);
 	diff_flush(diff_output_format, 1);
 	return ret;
 }
diff --git a/diff-files.c b/diff-files.c
--- a/diff-files.c
+++ b/diff-files.c
@@ -15,6 +15,7 @@ static int diff_setup_opt = 0;
 static int diff_score_opt = 0;
 static const char *pickaxe = NULL;
 static int pickaxe_opts = 0;
+static int diff_break_opt = -1;
 static int silent = 0;
 
 static void show_unmerge(const char *path)
@@ -57,6 +58,8 @@ int main(int argc, const char **argv)
 			pickaxe = argv[1] + 2;
 		else if (!strcmp(argv[1], "--pickaxe-all"))
 			pickaxe_opts = DIFF_PICKAXE_ALL;
+		else if (!strncmp(argv[1], "-B", 2))
+			diff_break_opt = diff_scoreopt_parse(argv[1]);
 		else if (!strncmp(argv[1], "-M", 2)) {
 			diff_score_opt = diff_scoreopt_parse(argv[1]);
 			detect_rename = DIFF_DETECT_RENAME;
@@ -116,9 +119,10 @@ int main(int argc, const char **argv)
 		show_modified(oldmode, mode, ce->sha1, null_sha1,
 			      ce->name);
 	}
-	diffcore_std(argv + 1,
+	diffcore_std((1 < argc) ? argv + 1 : NULL,
 		     detect_rename, diff_score_opt,
-		     pickaxe, pickaxe_opts);
+		     pickaxe, pickaxe_opts,
+		     diff_break_opt);
 	diff_flush(diff_output_format, 1);
 	return 0;
 }
diff --git a/diff-tree.c b/diff-tree.c
--- a/diff-tree.c
+++ b/diff-tree.c
@@ -14,6 +14,7 @@ static int diff_setup_opt = 0;
 static int diff_score_opt = 0;
 static const char *pickaxe = NULL;
 static int pickaxe_opts = 0;
+static int diff_break_opt = -1;
 static const char *header = NULL;
 static const char *header_prefix = "";
 
@@ -263,7 +264,8 @@ static int call_diff_flush(void)
 {
 	diffcore_std(0,
 		     detect_rename, diff_score_opt,
-		     pickaxe, pickaxe_opts);
+		     pickaxe, pickaxe_opts,
+		     diff_break_opt);
 	if (diff_queue_is_empty()) {
 		diff_flush(DIFF_FORMAT_NO_OUTPUT, 0);
 		return 0;
@@ -523,6 +525,10 @@ int main(int argc, const char **argv)
 			diff_score_opt = diff_scoreopt_parse(arg);
 			continue;
 		}
+		if (!strncmp(arg, "-B", 2)) {
+			diff_break_opt = diff_scoreopt_parse(arg);
+			continue;
+		}
 		if (!strcmp(arg, "-z")) {
 			diff_output_format = DIFF_FORMAT_MACHINE;
 			continue;
diff --git a/diff.c b/diff.c
--- a/diff.c
+++ b/diff.c
@@ -603,6 +603,7 @@ struct diff_filepair *diff_queue(struct 
 	dp->two = two;
 	dp->score = 0;
 	dp->source_stays = 0;
+	dp->broken_pair = 0;
 	diff_q(queue, dp);
 	return dp;
 }
@@ -637,6 +638,16 @@ static void diff_flush_raw(struct diff_f
 		sprintf(status, "%c%03d", p->status,
 			(int)(0.5 + p->score * 100.0/MAX_SCORE));
 		break;
+	case 'N': case 'D':
+		two_paths = 0;
+		if (p->score)
+			sprintf(status, "%c%03d", p->status,
+				(int)(0.5 + p->score * 100.0/MAX_SCORE));
+		else {
+			status[0] = p->status;
+			status[1] = 0;
+		}
+		break;
 	default:
 		two_paths = 0;
 		status[0] = p->status;
@@ -760,8 +771,9 @@ void diff_debug_filepair(const struct di
 {
 	diff_debug_filespec(p->one, i, "one");
 	diff_debug_filespec(p->two, i, "two");
-	fprintf(stderr, "score %d, status %c source_stays %d\n",
-		p->score, p->status ? : '?', p->source_stays);
+	fprintf(stderr, "score %d, status %c stays %d broken %d\n",
+		p->score, p->status ? : '?',
+		p->source_stays, p->broken_pair);
 }
 
 void diff_debug_queue(const char *msg, struct diff_queue_struct *q)
@@ -875,10 +887,13 @@ void diff_flush(int diff_output_style, i
 
 void diffcore_std(const char **paths,
 		  int detect_rename, int rename_score,
-		  const char *pickaxe, int pickaxe_opts)
+		  const char *pickaxe, int pickaxe_opts,
+		  int break_opt)
 {
 	if (paths && paths[0])
 		diffcore_pathspec(paths);
+	if (0 <= break_opt)
+		diffcore_break(break_opt);
 	if (detect_rename)
 		diffcore_rename(detect_rename, rename_score);
 	if (pickaxe)
diff --git a/diff.h b/diff.h
--- a/diff.h
+++ b/diff.h
@@ -43,9 +43,12 @@ extern void diffcore_pickaxe(const char 
 
 extern void diffcore_pathspec(const char **pathspec);
 
+extern void diffcore_break(int);
+
 extern void diffcore_std(const char **paths,
 			 int detect_rename, int rename_score,
-			 const char *pickaxe, int pickaxe_opts);
+			 const char *pickaxe, int pickaxe_opts,
+			 int break_opt);
 
 extern int diff_queue_is_empty(void);
 
diff --git a/diffcore-break.c b/diffcore-break.c
new file mode 100644
--- /dev/null
+++ b/diffcore-break.c
@@ -0,0 +1,127 @@
+/*
+ * Copyright (C) 2005 Junio C Hamano
+ */
+#include "cache.h"
+#include "diff.h"
+#include "diffcore.h"
+#include "delta.h"
+#include "count-delta.h"
+
+static int very_different(struct diff_filespec *src,
+			  struct diff_filespec *dst,
+			  int min_score)
+{
+	/* dst is recorded as a modification of src.  Are they so
+	 * different that we are better off recording this as a pair
+	 * of delete and create?  min_score is the minimum amount of
+	 * new material that must exist in the dst and not in src for
+	 * the pair to be considered a complete rewrite, and recommended
+	 * to be set to a very high value, 99% or so.
+	 *
+	 * The value we return represents the amount of new material
+	 * that is in dst and not in src.  We return 0 when we do not
+	 * want to get the filepair broken.
+	 */
+	void *delta;
+	unsigned long delta_size, base_size;
+
+	if (!S_ISREG(src->mode) || !S_ISREG(dst->mode))
+		return 0; /* leave symlink rename alone */
+
+	if (diff_populate_filespec(src, 1) || diff_populate_filespec(dst, 1))
+		return 0; /* error but caught downstream */
+
+	delta_size = ((src->size < dst->size) ?
+		      (dst->size - src->size) : (src->size - dst->size));
+
+	/* Notice that we use max of src and dst as the base size,
+	 * unlike rename similarity detection.  This is so that we do
+	 * not mistake a large addition as a complete rewrite.
+	 */
+	base_size = ((src->size < dst->size) ? dst->size : src->size);
+
+	/*
+	 * If file size difference is too big compared to the
+	 * base_size, we declare this a complete rewrite.
+	 */
+	if (base_size * min_score < delta_size * MAX_SCORE)
+		return MAX_SCORE;
+
+	if (diff_populate_filespec(src, 0) || diff_populate_filespec(dst, 0))
+		return 0; /* error but caught downstream */
+
+	delta = diff_delta(src->data, src->size,
+			   dst->data, dst->size,
+			   &delta_size);
+
+	/* A delta that has a lot of literal additions would have
+	 * big delta_size no matter what else it does.
+	 */
+	if (base_size * min_score < delta_size * MAX_SCORE)
+		return MAX_SCORE;
+
+	/* Estimate the edit size by interpreting delta. */
+	delta_size = count_delta(delta, delta_size);
+	free(delta);
+	if (delta_size == UINT_MAX)
+		return 0; /* error in delta computation */
+
+	if (base_size < delta_size)
+		return MAX_SCORE;
+
+	return delta_size * MAX_SCORE / base_size; 
+}
+
+void diffcore_break(int min_score)
+{
+	struct diff_queue_struct *q = &diff_queued_diff;
+	struct diff_queue_struct outq;
+	int i;
+
+	if (!min_score)
+		min_score = DEFAULT_BREAK_SCORE;
+
+	outq.nr = outq.alloc = 0;
+	outq.queue = NULL;
+
+	for (i = 0; i < q->nr; i++) {
+		struct diff_filepair *p = q->queue[i];
+		int score;
+
+		/* We deal only with in-place edit of non directory.
+		 * We do not break anything else.
+		 */
+		if (DIFF_FILE_VALID(p->one) && DIFF_FILE_VALID(p->two) &&
+		    !S_ISDIR(p->one->mode) && !S_ISDIR(p->two->mode) &&
+		    !strcmp(p->one->path, p->two->path)) {
+			score = very_different(p->one, p->two, min_score);
+			if (min_score <= score) {
+				/* Split this into delete and create */
+				struct diff_filespec *null_one, *null_two;
+				struct diff_filepair *dp;
+
+				/* deletion of one */
+				null_one = alloc_filespec(p->one->path);
+				dp = diff_queue(&outq, p->one, null_one);
+				dp->score = score;
+				dp->broken_pair = 1;
+
+				/* creation of two */
+				null_two = alloc_filespec(p->two->path);
+				dp = diff_queue(&outq, null_two, p->two);
+				dp->score = score;
+				dp->broken_pair = 1;
+
+				free(p); /* not diff_free_filepair(), we are
+					  * reusing one and two here.
+					  */
+				continue;
+			}
+		}
+		diff_q(&outq, p);
+	}
+	free(q->queue);
+	*q = outq;
+
+	return;
+}
diff --git a/diffcore-rename.c b/diffcore-rename.c
--- a/diffcore-rename.c
+++ b/diffcore-rename.c
@@ -225,8 +225,8 @@ static int score_compare(const void *a_,
 int diff_scoreopt_parse(const char *opt)
 {
 	int diglen, num, scale, i;
-	if (opt[0] != '-' || (opt[1] != 'M' && opt[1] != 'C'))
-		return -1; /* that is not a -M nor -C option */
+	if (opt[0] != '-' || (opt[1] != 'M' && opt[1] != 'C' && opt[1] != 'B'))
+		return -1; /* that is not a -M, -C nor -B option */
 	diglen = strspn(opt+2, "0123456789");
 	if (diglen == 0 || strlen(opt+2) != diglen)
 		return 0; /* use default */
@@ -249,7 +249,7 @@ void diffcore_rename(int detect_rename, 
 	int num_create, num_src, dst_cnt;
 
 	if (!minimum_score)
-		minimum_score = DEFAULT_MINIMUM_SCORE;
+		minimum_score = DEFAULT_RENAME_SCORE;
 	renq.queue = NULL;
 	renq.nr = renq.alloc = 0;
 
@@ -353,17 +353,36 @@ void diffcore_rename(int detect_rename, 
 			/*
 			 * Deletion
 			 *
-			 * We would output this delete record if renq
-			 * does not have a rename/copy to move
-			 * p->one->path out.
+			 * We would output this delete record if:
+			 *
+			 * (1) this is a broken delete and the counterpart
+			 *     broken create remains in the output; or
+			 * (2) this is not a broken delete, and renq does
+			 *     not have a rename/copy to move p->one->path
+			 *     out.
+			 *
+			 * Otherwise, the counterpart broken create
+			 * has been turned into a rename-edit; or
+			 * delete did not have a matching create to
+			 * begin with.
 			 */
-			for (j = 0; j < renq.nr; j++)
-				if (!strcmp(renq.queue[j]->one->path,
-					    p->one->path))
-					break;
-			if (j < renq.nr)
-				/* this path remains */
-				pair_to_free = p;
+			if (DIFF_PAIR_BROKEN(p)) {
+				/* broken delete */
+				struct diff_rename_dst *dst =
+					locate_rename_dst(p->one, 0);
+				if (dst && dst->pair)
+					/* counterpart is now rename/copy */
+					pair_to_free = p;
+			}
+			else {
+				for (j = 0; j < renq.nr; j++)
+					if (!strcmp(renq.queue[j]->one->path,
+						    p->one->path))
+						break;
+				if (j < renq.nr)
+					/* this path remains */
+					pair_to_free = p;
+			}
 
 			if (pair_to_free)
 				;
diff --git a/diffcore.h b/diffcore.h
--- a/diffcore.h
+++ b/diffcore.h
@@ -8,8 +8,9 @@
  * (e.g. diffcore-rename, diffcore-pickaxe).  Never include this header
  * in anything else.
  */
-#define MAX_SCORE 10000
-#define DEFAULT_MINIMUM_SCORE 5000
+#define MAX_SCORE 60000
+#define DEFAULT_RENAME_SCORE 30000 /* rename/copy similarity minimum (50%) */
+#define DEFAULT_BREAK_SCORE  59400 /* minimum for break to happen (99%)*/
 
 #define RENAME_DST_MATCHED 01
 
@@ -40,14 +41,19 @@ struct diff_filepair {
 	struct diff_filespec *one;
 	struct diff_filespec *two;
 	unsigned short int score;
-	char source_stays; /* all of R/C are copies */
 	char status; /* M C R N D U (see Documentation/diff-format.txt) */
+	unsigned source_stays : 1; /* all of R/C are copies */
+	unsigned broken_pair : 1;
 };
 #define DIFF_PAIR_UNMERGED(p) \
 	(!DIFF_FILE_VALID((p)->one) && !DIFF_FILE_VALID((p)->two))
 
 #define DIFF_PAIR_RENAME(p) (strcmp((p)->one->path, (p)->two->path))
 
+#define DIFF_PAIR_BROKEN(p) \
+	( (!DIFF_FILE_VALID((p)->one) != !DIFF_FILE_VALID((p)->two)) && \
+	  ((p)->broken_pair != 0) )
+
 #define DIFF_PAIR_TYPE_CHANGED(p) \
 	((S_IFMT & (p)->one->mode) != (S_IFMT & (p)->two->mode))
 
diff --git a/t/t4008-diff-break-rewrite.sh b/t/t4008-diff-break-rewrite.sh
new file mode 100755
--- /dev/null
+++ b/t/t4008-diff-break-rewrite.sh
@@ -0,0 +1,207 @@
+#!/bin/sh
+#
+# Copyright (c) 2005 Junio C Hamano
+#
+
+test_description='Break and then rename
+
+We have two very different files, file0 and file1, registered in a tree.
+
+We update file1 so drastically that it is more similar to file0, and
+then remove file0.  With -B, changes to file1 should be broken into
+separate delete and create, resulting in removal of file0, removal of
+original file1 and creation of completely rewritten file1.
+
+Further, with -B and -M together, these three modifications should
+turn into rename-edit of file0 into file1.
+
+Starting from the same two files in the tree, we swap file0 and file1.
+With -B, this should be detected as two complete rewrites, resulting in
+four changes in total.
+
+Further, with -B and -M together, these should turn into two renames.
+'
+. ./test-lib.sh
+
+_x40='[0-9a-f][0-9a-f][0-9a-f][0-9a-f][0-9a-f]'
+_x40="$_x40$_x40$_x40$_x40$_x40$_x40$_x40$_x40"
+sanitize_diff_raw='s/ '"$_x40"' '"$_x40"' \([CDNR]\)[0-9]*	/ X X \1#	/'
+compare_diff_raw () {
+    # When heuristics are improved, the score numbers would change.
+    # Ignore them while comparing.
+    # Also we do not check SHA1 hash generation in this test, which
+    # is a job for t0000-basic.sh
+
+    sed -e "$sanitize_diff_raw" <"$1" >.tmp-1
+    sed -e "$sanitize_diff_raw" <"$2" >.tmp-2
+    diff -u .tmp-1 .tmp-2 && rm -f .tmp-1 .tmp-2
+}
+
+test_expect_success \
+    setup \
+    'cat ../../README >file0 &&
+     cat ../../COPYING >file1 &&
+    git-update-cache --add file0 file1 &&
+    tree=$(git-write-tree) &&
+    echo "$tree"'
+
+test_expect_success \
+    'change file1 with copy-edit of file0 and remove file0' \
+    'sed -e "s/git/GIT/" file0 >file1 &&
+     rm -f file0 &&
+    git-update-cache --remove file0 file1'
+
+test_expect_success \
+    'run diff with -B' \
+    'git-diff-cache -B --cached "$tree" >current'
+
+cat >expected <<\EOF
+:100644 000000 f5deac7be59e7eeab8657fd9ae706fd6a57daed2 0000000000000000000000000000000000000000 D	file0
+:100644 000000 6ff87c4664981e4397625791c8ea3bbb5f2279a3 0000000000000000000000000000000000000000 D100	file1
+:000000 100644 0000000000000000000000000000000000000000 11e331465a89c394dc25c780de230043750c1ec8 N100	file1
+EOF
+
+test_expect_success \
+    'validate result of -B (#1)' \
+    'compare_diff_raw current expected'
+
+test_expect_success \
+    'run diff with -B and -M' \
+    'git-diff-cache -B -M "$tree" >current'
+
+cat >expected <<\EOF
+:100644 100644 f5deac7be59e7eeab8657fd9ae706fd6a57daed2 08bb2fb671deff4c03a4d4a0a1315dff98d5732c R100	file0	file1
+EOF
+
+test_expect_success \
+    'validate result of -B -M (#2)' \
+    'compare_diff_raw current expected'
+
+test_expect_success \
+    'swap file0 and file1' \
+    'rm -f file0 file1 &&
+     git-read-tree -m $tree &&
+     git-checkout-cache -f -u -a &&
+     mv file0 tmp &&
+     mv file1 file0 &&
+     mv tmp file1 &&
+     git-update-cache file0 file1'
+
+test_expect_success \
+    'run diff with -B' \
+    'git-diff-cache -B "$tree" >current'
+
+cat >expected <<\EOF
+:100644 000000 f5deac7be59e7eeab8657fd9ae706fd6a57daed2 0000000000000000000000000000000000000000 D100	file0
+:000000 100644 0000000000000000000000000000000000000000 6ff87c4664981e4397625791c8ea3bbb5f2279a3 N100	file0
+:100644 000000 6ff87c4664981e4397625791c8ea3bbb5f2279a3 0000000000000000000000000000000000000000 D100	file1
+:000000 100644 0000000000000000000000000000000000000000 f5deac7be59e7eeab8657fd9ae706fd6a57daed2 N100	file1
+EOF
+
+test_expect_success \
+    'validate result of -B (#3)' \
+    'compare_diff_raw current expected'
+
+test_expect_success \
+    'run diff with -B and -M' \
+    'git-diff-cache -B -M "$tree" >current'
+
+cat >expected <<\EOF
+:100644 100644 6ff87c4664981e4397625791c8ea3bbb5f2279a3 6ff87c4664981e4397625791c8ea3bbb5f2279a3 R100	file1	file0
+:100644 100644 f5deac7be59e7eeab8657fd9ae706fd6a57daed2 f5deac7be59e7eeab8657fd9ae706fd6a57daed2 R100	file0	file1
+EOF
+
+test_expect_success \
+    'validate result of -B -M (#4)' \
+    'compare_diff_raw current expected'
+
+test_expect_success \
+    'make file0 into something completely different' \
+    'rm -f file0 &&
+     ln -s frotz file0 &&
+     git-update-cache file0 file1'
+
+test_expect_success \
+    'run diff with -B' \
+    'git-diff-cache -B "$tree" >current'
+
+cat >expected <<\EOF
+:100644 120000 f5deac7be59e7eeab8657fd9ae706fd6a57daed2 67be421f88824578857624f7b3dc75e99a8a1481 T	file0
+:100644 000000 6ff87c4664981e4397625791c8ea3bbb5f2279a3 0000000000000000000000000000000000000000 D100	file1
+:000000 100644 0000000000000000000000000000000000000000 f5deac7be59e7eeab8657fd9ae706fd6a57daed2 N100	file1
+EOF
+
+test_expect_success \
+    'validate result of -B (#5)' \
+    'compare_diff_raw current expected'
+
+test_expect_success \
+    'run diff with -B' \
+    'git-diff-cache -B -M "$tree" >current'
+
+# This should not mistake file0 as the copy source of new file1
+# due to type differences.
+cat >expected <<\EOF
+:100644 120000 f5deac7be59e7eeab8657fd9ae706fd6a57daed2 67be421f88824578857624f7b3dc75e99a8a1481 T	file0
+:100644 000000 6ff87c4664981e4397625791c8ea3bbb5f2279a3 0000000000000000000000000000000000000000 D100	file1
+:000000 100644 0000000000000000000000000000000000000000 f5deac7be59e7eeab8657fd9ae706fd6a57daed2 N100	file1
+EOF
+
+test_expect_success \
+    'validate result of -B -M (#6)' \
+    'compare_diff_raw current expected'
+
+test_expect_success \
+    'run diff with -M' \
+    'git-diff-cache -M "$tree" >current'
+
+# This should not mistake file0 as the copy source of new file1
+# due to type differences.
+cat >expected <<\EOF
+:100644 120000 f5deac7be59e7eeab8657fd9ae706fd6a57daed2 67be421f88824578857624f7b3dc75e99a8a1481 T	file0
+:100644 100644 6ff87c4664981e4397625791c8ea3bbb5f2279a3 f5deac7be59e7eeab8657fd9ae706fd6a57daed2 M	file1
+EOF
+
+test_expect_success \
+    'validate result of -M (#7)' \
+    'compare_diff_raw current expected'
+
+test_expect_success \
+    'file1 edited to look like file0 and file0 rename-edited to file2' \
+    'rm -f file0 file1 &&
+     git-read-tree -m $tree &&
+     git-checkout-cache -f -u -a &&
+     sed -e "s/git/GIT/" file0 >file1 &&
+     sed -e "s/git/GET/" file0 >file2 &&
+     rm -f file0
+     git-update-cache --add --remove file0 file1 file2'
+
+test_expect_success \
+    'run diff with -B' \
+    'git-diff-cache -B "$tree" >current'
+
+cat >expected <<\EOF
+:100644 000000 f5deac7be59e7eeab8657fd9ae706fd6a57daed2 0000000000000000000000000000000000000000 D	file0
+:100644 000000 6ff87c4664981e4397625791c8ea3bbb5f2279a3 0000000000000000000000000000000000000000 D100	file1
+:000000 100644 0000000000000000000000000000000000000000 08bb2fb671deff4c03a4d4a0a1315dff98d5732c N100	file1
+:000000 100644 0000000000000000000000000000000000000000 f5deac7be59e7eeab8657fd9ae706fd6a57daed2 N	file2
+EOF
+
+test_expect_success \
+    'validate result of -B (#8)' \
+    'compare_diff_raw current expected'
+
+test_expect_success \
+    'run diff with -B -M' \
+    'git-diff-cache -B -M "$tree" >current'
+
+cat >expected <<\EOF
+:100644 100644 f5deac7be59e7eeab8657fd9ae706fd6a57daed2 08bb2fb671deff4c03a4d4a0a1315dff98d5732c C095	file0	file1
+:100644 100644 f5deac7be59e7eeab8657fd9ae706fd6a57daed2 59f832e5c8b3f7e486be15ad0cd3e95ba9af8998 R095	file0	file2
+EOF
+
+test_expect_success \
+    'validate result of -B -M (#9)' \
+    'compare_diff_raw current expected'
+
+test_done


^ permalink raw reply

* [PATCH 4/4] Add -O<orderfile> option to diff-* brothers.
From: Junio C Hamano @ 2005-05-30  7:09 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Git Mailing List
In-Reply-To: <7vekbp8ajm.fsf_-_@assigned-by-dhcp.cox.net>

A new diffcore filter diffcore-order is introduced.  This takes
a text file each of whose line is a shell glob pattern.  Patches
that match a glob pattern on an earlier line in the file are
output before patches that match a later line, and patches that
do not match any glob pattern are output last.

A typical orderfile for git project probably should look like
this:

    README
    Makefile
    Documentation
    *.h
    *.c

Signed-off-by: Junio C Hamano <junkio@cox.net>
---
 Documentation/git-diff-cache.txt |    6 +
 Documentation/git-diff-files.txt |    6 +
 Documentation/git-diff-tree.txt  |    6 +
 Makefile                         |    3 
 diff-cache.c                     |    8 ++
 diff-files.c                     |    6 +
 diff-tree.c                      |    8 ++
 diff.c                           |    5 +
 diff.h                           |    7 +-
 diffcore-order.c                 |  122 +++++++++++++++++++++++++++++++++++++++
 10 files changed, 167 insertions(+), 10 deletions(-)

6575b8d66ea48891aaba3fba3eb8a4fddc596c2c (from 96c0ba825059e65743bfce718e54413c4583ff35)
diff --git a/Documentation/git-diff-cache.txt b/Documentation/git-diff-cache.txt
--- a/Documentation/git-diff-cache.txt
+++ b/Documentation/git-diff-cache.txt
@@ -9,7 +9,7 @@ git-diff-cache - Compares content and mo
 
 SYNOPSIS
 --------
-'git-diff-cache' [-p] [-r] [-z] [-m] [-B] [-M] [-R] [-C] [-S<string>] [--pickaxe-all] [--cached] <tree-ish> [<path>...]
+'git-diff-cache' [-p] [-r] [-z] [-m] [-B] [-M] [-R] [-C] [-O<orderfile>] [-S<string>] [--pickaxe-all] [--cached] <tree-ish> [<path>...]
 
 DESCRIPTION
 -----------
@@ -52,6 +52,10 @@ OPTIONS
 	changeset, not just the files that contains the change
 	in <string>.
 
+-O<orderfile>::
+	Output the patch in the order specified in the
+	<orderfile>, which has one shell glob pattern per line.
+
 -R::
 	Output diff in reverse.
 
diff --git a/Documentation/git-diff-files.txt b/Documentation/git-diff-files.txt
--- a/Documentation/git-diff-files.txt
+++ b/Documentation/git-diff-files.txt
@@ -9,7 +9,7 @@ git-diff-files - Compares files in the w
 
 SYNOPSIS
 --------
-'git-diff-files' [-p] [-q] [-r] [-z] [-B] [-M] [-C] [-R] [-S<string>] [--pickaxe-all] [<pattern>...]
+'git-diff-files' [-p] [-q] [-r] [-z] [-B] [-M] [-C] [-R] [-O<orderfile>] [-S<string>] [--pickaxe-all] [<pattern>...]
 
 DESCRIPTION
 -----------
@@ -46,6 +46,10 @@ OPTIONS
 	changeset, not just the files that contains the change
 	in <string>.
 
+-O<orderfile>::
+	Output the patch in the order specified in the
+	<orderfile>, which has one shell glob pattern per line.
+
 -r::
 	This flag does not mean anything.  It is there only to match
 	git-diff-tree.  Unlike git-diff-tree, git-diff-files always looks
diff --git a/Documentation/git-diff-tree.txt b/Documentation/git-diff-tree.txt
--- a/Documentation/git-diff-tree.txt
+++ b/Documentation/git-diff-tree.txt
@@ -9,7 +9,7 @@ git-diff-tree - Compares the content and
 
 SYNOPSIS
 --------
-'git-diff-tree' [-p] [-r] [-z] [--stdin] [-B] [-M] [-R] [-C] [-S<string>] [--pickaxe-all] [-m] [-s] [-v] [-t] <tree-ish> <tree-ish> [<pattern>]\*
+'git-diff-tree' [-p] [-r] [-z] [--stdin] [-B] [-M] [-R] [-C] [-O<orderfile>] [-S<string>] [--pickaxe-all] [-m] [-s] [-v] [-t] <tree-ish> <tree-ish> [<pattern>]\*
 
 DESCRIPTION
 -----------
@@ -53,6 +53,10 @@ OPTIONS
 	changeset, not just the files that contains the change
 	in <string>.
 
+-O<orderfile>::
+	Output the patch in the order specified in the
+	<orderfile>, which has one shell glob pattern per line.
+
 -r::
 	recurse
 
diff --git a/Makefile b/Makefile
--- a/Makefile
+++ b/Makefile
@@ -48,7 +48,7 @@ LIB_OBJS += strbuf.o
 
 LIB_H += diff.h count-delta.h
 LIB_OBJS += diff.o diffcore-rename.o diffcore-pickaxe.o diffcore-pathspec.o \
-	count-delta.o diffcore-break.o
+	count-delta.o diffcore-break.o diffcore-order.o
 
 LIB_OBJS += gitenv.o
 
@@ -131,6 +131,7 @@ diffcore-rename.o : $(LIB_H) diffcore.h
 diffcore-pathspec.o : $(LIB_H) diffcore.h
 diffcore-pickaxe.o : $(LIB_H) diffcore.h
 diffcore-break.o : $(LIB_H) diffcore.h
+diffcore-order.o : $(LIB_H) diffcore.h
 
 test: all
 	$(MAKE) -C t/ all
diff --git a/diff-cache.c b/diff-cache.c
--- a/diff-cache.c
+++ b/diff-cache.c
@@ -10,6 +10,7 @@ static int diff_score_opt = 0;
 static const char *pickaxe = NULL;
 static int pickaxe_opts = 0;
 static int diff_break_opt = -1;
+static const char *orderfile = NULL;
 
 /* A file entry went away or appeared */
 static void show_file(const char *prefix, struct cache_entry *ce, unsigned char *sha1, unsigned int mode)
@@ -215,6 +216,10 @@ int main(int argc, const char **argv)
 			pickaxe = arg + 2;
 			continue;
 		}
+		if (!strncmp(arg, "-O", 2)) {
+			orderfile = arg + 2;
+			continue;
+		}
 		if (!strcmp(arg, "--pickaxe-all")) {
 			pickaxe_opts = DIFF_PICKAXE_ALL;
 			continue;
@@ -249,7 +254,8 @@ int main(int argc, const char **argv)
 	diffcore_std(pathspec ? : NULL,
 		     detect_rename, diff_score_opt,
 		     pickaxe, pickaxe_opts,
-		     diff_break_opt);
+		     diff_break_opt,
+		     orderfile);
 	diff_flush(diff_output_format, 1);
 	return ret;
 }
diff --git a/diff-files.c b/diff-files.c
--- a/diff-files.c
+++ b/diff-files.c
@@ -16,6 +16,7 @@ static int diff_score_opt = 0;
 static const char *pickaxe = NULL;
 static int pickaxe_opts = 0;
 static int diff_break_opt = -1;
+static const char *orderfile = NULL;
 static int silent = 0;
 
 static void show_unmerge(const char *path)
@@ -56,6 +57,8 @@ int main(int argc, const char **argv)
 			diff_setup_opt |= DIFF_SETUP_REVERSE;
 		else if (!strncmp(argv[1], "-S", 2))
 			pickaxe = argv[1] + 2;
+		else if (!strncmp(argv[1], "-O", 2))
+			orderfile = argv[1] + 2;
 		else if (!strcmp(argv[1], "--pickaxe-all"))
 			pickaxe_opts = DIFF_PICKAXE_ALL;
 		else if (!strncmp(argv[1], "-B", 2))
@@ -122,7 +125,8 @@ int main(int argc, const char **argv)
 	diffcore_std((1 < argc) ? argv + 1 : NULL,
 		     detect_rename, diff_score_opt,
 		     pickaxe, pickaxe_opts,
-		     diff_break_opt);
+		     diff_break_opt,
+		     orderfile);
 	diff_flush(diff_output_format, 1);
 	return 0;
 }
diff --git a/diff-tree.c b/diff-tree.c
--- a/diff-tree.c
+++ b/diff-tree.c
@@ -15,6 +15,7 @@ static int diff_score_opt = 0;
 static const char *pickaxe = NULL;
 static int pickaxe_opts = 0;
 static int diff_break_opt = -1;
+static const char *orderfile = NULL;
 static const char *header = NULL;
 static const char *header_prefix = "";
 
@@ -265,7 +266,8 @@ static int call_diff_flush(void)
 	diffcore_std(0,
 		     detect_rename, diff_score_opt,
 		     pickaxe, pickaxe_opts,
-		     diff_break_opt);
+		     diff_break_opt,
+		     orderfile);
 	if (diff_queue_is_empty()) {
 		diff_flush(DIFF_FORMAT_NO_OUTPUT, 0);
 		return 0;
@@ -511,6 +513,10 @@ int main(int argc, const char **argv)
 			pickaxe = arg + 2;
 			continue;
 		}
+		if (!strncmp(arg, "-O", 2)) {
+			orderfile = arg + 2;
+			continue;
+		}
 		if (!strcmp(arg, "--pickaxe-all")) {
 			pickaxe_opts = DIFF_PICKAXE_ALL;
 			continue;
diff --git a/diff.c b/diff.c
--- a/diff.c
+++ b/diff.c
@@ -888,7 +888,8 @@ void diff_flush(int diff_output_style, i
 void diffcore_std(const char **paths,
 		  int detect_rename, int rename_score,
 		  const char *pickaxe, int pickaxe_opts,
-		  int break_opt)
+		  int break_opt,
+		  const char *orderfile)
 {
 	if (paths && paths[0])
 		diffcore_pathspec(paths);
@@ -898,6 +899,8 @@ void diffcore_std(const char **paths,
 		diffcore_rename(detect_rename, rename_score);
 	if (pickaxe)
 		diffcore_pickaxe(pickaxe, pickaxe_opts);
+	if (orderfile)
+		diffcore_order(orderfile);
 }
 
 void diff_addremove(int addremove, unsigned mode,
diff --git a/diff.h b/diff.h
--- a/diff.h
+++ b/diff.h
@@ -43,12 +43,15 @@ extern void diffcore_pickaxe(const char 
 
 extern void diffcore_pathspec(const char **pathspec);
 
-extern void diffcore_break(int);
+extern void diffcore_order(const char *orderfile);
+
+extern void diffcore_break(int max_score);
 
 extern void diffcore_std(const char **paths,
 			 int detect_rename, int rename_score,
 			 const char *pickaxe, int pickaxe_opts,
-			 int break_opt);
+			 int break_opt,
+			 const char *orderfile);
 
 extern int diff_queue_is_empty(void);
 
diff --git a/diffcore-order.c b/diffcore-order.c
new file mode 100644
--- /dev/null
+++ b/diffcore-order.c
@@ -0,0 +1,122 @@
+/*
+ * Copyright (C) 2005 Junio C Hamano
+ */
+#include "cache.h"
+#include "diff.h"
+#include "diffcore.h"
+#include <fnmatch.h>
+
+static char **order;
+static int order_cnt;
+
+static void prepare_order(const char *orderfile)
+{
+	int fd, cnt, pass;
+	void *map;
+	char *cp, *endp;
+	struct stat st;
+
+	if (order)
+		return;
+
+	fd = open(orderfile, O_RDONLY);
+	if (fd < 0)
+		return;
+	if (fstat(fd, &st)) {
+		close(fd);
+		return;
+	}
+	map = mmap(NULL, st.st_size, PROT_READ|PROT_WRITE, MAP_PRIVATE, fd, 0);
+	close(fd);
+	if (-1 == (int)(long)map)
+		return;
+	endp = map + st.st_size;
+	for (pass = 0; pass < 2; pass++) {
+		cnt = 0;
+		cp = map;
+		while (cp < endp) {
+			char *ep;
+			for (ep = cp; ep < endp && *ep != '\n'; ep++)
+				;
+			/* cp to ep has one line */
+			if (*cp == '\n' || *cp == '#')
+				; /* comment */
+			else if (pass == 0)
+				cnt++;
+			else {
+				if (*ep == '\n') {
+					*ep = 0;
+					order[cnt] = cp;
+				}
+				else {
+					order[cnt] = xmalloc(ep-cp+1);
+					memcpy(order[cnt], cp, ep-cp);
+					order[cnt][ep-cp] = 0;
+				}
+				cnt++;
+			}
+			if (ep < endp)
+				ep++;
+			cp = ep;
+		}
+		if (pass == 0) {
+			order_cnt = cnt;
+			order = xmalloc(sizeof(*order) * cnt);
+		}
+	}
+}
+
+struct pair_order {
+	struct diff_filepair *pair;
+	int orig_order;
+	int order;
+};
+
+static int match_order(const char *path)
+{
+	int i;
+	char p[PATH_MAX];
+
+	for (i = 0; i < order_cnt; i++) {
+		strcpy(p, path);
+		while (p[0]) {
+			char *cp;
+			if (!fnmatch(order[i], p, 0))
+				return i;
+			cp = strrchr(p, '/');
+			if (!cp)
+				break;
+			*cp = 0;
+		}
+	}
+	return order_cnt;
+}
+
+static int compare_pair_order(const void *a_, const void *b_)
+{
+	struct pair_order const *a, *b;
+	a = (struct pair_order const *)a_;
+	b = (struct pair_order const *)b_;
+	if (a->order != b->order)
+		return a->order - b->order;
+	return a->orig_order - b->orig_order;
+}
+
+void diffcore_order(const char *orderfile)
+{
+	struct diff_queue_struct *q = &diff_queued_diff;
+	struct pair_order *o = xmalloc(sizeof(*o) * q->nr);
+	int i;
+
+	prepare_order(orderfile);
+	for (i = 0; i < q->nr; i++) {
+		o[i].pair = q->queue[i];
+		o[i].orig_order = i;
+		o[i].order = match_order(o[i].pair->two->path);
+	}
+	qsort(o, q->nr, sizeof(*o), compare_pair_order);
+	for (i = 0; i < q->nr; i++)
+		q->queue[i] = o[i].pair;
+	free(o);
+	return;
+}


^ permalink raw reply

* Re: [PATCH] Pickaxe fixes.
From: Junio C Hamano @ 2005-05-30  7:35 UTC (permalink / raw)
  To: Thomas Glanzmann; +Cc: git
In-Reply-To: <20050528162257.GE4881@cip.informatik.uni-erlangen.de>

>>>>> "TG" == Thomas Glanzmann <sithglan@stud.uni-erlangen.de> writes:

TG> ... However at the moment I don't
TG> have an opinion on this, I have to use it a bit longer. But it is a good
TG> thing that I know by now that it limits its view to the subdirectory
TG> after your patch-train is applied.

Your opinion on this would not count anymore ;-).  Pathspec is now
at the beginning of the processiong chain.


^ permalink raw reply

* Re: -p diff output and the 'Index:' line
From: Junio C Hamano @ 2005-05-30  7:42 UTC (permalink / raw)
  To: Petr Baudis; +Cc: git
In-Reply-To: <20050529190305.GP1036@pasky.ji.cz>

>>>>> "PB" == Petr Baudis <pasky@ucw.cz> writes:

PB> It's just something along the lines of "Me Og. Og sees /^+/. Og makes
PB> the line green." written in gawk (actually I'm not sure if pure awk
PB> wouldn't do, but I actually don't know the language), so I don't think
PB> the external diff thing would've helped me with that in any way.

Ah, I see.  I thought you were talking about the Index: and
separator lines.  Colorizing diff/patch part, you need to parse
the diff output with sed/awk/perl and annotate it anyway, and it
does not matter if you annotate within GIT_EXTERNAL_DIFF or
outside.  I agree with you that using GIT_EXTERNAL_DIFF
mechanism would not help you here.





^ permalink raw reply

* Re: Problem with cg-diff <file>
From: Junio C Hamano @ 2005-05-30  7:54 UTC (permalink / raw)
  To: Petr Baudis; +Cc: GIT Mailing List
In-Reply-To: <20050530003242.GA1036@pasky.ji.cz>

>>>>> "PB" == Petr Baudis <pasky@ucw.cz> writes:

PB> Ok, so this is what you get when you mix: sleepiness, performing only
PB> mental experiments not verified in practice, and inattentive reading of
PB> the code.

PB> I'm sorry for bothering. Instruct yourself from my bad example, please.
PB> :-)

If you forbid people to ask for help when the person who is
asked might feel the question groundless or based on "only
mental experiments not verified in practice and inattentive
reading of the code", the value to have a community diminishes.

We ask questions and ask for help because we know others know
more about things we do not know offhand, not necessarily
because we would not ever be able to figure them out ourselves.

If you know somebody else would know the answer immediately for
something that may take you a day or so to figure out, asking
for help is the right thing to do --- your time is better spent
on what you do best (e.g. improving Cogito).  I should not feel
bothered by your questions, and I am certainly not feeling
bothered at all (well, at least until seeing the last sentence,
and wondering what you really meant ;-)).

Always glad to be of help.


^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox