git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Handling merge conflicts a bit more gracefully..
@ 2005-06-08 20:55 Linus Torvalds
  2005-06-08 23:07 ` Junio C Hamano
  2005-06-09  3:07 ` Jeff Garzik
  0 siblings, 2 replies; 33+ messages in thread
From: Linus Torvalds @ 2005-06-08 20:55 UTC (permalink / raw)
  To: Git Mailing List; +Cc: Junio C Hamano, Jeff Garzik


Ok, Jeff reported that whenever there is a merge conflict, he ends up 
really punting on it and doing it all with diffs, which clearly meant that 
I had to fix up my silly things for this. Which I think I've done now.

What happens now in the case of a merge conflict is:
 - the merge is obviously not committed
 - we do all the successful merges, and update the index file for them
 - for the files that conflict, we force the index to contain the old
   version of the file (ie we remove the merge from the index), and we
   write the (failed) output of the merge into the working directory, and
   we complain loudly:

	Auto-merging xyzzy.
	merge: warning: conflicts during merge
	ERROR: Merge conflict in xyzzy.
	fatal: merge program failed
	Automatic merge failed, fix up by hand

at which point a normal "git-diff-files -p xyzzy" will show the incomplete
merge results (as relative to the original BRANCH you started with), and
in fact you can also do "git-diff-cache -p MERGE_HEAD xyzzy" to see the
same thing (but relative to the branch you tried to merge).

You then fix up the merge failure by hand (exactly the way you'd do with 
CVS), and you do a "git-update-cache xyzzy" when you're happy with the end 
result. Then a simple "git commit" should do the right thing.

If you decide that the merge is too hard to undo, you'd do:

	git-read-tree -u -m HEAD
	rm .git/MERGE_HEAD

and use git-checkout-cache judiciously to remove any edits the merge did.

This is definitely not perfect, but it's a hell of a lot more usable than
it used to be, and not really worse than what CVS people are used to (and
usually a lot better, since git will obviously get the origin of a
three-way merge right, unlike CVS).

Comments? It would be good to have people test this and maybe even write a 
few automated tests that it all works as expected..

		Linus

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Handling merge conflicts a bit more gracefully..
  2005-06-08 20:55 Handling merge conflicts a bit more gracefully Linus Torvalds
@ 2005-06-08 23:07 ` Junio C Hamano
  2005-06-08 23:35   ` Linus Torvalds
  2005-06-09  3:07 ` Jeff Garzik
  1 sibling, 1 reply; 33+ messages in thread
From: Junio C Hamano @ 2005-06-08 23:07 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git

>>>>> "LT" == Linus Torvalds <torvalds@osdl.org> writes:

LT> What happens now in the case of a merge conflict is:
LT>  - the merge is obviously not committed
LT>  - we do all the successful merges, and update the index file for them
LT>  - for the files that conflict, we force the index to contain the old
LT>    version of the file (ie we remove the merge from the index), and we
LT>    write the (failed) output of the merge into the working directory, and
LT>    we complain loudly:

LT> Comments? It would be good to have people test this and maybe even write a 
LT> few automated tests that it all works as expected..

OK, I'll bite.  Other than some minor details, the work tree
seems to be updated with the result of the merge, either
successful one or failed one.

2a68a8659f7dc55fd285d235ae2d19e7a8116c30 \
(from f9e7750621ca5e067f58a679caff5ff2f9881c4c)
diff --git a/git-merge-one-file-script b/git-merge-one-file-script
--- a/git-merge-one-file-script
+++ b/git-merge-one-file-script
@@ -19,22 +19,25 @@ case "${1:-.}${2:-.}${3:-.}" in
 # Deleted in both.
 #
 "$1..")
-	echo "ERROR: $4 is removed in both branches."
-	echo "ERROR: This is a potential rename conflict."
-	exit 1;;
+	echo "WARNING: $4 is removed in both branches."
+	echo "WARNING: This is a potential rename conflict."
+	exec git-update-cache --remove -- "$4" ;;

Making sure that the path does not exist in the work tree with
test -f "$4" would be more sensible, before running --remove.

 #
 # Deleted in one and unchanged in the other.
 #
 "$1.." | "$1.$1" | "$1$1.")
 	echo "Removing $4"
-	exec git-update-cache --force-remove "$4" ;;
+	rm -f -- "$4"
+	exec git-update-cache --remove -- "$4" ;;

Make sure "$4" is not a directory, perhaps?  At least barf if
that 'rm -f -- "$4"' fails?

 #
 # Modified in both, but differently.
 #

@@ -55,19 +60,21 @@ case "${1:-.}${2:-.}${3:-.}" in
 	orig=`git-unpack-file $1`
 	src1=`git-unpack-file $2`
 	src2=`git-unpack-file $3`
-	merge "$src2" "$orig" "$src1"
+	merge -p "$src1" "$orig" "$src2" > "$4"
 	ret=$?
+	rm -f -- "$orig" "$src1" "$src2"
 	if [ "$6" != "$7" ]; then
 		echo "ERROR: Permissions $5->$6->$7 don't match."
+		ret=1
 	fi
 	if [ $ret -ne 0 ]; then
-		echo "ERROR: Leaving conflict merge in $src2."
+		# Reset the index to the first branch, making
+		# git-diff-file useful
+		git-update-cache --add --cacheinfo "$6" "$2" "$4"
+		echo "ERROR: Merge conflict in $4."
 		exit 1
 	fi
-	sha1=`git-write-blob "$src2"` || {
-		echo "ERROR: Leaving conflict merge in $src2."
-	}
-	exec git-update-cache --add --cacheinfo "$6" $sha1 "$4" ;;
+	exec git-update-cache --add -- "$4" ;;
 *)
 	echo "ERROR: Not handling case $4: $1 -> $2 -> $3" ;;
 esac

Again, make sure "$4" is not a directory before redirecting into
it from merge, so that you can tell merge failures from it?


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Handling merge conflicts a bit more gracefully..
  2005-06-08 23:07 ` Junio C Hamano
@ 2005-06-08 23:35   ` Linus Torvalds
  2005-06-09  0:03     ` Junio C Hamano
                       ` (3 more replies)
  0 siblings, 4 replies; 33+ messages in thread
From: Linus Torvalds @ 2005-06-08 23:35 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git



On Wed, 8 Jun 2005, Junio C Hamano wrote:
>  # Deleted in both.
> 
> Making sure that the path does not exist in the work tree with
> test -f "$4" would be more sensible, before running --remove.

Yeah, my (broken) thinking was that since it wasn't in both, it wasn't in 
the working directory either, but you're right, that's just crazy talk. 
There could be a stale file there.

Made it do a

	rm -f -- "$4" || exit 1

instead (and changed the other one to do the "|| exit 1" too, since you're 
also obviously right on the directory issue).

>  # Modified in both, but differently.
> +	merge -p "$src1" "$orig" "$src2" > "$4"
> 
> Again, make sure "$4" is not a directory before redirecting into
> it from merge, so that you can tell merge failures from it?

Hmm.. What's the cleanest way to check for redirection errors, but still
be able to distinguish those cleanly from "merge" itself returning an
error?

			Linus

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Handling merge conflicts a bit more gracefully..
  2005-06-08 23:35   ` Linus Torvalds
@ 2005-06-09  0:03     ` Junio C Hamano
  2005-06-09  0:41       ` Linus Torvalds
  2005-06-09  0:11     ` Junio C Hamano
                       ` (2 subsequent siblings)
  3 siblings, 1 reply; 33+ messages in thread
From: Junio C Hamano @ 2005-06-09  0:03 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git

>>>>> "LT" == Linus Torvalds <torvalds@osdl.org> writes:

>> # Modified in both, but differently.
>> +	merge -p "$src1" "$orig" "$src2" > "$4"
>> 
>> Again, make sure "$4" is not a directory before redirecting into
>> it from merge, so that you can tell merge failures from it?

LT> Hmm.. What's the cleanest way to check for redirection errors, but still
LT> be able to distinguish those cleanly from "merge" itself returning an
LT> error?

I do not think you can, unless you are willing to parse shell
error messages, which I do not want you to be willing to ;-).

    : siamese; ls -dlF junk j.py
    ----------  1 junio junio  845 May  7  2004 j.py
    drwxrwxr-x  2 junio junio 4096 May  4 22:31 junk/
    : siamese; echo foo >j.py ; echo $?
    bash: j.py: Permission denied
    1
    : siamese; echo foo >junk ; echo $?
    bash: junk: Is a directory
    1

I think you have a bigger problem of leading paths, BTW.

Since we would want to have the merge result file at that path,
and not being able to create such is an error, how about doing
dumb and simple, like:

    d=`dirname "$4"` &&
    mkdir -p "$d" &&
    rm -f -- "$4" &&
    : >"$4" || {
        echo "barf"
        exit 1
    }
    merge -p "$src1" "$orig" "$src2" >"$4"
    ret=$?


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Handling merge conflicts a bit more gracefully..
  2005-06-08 23:35   ` Linus Torvalds
  2005-06-09  0:03     ` Junio C Hamano
@ 2005-06-09  0:11     ` Junio C Hamano
  2005-06-09  1:08       ` Linus Torvalds
  2005-06-09  7:02     ` [PATCH 0/3] " Junio C Hamano
  2005-06-18  0:15     ` Handling merge conflicts a bit more gracefully Herbert Xu
  3 siblings, 1 reply; 33+ messages in thread
From: Junio C Hamano @ 2005-06-09  0:11 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git

While I have your attention, I have been thinking about a
problem at lower level than what is being discussed.

Consider the following two command sequences:

     (1) git-read-tree -m $H $M	&& git-write-tree

     (2) I=`git-write-tree` &&
         git-read-tree -m $H $I $M &&
         git-merge-cache -o git-merge-one-file-script -a &&
         git-write-tree

I think they should be equivalent in that:

   - when (1) refuses to run, (2) should either cause
     git-read-tree to refuse, or at least should result in an
     unmerged cache and git-write-tree phase should fail;

   - when (1) succeeds, (2) should also succeed, and the
     resulting tree from both should be the same.



^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Handling merge conflicts a bit more gracefully..
  2005-06-09  0:03     ` Junio C Hamano
@ 2005-06-09  0:41       ` Linus Torvalds
  2005-06-09  1:04         ` Junio C Hamano
  0 siblings, 1 reply; 33+ messages in thread
From: Linus Torvalds @ 2005-06-09  0:41 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git



On Wed, 8 Jun 2005, Junio C Hamano wrote:
> 
> I do not think you can, unless you are willing to parse shell
> error messages, which I do not want you to be willing to ;-).

Yeah, no. 

> I think you have a bigger problem of leading paths, BTW.

Gotcha. I committed a largely untested fix that hopefully does this all 
right.

I'm currently using your suggested thing (inside a function), but I think 
I'll instead make it do

	git-update-cache --add --cacheinfo ... &&
		git-checkout-cache -u -f "$4"

which seems even simpler.

		Linus

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Handling merge conflicts a bit more gracefully..
  2005-06-09  0:41       ` Linus Torvalds
@ 2005-06-09  1:04         ` Junio C Hamano
  0 siblings, 0 replies; 33+ messages in thread
From: Junio C Hamano @ 2005-06-09  1:04 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git

>>>>> "LT" == Linus Torvalds <torvalds@osdl.org> writes:

LT> I'll instead make it do

LT> 	git-update-cache --add --cacheinfo ... &&
LT> 		git-checkout-cache -u -f "$4"

LT> which seems even simpler.

I like that one much much better.  Consistently using
checkout-cache -f everywhere is much preferred.  It creates the
leading paths itself, and even nukes interfering files it finds
while creating leading directories, which the verify_path using
mkdir -p would not give you.


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Handling merge conflicts a bit more gracefully..
  2005-06-09  0:11     ` Junio C Hamano
@ 2005-06-09  1:08       ` Linus Torvalds
  2005-06-09  2:15         ` Junio C Hamano
  0 siblings, 1 reply; 33+ messages in thread
From: Linus Torvalds @ 2005-06-09  1:08 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git



On Wed, 8 Jun 2005, Junio C Hamano wrote:
> 
> Consider the following two command sequences:
> 
>      (1) git-read-tree -m $H $M	&& git-write-tree
> 
>      (2) I=`git-write-tree` &&
>          git-read-tree -m $H $I $M &&
>          git-merge-cache -o git-merge-one-file-script -a &&
>          git-write-tree
> 
> I think they should be equivalent in that:
> 
>    - when (1) refuses to run, (2) should either cause
>      git-read-tree to refuse, or at least should result in an
>      unmerged cache and git-write-tree phase should fail;
> 
>    - when (1) succeeds, (2) should also succeed, and the
>      resulting tree from both should be the same.

I think that sounds reasonable. Is it not the case now?

		Linus

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Handling merge conflicts a bit more gracefully..
  2005-06-09  1:08       ` Linus Torvalds
@ 2005-06-09  2:15         ` Junio C Hamano
  2005-06-09  2:48           ` Linus Torvalds
  0 siblings, 1 reply; 33+ messages in thread
From: Junio C Hamano @ 2005-06-09  2:15 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git

>>>>> "LT" == Linus Torvalds <torvalds@osdl.org> writes:

LT> I think that sounds reasonable. Is it not the case now?

Well, except that $I may validly be an empty tree ;-), so not
quite.

In case it was not clear, where I am headed is this.  I would
like to rip out the two-tree "carry forward" implementation from
read-tree, and replace it with:

    read_cache() -- current goes to stage0
    read_tree(H) -- H goes to stage1
    read_tree(M) -- M goes to stage3
    for each path
        if it appears in stage0, copy it to stage2
        else if it appears in stage1, copy it to stage2
    threeway_merge() !!

And then the resulting possibly unmerged cache can be resolved
exactly the same way with merge-cache.

The trouble I feel with the current "carry forward" code is that
when it works it does sensible thing, but otherwise does not
help the end user at all.  With all the work going into making
merge-one-file-script nicer today, I think leveraging three-way
merge support for two-tree fast forward case would make a lot
more sense than keeping the all-or-nothing carry forward code I
recently added to it.

When/if that happens, then the current fast-forward code would
need to be changed from:

    read-tree -m $H $M && echo $M >.git/HEAD

to

    read-tree -m $H $M &&
    if unmerged paths in the resulting cache
    then
        merge-cache -o merge-one-file-script -a
    fi &&
    echo $M >.git/HEAD

and the user's local changes since H when fast forwarding to M
would be handled with the same workflow as the three-way case.

Hmm.


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Handling merge conflicts a bit more gracefully..
  2005-06-09  2:15         ` Junio C Hamano
@ 2005-06-09  2:48           ` Linus Torvalds
  2005-06-09  4:35             ` Junio C Hamano
  0 siblings, 1 reply; 33+ messages in thread
From: Linus Torvalds @ 2005-06-09  2:48 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git



On Wed, 8 Jun 2005, Junio C Hamano wrote:
> 
> Well, except that $I may validly be an empty tree ;-), so not
> quite.

Yeah, ok, so the fact that we allow missing things in the index (which was 
debatable to start with) makes for exceptions. 

We could certainly be stricter about the index contents, and require that
they match the branch we're merging from exactly, rather than be a subset.  
That said, I'm not convinced that it's worth it, even if it means that a
two-way merge ends up acceping merges that a three-way one never would.

> In case it was not clear, where I am headed is this.  I would
> like to rip out the two-tree "carry forward" implementation from
> read-tree, and replace it with:

Yeah, I see that, I'm just not entirely convinced it's a good idea.

The thing is, a two-way merge really _is_ very different from a three-way
one, in that it's a fast-forward, and the fact that it allows for things
that the more complex "full" case wouldn't allow I feel is something of an
advantage. And it _can_ allow them exactly because it's not the full case.

I like your concept:

> When/if that happens, then the current fast-forward code would
> need to be changed from:
> 
>     read-tree -m $H $M && echo $M >.git/HEAD
> 
> to
> 
>     read-tree -m $H $M &&
>     if unmerged paths in the resulting cache
>     then
>         merge-cache -o merge-one-file-script -a
>     fi &&
>     echo $M >.git/HEAD
> 
> and the user's local changes since H when fast forwarding to M
> would be handled with the same workflow as the three-way case.

but that one doesn't really help the case of stuff he hasn't marked 
up-dated, so I think that's actually a special case that _isn't_ the 
important one. I think the case that is more important (and more likely to 
hit people) is when they have something in their working tree that 
conflicts with the merge, and then what you want is really that the 
current "update" code do the three-way merge in the working directory, not 
that it's done on the index file contents.

So I see where you are coming from, but I don't think the index file is 
the most important case. The more important case is the one that the 
three-way merge doesn't handle either!

		Linus

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Handling merge conflicts a bit more gracefully..
  2005-06-08 20:55 Handling merge conflicts a bit more gracefully Linus Torvalds
  2005-06-08 23:07 ` Junio C Hamano
@ 2005-06-09  3:07 ` Jeff Garzik
  2005-06-09  4:11   ` Linus Torvalds
  1 sibling, 1 reply; 33+ messages in thread
From: Jeff Garzik @ 2005-06-09  3:07 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Git Mailing List, Junio C Hamano

Linus Torvalds wrote:
> Comments? It would be good to have people test this and maybe even write a 
> few automated tests that it all works as expected..

I've got a few libata branches I have been putting off updating to the 
latest kernel, because of merge conflicts ('chs-support' and 'passthru' 
branches of libata-dev.git).

If this merge-gracefully stuff is all checked into git.git, I can 
definitely give it some real-world testing.

	Jeff



^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Handling merge conflicts a bit more gracefully..
  2005-06-09  3:07 ` Jeff Garzik
@ 2005-06-09  4:11   ` Linus Torvalds
  0 siblings, 0 replies; 33+ messages in thread
From: Linus Torvalds @ 2005-06-09  4:11 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: Git Mailing List, Junio C Hamano



On Wed, 8 Jun 2005, Jeff Garzik wrote:
> 
> If this merge-gracefully stuff is all checked into git.git, I can 
> definitely give it some real-world testing.

Yup, all there. Not a _ton_ of testing exactly, but I did actually test
both the content conflict case and the "new directory" case. At least 
once.

		Linus

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Handling merge conflicts a bit more gracefully..
  2005-06-09  2:48           ` Linus Torvalds
@ 2005-06-09  4:35             ` Junio C Hamano
  2005-06-09  4:54               ` Linus Torvalds
  0 siblings, 1 reply; 33+ messages in thread
From: Junio C Hamano @ 2005-06-09  4:35 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git

>>>>> "LT" == Linus Torvalds <torvalds@osdl.org> writes:

LT> Yeah, ok, so the fact that we allow missing things in the
LT> index (which was debatable to start with) makes for
LT> exceptions.

Not just that.  Another big difference is that we allow _extra_
things in the index in two-tree case (i.e. local additions).
But I do not think these exceptions are necessarily bad.

And you are right that two-tree is _very_ different from
three-way merge.

LT> We could certainly be stricter about the index contents, and
LT> require that they match the branch we're merging from
LT> exactly, rather than be a subset.

I guess great minds do not always think alike.  I was going in
quite the opposite direction.  I vaguely recall saying this
before on this list ;-)

With the current three-way code, if I rewrite two-way merge
using the three-way "read-tree -m H I-mixed-with-H M" (emulated
two-tree fast forward, where "I" denotes "tree that would have
resulted from the original cache"), it would give quite
different results from the "carry forward" two-way code we have.
So in that sense, three-way and two-way are quite different.

I have, however, not convinced myself that this difference is
coming from some fundamental difference between two-tree fast
forward and three-way merge.  If desirable results fall out
naturally for the "emulated two-way" case by handling three-way
case more carefully (e.g. not having stricter index requirements
than necessary), that would be wonderful.  I think, for example,
there are places where we have too strict index requirements in
three-way merge (grep for '(ALT)' in t/t1000*.sh test file).

I probably am dreaming, though.

LT> I think the case that is more important (and more likely to
LT> hit people) is when they have something in their working
LT> tree that conflicts with the merge, and then what you want
LT> is really that the current "update" code do the three-way
LT> merge in the working directory, not that it's done on the
LT> index file contents.

LT> ..., but I don't think the index file is the most important
LT> case. The more important case is the one that the three-way
LT> merge doesn't handle either!

I agree with all of the above.  Their working tree has changes
from H, and merging M into H conflicts with those changes.  That
means, although they did not actually make a formal commit, what
they have is essentially this:

         cache contents
         is here
         v
      ---I---
     /       ^work tree contents is here
  --H
     \
      ----------M

which means we are exactly in the same situation as "merge I and
M pivoting on H" three-way merge, with a dirty work tree.  Any
solution and help we would give to the end-user for the
three-way case would automatically help this two-way case,
wouldn't it?

I do not think index file is important either; maybe I am not
really understanding your argument.  I fully accept the new
world order with today's merge-one-file-script changes, that the
merge result will be left in the work tree for the user to
verify and sort out.  What I am trying to do in the above
picture is to help the end-user forward-porting differences in I
since H (along with the work tree changes since I) when doing a
fast-forward from H to M happens, using the files in the work
tree.


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Handling merge conflicts a bit more gracefully..
  2005-06-09  4:35             ` Junio C Hamano
@ 2005-06-09  4:54               ` Linus Torvalds
  2005-06-09  5:15                 ` Junio C Hamano
  0 siblings, 1 reply; 33+ messages in thread
From: Linus Torvalds @ 2005-06-09  4:54 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git



On Wed, 8 Jun 2005, Junio C Hamano wrote:
> >>>>> "LT" == Linus Torvalds <torvalds@osdl.org> writes:
> 
> LT> Yeah, ok, so the fact that we allow missing things in the
> LT> index (which was debatable to start with) makes for
> LT> exceptions.
> 
> Not just that.  Another big difference is that we allow _extra_
> things in the index in two-tree case (i.e. local additions).
> But I do not think these exceptions are necessarily bad.

Well, they'd be bad in a three-way merge.

The reason they aren't bad in a two-way merge is that you don't commit the 
result - the commits have been done already. 

That's really the big conceptual difference between two-way and three-way:  
never mind the merge algorithm itself.

(In fact, in many ways, two-way merges are really just the same as a 
one-way merge, except it now knows where it came from, so it can do sanity 
checking).

As to working tree changes:

> which means we are exactly in the same situation as "merge I and
> M pivoting on H" three-way merge, with a dirty work tree.  Any
> solution and help we would give to the end-user for the
> three-way case would automatically help this two-way case,
> wouldn't it?

Yes.

In fact, there's a fairly simple solution, which is to remove the current 
check for "verify_uptodate()" and instead replace it with the "update" 
phase not just writing the file, but actually doing a three-way merge on 
it.

NOTE! This would not affect the resulting _tree_ in any way at all. It 
would literally only affect how we write out the working directory. Right 
now we just fail when the working file isn't up-to-date, and that could be 
replaced with instead doing a

	merge W I M

where "W" is the working file, "I" is the index file, and "M" is the merge 
result that we currently just write out directly.

In the special case of I == M, we already do _exactly_ this: we know that
since I=M, the merge will be W, so we don't do the update at all.

So in fact, doing a 3-way merge is really a generalization of what we
already do, and removes a failure case.

NOTE! This 3way merge is fundamentally _different_ from the 3-way merge
that is done by "git-merge-one-file-script" that we already do. _That_
3-way merge is done not on the working files, but on the results in the
trees, while this new 3way merge would be done purely in the working
directory (ie it wouldn't make sense without the "-u" flag).

If we do this, I'd personally suggest it be another flag, possibly "-u3" 
instead of just plain "-u".

		Linus

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Handling merge conflicts a bit more gracefully..
  2005-06-09  4:54               ` Linus Torvalds
@ 2005-06-09  5:15                 ` Junio C Hamano
  0 siblings, 0 replies; 33+ messages in thread
From: Junio C Hamano @ 2005-06-09  5:15 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git

>>>>> "LT" == Linus Torvalds <torvalds@osdl.org> writes:

LT> On Wed, 8 Jun 2005, Junio C Hamano wrote:
>> >>>>> "LT" == Linus Torvalds <torvalds@osdl.org> writes:
>> 
LT> Yeah, ok, so the fact that we allow missing things in the
LT> index (which was debatable to start with) makes for
LT> exceptions.
>> 
>> Not just that.  Another big difference is that we allow _extra_
>> things in the index in two-tree case (i.e. local additions).
>> But I do not think these exceptions are necessarily bad.

LT> Well, they'd be bad in a three-way merge.

No question about it.  I am not proposing to conditionally
accept extra entries in 3-way case.

But when "read-tree -m H I-mixed-with-H M" 3-way merge is
emulating "read-tree -m H M", it does not need to accept any
extra entries in the cache, because in this case "our head" tree
is "I-mixed-with-H", which by definition contains everything in
the current cache (remember, "I-mixed-with-H" is built by
looking at each path and if it has stage0 then copy it to stage2
otherwise if it has stage1 then copy it to stage2, after reading
the index file into stage0, H into stage1, and M into stage 3).

And after such "3-way merge emulating 2-way fast-forward," you
can commit---the commit will have a single parent, M, and the
difference it contains is the changes the user made while he was
working off of H, rebased to M.

Of course this all assumes that we have a perfectly working
three-way merge ;-).


^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH 0/3] Handling merge conflicts a bit more gracefully
  2005-06-08 23:35   ` Linus Torvalds
  2005-06-09  0:03     ` Junio C Hamano
  2005-06-09  0:11     ` Junio C Hamano
@ 2005-06-09  7:02     ` Junio C Hamano
  2005-06-09  7:04       ` [PATCH 1/3] read-tree.c: rename local variables used in 3-way merge code Junio C Hamano
                         ` (2 more replies)
  2005-06-18  0:15     ` Handling merge conflicts a bit more gracefully Herbert Xu
  3 siblings, 3 replies; 33+ messages in thread
From: Junio C Hamano @ 2005-06-09  7:02 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git

This series consists of three patches.

  [PATCH 1/3] read-tree.c: rename local variables used in 3-way merge code.
  [PATCH 2/3] read-tree -m 3-way: loosen index requirements that is too strict.
  [PATCH 3/3] read-tree -m 3-way: handle more trivial merges internally

You may have noticed that I already described some "alternative
semantics" in the 3-way merge test script t1000.  This set of
patches implements some of them, namely the following 5 cases:

     O       A       B         result      index requirements
-------------------------------------------------------------------
  5  missing exists  A==B      take A      must match A, if exists.
 ------------------------------------------------------------------
  6  exists  missing missing   remove      must not exist.
 ------------------------------------------------------------------
  8  exists  missing O==B      remove      must not exist.
 ------------------------------------------------------------------
 10  exists  O==A    missing   remove      must match A and be
                                           up-to-date, if exists.
 ------------------------------------------------------------------
 14  exists  O==A    O!=B      take B      if exists, must either (1)
                                           match A and be up-to-date,
                                           or (2) match B.
-------------------------------------------------------------------

The first patch is to match the local variable names used in the
functions involved to the names used in the case matrix.

Case #14 is resolved identically as the old code does, but the
index requirement old code placed on this case was stricter than
necessary.  In this case, satisfying the usual rule of "match A
and be up-to-date if exists" is certainly OK, but additionally,
if the original index matches the tree being merged (without
even being up-to-date) is also permissible, because there would
be no information loss or work-tree clobbering if we allowed it.
The second patch in the series corrects this.

Case #5, #6, #8, and #10 were traditionally kept unmerged in the
index file when read-tree is done, and resolving them was left
to the script.  By resolving these internally, we can loosen the
index requirements without compromising correctness for case #5.
Other three cases could still be left for the "script policy"
because this change does not affect the index requirements for
these cases, but it was simple enough to implement them and this
would not be too controversial a change.  The third patch in the
series implements these changes.


^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH 1/3] read-tree.c: rename local variables used in 3-way merge code.
  2005-06-09  7:02     ` [PATCH 0/3] " Junio C Hamano
@ 2005-06-09  7:04       ` Junio C Hamano
  2005-06-09  7:05       ` [PATCH 2/3] read-tree -m 3-way: loosen index requirements that is too strict Junio C Hamano
  2005-06-09  7:06       ` [PATCH 3/3] read-tree -m 3-way: handle more trivial merges internally Junio C Hamano
  2 siblings, 0 replies; 33+ messages in thread
From: Junio C Hamano @ 2005-06-09  7:04 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git

I'd hate to do this, but every time I try to touch this code and
validate what it does against the case matrix in t1000 test, I
get confused.  The variable names are renamed to match the case
matrix.  Now they are named as:

    i -- entry from the index file (formerly known as "old")
    o -- merge base (formerly known as "a")
    a -- our head (formerly known as "b")
    b -- merge head (formerly known as "c")

Signed-off-by: Junio C Hamano <junkio@cox.net>
---

 read-tree.c |   40 ++++++++++++++++++++--------------------
 1 files changed, 20 insertions(+), 20 deletions(-)

diff --git a/read-tree.c b/read-tree.c
--- a/read-tree.c
+++ b/read-tree.c
@@ -40,9 +40,9 @@ static int same(struct cache_entry *a, s
  * This removes all trivial merges that don't change the tree
  * and collapses them to state 0.
  */
-static struct cache_entry *merge_entries(struct cache_entry *a,
-					 struct cache_entry *b,
-					 struct cache_entry *c)
+static struct cache_entry *merge_entries(struct cache_entry *o,
+					 struct cache_entry *a,
+					 struct cache_entry *b)
 {
 	/*
 	 * Ok, all three entries describe the same
@@ -58,16 +58,16 @@ static struct cache_entry *merge_entries
 	 * The "all entries exactly the same" case falls out as
 	 * a special case of any of the "two same" cases.
 	 *
-	 * Here "a" is "original", and "b" and "c" are the two
+	 * Here "o" is "original", and "a" and "b" are the two
 	 * trees we are merging.
 	 */
-	if (a && b && c) {
-		if (same(b,c))
-			return c;
+	if (o && a && b) {
 		if (same(a,b))
-			return c;
-		if (same(a,c))
 			return b;
+		if (same(o,a))
+			return b;
+		if (same(o,b))
+			return a;
 	}
 	return NULL;
 }
@@ -126,29 +126,29 @@ static int merged_entry(struct cache_ent
 
 static int threeway_merge(struct cache_entry *stages[4], struct cache_entry **dst)
 {
-	struct cache_entry *old = stages[0];
-	struct cache_entry *a = stages[1], *b = stages[2], *c = stages[3];
+	struct cache_entry *i = stages[0];
+	struct cache_entry *o = stages[1], *a = stages[2], *b = stages[3];
 	struct cache_entry *merge;
 	int count;
 
 	/*
-	 * If we have an entry in the index cache ("old"), then we want
+	 * If we have an entry in the index cache ("i"), then we want
 	 * to make sure that it matches any entries in stage 2 ("first
-	 * branch", aka "b").
+	 * branch", aka "a").
 	 */
-	if (old) {
-		if (!b || !same(old, b))
+	if (i) {
+		if (!a || !same(i, a))
 			return -1;
 	}
-	merge = merge_entries(a, b, c);
+	merge = merge_entries(o, a, b);
 	if (merge)
-		return merged_entry(merge, old, dst);
-	if (old)
-		verify_uptodate(old);
+		return merged_entry(merge, i, dst);
+	if (i)
+		verify_uptodate(i);
 	count = 0;
+	if (o) { *dst++ = o; count++; }
 	if (a) { *dst++ = a; count++; }
 	if (b) { *dst++ = b; count++; }
-	if (c) { *dst++ = c; count++; }
 	return count;
 }
 
------------


^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH 2/3] read-tree -m 3-way: loosen index requirements that is too strict.
  2005-06-09  7:02     ` [PATCH 0/3] " Junio C Hamano
  2005-06-09  7:04       ` [PATCH 1/3] read-tree.c: rename local variables used in 3-way merge code Junio C Hamano
@ 2005-06-09  7:05       ` Junio C Hamano
  2005-06-09  7:06       ` [PATCH 3/3] read-tree -m 3-way: handle more trivial merges internally Junio C Hamano
  2 siblings, 0 replies; 33+ messages in thread
From: Junio C Hamano @ 2005-06-09  7:05 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git

This patch teaches "read-tree -m O A B" that, when only "the
other tree" changed a path, and if the work tree already has
that change, we are not in a situation that would clobber the
cache and the working tree, and lets the merge succeed; this is
case #14ALT in t1000 test.

Signed-off-by: Junio C Hamano <junkio@cox.net>
---

 read-tree.c                 |   16 ++++++++++++++++
 t/t1000-read-tree-m-3way.sh |    9 +++++++++
 2 files changed, 25 insertions(+), 0 deletions(-)

diff --git a/read-tree.c b/read-tree.c
--- a/read-tree.c
+++ b/read-tree.c
@@ -131,6 +131,22 @@ static int threeway_merge(struct cache_e
 	struct cache_entry *merge;
 	int count;
 
+	/* The case #14ALT is special in that it allows "i" to match
+	 * the "merged branch", aka "b" and even be dirty, as an
+	 * alternative to the usual 'must match "a" and be up-to-date'
+	 * rule.
+	 */
+	if (o && a && b && same(o, a) && !same(o, b)) {
+		if (i) {
+			if (same(i, b))
+				; /* case #14ALT exception */
+			else if (same(i, a))
+				verify_uptodate(i);
+			else
+				return -1;
+		}
+	}
+	else /* otherwise the original rule applies */
 	/*
 	 * If we have an entry in the index cache ("i"), then we want
 	 * to make sure that it matches any entries in stage 2 ("first
diff --git a/t/t1000-read-tree-m-3way.sh b/t/t1000-read-tree-m-3way.sh
--- a/t/t1000-read-tree-m-3way.sh
+++ b/t/t1000-read-tree-m-3way.sh
@@ -455,6 +455,15 @@ test_expect_success \
      git-read-tree -m $tree_O $tree_A $tree_B &&
      check_result"
 
+test_expect_success \
+    '14ALT - in O && A && B && O==A && O!=B case, matching B is also OK' \
+    "rm -f .git/index NM &&
+     cp .orig-B/NM NM &&
+     git-update-cache --add NM &&
+     echo extra >>NM &&
+     git-read-tree -m $tree_O $tree_A $tree_B &&
+     check_result"
+
 test_expect_failure \
     '14 (fail) - must match and be up-to-date in O && A && B && O==A && O!=B case' \
     "rm -f .git/index NM &&
------------


^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH 3/3] read-tree -m 3-way: handle more trivial merges internally
  2005-06-09  7:02     ` [PATCH 0/3] " Junio C Hamano
  2005-06-09  7:04       ` [PATCH 1/3] read-tree.c: rename local variables used in 3-way merge code Junio C Hamano
  2005-06-09  7:05       ` [PATCH 2/3] read-tree -m 3-way: loosen index requirements that is too strict Junio C Hamano
@ 2005-06-09  7:06       ` Junio C Hamano
  2005-06-09 15:15         ` Linus Torvalds
  2 siblings, 1 reply; 33+ messages in thread
From: Junio C Hamano @ 2005-06-09  7:06 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git

This patch teaches "read-tree -m O A B" that some more trivial
cases can be handled internally.  This allows us to loosen
otherwise too strict index requirements in case #5ALT, where
both branches create a new file identically --- the previous
code required index to be up-to-date and aborted the merge when
it is not, but there is no reason to require it to be up-to-date
in this case.

The test vector has been updated to match the new behaviour as
well.

Signed-off-by: Junio C Hamano <junkio@cox.net>
---

 read-tree.c                 |   16 ++++++++++++++++
 t/t1000-read-tree-m-3way.sh |   27 +++++++++------------------
 2 files changed, 25 insertions(+), 18 deletions(-)

diff --git a/read-tree.c b/read-tree.c
--- a/read-tree.c
+++ b/read-tree.c
@@ -69,6 +69,12 @@ static struct cache_entry *merge_entries
 		if (same(o,b))
 			return a;
 	}
+	/* #5ALT */
+	if (!o && a && b && same(a,b)) {
+		/* Match what git-merge-one-file-script does */
+		printf("Adding %s\n", a->name);
+		return a;
+	}
 	return NULL;
 }
 
@@ -161,6 +167,16 @@ static int threeway_merge(struct cache_e
 		return merged_entry(merge, i, dst);
 	if (i)
 		verify_uptodate(i);
+
+	/* #6ALT, #8ALT, and #10ALT */
+	if ((o && !a && !b) ||
+	    (o && !a && b && same(o, b)) ||
+	    (o && a && !b && same(o, a))) {
+		/* Match what git-merge-one-file-script does */
+		printf("Removing %s\n", o->name); 
+		return 0;
+	}
+
 	count = 0;
 	if (o) { *dst++ = o; count++; }
 	if (a) { *dst++ = a; count++; }
diff --git a/t/t1000-read-tree-m-3way.sh b/t/t1000-read-tree-m-3way.sh
--- a/t/t1000-read-tree-m-3way.sh
+++ b/t/t1000-read-tree-m-3way.sh
@@ -75,21 +75,18 @@ In addition:
 . ../lib-read-tree-m-3way.sh
 
 ################################################################
-# This is the "no trivial merge unless all three exists" table.
+# Trivial "majority when 3 stages exist" merge plus #5ALT, #6ALT,
+# #8ALT, #10ALT trivial merges.
 
 cat >expected <<\EOF
 100644 X 2	AA
 100644 X 3	AA
 100644 X 2	AN
-100644 X 1	DD
 100644 X 3	DF
 100644 X 2	DF/DF
 100644 X 1	DM
 100644 X 3	DM
-100644 X 1	DN
-100644 X 3	DN
-100644 X 2	LL
-100644 X 3	LL
+100644 X 0	LL
 100644 X 1	MD
 100644 X 2	MD
 100644 X 1	MM
@@ -97,8 +94,6 @@ cat >expected <<\EOF
 100644 X 3	MM
 100644 X 0	MN
 100644 X 3	NA
-100644 X 1	ND
-100644 X 2	ND
 100644 X 0	NM
 100644 X 0	NN
 100644 X 0	SS
@@ -108,11 +103,8 @@ cat >expected <<\EOF
 100644 X 2	Z/AA
 100644 X 3	Z/AA
 100644 X 2	Z/AN
-100644 X 1	Z/DD
 100644 X 1	Z/DM
 100644 X 3	Z/DM
-100644 X 1	Z/DN
-100644 X 3	Z/DN
 100644 X 1	Z/MD
 100644 X 2	Z/MD
 100644 X 1	Z/MM
@@ -120,8 +112,6 @@ cat >expected <<\EOF
 100644 X 3	Z/MM
 100644 X 0	Z/MN
 100644 X 3	Z/NA
-100644 X 1	Z/ND
-100644 X 2	Z/ND
 100644 X 0	Z/NM
 100644 X 0	Z/NN
 EOF
@@ -289,23 +279,24 @@ test_expect_failure \
      git-read-tree -m $tree_O $tree_A $tree_B"
 
 test_expect_success \
-    '5 - must match and be up-to-date in !O && A && B && A==B case.' \
+    '5 - must match in !O && A && B && A==B case.' \
     "rm -f .git/index LL &&
      cp .orig-A/LL LL &&
      git-update-cache --add LL &&
      git-read-tree -m $tree_O $tree_A $tree_B &&
      check_result"
 
-test_expect_failure \
-    '5 (fail) - must match and be up-to-date in !O && A && B && A==B case.' \
+test_expect_success \
+    '5 - must match in !O && A && B && A==B case.' \
     "rm -f .git/index LL &&
      cp .orig-A/LL LL &&
      git-update-cache --add LL &&
      echo extra >>LL &&
-     git-read-tree -m $tree_O $tree_A $tree_B"
+     git-read-tree -m $tree_O $tree_A $tree_B &&
+     check_result"
 
 test_expect_failure \
-    '5 (fail) - must match and be up-to-date in !O && A && B && A==B case.' \
+    '5 (fail) - must match in !O && A && B && A==B case.' \
     "rm -f .git/index LL &&
      cp .orig-A/LL LL &&
      echo extra >>LL &&
------------


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 3/3] read-tree -m 3-way: handle more trivial merges internally
  2005-06-09  7:06       ` [PATCH 3/3] read-tree -m 3-way: handle more trivial merges internally Junio C Hamano
@ 2005-06-09 15:15         ` Linus Torvalds
  2005-06-09 17:26           ` Junio C Hamano
  2005-06-09 20:35           ` [PATCH 3/3] " Junio C Hamano
  0 siblings, 2 replies; 33+ messages in thread
From: Linus Torvalds @ 2005-06-09 15:15 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git



On Thu, 9 Jun 2005, Junio C Hamano wrote:
>
> This patch teaches "read-tree -m O A B" that some more trivial
> cases can be handled internally.

No, I think this is quite possibly wrong for several reasons.

For one, it makes the rest of the system unaware of the deleted files, so 
nothing ever deletes them from the working directory.

And that's not entirely trivial to fix either, since the obvious fix 
(which is to just do a

	if (update)
		unlink(b->name);

or something like that) is wrong. It's wrong because we must not do the
update until the very end, when we've either merged all entries or we've
failed on an entry that couldn't be merged (that's why I did the extra
CE_UPDATE flag, instead of updating as we go along).

Now, you could fix that by creating a separate list of files to be 
deleted (so this is not fundamental, it's just more complicated than the 
trivial case), but that doesn't help, because there's _another_ reason why 
read-tree shouldn't handle these cases.

Namely that read-tree doesn't have a frigging clue about renames, and 
shouldn't have.

But a real merge program _could_ have a frigging clue, and might notice 
patterns like

 - file got modified in one branch, removed in the other
 - a file got added in the other branch
 - "Hey, that added file looks like the removed one!"
 - Let's merge the modifications from the first branch into the move of 
   the second branch!

See? Now, git-read-tree won't handle the first case anyway, but your 
change _does_ make it handle the "file got added" case, which means that 
now the added file is invisible the the "smart merger", and the smart 
merger can't really tell that it was a rename any more.

So our current stupid file-by-file "git-merge-cache" will never do this, 
but that's a limitation of me being less than the intellectual giant I 
wish I was. So I just do the stupid merges. But I _know_ they are stupid, 
and I would like to leave the door open for somebody else to fix up the 
cases I don't handle.

You're basically closing that door.

Now, you can (validly) argue that you could still just look at the
original trees ("git-diff-tree -C $O $M") and grep for copies/movement and
do it by hand _there_ instead of looking at the result of the read-tree, 
and you may well be right. So again, this is not a _fundamental_ problem, 
although it's a bit more fundamental than the first one. 

So if you want to convince me that it's better to do the rename detection
outside of the index file, go wild. Alternatively, you can argue that we
can always undo this later, when once we _do_ have rename and copy
detection and can try to merge things automatically (what _do_ you do if a
file is copied in one branch and modified in the other? Just warn the poor
user, I guess).

So I just need a little convincing that this is a good idea.

		Linus

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 3/3] read-tree -m 3-way: handle more trivial merges internally
  2005-06-09 15:15         ` Linus Torvalds
@ 2005-06-09 17:26           ` Junio C Hamano
  2005-06-09 17:37             ` Linus Torvalds
  2005-06-09 20:35           ` [PATCH 3/3] " Junio C Hamano
  1 sibling, 1 reply; 33+ messages in thread
From: Junio C Hamano @ 2005-06-09 17:26 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git

>>>>> "LT" == Linus Torvalds <torvalds@osdl.org> writes:

LT> No, I think this is quite possibly wrong for several reasons.

I agree with everything you said.

I need to regurgitate other points you raised, but one immediate
comment on the "lost remove" case.  The current two-way code has
the same brokenness in that it does not unlink removed files
under "-u".  We either need the "list of files to be removed",
or we need to make two-way abort if we see these "remove" cases.


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 3/3] read-tree -m 3-way: handle more trivial merges internally
  2005-06-09 17:26           ` Junio C Hamano
@ 2005-06-09 17:37             ` Linus Torvalds
       [not found]               ` <7vbr6fnzf0.fsf@assigned-by-dhcp.cox.net>
  0 siblings, 1 reply; 33+ messages in thread
From: Linus Torvalds @ 2005-06-09 17:37 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git



On Thu, 9 Jun 2005, Junio C Hamano wrote:
> 
> I need to regurgitate other points you raised, but one immediate
> comment on the "lost remove" case.  The current two-way code has
> the same brokenness in that it does not unlink removed files
> under "-u".  We either need the "list of files to be removed",
> or we need to make two-way abort if we see these "remove" cases.

Yes, you're right.

Ho humm. I'll think about it. There's no "next" pointer in a struct 
cache-struct, and because we use the on-disk layout (good or bad, I dunno, 
but it does remove the need for copying megabytes of data for some cases) 
we can't just add one. So to generate a list of "deleted" files we'd have 
to make a separate array or something.

Not hard, but it's a bit ugly. I don't see any alternative, though, unless
we really do end up using the same "leave it in the different stages and
force people to run git-merge-cache on the result" thing that the
three-way merge does.

The fact that the three-way merge _might_ also like to remove the entries,
and that the two-way merge already handles the addition of new files, does
kind of argue that we should do it. For symmetry witht he "file add" case, 
if nothing else.

			Linus

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 3/3] read-tree -m 3-way: handle more trivial merges internally
  2005-06-09 15:15         ` Linus Torvalds
  2005-06-09 17:26           ` Junio C Hamano
@ 2005-06-09 20:35           ` Junio C Hamano
  2005-06-09 22:13             ` [PATCH] Add git-diff-stages command Junio C Hamano
  2005-06-10 19:59             ` [PATCH 3/3] read-tree -m 3-way: handle more trivial merges internally Junio C Hamano
  1 sibling, 2 replies; 33+ messages in thread
From: Junio C Hamano @ 2005-06-09 20:35 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git

>>>>> "LT" == Linus Torvalds <torvalds@osdl.org> writes:

LT> Namely that read-tree doesn't have a frigging clue about renames, and 
LT> shouldn't have.

LT> But a real merge program _could_ have a frigging clue, and might notice 
LT> patterns like

LT>  - file got modified in one branch, removed in the other
LT>  - a file got added in the other branch
LT>  - "Hey, that added file looks like the removed one!"
LT>  - Let's merge the modifications from the first branch into the move of 
LT>    the second branch!

LT> Now, you can (validly) argue that you could still just look at the
LT> original trees ("git-diff-tree -C $O $M") and grep for copies/movement and
LT> do it by hand _there_ instead of looking at the result of the read-tree, 
LT> and you may well be right. So again, this is not a _fundamental_ problem, 
LT> although it's a bit more fundamental than the first one. 

My knee-jerk reaction was "No, I would refuse to make that
argument, because making the merge mechanism to examine trees
itself would take us full-circle back to where we started
[*1*]".

I agree we can, as the zeroth order approximation, run two
"diff-tree -B --find-copies-harder -C" [*3*, *4*] on (O,A) and
(O,B) pairs, and compare their output to cover the rename case
[*2*] you described.  I think we also can write a simple program
that reads an unmerged index file and do the equivalent of these
two diff-tree commands.

However, what I suspect to happen in practice is that the lines
of development leading to A and B may have so much modification
to those renamed or copied files since they forked at O that we
may not recognize renames or copies as such by only looking at
(O,A) and (O,B).  In order to do a reasonable job while merging,
we may end up needing to run "diff-tree --stdin -B -C" on the
output of "rev-list O A" to fully follow the rename/copy trail
[*5*].

What all this means is that the simple three-stage information
read-tree -m gives us, which is about only three trees, might
not be enough to handle renames and copies intelligently, when
we need to deal with a pair of trees that have diverged for too
long.

Once we go down this path, arguing against making "read-tree -m"
results useless for such an intelligent merge logic (because it
forces the merge logic to look at the trees and commits
involved) ceases to make much sense, because such an intelligent
merge logic needs to look at more than three trees _anyway_.

What "read-tree -m" gives us, while being very efficient,
elegant and effective in "merge small and merge often" use
pattern we recommend, may not be so useful to implement such an
intelligent merge logic, and instead we would do better if we
did it the hard way by inspecting individual commits.  I do not
have problem with that approach.  It would be a much longer-term
project, though.

So, yes I ended up arguing that the intelligent merge logic
could and probably needs to look at the trees involved ;-).


Among the three-way cases, the only case I think that may make a
practical difference is the case #5ALT, which deals with "a file
added identically in both branches" case.  This is what happens
when a widely accepted patch has been applied independently to
both trees recently (eh, "since they forked").  New files tend
to get updated more often, and allowing the file to be locally
modified, instead of failing the merge in read-tree phase, would
help the workflow.  If the file were modified in the user's
repository, and checked in, then the current 3-way merge code
cannot help the user that much, because we would be in !O && A
&& B && A!=B situation.  I have a suspicion that we could
probably help this case by looking at not just merge base but
the edge commits as well.

I consider #14ALT an improvement, but at the same time I doubt
that particular one would make much practical difference.  It is
more or less "while we are at it" kind of change.  All others,
including the "remove" cases (I botched -u but as you point out
it is correctable), do not contribute to loosening the index
requirements, but I suspect they might help me later unify
two-way fast forward and three-way merge.  Yes, I am still
looking at "read-tree -m H I-mixed-with-H M" that emulates
"read-tree H M".


[Footnotes]

*1* Remember merge-trees Perl script, which I did before you
invented the multi-stage read-tree?  Boy it feels like it was so
distant past...

*2* A casual reader may notice that we are arguing about renames
after both of us publicly stated that "renames do not matter".
Here is a clarification.  We both consider "recording renames at
commit time" does not matter, but we do take "tracking and
handling the renames" seriously.  There is a difference.

*3* Oops.  There is not --find-copies-harder yet ;-).

*4* This would be further helped if we had a --show-rename-only
diffcore filter.  The operation is similar to the pickaxe, but
it would prune changesets down only to renames and copies.  I
actually wrote and threw away such a filter back when I was
trying to find good test cases in linux-2.6 repository.

*5* And the development line leading to A or B may not even be
linear, in which case it may be easier to first decompose the
chain between O and A into individual epochs.  Jon Seymour's
"rev-list --merge-order O A" would be very handy for this.


^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH] Add git-diff-stages command.
  2005-06-09 20:35           ` [PATCH 3/3] " Junio C Hamano
@ 2005-06-09 22:13             ` Junio C Hamano
  2005-06-09 22:30               ` Linus Torvalds
  2005-06-11  1:44               ` [PATCH] diff-stages: unuglify the too big main() function Junio C Hamano
  2005-06-10 19:59             ` [PATCH 3/3] read-tree -m 3-way: handle more trivial merges internally Junio C Hamano
  1 sibling, 2 replies; 33+ messages in thread
From: Junio C Hamano @ 2005-06-09 22:13 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git

The diff-* brothers acquired a sibling, git-diff-stages.  With
an unmerged index file, you specify two stage numbers and it
shows the differences between them.

Signed-off-by: Junio C Hamano <junkio@cox.net>
---

*** ... I think we also can write a simple program that reads an
*** unmerged index file and do the equivalent of these two
*** diff-tree commands.
***
*** Only lightly tested.

 Makefile      |    3 +-
 diff-stages.c |  112 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 114 insertions(+), 1 deletions(-)

diff --git a/Makefile b/Makefile
--- a/Makefile
+++ b/Makefile
@@ -33,7 +33,7 @@ PROG=   git-update-cache git-diff-files 
 	git-http-pull git-ssh-push git-ssh-pull git-rev-list git-mktag \
 	git-diff-helper git-tar-tree git-local-pull git-write-blob \
 	git-get-tar-commit-id git-mkdelta git-apply git-stripspace \
-	git-cvs2git
+	git-cvs2git git-diff-stages
 
 all: $(PROG)
 
@@ -117,6 +117,7 @@ git-write-blob: write-blob.c
 git-mkdelta: mkdelta.c
 git-stripspace: stripspace.c
 git-cvs2git: cvs2git.c
+git-diff-stages: diff-stages.c
 
 git-http-pull: LIBS += -lcurl
 git-rev-list: LIBS += -lssl
diff --git a/diff-stages.c b/diff-stages.c
new file mode 100644
--- /dev/null
+++ b/diff-stages.c
@@ -0,0 +1,112 @@
+/*
+ * Copyright (c) 2005 Junio C Hamano
+ */
+
+#include "cache.h"
+#include "diff.h"
+
+static int diff_output_format = DIFF_FORMAT_HUMAN;
+static int detect_rename = 0;
+static int diff_setup_opt = 0;
+static int diff_score_opt = 0;
+static const char *pickaxe = NULL;
+static int pickaxe_opts = 0;
+static int diff_break_opt = -1;
+static const char *orderfile = NULL;
+
+static char *diff_stages_usage =
+"git-diff-stages [-p] [-r] [-z] [-M] [-C] [-R] [-S<string>] [-O<orderfile>] <stage1> <stage2> [<path>...]";
+
+int main(int ac, const char **av)
+{
+	int stage1, stage2, i;
+
+	read_cache();
+	while (1 < ac && av[1][0] == '-') {
+		const char *arg = av[1];
+		if (!strcmp(arg, "-r"))
+			; /* as usual */
+		else if (!strcmp(arg, "-p"))
+			diff_output_format = DIFF_FORMAT_PATCH;
+		else if (!strncmp(arg, "-B", 2)) {
+			if ((diff_break_opt = diff_scoreopt_parse(arg)) == -1)
+				usage(diff_stages_usage);
+		}
+		else if (!strncmp(arg, "-M", 2)) {
+			detect_rename = DIFF_DETECT_RENAME;
+			if ((diff_score_opt = diff_scoreopt_parse(arg)) == -1)
+				usage(diff_stages_usage);
+		}
+		else if (!strncmp(arg, "-C", 2)) {
+			detect_rename = DIFF_DETECT_COPY;
+			if ((diff_score_opt = diff_scoreopt_parse(arg)) == -1)
+				usage(diff_stages_usage);
+		}
+		else if (!strcmp(arg, "-z"))
+			diff_output_format = DIFF_FORMAT_MACHINE;
+		else if (!strcmp(arg, "-R"))
+			diff_setup_opt |= DIFF_SETUP_REVERSE;
+		else if (!strncmp(arg, "-S", 2))
+			pickaxe = arg + 2;
+		else if (!strncmp(arg, "-O", 2))
+			orderfile = arg + 2;
+		else if (!strcmp(arg, "--pickaxe-all"))
+			pickaxe_opts = DIFF_PICKAXE_ALL;
+		else
+			usage(diff_stages_usage);
+		ac--; av++;
+	}
+
+	if (ac < 3 ||
+	    sscanf(av[1], "%d", &stage1) != 1 ||
+	    ! (0 <= stage1 && stage1 <= 3) ||
+	    sscanf(av[2], "%d", &stage2) != 1 ||
+	    ! (0 <= stage2 && stage2 <= 3))
+		usage(diff_stages_usage);
+
+	av += 3; /* The rest from av[0] are for paths restriction. */
+	diff_setup(diff_setup_opt);
+
+	i = 0;
+	while (i < active_nr) {
+		struct cache_entry *ce, *stages[4] = { NULL, };
+		struct cache_entry *one, *two;
+		const char *name;
+		int len;
+		ce = active_cache[i];
+		len = ce_namelen(ce);
+		name = ce->name;
+		for (;;) {
+			int stage = ce_stage(ce);
+			stages[stage] = ce;
+			if (active_nr <= ++i)
+				break;
+			ce = active_cache[i];
+			if (ce_namelen(ce) != len ||
+			    memcmp(name, ce->name, len))
+				break;
+		}
+		one = stages[stage1];
+		two = stages[stage2];
+		if (!one && !two)
+			continue;
+		if (!one)
+			diff_addremove('+', ntohl(two->ce_mode),
+				       two->sha1, name, NULL);
+		else if (!two)
+			diff_addremove('-', ntohl(one->ce_mode),
+				       one->sha1, name, NULL);
+		else if (memcmp(one->sha1, two->sha1, 20) ||
+			 (one->ce_mode != two->ce_mode))
+			 diff_change(ntohl(one->ce_mode), ntohl(two->ce_mode),
+				     one->sha1, two->sha1, name, NULL);
+	}
+
+	diffcore_std(av,
+		     detect_rename, diff_score_opt,
+		     pickaxe, pickaxe_opts,
+		     diff_break_opt,
+		     orderfile);
+	diff_flush(diff_output_format, 1);
+	return 0;
+}
------------


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH] Add git-diff-stages command.
  2005-06-09 22:13             ` [PATCH] Add git-diff-stages command Junio C Hamano
@ 2005-06-09 22:30               ` Linus Torvalds
  2005-06-11  1:44               ` [PATCH] diff-stages: unuglify the too big main() function Junio C Hamano
  1 sibling, 0 replies; 33+ messages in thread
From: Linus Torvalds @ 2005-06-09 22:30 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git



On Thu, 9 Jun 2005, Junio C Hamano wrote:
>
> The diff-* brothers acquired a sibling, git-diff-stages.  With
> an unmerged index file, you specify two stage numbers and it
> shows the differences between them.

I hate how you do one big "main()" function that does it all.

I'll apply the patch, but really, this is pretty ugly.

		Linus

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH] read-tree.c: rename local variables used in 3-way merge code.
       [not found]                 ` <Pine.LNX.4.58.0506091152530.2286@ppc970.osdl.org>
@ 2005-06-09 22:45                   ` Junio C Hamano
  2005-06-09 22:47                   ` [PATCH] Handle entry removals during merge correctly Junio C Hamano
                                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 33+ messages in thread
From: Junio C Hamano @ 2005-06-09 22:45 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git

I'd hate to do this, but every time I try to touch this code and
validate what it does against the case matrix in t1000 test I
get confused.  The variable names are renamed to match the case
matrix.  Now they are named as:

    i -- entry from the index file (formerly known as "old")
    o -- merge base (formerly known as "a")
    a -- our head (formerly known as "b")
    b -- merge head (formerly known as "c")

Signed-off-by: Junio C Hamano <junkio@cox.net>
---

*** Re-submit.  I've rebased the series from the last night.
*** This is the first of them.

 read-tree.c |   40 ++++++++++++++++++++--------------------
 1 files changed, 20 insertions(+), 20 deletions(-)

diff --git a/read-tree.c b/read-tree.c
--- a/read-tree.c
+++ b/read-tree.c
@@ -40,9 +40,9 @@ static int same(struct cache_entry *a, s
  * This removes all trivial merges that don't change the tree
  * and collapses them to state 0.
  */
-static struct cache_entry *merge_entries(struct cache_entry *a,
-					 struct cache_entry *b,
-					 struct cache_entry *c)
+static struct cache_entry *merge_entries(struct cache_entry *o,
+					 struct cache_entry *a,
+					 struct cache_entry *b)
 {
 	/*
 	 * Ok, all three entries describe the same
@@ -58,16 +58,16 @@ static struct cache_entry *merge_entries
 	 * The "all entries exactly the same" case falls out as
 	 * a special case of any of the "two same" cases.
 	 *
-	 * Here "a" is "original", and "b" and "c" are the two
+	 * Here "o" is "original", and "a" and "b" are the two
 	 * trees we are merging.
 	 */
-	if (a && b && c) {
-		if (same(b,c))
-			return c;
+	if (o && a && b) {
 		if (same(a,b))
-			return c;
-		if (same(a,c))
 			return b;
+		if (same(o,a))
+			return b;
+		if (same(o,b))
+			return a;
 	}
 	return NULL;
 }
@@ -126,29 +126,29 @@ static int merged_entry(struct cache_ent
 
 static int threeway_merge(struct cache_entry *stages[4], struct cache_entry **dst)
 {
-	struct cache_entry *old = stages[0];
-	struct cache_entry *a = stages[1], *b = stages[2], *c = stages[3];
+	struct cache_entry *i = stages[0];
+	struct cache_entry *o = stages[1], *a = stages[2], *b = stages[3];
 	struct cache_entry *merge;
 	int count;
 
 	/*
-	 * If we have an entry in the index cache ("old"), then we want
+	 * If we have an entry in the index cache ("i"), then we want
 	 * to make sure that it matches any entries in stage 2 ("first
-	 * branch", aka "b").
+	 * branch", aka "a").
 	 */
-	if (old) {
-		if (!b || !same(old, b))
+	if (i) {
+		if (!a || !same(i, a))
 			return -1;
 	}
-	merge = merge_entries(a, b, c);
+	merge = merge_entries(o, a, b);
 	if (merge)
-		return merged_entry(merge, old, dst);
-	if (old)
-		verify_uptodate(old);
+		return merged_entry(merge, i, dst);
+	if (i)
+		verify_uptodate(i);
 	count = 0;
+	if (o) { *dst++ = o; count++; }
 	if (a) { *dst++ = a; count++; }
 	if (b) { *dst++ = b; count++; }
-	if (c) { *dst++ = c; count++; }
 	return count;
 }
 
------------


^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH] Handle entry removals during merge correctly.
       [not found]                 ` <Pine.LNX.4.58.0506091152530.2286@ppc970.osdl.org>
  2005-06-09 22:45                   ` [PATCH] read-tree.c: rename local variables used in 3-way merge code Junio C Hamano
@ 2005-06-09 22:47                   ` Junio C Hamano
  2005-06-09 22:48                   ` [PATCH] read-tree -m 3-way: loosen an index requirement that was too strict Junio C Hamano
  2005-06-09 22:49                   ` [PATCH] read-tree -m 3-way: handle more trivial merges internally Junio C Hamano
  3 siblings, 0 replies; 33+ messages in thread
From: Junio C Hamano @ 2005-06-09 22:47 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git

From: Linus Torvalds <torvalds@osdl.org>

We could handle delete the same way - to set the ce_mode to zero
and add them to the "dst" array, and teach write-cache not to
write them out. Then the same loop that goes around doing the
CE_UPDATE thing could check for the delete case.

Signed-off-by: Junio C Hamano <junkio@cox.net>
---

*** Linus, did I get the patch submission format right?  This is
*** essentially what you wrote (with one correction) but not
*** really "a forwarded e-mail".

 read-cache.c |   10 ++++++++--
 read-tree.c  |   30 ++++++++++++++++++++----------
 2 files changed, 28 insertions(+), 12 deletions(-)

diff --git a/read-cache.c b/read-cache.c
--- a/read-cache.c
+++ b/read-cache.c
@@ -440,11 +440,15 @@ int write_cache(int newfd, struct cache_
 {
 	SHA_CTX c;
 	struct cache_header hdr;
-	int i;
+	int i, removed;
+
+	for (i = removed = 0; i < entries; i++)
+		if (!cache[i]->ce_mode)
+			removed++;
 
 	hdr.hdr_signature = htonl(CACHE_SIGNATURE);
 	hdr.hdr_version = htonl(2);
-	hdr.hdr_entries = htonl(entries);
+	hdr.hdr_entries = htonl(entries - removed);
 
 	SHA1_Init(&c);
 	if (ce_write(&c, newfd, &hdr, sizeof(hdr)) < 0)
@@ -452,6 +456,8 @@ int write_cache(int newfd, struct cache_
 
 	for (i = 0; i < entries; i++) {
 		struct cache_entry *ce = cache[i];
+		if (!ce->ce_mode)
+			continue;
 		if (ce_write(&c, newfd, ce, ce_size(ce)) < 0)
 			return -1;
 	}
diff --git a/read-tree.c b/read-tree.c
--- a/read-tree.c
+++ b/read-tree.c
@@ -124,6 +124,15 @@ static int merged_entry(struct cache_ent
 	return 1;
 }
 
+static int deleted_entry(struct cache_entry *ce, struct cache_entry *old, struct cache_entry **dst)
+{
+	if (old)
+		verify_uptodate(old);
+	ce->ce_mode = 0;
+	*dst++ = ce;
+	return 1;
+}
+
 static int threeway_merge(struct cache_entry *stages[4], struct cache_entry **dst)
 {
 	struct cache_entry *i = stages[0];
@@ -181,25 +190,21 @@ static int twoway_merge(struct cache_ent
 			*dst++ = current;
 			return 1;
 		}
-		else if (oldtree && !newtree && same(current, oldtree)) {
+		else if (oldtree && !newtree && same(current, oldtree))
 			/* 10 or 11 */
-			verify_uptodate(current);
-			return 0;
-		}
+			return deleted_entry(oldtree, current, dst);
 		else if (oldtree && newtree &&
-			 same(current, oldtree) && !same(current, newtree)) {
+			 same(current, oldtree) && !same(current, newtree))
 			/* 20 or 21 */
-			verify_uptodate(current);
-			return merged_entry(newtree, NULL, dst);
-		}
+			return merged_entry(newtree, current, dst);
 		else
 			/* all other failures */
 			return -1;
 	}
 	else if (newtree)
-		return merged_entry(newtree, NULL, dst);
+		return merged_entry(newtree, current, dst);
 	else
-		return 0;
+		return deleted_entry(oldtree, current, dst);
 }
 
 /*
@@ -236,6 +241,11 @@ static void check_updates(struct cache_e
 	unsigned short mask = htons(CE_UPDATE);
 	while (nr--) {
 		struct cache_entry *ce = *src++;
+		if (!ce->ce_mode) {
+			if (update)
+				unlink(ce->name);
+			continue;
+		}
 		if (ce->ce_flags & mask) {
 			ce->ce_flags &= ~mask;
 			if (update)
------------


^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH] read-tree -m 3-way: loosen an index requirement that was too strict.
       [not found]                 ` <Pine.LNX.4.58.0506091152530.2286@ppc970.osdl.org>
  2005-06-09 22:45                   ` [PATCH] read-tree.c: rename local variables used in 3-way merge code Junio C Hamano
  2005-06-09 22:47                   ` [PATCH] Handle entry removals during merge correctly Junio C Hamano
@ 2005-06-09 22:48                   ` Junio C Hamano
  2005-06-09 22:49                   ` [PATCH] read-tree -m 3-way: handle more trivial merges internally Junio C Hamano
  3 siblings, 0 replies; 33+ messages in thread
From: Junio C Hamano @ 2005-06-09 22:48 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git

This patch teaches "read-tree -m O A B" that, when only "the
other tree" changed a path, and if the work tree already has
that change, we are not in a situation that would clobber the
cache and the working tree, and lets the merge succeed; this is
case #14ALT in t1000 test.  It does not change the result of the
merge, but prevents it from failing when it should not.

Signed-off-by: Junio C Hamano <junkio@cox.net>
---

*** Rebased one from the last night.

 read-tree.c                 |   16 ++++++++++++++++
 t/t1000-read-tree-m-3way.sh |    9 +++++++++
 2 files changed, 25 insertions(+), 0 deletions(-)

diff --git a/read-tree.c b/read-tree.c
--- a/read-tree.c
+++ b/read-tree.c
@@ -140,6 +140,22 @@ static int threeway_merge(struct cache_e
 	struct cache_entry *merge;
 	int count;
 
+	/* The case #14ALT is special in that it allows "i" to match
+	 * the "merged branch", aka "b" and even be dirty, as an
+	 * alternative to the usual 'must match "a" and be up-to-date'
+	 * rule.
+	 */
+	if (o && a && b && same(o, a) && !same(o, b)) {
+		if (i) {
+			if (same(i, b))
+				; /* case #14ALT exception */
+			else if (same(i, a))
+				verify_uptodate(i);
+			else
+				return -1;
+		}
+	}
+	else /* otherwise the original rule applies */
 	/*
 	 * If we have an entry in the index cache ("i"), then we want
 	 * to make sure that it matches any entries in stage 2 ("first
diff --git a/t/t1000-read-tree-m-3way.sh b/t/t1000-read-tree-m-3way.sh
--- a/t/t1000-read-tree-m-3way.sh
+++ b/t/t1000-read-tree-m-3way.sh
@@ -455,6 +455,15 @@ test_expect_success \
      git-read-tree -m $tree_O $tree_A $tree_B &&
      check_result"
 
+test_expect_success \
+    '14ALT - in O && A && B && O==A && O!=B case, matching B is also OK' \
+    "rm -f .git/index NM &&
+     cp .orig-B/NM NM &&
+     git-update-cache --add NM &&
+     echo extra >>NM &&
+     git-read-tree -m $tree_O $tree_A $tree_B &&
+     check_result"
+
 test_expect_failure \
     '14 (fail) - must match and be up-to-date in O && A && B && O==A && O!=B case' \
     "rm -f .git/index NM &&
------------


^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH] read-tree -m 3-way: handle more trivial merges internally.
       [not found]                 ` <Pine.LNX.4.58.0506091152530.2286@ppc970.osdl.org>
                                     ` (2 preceding siblings ...)
  2005-06-09 22:48                   ` [PATCH] read-tree -m 3-way: loosen an index requirement that was too strict Junio C Hamano
@ 2005-06-09 22:49                   ` Junio C Hamano
  3 siblings, 0 replies; 33+ messages in thread
From: Junio C Hamano @ 2005-06-09 22:49 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git

This patch teaches "read-tree -m O A B" that some more trivial
cases can be handled internally.  This allows us to loosen
otherwise too strict index requirements in case #5ALT, where
both branches create a new file identically.  The previous
code required index to be up-to-date and aborted the merge when
it is not, but there is no reason to require it to be up-to-date
in this case; it only needs to match A.

The test vector has been updated to match the new behaviour.

Signed-off-by: Junio C Hamano <junkio@cox.net>
---

*** This has the "removal" fixes.

 read-tree.c                 |   16 ++++++++++++++++
 t/t1000-read-tree-m-3way.sh |   27 +++++++++------------------
 2 files changed, 25 insertions(+), 18 deletions(-)

diff --git a/read-tree.c b/read-tree.c
--- a/read-tree.c
+++ b/read-tree.c
@@ -69,6 +69,12 @@ static struct cache_entry *merge_entries
 		if (same(o,b))
 			return a;
 	}
+	/* #5ALT */
+	if (!o && a && b && same(a,b)) {
+		/* Match what git-merge-one-file-script does */
+		printf("Adding %s\n", a->name);
+		return a;
+	}
 	return NULL;
 }
 
@@ -170,6 +176,16 @@ static int threeway_merge(struct cache_e
 		return merged_entry(merge, i, dst);
 	if (i)
 		verify_uptodate(i);
+
+	/* #6ALT, #8ALT, and #10ALT */
+	if ((o && !a && !b) ||
+	    (o && !a && b && same(o, b)) ||
+	    (o && a && !b && same(o, a))) {
+		/* Match what git-merge-one-file-script does */
+		printf("Removing %s\n", o->name); 
+		return deleted_entry(o, i, dst);
+	}
+
 	count = 0;
 	if (o) { *dst++ = o; count++; }
 	if (a) { *dst++ = a; count++; }
diff --git a/t/t1000-read-tree-m-3way.sh b/t/t1000-read-tree-m-3way.sh
--- a/t/t1000-read-tree-m-3way.sh
+++ b/t/t1000-read-tree-m-3way.sh
@@ -75,21 +75,18 @@ In addition:
 . ../lib-read-tree-m-3way.sh
 
 ################################################################
-# This is the "no trivial merge unless all three exists" table.
+# Trivial "majority when 3 stages exist" merge plus #5ALT, #6ALT,
+# #8ALT, #10ALT trivial merges.
 
 cat >expected <<\EOF
 100644 X 2	AA
 100644 X 3	AA
 100644 X 2	AN
-100644 X 1	DD
 100644 X 3	DF
 100644 X 2	DF/DF
 100644 X 1	DM
 100644 X 3	DM
-100644 X 1	DN
-100644 X 3	DN
-100644 X 2	LL
-100644 X 3	LL
+100644 X 0	LL
 100644 X 1	MD
 100644 X 2	MD
 100644 X 1	MM
@@ -97,8 +94,6 @@ cat >expected <<\EOF
 100644 X 3	MM
 100644 X 0	MN
 100644 X 3	NA
-100644 X 1	ND
-100644 X 2	ND
 100644 X 0	NM
 100644 X 0	NN
 100644 X 0	SS
@@ -108,11 +103,8 @@ cat >expected <<\EOF
 100644 X 2	Z/AA
 100644 X 3	Z/AA
 100644 X 2	Z/AN
-100644 X 1	Z/DD
 100644 X 1	Z/DM
 100644 X 3	Z/DM
-100644 X 1	Z/DN
-100644 X 3	Z/DN
 100644 X 1	Z/MD
 100644 X 2	Z/MD
 100644 X 1	Z/MM
@@ -120,8 +112,6 @@ cat >expected <<\EOF
 100644 X 3	Z/MM
 100644 X 0	Z/MN
 100644 X 3	Z/NA
-100644 X 1	Z/ND
-100644 X 2	Z/ND
 100644 X 0	Z/NM
 100644 X 0	Z/NN
 EOF
@@ -289,23 +279,24 @@ test_expect_failure \
      git-read-tree -m $tree_O $tree_A $tree_B"
 
 test_expect_success \
-    '5 - must match and be up-to-date in !O && A && B && A==B case.' \
+    '5 - must match in !O && A && B && A==B case.' \
     "rm -f .git/index LL &&
      cp .orig-A/LL LL &&
      git-update-cache --add LL &&
      git-read-tree -m $tree_O $tree_A $tree_B &&
      check_result"
 
-test_expect_failure \
-    '5 (fail) - must match and be up-to-date in !O && A && B && A==B case.' \
+test_expect_success \
+    '5 - must match in !O && A && B && A==B case.' \
     "rm -f .git/index LL &&
      cp .orig-A/LL LL &&
      git-update-cache --add LL &&
      echo extra >>LL &&
-     git-read-tree -m $tree_O $tree_A $tree_B"
+     git-read-tree -m $tree_O $tree_A $tree_B &&
+     check_result"
 
 test_expect_failure \
-    '5 (fail) - must match and be up-to-date in !O && A && B && A==B case.' \
+    '5 (fail) - must match in !O && A && B && A==B case.' \
     "rm -f .git/index LL &&
      cp .orig-A/LL LL &&
      echo extra >>LL &&
------------


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 3/3] read-tree -m 3-way: handle more trivial merges internally
  2005-06-09 20:35           ` [PATCH 3/3] " Junio C Hamano
  2005-06-09 22:13             ` [PATCH] Add git-diff-stages command Junio C Hamano
@ 2005-06-10 19:59             ` Junio C Hamano
  1 sibling, 0 replies; 33+ messages in thread
From: Junio C Hamano @ 2005-06-10 19:59 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git

>>>>> "JCH" == Junio C Hamano <junkio@cox.net> writes:

JCH> So, yes I ended up arguing that the intelligent merge logic
JCH> could and probably needs to look at the trees involved ;-).

"Could look at, and probably be better off looking at," would
have been a better wording.

Linus, please discard the patches from me that you have not
applied about the "loosening of too strict index requirements"
(yesterday and the day before).  I think I am finally getting
somewhere but the solution, if it works, would be somewhat
different from what I have been sending you.





^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH] diff-stages: unuglify the too big main() function.
  2005-06-09 22:13             ` [PATCH] Add git-diff-stages command Junio C Hamano
  2005-06-09 22:30               ` Linus Torvalds
@ 2005-06-11  1:44               ` Junio C Hamano
  1 sibling, 0 replies; 33+ messages in thread
From: Junio C Hamano @ 2005-06-11  1:44 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git

Split the core of the program, diff_stage, from one big "main()"
function that does it all and leave only the parameter parsing,
setup and finalize part in the main().

Signed-off-by: Junio C Hamano <junkio@cox.net>
---

 diff-stages.c |   75 ++++++++++++++++++++++++++++++---------------------------
 1 files changed, 40 insertions(+), 35 deletions(-)

diff --git a/diff-stages.c b/diff-stages.c
--- a/diff-stages.c
+++ b/diff-stages.c
@@ -17,9 +17,47 @@ static const char *orderfile = NULL;
 static char *diff_stages_usage =
 "git-diff-stages [-p] [-r] [-z] [-M] [-C] [-R] [-S<string>] [-O<orderfile>] <stage1> <stage2> [<path>...]";
 
+static void diff_stages(int stage1, int stage2)
+{
+	int i = 0;
+	while (i < active_nr) {
+		struct cache_entry *ce, *stages[4] = { NULL, };
+		struct cache_entry *one, *two;
+		const char *name;
+		int len;
+		ce = active_cache[i];
+		len = ce_namelen(ce);
+		name = ce->name;
+		for (;;) {
+			int stage = ce_stage(ce);
+			stages[stage] = ce;
+			if (active_nr <= ++i)
+				break;
+			ce = active_cache[i];
+			if (ce_namelen(ce) != len ||
+			    memcmp(name, ce->name, len))
+				break;
+		}
+		one = stages[stage1];
+		two = stages[stage2];
+		if (!one && !two)
+			continue;
+		if (!one)
+			diff_addremove('+', ntohl(two->ce_mode),
+				       two->sha1, name, NULL);
+		else if (!two)
+			diff_addremove('-', ntohl(one->ce_mode),
+				       one->sha1, name, NULL);
+		else if (memcmp(one->sha1, two->sha1, 20) ||
+			 (one->ce_mode != two->ce_mode))
+			 diff_change(ntohl(one->ce_mode), ntohl(two->ce_mode),
+				     one->sha1, two->sha1, name, NULL);
+	}
+}
+
 int main(int ac, const char **av)
 {
-	int stage1, stage2, i;
+	int stage1, stage2;
 
 	read_cache();
 	while (1 < ac && av[1][0] == '-') {
@@ -67,40 +105,7 @@ int main(int ac, const char **av)
 	av += 3; /* The rest from av[0] are for paths restriction. */
 	diff_setup(diff_setup_opt);
 
-	i = 0;
-	while (i < active_nr) {
-		struct cache_entry *ce, *stages[4] = { NULL, };
-		struct cache_entry *one, *two;
-		const char *name;
-		int len;
-		ce = active_cache[i];
-		len = ce_namelen(ce);
-		name = ce->name;
-		for (;;) {
-			int stage = ce_stage(ce);
-			stages[stage] = ce;
-			if (active_nr <= ++i)
-				break;
-			ce = active_cache[i];
-			if (ce_namelen(ce) != len ||
-			    memcmp(name, ce->name, len))
-				break;
-		}
-		one = stages[stage1];
-		two = stages[stage2];
-		if (!one && !two)
-			continue;
-		if (!one)
-			diff_addremove('+', ntohl(two->ce_mode),
-				       two->sha1, name, NULL);
-		else if (!two)
-			diff_addremove('-', ntohl(one->ce_mode),
-				       one->sha1, name, NULL);
-		else if (memcmp(one->sha1, two->sha1, 20) ||
-			 (one->ce_mode != two->ce_mode))
-			 diff_change(ntohl(one->ce_mode), ntohl(two->ce_mode),
-				     one->sha1, two->sha1, name, NULL);
-	}
+	diff_stages(stage1, stage2);
 
 	diffcore_std(av,
 		     detect_rename, diff_score_opt,
------------


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Handling merge conflicts a bit more gracefully..
  2005-06-08 23:35   ` Linus Torvalds
                       ` (2 preceding siblings ...)
  2005-06-09  7:02     ` [PATCH 0/3] " Junio C Hamano
@ 2005-06-18  0:15     ` Herbert Xu
  2005-06-18  0:26       ` Linus Torvalds
  3 siblings, 1 reply; 33+ messages in thread
From: Herbert Xu @ 2005-06-18  0:15 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: junkio, git

Linus Torvalds <torvalds@osdl.org> wrote:
> 
>>  # Modified in both, but differently.
>> +     merge -p "$src1" "$orig" "$src2" > "$4"
>> 
>> Again, make sure "$4" is not a directory before redirecting into
>> it from merge, so that you can tell merge failures from it?
> 
> Hmm.. What's the cleanest way to check for redirection errors, but still
> be able to distinguish those cleanly from "merge" itself returning an
> error?

I don't know whether this is the cleanest, but this is one way:

redir=failed
{
	redir=ok
	merge -p "$src1" "$orig" "$src2"
} > "$4" || err=$?

if [ $redir = failed ]; then
	...
fi

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Handling merge conflicts a bit more gracefully..
  2005-06-18  0:15     ` Handling merge conflicts a bit more gracefully Herbert Xu
@ 2005-06-18  0:26       ` Linus Torvalds
  0 siblings, 0 replies; 33+ messages in thread
From: Linus Torvalds @ 2005-06-18  0:26 UTC (permalink / raw)
  To: Herbert Xu; +Cc: junkio, git



On Sat, 18 Jun 2005, Herbert Xu wrote:
> 
> I don't know whether this is the cleanest, but this is one way:

Oh, wow.

One thing I have to say is that I've learnt a lot more shell tricks. 

Now I'll just have to unlearn them, so that I won't have nightmares.

		Linus

^ permalink raw reply	[flat|nested] 33+ messages in thread

end of thread, other threads:[~2005-06-18  0:18 UTC | newest]

Thread overview: 33+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-06-08 20:55 Handling merge conflicts a bit more gracefully Linus Torvalds
2005-06-08 23:07 ` Junio C Hamano
2005-06-08 23:35   ` Linus Torvalds
2005-06-09  0:03     ` Junio C Hamano
2005-06-09  0:41       ` Linus Torvalds
2005-06-09  1:04         ` Junio C Hamano
2005-06-09  0:11     ` Junio C Hamano
2005-06-09  1:08       ` Linus Torvalds
2005-06-09  2:15         ` Junio C Hamano
2005-06-09  2:48           ` Linus Torvalds
2005-06-09  4:35             ` Junio C Hamano
2005-06-09  4:54               ` Linus Torvalds
2005-06-09  5:15                 ` Junio C Hamano
2005-06-09  7:02     ` [PATCH 0/3] " Junio C Hamano
2005-06-09  7:04       ` [PATCH 1/3] read-tree.c: rename local variables used in 3-way merge code Junio C Hamano
2005-06-09  7:05       ` [PATCH 2/3] read-tree -m 3-way: loosen index requirements that is too strict Junio C Hamano
2005-06-09  7:06       ` [PATCH 3/3] read-tree -m 3-way: handle more trivial merges internally Junio C Hamano
2005-06-09 15:15         ` Linus Torvalds
2005-06-09 17:26           ` Junio C Hamano
2005-06-09 17:37             ` Linus Torvalds
     [not found]               ` <7vbr6fnzf0.fsf@assigned-by-dhcp.cox.net>
     [not found]                 ` <Pine.LNX.4.58.0506091152530.2286@ppc970.osdl.org>
2005-06-09 22:45                   ` [PATCH] read-tree.c: rename local variables used in 3-way merge code Junio C Hamano
2005-06-09 22:47                   ` [PATCH] Handle entry removals during merge correctly Junio C Hamano
2005-06-09 22:48                   ` [PATCH] read-tree -m 3-way: loosen an index requirement that was too strict Junio C Hamano
2005-06-09 22:49                   ` [PATCH] read-tree -m 3-way: handle more trivial merges internally Junio C Hamano
2005-06-09 20:35           ` [PATCH 3/3] " Junio C Hamano
2005-06-09 22:13             ` [PATCH] Add git-diff-stages command Junio C Hamano
2005-06-09 22:30               ` Linus Torvalds
2005-06-11  1:44               ` [PATCH] diff-stages: unuglify the too big main() function Junio C Hamano
2005-06-10 19:59             ` [PATCH 3/3] read-tree -m 3-way: handle more trivial merges internally Junio C Hamano
2005-06-18  0:15     ` Handling merge conflicts a bit more gracefully Herbert Xu
2005-06-18  0:26       ` Linus Torvalds
2005-06-09  3:07 ` Jeff Garzik
2005-06-09  4:11   ` Linus Torvalds

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).