Comments on recursive merge..

git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* Comments on recursive merge..
@ 2005-11-07 16:48 Linus Torvalds
  2005-11-07 16:56 ` Linus Torvalds
  2005-11-07 22:58 ` Comments on recursive merge Fredrik Kuivinen
  0 siblings, 2 replies; 58+ messages in thread
From: Linus Torvalds @ 2005-11-07 16:48 UTC (permalink / raw)
  To: Junio C Hamano, Fredrik Kuivinen; +Cc: Git Mailing List

Guys,

  I just hit my first real rename conflict, and very timidly tried the 
"recursive" strategy in the hopes that I wouldn't need to do things by 
hand.

It resolved things beautifully. Good job. 

My only worry is that I don't read python, so I don't really know how it 
does what it does, which makes me nervous. Can somebody (Fredrik?) add 
some documentation about the merge strategy and how it works.

Considering that the stupid resolve strategy really requires you to know 
how git works when rename conflicts happen (things left in unmerged state 
are really quite hard to handle by hand unless you know exactly what 
you're doing), I'd almost suggest making "recursive" the default. I'm a 
bit nervous about it, but knowing how it works would probably put most of 
that to rest.

		Linus

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: Comments on recursive merge..
  2005-11-07 16:48 Comments on recursive merge Linus Torvalds
@ 2005-11-07 16:56 ` Linus Torvalds
  2005-11-07 23:19   ` [PATCH] merge-recursive: Only print relevant rename messages Fredrik Kuivinen
  2005-11-07 22:58 ` Comments on recursive merge Fredrik Kuivinen
  1 sibling, 1 reply; 58+ messages in thread
From: Linus Torvalds @ 2005-11-07 16:56 UTC (permalink / raw)
  To: Junio C Hamano, Fredrik Kuivinen; +Cc: Git Mailing List

On Mon, 7 Nov 2005, Linus Torvalds wrote:
> 
>   I just hit my first real rename conflict, and very timidly tried the 
> "recursive" strategy in the hopes that I wouldn't need to do things by 
> hand.
> 
> It resolved things beautifully. Good job. 

Btw, one thing that it does is print out too much information.

In particular, I had renames on both sides of the merge (in case anybody 
wants to see which one I'm talking about: it's the current top-of-head 
commit in the kernel archives: 333c47c847c90aaefde8b593054d9344106333b5).

Now, renames that you've done yourself you really don't want to hear 
about, at least if the other side didn't change anything in that file.

Renames that the _other_ side has done (the one you're merging) you may or 
may not want to know about, regardless of whether they happened to files 
that are changed. But since "git pull" will do a "git-apply --stat" at the 
end and show the renames there, I'd argue that the merge strategy itself 
should be quiet about any renames that are trivial.

So how about talking about renames only if you end up also doing a 
file-level merge? As it is, doing the merge talked about renames that I 
had merged earlier in my own branch, which is just confusing.

		Linus

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [PATCH] merge-recursive: Only print relevant rename messages
  2005-11-07 16:56 ` Linus Torvalds
@ 2005-11-07 23:19   ` Fredrik Kuivinen
  2005-11-07 23:54     ` Junio C Hamano
  0 siblings, 1 reply; 58+ messages in thread
From: Fredrik Kuivinen @ 2005-11-07 23:19 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Junio C Hamano, Fredrik Kuivinen, Git Mailing List

On Mon, Nov 07, 2005 at 08:56:07AM -0800, Linus Torvalds wrote:
> 
> Btw, one thing that it does is print out too much information.
> 
> In particular, I had renames on both sides of the merge (in case anybody 
> wants to see which one I'm talking about: it's the current top-of-head 
> commit in the kernel archives: 333c47c847c90aaefde8b593054d9344106333b5).
> 
> Now, renames that you've done yourself you really don't want to hear 
> about, at least if the other side didn't change anything in that file.
> 
> Renames that the _other_ side has done (the one you're merging) you may or 
> may not want to know about, regardless of whether they happened to files 
> that are changed. But since "git pull" will do a "git-apply --stat" at the 
> end and show the renames there, I'd argue that the merge strategy itself 
> should be quiet about any renames that are trivial.
> 
> So how about talking about renames only if you end up also doing a 
> file-level merge? As it is, doing the merge talked about renames that I 
> had merged earlier in my own branch, which is just confusing.
> 

Sounds like a good idea. How about something like the following?

--

It isn't really interesting to know about the renames that have
already been committed to the branch you are working on. Furthermore,
the 'git-apply --stat' at the end of git-(merge|pull) will tell us
about any renames in the other branch.

With this commit only renames which require a file-level merge will
be printed.

Signed-off-by: Fredrik Kuivinen <freku045@student.liu.se>


---

 git-merge-recursive.py |   22 +++++++++++++++-------
 1 files changed, 15 insertions(+), 7 deletions(-)

applies-to: 5af1b5b93257ecfe993bb24975bf596faa342758
89c029b439603630a53ee4e4d0cb7931111afd2a
diff --git a/git-merge-recursive.py b/git-merge-recursive.py
index 626d854..9983cd9 100755
--- a/git-merge-recursive.py
+++ b/git-merge-recursive.py
@@ -162,10 +162,13 @@ def mergeTrees(head, merge, common, bran
 # Low level file merging, update and removal
 # ------------------------------------------
 
+MERGE_NONE = 0
+MERGE_TRIVIAL = 1
+MERGE_3WAY = 2
 def mergeFile(oPath, oSha, oMode, aPath, aSha, aMode, bPath, bSha, bMode,
               branch1Name, branch2Name):
 
-    merge = False
+    merge = MERGE_NONE
     clean = True
 
     if stat.S_IFMT(aMode) != stat.S_IFMT(bMode):
@@ -178,7 +181,7 @@ def mergeFile(oPath, oSha, oMode, aPath,
             sha = bSha
     else:
         if aSha != oSha and bSha != oSha:
-            merge = True
+            merge = MERGE_TRIVIAL
 
         if aMode == oMode:
             mode = bMode
@@ -207,7 +210,8 @@ def mergeFile(oPath, oSha, oMode, aPath,
             os.unlink(orig)
             os.unlink(src1)
             os.unlink(src2)
-            
+
+            merge = MERGE_3WAY
             clean = (code == 0)
         else:
             assert(stat.S_ISLNK(aMode) and stat.S_ISLNK(bMode))
@@ -577,14 +581,16 @@ def processRenames(renamesA, renamesB, b
                 updateFile(False, ren1.dstSha, ren1.dstMode, dstName1)
                 updateFile(False, ren2.dstSha, ren2.dstMode, dstName2)
             else:
-                print 'Renaming', fmtRename(path, ren1.dstName)
                 [resSha, resMode, clean, merge] = \
                          mergeFile(ren1.srcName, ren1.srcSha, ren1.srcMode,
                                    ren1.dstName, ren1.dstSha, ren1.dstMode,
                                    ren2.dstName, ren2.dstSha, ren2.dstMode,
                                    branchName1, branchName2)
 
-                if merge:
+                if merge or not clean:
+                    print 'Renaming', fmtRename(path, ren1.dstName)
+
+                if merge == MERGE_3WAY:
                     print 'Auto-merging', ren1.dstName
 
                 if not clean:
@@ -653,14 +659,16 @@ def processRenames(renamesA, renamesB, b
                 tryMerge = True
 
             if tryMerge:
-                print 'Renaming', fmtRename(ren1.srcName, ren1.dstName)
                 [resSha, resMode, clean, merge] = \
                          mergeFile(ren1.srcName, ren1.srcSha, ren1.srcMode,
                                    ren1.dstName, ren1.dstSha, ren1.dstMode,
                                    ren1.srcName, srcShaOtherBranch, srcModeOtherBranch,
                                    branchName1, branchName2)
 
-                if merge:
+                if merge or not clean:
+                    print 'Renaming', fmtRename(ren1.srcName, ren1.dstName)
+
+                if merge == MERGE_3WAY:
                     print 'Auto-merging', ren1.dstName
 
                 if not clean:

^ permalink raw reply related	[flat|nested] 58+ messages in thread

* Re: [PATCH] merge-recursive: Only print relevant rename messages
  2005-11-07 23:19   ` [PATCH] merge-recursive: Only print relevant rename messages Fredrik Kuivinen
@ 2005-11-07 23:54     ` Junio C Hamano
  2005-11-09 10:36       ` Fredrik Kuivinen
  0 siblings, 1 reply; 58+ messages in thread
From: Junio C Hamano @ 2005-11-07 23:54 UTC (permalink / raw)
  To: Fredrik Kuivinen; +Cc: Linus Torvalds, Git Mailing List

Fredrik Kuivinen <freku045@student.liu.se> writes:

> @@ -178,7 +181,7 @@ def mergeFile(oPath, oSha, oMode, aPath,
>              sha = bSha
>      else:
>          if aSha != oSha and bSha != oSha:
> -            merge = True
> +            merge = MERGE_TRIVIAL

The rest looks good to me, but are you sure about this part?  I
have a feeling that the above "and" should be "or", meaning, we
check to see if there is _any_ change, and default to TRIVIAL,
but later we would find that we need a real merge and then
promote it to MERGE_3WAY.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH] merge-recursive: Only print relevant rename messages
  2005-11-07 23:54     ` Junio C Hamano
@ 2005-11-09 10:36       ` Fredrik Kuivinen
  0 siblings, 0 replies; 58+ messages in thread
From: Fredrik Kuivinen @ 2005-11-09 10:36 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Fredrik Kuivinen, Linus Torvalds, Git Mailing List

On Mon, Nov 07, 2005 at 03:54:57PM -0800, Junio C Hamano wrote:
> Fredrik Kuivinen <freku045@student.liu.se> writes:
> 
> > @@ -178,7 +181,7 @@ def mergeFile(oPath, oSha, oMode, aPath,
> >              sha = bSha
> >      else:
> >          if aSha != oSha and bSha != oSha:
> > -            merge = True
> > +            merge = MERGE_TRIVIAL
> 
> The rest looks good to me, but are you sure about this part?  I
> have a feeling that the above "and" should be "or", meaning, we
> check to see if there is _any_ change, and default to TRIVIAL,
> but later we would find that we need a real merge and then
> promote it to MERGE_3WAY.
> 

You are right. The code actually do the right thing, but it does it by
accident. Please apply the following patch.

---

merge-recursive: Fix limited output of rename messages

The previous code did the right thing, but it did it by accident.

Signed-off-by: Fredrik Kuivinen <freku045@student.liu.se>


---

 git-merge-recursive.py |   12 ++++--------
 1 files changed, 4 insertions(+), 8 deletions(-)

applies-to: bb7dd65e1d945edbe0137a761ebc388c7394067a
f56613498cd7fb7013f532a04e63b580314ed957
diff --git a/git-merge-recursive.py b/git-merge-recursive.py
index 9983cd9..3657875 100755
--- a/git-merge-recursive.py
+++ b/git-merge-recursive.py
@@ -162,13 +162,10 @@ def mergeTrees(head, merge, common, bran
 # Low level file merging, update and removal
 # ------------------------------------------
 
-MERGE_NONE = 0
-MERGE_TRIVIAL = 1
-MERGE_3WAY = 2
 def mergeFile(oPath, oSha, oMode, aPath, aSha, aMode, bPath, bSha, bMode,
               branch1Name, branch2Name):
 
-    merge = MERGE_NONE
+    merge = False
     clean = True
 
     if stat.S_IFMT(aMode) != stat.S_IFMT(bMode):
@@ -181,7 +178,7 @@ def mergeFile(oPath, oSha, oMode, aPath,
             sha = bSha
     else:
         if aSha != oSha and bSha != oSha:
-            merge = MERGE_TRIVIAL
+            merge = True
 
         if aMode == oMode:
             mode = bMode
@@ -211,7 +208,6 @@ def mergeFile(oPath, oSha, oMode, aPath,
             os.unlink(src1)
             os.unlink(src2)
 
-            merge = MERGE_3WAY
             clean = (code == 0)
         else:
             assert(stat.S_ISLNK(aMode) and stat.S_ISLNK(bMode))
@@ -590,7 +586,7 @@ def processRenames(renamesA, renamesB, b
                 if merge or not clean:
                     print 'Renaming', fmtRename(path, ren1.dstName)
 
-                if merge == MERGE_3WAY:
+                if merge:
                     print 'Auto-merging', ren1.dstName
 
                 if not clean:
@@ -668,7 +664,7 @@ def processRenames(renamesA, renamesB, b
                 if merge or not clean:
                     print 'Renaming', fmtRename(ren1.srcName, ren1.dstName)
 
-                if merge == MERGE_3WAY:
+                if merge:
                     print 'Auto-merging', ren1.dstName
 
                 if not clean:
---
0.99.9.GIT

^ permalink raw reply related	[flat|nested] 58+ messages in thread

* Re: Comments on recursive merge..
  2005-11-07 16:48 Comments on recursive merge Linus Torvalds
  2005-11-07 16:56 ` Linus Torvalds
@ 2005-11-07 22:58 ` Fredrik Kuivinen
  2005-11-08  0:13   ` Junio C Hamano
  1 sibling, 1 reply; 58+ messages in thread
From: Fredrik Kuivinen @ 2005-11-07 22:58 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Junio C Hamano, Fredrik Kuivinen, Git Mailing List

On Mon, Nov 07, 2005 at 08:48:06AM -0800, Linus Torvalds wrote:
> 
> Guys,
> 
>   I just hit my first real rename conflict, and very timidly tried the 
> "recursive" strategy in the hopes that I wouldn't need to do things by 
> hand.
> 
> It resolved things beautifully. Good job. 

I'm glad that it worked.

> My only worry is that I don't read python, so I don't really know how it 
> does what it does, which makes me nervous. Can somebody (Fredrik?) add 
> some documentation about the merge strategy and how it works.

I will write something up.

> Considering that the stupid resolve strategy really requires you to know 
> how git works when rename conflicts happen (things left in unmerged state 
> are really quite hard to handle by hand unless you know exactly what 
> you're doing), I'd almost suggest making "recursive" the default. I'm a 
> bit nervous about it, but knowing how it works would probably put most of 
> that to rest.

It would be great if the recursive strategy could get some more
testing. I have tested it on a thousand commits or so in a few kernel
repositories and haven't found any bugs, but it could be due to errors
in the test setup, testing the wrong repositories or just being lucky. Some
real-world testing would be great.

- Fredrik

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: Comments on recursive merge..
  2005-11-07 22:58 ` Comments on recursive merge Fredrik Kuivinen
@ 2005-11-08  0:13   ` Junio C Hamano
  2005-11-08  0:33     ` Linus Torvalds
  0 siblings, 1 reply; 58+ messages in thread
From: Junio C Hamano @ 2005-11-08  0:13 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git, Fredrik Kuivinen

Fredrik Kuivinen <freku045@student.liu.se> writes:

> On Mon, Nov 07, 2005 at 08:48:06AM -0800, Linus Torvalds wrote:
>> 
>> Guys,
>> 
>>   I just hit my first real rename conflict, and very timidly tried the 
>> "recursive" strategy in the hopes that I wouldn't need to do things by 
>> hand.
>> 
>> It resolved things beautifully. Good job. 
>
> I'm glad that it worked.

This is the first time I see you pleased by something in git
that was done without very close supervision from you.  All the
credits for this one goes to Fredrik, of course, but it is a
small victory for me as the maintainer as well, and I am very
happy about it.

>> ..., I'd almost suggest making "recursive" the default. I'm a
>> bit nervous about it, but knowing how it works would probably
>> put most of that to rest.

Another thing to consider is if it is fast enough for everyday
trivial merges.

In any case, I've been thinking about teaching git-merge to look
into .git/config to make it overridable which strategy to use by
default.  This would eliminate the hardcoded 'resolve for
two-head, octopus for more' rule from git-pull.  Then we could
ship git-merge with the default rule of 'recursive for two-head,
octopus for more', and if it turns out to be premature, you can
update your config file to use resolve for two-head case while
we sort things out.  That way, recursive would get wider test
coverage, and people who really need a working merge this minute
can choose to run resolve in emergency without specifying '-s
resolve' on the command line of 'git pull' every time.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: Comments on recursive merge..
  2005-11-08  0:13   ` Junio C Hamano
@ 2005-11-08  0:33     ` Linus Torvalds
  2005-11-08  0:59       ` Junio C Hamano
                         ` (3 more replies)
  0 siblings, 4 replies; 58+ messages in thread
From: Linus Torvalds @ 2005-11-08  0:33 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Fredrik Kuivinen

On Mon, 7 Nov 2005, Junio C Hamano wrote:
> 
> This is the first time I see you pleased by something in git
> that was done without very close supervision from you.

That sounds like a backhanded way of saying that I'm micromanagering, 
picky and difficult to work with ;)

> Another thing to consider is if it is fast enough for everyday
> trivial merges.

Hmm. True. The _really_ trivial in-index case triggers for me pretty 
often, but I haven't done any statistics. It might be only 50% of the 
time.

Is the recursive thing noticeably slower for the "easy" cases (ie things 
that the old regular resolve strategy does well)?

It's certainly an option to just do what I just did, namely use the 
default one until it breaks, and then just do "git reset --hard" and re-do 
the pull with "-s recursive". A bit sad, and it would be good to have 
coverage on the recursive strategy..

		Linus

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: Comments on recursive merge..
  2005-11-08  0:33     ` Linus Torvalds
@ 2005-11-08  0:59       ` Junio C Hamano
  2005-11-08 11:58       ` Johannes Schindelin
                         ` (2 subsequent siblings)
  3 siblings, 0 replies; 58+ messages in thread
From: Junio C Hamano @ 2005-11-08  0:59 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git, Fredrik Kuivinen

Linus Torvalds <torvalds@osdl.org> writes:

> On Mon, 7 Nov 2005, Junio C Hamano wrote:
>> 
>> This is the first time I see you pleased by something in git
>> that was done without very close supervision from you.
>
> That sounds like a backhanded way of saying that I'm micromanagering, 
> picky and difficult to work with ;)

Sorry, that is not what I meant to say at all.

You used to micromanage, but it was _very_ good for git back
then.  I admit that only once I found you too picky and
difficult to work with while I was fixing a bad premature-free
bug in the diffcore-rename code, but overall your attention to
detail well paid off.

Since I inherited the project, we added quite a lot of stuff,
but I was still unsure if we are making good progress, or just
stagnating with only small enhancements and obvious fixes.

But Fredrik merge turns out to be a spectacular success as you
found out, which is a triumph for Fredrik, but at the same time
it means I was not doing too bad myself ;-).

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: Comments on recursive merge..
  2005-11-08  0:33     ` Linus Torvalds
  2005-11-08  0:59       ` Junio C Hamano
@ 2005-11-08 11:58       ` Johannes Schindelin
  2005-11-08 21:02         ` Fredrik Kuivinen
  2005-11-08 16:21       ` [RFC/PATCH] Make git-recursive the default strategy for git-pull Junio C Hamano
  2005-11-11 22:25       ` Comments on recursive merge Junio C Hamano
  3 siblings, 1 reply; 58+ messages in thread
From: Johannes Schindelin @ 2005-11-08 11:58 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Junio C Hamano, git, Fredrik Kuivinen

Hi,

On Mon, 7 Nov 2005, Linus Torvalds wrote:

> Is the recursive thing noticeably slower for the "easy" cases (ie things 
> that the old regular resolve strategy does well)?

IIRC recursive does nothing else than recursively merging the merge-bases 
(granted, in a clever way). So if there is only one merge-base, the only 
slow-down would be the startup of python (which is probably worth it, 
anyway).

> It's certainly an option to just do what I just did, namely use the 
> default one until it breaks, and then just do "git reset --hard" and re-do 
> the pull with "-s recursive". A bit sad, and it would be good to have 
> coverage on the recursive strategy..

We already have a fallback list: after really-trivial, try automatic, ...,
try resolve. Why not just add recursive? So, if even resolve failed, just 
try once more, with recursive.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: Comments on recursive merge..
  2005-11-08 11:58       ` Johannes Schindelin
@ 2005-11-08 21:02         ` Fredrik Kuivinen
  2005-11-08 21:47           ` Junio C Hamano
                             ` (2 more replies)
  0 siblings, 3 replies; 58+ messages in thread
From: Fredrik Kuivinen @ 2005-11-08 21:02 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Linus Torvalds, Junio C Hamano, git, Fredrik Kuivinen

On Tue, Nov 08, 2005 at 12:58:50PM +0100, Johannes Schindelin wrote:
> Hi,
> 
> On Mon, 7 Nov 2005, Linus Torvalds wrote:
> 
> > Is the recursive thing noticeably slower for the "easy" cases (ie things 
> > that the old regular resolve strategy does well)?
> 
> IIRC recursive does nothing else than recursively merging the merge-bases 
> (granted, in a clever way). So if there is only one merge-base, the only 
> slow-down would be the startup of python (which is probably worth it, 
> anyway).
> 

I haven't done any real measurements but my feeling is that the
recursive strategy is at least not very much slower than the resolve
strategy.

In the single-common-ancestor case I can think of the following things
which may make a difference speed wise:

* The recursive strategy is written in Python
* The code for finding common ancestors is also written in Python and
  is probably a bit slower than git-merge-base.
* git-diff-tree -M --diff-filter=R <common ancestor> <branch> is
  executed twice, once for each branch.

On the positive side the code which corresponds to git-merge-one-file
in the git-resolve case is also written in python, we can therefore
avoid some forks and execs.

> > It's certainly an option to just do what I just did, namely use the 
> > default one until it breaks, and then just do "git reset --hard" and re-do 
> > the pull with "-s recursive". A bit sad, and it would be good to have 
> > coverage on the recursive strategy..
> 
> We already have a fallback list: after really-trivial, try automatic, ...,
> try resolve. Why not just add recursive? So, if even resolve failed, just 
> try once more, with recursive.
> 

I don't think this is a very good idea for two reasons. The first one
is that there are some merge scenarios involving renames which should
be conflicts but are cleanly merged by git-resolve.

The second reason is that with the fall back list the recursive
strategy will only be used in the strange corner cases and will thus
not get nearly the same amount of testing it would get if it was the
first choice (or directly after the really-trivial merge).

- Fredrik

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: Comments on recursive merge..
  2005-11-08 21:02         ` Fredrik Kuivinen
@ 2005-11-08 21:47           ` Junio C Hamano
  2005-11-08 21:52           ` Linus Torvalds
  2005-11-08 23:04           ` Comments on recursive merge Johannes Schindelin
  2 siblings, 0 replies; 58+ messages in thread
From: Junio C Hamano @ 2005-11-08 21:47 UTC (permalink / raw)
  To: Fredrik Kuivinen; +Cc: Johannes Schindelin, Linus Torvalds, git

Fredrik Kuivinen <freku045@student.liu.se> writes:

> The second reason is that with the fall back list the recursive
> strategy will only be used in the strange corner cases and will thus
> not get nearly the same amount of testing it would get if it was the
> first choice (or directly after the really-trivial merge).

There are two reasons to avoid git-merge choose from more than
one strategy.

1. The whole idea that git-merge implements "goodness" metric is
   bogus.  It does not know what merge strategy is good and that
   is the reason it punts and has the user choose his preferred
   strategy.

2. When it is going to loop over more than one strategy, it
   stashes away the current working tree state, so that the
   second and subsequent strategies can begin from a clean slate
   (including local modifications since the current head).  If
   we try only one, there is no such cost involved.

I think the patch I sent out last night to change the recursive
as the default strategy and make it overridable from the
configuration mechanism would be a better way to give people
more exposure to the greatness of recursive while protecting
them from potential glitches if any.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: Comments on recursive merge..
  2005-11-08 21:02         ` Fredrik Kuivinen
  2005-11-08 21:47           ` Junio C Hamano
@ 2005-11-08 21:52           ` Linus Torvalds
  2005-11-08 22:36             ` Fredrik Kuivinen
  2005-11-08 23:04           ` Comments on recursive merge Johannes Schindelin
  2 siblings, 1 reply; 58+ messages in thread
From: Linus Torvalds @ 2005-11-08 21:52 UTC (permalink / raw)
  To: Fredrik Kuivinen; +Cc: Johannes Schindelin, Junio C Hamano, git



On Tue, 8 Nov 2005, Fredrik Kuivinen wrote:
>
> * The code for finding common ancestors is also written in Python and
>   is probably a bit slower than git-merge-base.

Btw, what part of git-merge-bases is it that makes it not be practical?

			Linus

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: Comments on recursive merge..
  2005-11-08 21:52           ` Linus Torvalds
@ 2005-11-08 22:36             ` Fredrik Kuivinen
  2005-11-08 23:05               ` Linus Torvalds
  0 siblings, 1 reply; 58+ messages in thread
From: Fredrik Kuivinen @ 2005-11-08 22:36 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Fredrik Kuivinen, Johannes Schindelin, Junio C Hamano, git

On Tue, Nov 08, 2005 at 01:52:06PM -0800, Linus Torvalds wrote:
> 
> 
> On Tue, 8 Nov 2005, Fredrik Kuivinen wrote:
> >
> > * The code for finding common ancestors is also written in Python and
> >   is probably a bit slower than git-merge-base.
> 
> Btw, what part of git-merge-bases is it that makes it not be practical?
> 

The problem is in the multiple-common-ancestors case. If we have three
common ancestors, A, B and C, we will start with merging A with B. The
result is a new 'virtual' commit object (not stored in the object
database), lets call it V. We are then going to merge V with C. To do
that we need to get the common ancestor(s) of V and C, and as V
doesn't exist in the database we can't use git-merge-base.

I haven't given it a lot of thought though, it might be possible to
use git-merge-base in some way and get the same results as we get now.

It would certainly be possible to use git-merge-base in the first
iteration and use the python code only when we actually have any
'virtual' commit objects.

- Fredrik

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: Comments on recursive merge..
  2005-11-08 22:36             ` Fredrik Kuivinen
@ 2005-11-08 23:05               ` Linus Torvalds
  2005-11-08 23:18                 ` Johannes Schindelin
  2005-11-09  0:32                 ` Petr Baudis
  0 siblings, 2 replies; 58+ messages in thread
From: Linus Torvalds @ 2005-11-08 23:05 UTC (permalink / raw)
  To: Fredrik Kuivinen; +Cc: Johannes Schindelin, Junio C Hamano, git

On Tue, 8 Nov 2005, Fredrik Kuivinen wrote:
> 
> The problem is in the multiple-common-ancestors case. If we have three
> common ancestors, A, B and C, we will start with merging A with B. The
> result is a new 'virtual' commit object (not stored in the object
> database), lets call it V. We are then going to merge V with C. To do
> that we need to get the common ancestor(s) of V and C, and as V
> doesn't exist in the database we can't use git-merge-base.

Hmm. That's really the same as the merge-base of "C _and_ (A _or_ B)", 
isn't it?

So we should be able to do that even without ever seeing the virtual 
merge. In fact, it really is pretty trivial from a technical standpoint: 
the "A _or_ B" part is really just inserting both A and B with the same 
"flags" value (see the "merge_base()" function in merge-base.c).

So in general, "merge-base" could trivially be extended to have any number 
of "OR commits" on either side, as long as there is just one "and".

It could also be extended to have multiple "and" cases, but that has 
actually already been done by "git-show-branch", I think. It's all the 
same logic, except it uses more than just two bits.

So _technically_ it should be easy to do, it would just need some sane 
command line syntax to specify the grouping.

> I haven't given it a lot of thought though, it might be possible to
> use git-merge-base in some way and get the same results as we get now.
> 
> It would certainly be possible to use git-merge-base in the first
> iteration and use the python code only when we actually have any
> 'virtual' commit objects.

Just handling the first case specially would be sufficient for 99% of all 
uses.  And if the multi-parent cases are slightly slower, I don't think 
anybody cares.

In fact, we _always_ do the first git-merge-base in git-merge.sh anyway 
(actually, not git-merge-base, but "git-show-branch --merge-base"). So 
that's really done already for the "test if it's trivial" case..

Junio, that points out that "git-merge-base" is another program that could 
just be removed, since it's really supreceded by git-show-branch. Or did I 
miss something?

			Linus

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: Comments on recursive merge..
  2005-11-08 23:05               ` Linus Torvalds
@ 2005-11-08 23:18                 ` Johannes Schindelin
  2005-11-09  0:18                   ` Linus Torvalds
  2005-11-09  0:32                 ` Petr Baudis
  1 sibling, 1 reply; 58+ messages in thread
From: Johannes Schindelin @ 2005-11-08 23:18 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Fredrik Kuivinen, Junio C Hamano, git

Hi,

On Tue, 8 Nov 2005, Linus Torvalds wrote:

> Junio, that points out that "git-merge-base" is another program that could 
> just be removed, since it's really supreceded by git-show-branch. Or did I 
> miss something?

IIRC, git-show-branch has a limit on the number of refs it can take.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: Comments on recursive merge..
  2005-11-08 23:18                 ` Johannes Schindelin
@ 2005-11-09  0:18                   ` Linus Torvalds
  2005-11-09  6:10                     ` Junio C Hamano
  0 siblings, 1 reply; 58+ messages in thread
From: Linus Torvalds @ 2005-11-09  0:18 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Fredrik Kuivinen, Junio C Hamano, git

On Wed, 9 Nov 2005, Johannes Schindelin wrote:
> 
> On Tue, 8 Nov 2005, Linus Torvalds wrote:
> 
> > Junio, that points out that "git-merge-base" is another program that could 
> > just be removed, since it's really supreceded by git-show-branch. Or did I 
> > miss something?
> 
> IIRC, git-show-branch has a limit on the number of refs it can take.

Well, git-merge-base does too. git-merge-base only takes two refs ;)

In general, you need to keep track of one bit per ref, and since we have 
a 32-bit "flags" word and need a couple of bits for other maintenance 
info, pretty much anything that figures out common heads will be limited 
some way. 

This is only a limit for the "and" logic - the "or" logic (if we implement 
it) will just share the same status bit for all the refs that are "ored 
together" and thus has no limits. 

Oh, and the "and" logic can be extended by running the program multiple 
times, so it's not a "hard" limit, it's just an issue of convenience.

That said, anybody who ever does an octopus of more than just a few heads 
deserves to be shot, so I don't think the limit should matter. The 
recursive strategy should only add the "or" kind of refs, and it 
shouldn't be a problem (apart from just how to describe them).

		Linus

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: Comments on recursive merge..
  2005-11-09  0:18                   ` Linus Torvalds
@ 2005-11-09  6:10                     ` Junio C Hamano
  0 siblings, 0 replies; 58+ messages in thread
From: Junio C Hamano @ 2005-11-09  6:10 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git

Linus Torvalds <torvalds@osdl.org> writes:

> In general, you need to keep track of one bit per ref, and since we have 
> a 32-bit "flags" word and need a couple of bits for other maintenance 
> info, pretty much anything that figures out common heads will be limited 
> some way. 
>
> This is only a limit for the "and" logic - the "or" logic (if we implement 
> it) will just share the same status bit for all the refs that are "ored 
> together" and thus has no limits. 
>
> Oh, and the "and" logic can be extended by running the program multiple 
> times, so it's not a "hard" limit, it's just an issue of convenience.
>
> That said, anybody who ever does an octopus of more than just a few heads 
> deserves to be shot, so I don't think the limit should matter. The 
> recursive strategy should only add the "or" kind of refs, and it 
> shouldn't be a problem (apart from just how to describe them).

Come to think of it, git-merge-octopus does AND.  If I am
merging topic branches 1, 2, 3,... N into my master, internally
it does an equivalent of merging 1 into master, then 2 into the
result, then C into that result,..., and it uses merge-base of
all the heads merged so far and the original master to pivot on.

And I think this is *wrong*.  The merge base of each step when
merging head N does not have to be older than merge base of the
original master and head N, but currently that is not what it
does.  I should be ORing them ideally, but even if I do not, I
should be able to just use the merge base of head N and original
master.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: Comments on recursive merge..
  2005-11-08 23:05               ` Linus Torvalds
  2005-11-08 23:18                 ` Johannes Schindelin
@ 2005-11-09  0:32                 ` Petr Baudis
  2005-11-09  0:51                   ` Linus Torvalds
  1 sibling, 1 reply; 58+ messages in thread
From: Petr Baudis @ 2005-11-09  0:32 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Fredrik Kuivinen, Johannes Schindelin, Junio C Hamano, git

Dear diary, on Wed, Nov 09, 2005 at 12:05:43AM CET, I got a letter
where Linus Torvalds <torvalds@osdl.org> said that...
> Junio, that points out that "git-merge-base" is another program that could 
> just be removed, since it's really supreceded by git-show-branch. Or did I 
> miss something?

Wow, I didn't know git-show-branch could do that (even though it's a bit
unnatural to expect this from command named this way; then again,
there's git-rev-parse...).

BTW, git-show-branch is also by orders of magnitude faster (not that
this would be any major timesaver). Median 0.006s vs. median 0.124s.

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
VI has two modes: the one in which it beeps and the one in which
it doesn't.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: Comments on recursive merge..
  2005-11-09  0:32                 ` Petr Baudis
@ 2005-11-09  0:51                   ` Linus Torvalds
  2005-11-09  0:59                     ` Junio C Hamano
  0 siblings, 1 reply; 58+ messages in thread
From: Linus Torvalds @ 2005-11-09  0:51 UTC (permalink / raw)
  To: Petr Baudis; +Cc: Fredrik Kuivinen, Johannes Schindelin, Junio C Hamano, git

On Wed, 9 Nov 2005, Petr Baudis wrote:
> 
> BTW, git-show-branch is also by orders of magnitude faster (not that
> this would be any major timesaver). Median 0.006s vs. median 0.124s.

Ouch. That makes me suspicious. One reason git-merge-base is slow is 
because it's being pretty careful about some pathological examples of 
dates being just the wrong way around, and it might just be that the 
reason git-show-branch is faster is because it isn't doing that part 
right.

So yes, git-merge-base does extra work, but it does so because I think it 
needs to.

Junio? You even wrote the comment about the case in git-merge-base, I'm 
wondering whether it's a bug that we use the fast-and-cheap algorithm in 
git-show-branch..

Of course, arguably you can first try the fast-and-cheap thing, and if 
that gives a merge parent that is acceptable, why not? So maybe it's the 
right thing for the "let's see if this is trivial" case, but I think it 
might _think_ some cases are trivial that really shouldn't, because they 
actually have two merge parents.

		Linus

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: Comments on recursive merge..
  2005-11-09  0:51                   ` Linus Torvalds
@ 2005-11-09  0:59                     ` Junio C Hamano
  2005-11-09  1:22                       ` Linus Torvalds
  0 siblings, 1 reply; 58+ messages in thread
From: Junio C Hamano @ 2005-11-09  0:59 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git

Linus Torvalds <torvalds@osdl.org> writes:

> Junio? You even wrote the comment about the case in git-merge-base, I'm 
> wondering whether it's a bug that we use the fast-and-cheap algorithm in 
> git-show-branch..

I did show-branch soon after we worked on those pathlogical
merge-base fix, so I would be a bit surprised if I did it
without using all the knowledge from that exercise, but I do not
remember offhand.  The core logic should be simple
generalization of two-head merge-base to N heads.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: Comments on recursive merge..
  2005-11-09  0:59                     ` Junio C Hamano
@ 2005-11-09  1:22                       ` Linus Torvalds
  2005-11-09  1:42                         ` Junio C Hamano
  0 siblings, 1 reply; 58+ messages in thread
From: Linus Torvalds @ 2005-11-09  1:22 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Tue, 8 Nov 2005, Junio C Hamano wrote:
> 
> I did show-branch soon after we worked on those pathlogical
> merge-base fix, so I would be a bit surprised if I did it
> without using all the knowledge from that exercise, but I do not
> remember offhand.  The core logic should be simple
> generalization of two-head merge-base to N heads.

Hmm.

Look at the "join_revs()" logic, and tell me I'm crazy.

It does:

	struct commit *commit = pop_one_commit(list_p);
	int still_interesting = !!interesting(*list_p);

in that order: it looks whether there are any interesting commits left 
_after_ it has popped the top-of-stack.

Which means that "still_interesting" can go down to zero if we just popped 
the last interesting thing off the stack.

Which seems wrong, because the thing we just popped off the stack could 
easily itself be interesting (in fact, it should be so, 99% of the time), 
and can cause other interesting commits to be populated back onto the 
list. So the "still_interesting" flag seems to be wrongly computed: the 
way it is computed now, it's meaningless.

In contrast, the "merge_base()" thing does

	while (interesting(list)) {
		..
	}

which means that we really will walk the list until there is nothing 
interesting left. Which is admittedly expensive, but it was how we got rid 
of the pathological case.

But maybe I'm just missing something really subtle. Maybe git-show-branch 
does some really clever optimization that is valid.

		Linus

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: Comments on recursive merge..
  2005-11-09  1:22                       ` Linus Torvalds
@ 2005-11-09  1:42                         ` Junio C Hamano
  2005-11-09 10:20                           ` Junio C Hamano
  0 siblings, 1 reply; 58+ messages in thread
From: Junio C Hamano @ 2005-11-09  1:42 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git

Linus Torvalds <torvalds@osdl.org> writes:

> It does:
>
> 	struct commit *commit = pop_one_commit(list_p);
> 	int still_interesting = !!interesting(*list_p);
>
> in that order: it looks whether there are any interesting commits left 
> _after_ it has popped the top-of-stack.

Ahhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh.  You are right.

The problem is most of the time hidden, because we usually do
one extra round (extra usually starts from 0 and we break out
after we say "not interesting anymore" and extra < 0).

Obviously, I was not thinking clearly.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: Comments on recursive merge..
  2005-11-09  1:42                         ` Junio C Hamano
@ 2005-11-09 10:20                           ` Junio C Hamano
  2005-11-09 14:59                             ` Petr Baudis
  2005-11-09 16:30                             ` Linus Torvalds
  0 siblings, 2 replies; 58+ messages in thread
From: Junio C Hamano @ 2005-11-09 10:20 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git

Junio C Hamano <junkio@cox.net> writes:

> Linus Torvalds <torvalds@osdl.org> writes:
>
>> It does:
>>
>> 	struct commit *commit = pop_one_commit(list_p);
>> 	int still_interesting = !!interesting(*list_p);
>>
>> in that order: it looks whether there are any interesting commits left 
>> _after_ it has popped the top-of-stack.
>
> The problem is most of the time hidden,...

As you pointed out, still_interesting means "after we are done
with this commit, do we still have something interesting to be
processed?", and the later "extra < 0" check compensates for
this.  After I pop the last interesting commit, I still look at
its parents and push them back into the list.

It seems to be doing the right thing after all.  I hate to admit
it, but I have been having hard time figuring out how this thing
works X-<.  In the meantime, I've checked commits from linux-2.6
history that have more than one merge-base candidates.
"git-merge-base --all" and "git-show-branch --merge-base" give
the same answer to all of them [*1*].

I do not think "git-show-branch --merge-base" can be any more
efficient than "git-merge-base --all".  It does _more_ things
(probably unnecessary things as well).  Pasky's number could be
just an artifact of hot/cold cache difference.

[Footnote]

*1* Here are the commits I used from linux-2.6 repository that
have more than one commits:

    ba9b543d5bec0a7605952e2ba501fb8b0f3b6407
    84ffa747520edd4556b136bdfc9df9eb1673ce12
    da28c12089dfcfb8695b6b555cdb8e03dda2b690
    3190186362466658f01b2e354e639378ce07e1a9
    0c168775709faa74c1b87f1e61046e0c51ade7f3
    0e396ee43e445cb7c215a98da4e76d0ce354d9d7
    467ca22d3371f132ee225a5591a1ed0cd518cb3d

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: Comments on recursive merge..
  2005-11-09 10:20                           ` Junio C Hamano
@ 2005-11-09 14:59                             ` Petr Baudis
  2005-11-09 16:30                             ` Linus Torvalds
  1 sibling, 0 replies; 58+ messages in thread
From: Petr Baudis @ 2005-11-09 14:59 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Linus Torvalds, git

Dear diary, on Wed, Nov 09, 2005 at 11:20:22AM CET, I got a letter
where Junio C Hamano <junkio@cox.net> said that...
> I do not think "git-show-branch --merge-base" can be any more
> efficient than "git-merge-base --all".  It does _more_ things
> (probably unnecessary things as well).  Pasky's number could be
> just an artifact of hot/cold cache difference.

Certainly not that. But I've fetched in the meantime and now show-branch
takes much longer - median 0.078s (git-merge-base's median still stays
around 0.128s). So possibly git-show-branch did some smart optimization
right away in the previous case. I can try to track down the particular
commits if there's any interest.

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
VI has two modes: the one in which it beeps and the one in which
it doesn't.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: Comments on recursive merge..
  2005-11-09 10:20                           ` Junio C Hamano
  2005-11-09 14:59                             ` Petr Baudis
@ 2005-11-09 16:30                             ` Linus Torvalds
  2005-11-09 20:13                               ` Junio C Hamano
  1 sibling, 1 reply; 58+ messages in thread
From: Linus Torvalds @ 2005-11-09 16:30 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Wed, 9 Nov 2005, Junio C Hamano wrote:
> 
> As you pointed out, still_interesting means "after we are done
> with this commit, do we still have something interesting to be
> processed?", and the later "extra < 0" check compensates for
> this.  After I pop the last interesting commit, I still look at
> its parents and push them back into the list.

That "extra" check only helps once. If we ever hit the "extra--", it's 
gone.

In other words, follow this:

 - we start out with "extra = 0" (default value)
 - we've got one "interesting" commit left, and we just popped it.
 - we now have "still_interesting = 0"
 - the commit has just one parent, and it's not something we've seen 
   before, so we add it to the seen list and decrement "extra", which is 
   now -1. We then insert it back to the list.
 - we go back up, pop the thing we just got, and now there are again no 
   interesting commits on the list any more, so "still_interesting = 0".
 - now "extra" is -1, and we break out of the loop without ever 
   percolating the flags of this commit to its parents.

No?

> It seems to be doing the right thing after all.  I hate to admit it, but 
> I have been having hard time figuring out how this thing works X-<.  In 
> the meantime, I've checked commits from linux-2.6 history that have more 
> than one merge-base candidates.

I'm not very impressed by "it works for the seven cases I tried".

It's entirely possible that there _is_ some reason it always works, but if 
so, I'd like to understand it. More likely, it works in _practice_ because 
the only way to trigger anything else is likely such a perverse commit 
history that you'd never see it, but hey..

Also, I don't think this has necessarily anything to do with "multiple 
merge bases". As far as I can tell, we can find a potential "merge base" 
that starts the culling of uniniteresting things, but some other branch 
(that we haven't followed yet - perhaps the one we just broke out of 
early) may end up causing an _earlier_ commit to turn out to also be a 
merge-base, and the merge-base we found originally turns out to be a 
parent of the new one, and thus totally uninteresting.

See what I'm saying? Even with just _one_ well-defined merge base, we 
might hit it.

It so happens that because we traverse the commit history in date order, 
we almost never (but the keyword here is _almost_) hit the case where a 
child of a commit ends up being parsed _after_ the commit that is its 
parent. That only happens when there are non-synchronized clocks etc, and 
there are very few cases of that in the kernel tree.

Just to see how rare that is, do this:

	git-rev-list --pretty=raw HEAD |
		grep '^committer' |
		cut -d'>' -f2 |
		cut -d' ' -f2 > date-list

which basically generates the list of dates of commits in the kernel tree, 
sorted in the natural order that we always traverse the commits in.

Now, do

	sort -nr date-list | diff -u date-list -

to see how often the dates are off. I'm seeing only _three_ commits that 
have time-warps (ie they were "earlier" than one of their parents). Out of 
13,000+.

So walking things in date order _almost_ always does the right thing just 
by mistake (well, it's not "mistake", of course. It's by design: it's the 
closest we can get to a nice balanced walk. But the point is that it's 
still just a heuristic, not something we can absolutely depend on).

And THAT was the reason for the problem with the original git-merge-base 
algorithm. Not multiple merge-bases (which was admittedly another 
problem), but the fact that it didn't give the right merge-base at all due 
to time warps.

(Again - it may be that there's something in show-branch that makes the 
optimization valid, but I just don't understand it).

			Linus

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: Comments on recursive merge..
  2005-11-09 16:30                             ` Linus Torvalds
@ 2005-11-09 20:13                               ` Junio C Hamano
  2005-11-09 21:58                                 ` Linus Torvalds
  0 siblings, 1 reply; 58+ messages in thread
From: Junio C Hamano @ 2005-11-09 20:13 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git

Linus Torvalds <torvalds@osdl.org> writes:

> That "extra" check only helps once. If we ever hit the "extra--", it's 
> gone.

I think you are right here, but while digging into this I found
an interesting case.

The current show-branch code does the same as merge-base in the
pathological example depicted in merge-base.c, but they seem to
do different things to this picture (commit grows from bottom to
top, time flows alphabetically; find base between G and H).

                 H
                / \
           G   A   \
           |\ /     \ 
           | B       \
           |  \       \
            \  C       F
             \  \     / 
              \  D   /   
               \ |  /
                \| /
		 E

"git-merge-base --all" says the merge bases are B and E, while
"show-branch --merge-base" mentions only B.  In this case the
latter is probably the better answer.  Actually git-merge-base
without --all only mentions E.  This is because we give up when
we find the list elements are all uninteresting.  And this is
very expensive to fix (I recall mentioning "horizon effect" last
time we worked on this --- around August 12th).

G gets bit 1 and H gets bit 2.  Here is what happens in each
iteration:

	List			A B C D E F G H		Result
	G1 H2			- - - - - - 1 2
	H2 E1 B1		- 1 - - 1 - 1 2
	F2 E1 B1 A2		2 1 - - 1 2 1 2
	E3 B1 A2		2 1 - - 3 2 1 2		E3
	B1 A2			2 1 - - 3 2 1 2		E3
	C1 A2			2 1 1 - 3 2 1 2		E3
	D1 A2			2 1 1 1 3 2 1 2		E3
	A2			2 1 1 1 3 2 1 2		E3
	B3			2 3 1 1 3 2 1 2		E3 B3
	C7			2 3 7 1 3 2 1 2		E3 B3

We popped B with flag 3, and started contaminating the well by
reinjecting its parent C with flag 7.  That is all good, but
"while (interesting(list))" check stops us from going further.
Ideally the following two steps would have found out that E is
also uninteresting.

	D7			2 3 7 7 3 2 1 2
	E7			2 3 7 7 7 2 1 2

But that is expensive -- we would not know when to stop.

A reproduction recipe is attached here, primarily so I do not
have to worry about losing it from /var/tmp/.

-- >8 -- cut here -- >8 --
#!/bin/sh

rm -fr .git && git-init-db
T=$(git-write-tree)

M=1130000000
Z=+0000

export GIT_COMMITTER_EMAIL=git@comm.iter.xz
export GIT_COMMITTER_NAME='C O Mmiter'
export GIT_AUTHOR_NAME='A U Thor'
export GIT_AUTHOR_EMAIL=git@au.thor.xz

doit() {
	OFFSET=$1; shift
	NAME=$1; shift
	PARENTS=
	for P
	do
		PARENTS="${PARENTS}-p $P "
	done
	GIT_COMMITTER_DATE="$(($M + $OFFSET)) $Z"
	GIT_AUTHOR_DATE=$GIT_COMMITTER_DATE
	export GIT_COMMITTER_DATE GIT_AUTHOR_DATE
	commit=$(echo $NAME | git-commit-tree $T $PARENTS)
	echo $commit >.git/refs/tags/$NAME
	echo $commit
}

checkit() {
    echo MB
    git-merge-base --all "$@" | xargs git-name-rev
    echo SB
    git-show-branch --merge-base "$@" | xargs git-name-rev
    git-show-branch --sha1-name --more=99 "$@"
}

E=$(doit 5 E)
D=$(doit 4 D $E)
F=$(doit 6 F $E)
C=$(doit 3 C $D)
B=$(doit 2 B $C)
A=$(doit 1 A $B)
G=$(doit 7 G $B $E)
H=$(doit 8 H $A $F)

checkit $G $H

exit

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: Comments on recursive merge..
  2005-11-09 20:13                               ` Junio C Hamano
@ 2005-11-09 21:58                                 ` Linus Torvalds
  2005-11-09 22:56                                   ` Junio C Hamano
  2005-11-11  2:58                                   ` merge-base: fully contaminate the well Junio C Hamano
  0 siblings, 2 replies; 58+ messages in thread
From: Linus Torvalds @ 2005-11-09 21:58 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Wed, 9 Nov 2005, Junio C Hamano wrote:
> 
> The current show-branch code does the same as merge-base in the
> pathological example depicted in merge-base.c, but they seem to
> do different things to this picture (commit grows from bottom to
> top, time flows alphabetically; find base between G and H).
> 
>                  H
>                 / \
>            G   A   \
>            |\ /     \ 
>            | B       \
>            |  \       \
>             \  C       F
>              \  \     / 
>               \  D   /   
>                \ |  /
>                 \| /
> 		   E
> 
> "git-merge-base --all" says the merge bases are B and E, while
> "show-branch --merge-base" mentions only B.  In this case the
> latter is probably the better answer.

I don't agree.

Sure, B _may_ be the right answer for a particular merge strategt, but 
there's no way of knowing. Maybe all the big changes came in through F, 
and H is the merge that sorted that out, and E actually ends up being the 
better base.

So I think from a correctness standpoint, the only thing that matters is 
"git-merge-base --all", and anything that doesn't know to return both E 
and B looks potentially buggy.

> Actually git-merge-base without --all only mentions E.

Well, we should really consider anything that doesn't take them all into 
account to be a bug waiting to happen (or rather, a merge waiting for a 
disaster), but E is the right one, since it's the more recent one).

Now, this case obviously depends on history being almost maximally insane 
(ie pretty much _all_ the dates are wrong). So in practice we probably 
don't care.

So maybe "git-show-branch --merge-base" ends up acceptable as a faster way 
to do the quick "let's see if we can find _some_ merge-base to do the 
in-index merge with", but personally I'd much rather always do a 
"git-merge-base --all", and only do the fast index merge if we only have 
one potential parent.

That way there would never any question about what the "quick merge" does.

			Linus

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: Comments on recursive merge..
  2005-11-09 21:58                                 ` Linus Torvalds
@ 2005-11-09 22:56                                   ` Junio C Hamano
  2005-11-09 23:34                                     ` Linus Torvalds
  2005-11-11  2:58                                   ` merge-base: fully contaminate the well Junio C Hamano
  1 sibling, 1 reply; 58+ messages in thread
From: Junio C Hamano @ 2005-11-09 22:56 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git

Linus Torvalds <torvalds@osdl.org> writes:

>> 
>>                  H
>>                 / \
>>            G   A   \
>>            |\ /     \ 
>>            | B       \
>>            |  \       \
>>             \  C       F
>>              \  \     / 
>>               \  D   /   
>>                \ |  /
>>                 \| /
>> 		   E
>> 
> So I think from a correctness standpoint, the only thing that matters is 
> "git-merge-base --all", and anything that doesn't know to return both E 
> and B looks potentially buggy.

But the point of well-poisoning you did in merge-base was to
detect that E is an ancestor of B and exclude it in the first
place.  If it matters what F does, it means checking ancestry
among B C D E and declare that B is a better ancestor than C, D,
E does not help or is sometimes harmful.  No question that B is
always superiour ancestor than C and D, but arguably the
presence of F _might_ change situation for B vs E.

I however do not see merge-base trying to take that into account
and treat E differently from C and D in any way.  Only because F
and E had newer timestamp than C and D, we ended up finding E
first and did not poison E through B, and that's why you got
both B and E.  I think it was just an accident.  If F were older
than B, I suspect the result would have been very different.

> Now, this case obviously depends on history being almost maximally insane 
> (ie pretty much _all_ the dates are wrong). So in practice we probably 
> don't care.

I agree.  The above example was to answer my own question in
this message:

	http://marc.theaimsgroup.com/?l=git&m=112382448222823

> ... personally I'd much rather always do a 
> "git-merge-base --all", and only do the fast index merge if we only have 
> one potential parent.
>
> That way there would never any question about what the "quick merge" does.

I agree we should try to stay away from "heuristic" and make
things safer, but after seeing the above, I'd need a bit more
time to convince myself that what 'git-merge-base --all' does is
*the* safe approach.  Right now, it looks to me that both are
heuristic that work most of the time (merge-base --all 99.99999%
of the time, show-branch 99% of the time, or something like
that).

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: Comments on recursive merge..
  2005-11-09 22:56                                   ` Junio C Hamano
@ 2005-11-09 23:34                                     ` Linus Torvalds
  0 siblings, 0 replies; 58+ messages in thread
From: Linus Torvalds @ 2005-11-09 23:34 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Wed, 9 Nov 2005, Junio C Hamano wrote:

> Linus Torvalds <torvalds@osdl.org> writes:
> 
> >> 
> >>                  H
> >>                 / \
> >>            G   A   \
> >>            |\ /     \ 
> >>            | B       \
> >>            |  \       \
> >>             \  C       F
> >>              \  \     / 
> >>               \  D   /   
> >>                \ |  /
> >>                 \| /
> >> 		   E
> >> 
> > So I think from a correctness standpoint, the only thing that matters is 
> > "git-merge-base --all", and anything that doesn't know to return both E 
> > and B looks potentially buggy.
> 
> But the point of well-poisoning you did in merge-base was to
> detect that E is an ancestor of B and exclude it in the first
> place.

Ahh, you're right, and I'm wrong. That "E" is not a real merge-base, since 
there _is_ a valid merge-base that is a direct descendant of it and thus 
objectively better.

And as to why git-merge-base returns E in the first place: it really 
shouldn't, but when it sees B, it can decide that C is uninteresting, and 
so there are no interesting commits left. So it never continues to walk D 
and thus never notices that D covers E and E is _also_ uninteresting.

So it thinks both E and B are interesting, and since E has a more recent 
date, it will select that one when only showing one (and then show both 
when asked to).

> I however do not see merge-base trying to take that into account
> and treat E differently from C and D in any way.

It doesn't. git-merge-base simply walks the chain in as close to date 
order as it can, and when it decides that the rest of the chain is 
provably uninteresting, it stops.

Which means that sometimes it can stop with too _many_ merge heads, just 
because it hasn't realized that they are reachable through a chain that is 
otherwise provably uninteresting.

This is because we define "uninteresting" as meaning "cannot reach any 
more _new_ merge-heads". Which is true. The fact that such a chain could 
reach some heads we found earlier and mark them as being pointless never 
enters the picture ;)

> I agree we should try to stay away from "heuristic" and make
> things safer, but after seeing the above, I'd need a bit more
> time to convince myself that what 'git-merge-base --all' does is
> *the* safe approach.

Well, "git-merge-base --all" will be "safer" in the sense that it's 
guaranteed to give a superset of the merge-heads (which itself is "safe" 
in that it flags potentially interesting cases early).

Then, the recursive merge strategy could notice (in fact, _will_ notice, 
if it tries to merge the merge-heads) that the merge of such a pair of 
merge-heads is one of the heads itself (just a fast-forward), and thus the 
recursive strategy should correctly have chosen "B" as the merge-head.

So yes, "git-merge-base --all" really is safe. Sometimes (under fairly odd 
circumstances) a bit unnecessarily conservative, but always safe.

> Right now, it looks to me that both are heuristic that work most of the 
> time (merge-base --all 99.99999% of the time, show-branch 99% of the 
> time, or something like that).

The thing is, I don't see what guarantees that the show-branch brhaviour 
is safe or conservative. It happened to pick B in this case, which was the 
right choice, but I don't see how that was anything but just luck and 
happenstance.

IOW, I can see that "git-merge-base --all" can return some unnecessary 
heads, but I can also argue for how that becomes safe and fixes itself. 
With git-show-branch --merge-base, I don't know what that argument is, 
because I can't see how it _guarantees_ that it would always pick B over 
E.

			Linus

^ permalink raw reply	[flat|nested] 58+ messages in thread

* merge-base: fully contaminate the well.
  2005-11-09 21:58                                 ` Linus Torvalds
  2005-11-09 22:56                                   ` Junio C Hamano
@ 2005-11-11  2:58                                   ` Junio C Hamano
  2005-11-11  5:36                                     ` Linus Torvalds
  1 sibling, 1 reply; 58+ messages in thread
From: Junio C Hamano @ 2005-11-11  2:58 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git

The discussion on the list demonstrated a pathological case where
an ancestor of a merge-base can be left interesting.  This commit
introduces a postprocessing phase to fix it.

Signed-off-by: Junio C Hamano <junkio@cox.net>

---

  Linus Torvalds <torvalds@osdl.org> writes:

  > On Wed, 9 Nov 2005, Junio C Hamano wrote:
  >
  >> But the point of well-poisoning you did in merge-base was to
  >> detect that E is an ancestor of B and exclude it in the first
  >> place.
  >
  > Ahh, you're right, and I'm wrong. That "E" is not a real merge-base, since 
  > there _is_ a valid merge-base that is a direct descendant of it and thus 
  > objectively better.
  >
  > Which means that sometimes it can stop with too _many_ merge heads, just 
  > because it hasn't realized that they are reachable through a chain that is 
  > otherwise provably uninteresting.

  I am not particularly proud of this change, but here is an
  attempt to fully contaminate the well without going all the
  way down to root.  It adds a postprocessing phase which does
  not parse any new commits.

  > The thing is, I don't see what guarantees that the show-branch brhaviour 
  > is safe or conservative.

  You are right about this; I have a separate patch to fix it.


 merge-base.c |   78 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++-
 1 files changed, 77 insertions(+), 1 deletions(-)

applies-to: 8eacf17303188e55375f76bea8051555ba1baf02
a2fd94f707128f3f390362725c8d8b0802940111
diff --git a/merge-base.c b/merge-base.c
index 286bf0e..43a6818 100644
--- a/merge-base.c
+++ b/merge-base.c
@@ -80,6 +80,45 @@ static struct commit *interesting(struct
  * Now, list does not have any interesting commit.  So we find the newest
  * commit from the result list that is not marked uninteresting.  Which is
  * commit B.
+ *
+ *
+ * Another pathological example how this thing can fail to mark an ancestor
+ * of a merge base as UNINTERESTING without the postprocessing phase.
+ *
+ *		  2
+ *		  H
+ *	    1    / \
+ *	    G   A   \
+ *	    |\ /     \ 
+ *	    | B       \
+ *	    |  \       \
+ *	     \  C       F
+ *	      \  \     / 
+ *	       \  D   /   
+ *		\ |  /
+ *		 \| /
+ *		  E
+ *
+ *	 list			A B C D E F G H
+ *	 G1 H2			- - - - - - 1 2
+ *	 H2 E1 B1		- 1 - - 1 - 1 2
+ *	 F2 E1 B1 A2		2 1 - - 1 2 1 2
+ *	 E3 B1 A2		2 1 - - 3 2 1 2
+ *	 B1 A2			2 1 - - 3 2 1 2
+ *	 C1 A2			2 1 1 - 3 2 1 2
+ *	 D1 A2			2 1 1 1 3 2 1 2
+ *	 A2			2 1 1 1 3 2 1 2
+ *	 B3			2 3 1 1 3 2 1 2
+ *	 C7			2 3 7 1 3 2 1 2
+ *
+ * At this point, unfortunately, everybody in the list is
+ * uninteresting, so we fail to complete the following two
+ * steps to fully marking uninteresting commits.
+ *
+ *	 D7			2 3 7 7 3 2 1 2
+ *	 E7			2 3 7 7 7 2 1 2
+ *
+ * and we end up showing E as an interesting merge base.
  */
 
 static int show_all = 0;
@@ -88,6 +127,7 @@ static int merge_base(struct commit *rev
 {
 	struct commit_list *list = NULL;
 	struct commit_list *result = NULL;
+	struct commit_list *tmp = NULL;
 
 	if (rev1 == rev2) {
 		printf("%s\n", sha1_to_hex(rev1->object.sha1));
@@ -104,9 +144,10 @@ static int merge_base(struct commit *rev
 
 	while (interesting(list)) {
 		struct commit *commit = list->item;
-		struct commit_list *tmp = list, *parents;
+		struct commit_list *parents;
 		int flags = commit->object.flags & 7;
 
+		tmp = list;
 		list = list->next;
 		free(tmp);
 		if (flags == 3) {
@@ -130,6 +171,41 @@ static int merge_base(struct commit *rev
 	if (!result)
 		return 1;
 
+	/*
+	 * Postprocess to fully contaminate the well.
+	 */
+	for (tmp = result; tmp; tmp = tmp->next) {
+		struct commit *c = tmp->item;
+		/* Reinject uninteresting ones to list,
+		 * so we can scan their parents.
+		 */
+		if (c->object.flags & UNINTERESTING)
+			commit_list_insert(c, &list);
+	}
+	while (list) {
+		struct commit *c = list->item;
+		struct commit_list *parents;
+
+		tmp = list;
+		list = list->next;
+		free(tmp);
+
+		/* Anything taken out of the list is uninteresting, so
+		 * mark all its parents uninteresting.  We do not
+		 * parse new ones (we already parsed all the relevant
+		 * ones).
+		 */
+		parents = c->parents;
+		while (parents) {
+			struct commit *p = parents->item;
+			parents = parents->next;
+			if (!(p->object.flags & UNINTERESTING)) {
+				p->object.flags |= UNINTERESTING;
+				commit_list_insert(p, &list);
+			}
+		}
+	}
+
 	while (result) {
 		struct commit *commit = result->item;
 		result = result->next;
---
0.99.9.GIT

^ permalink raw reply related	[flat|nested] 58+ messages in thread

* Re: merge-base: fully contaminate the well.
  2005-11-11  2:58                                   ` merge-base: fully contaminate the well Junio C Hamano
@ 2005-11-11  5:36                                     ` Linus Torvalds
  2005-11-11  6:04                                       ` Junio C Hamano
  2005-11-11  8:28                                       ` Junio C Hamano
  0 siblings, 2 replies; 58+ messages in thread
From: Linus Torvalds @ 2005-11-11  5:36 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Thu, 10 Nov 2005, Junio C Hamano wrote:
>
> The discussion on the list demonstrated a pathological case where
> an ancestor of a merge-base can be left interesting.  This commit
> introduces a postprocessing phase to fix it.

Hmm. I'd suggest only doing this for the (relatively unlikely) case of 
there being more than one merge-base..

You don't even have to be exact about it: just see if the result list is 
bigger than one. (ie "result->next != NULL"). Yeah, sometimes there are 
result commits that get turned UNINTERESTING after they are added to the 
result list, and your "contaminate" phase would be strictly not needed
then, but that's already a pretty unusual case.

So the cheap test is to just say

	/* Do we have multiple results? */
	if (result->next)
		contaminate_well(result);

no?

Btw, I don't think your contamination logic is necessarily complete. We 
may not even have parsed some of the commits that end up being on that 
strange corner case. I think you catch the particular case you tried, but 
I think that in theory, with long chains of commits out of date order, you 
could be in the situation of having determined that everything was 
uninteresting before you even parsed enough to see the chain from one 
merge-base to another.

In fact, I think it would happen with your pathological example if it had 
just one more commit out-of-order in the E-D-C-B chain. But I didn't walk 
it through.

		Linus

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: merge-base: fully contaminate the well.
  2005-11-11  5:36                                     ` Linus Torvalds
@ 2005-11-11  6:04                                       ` Junio C Hamano
  2005-11-11 16:18                                         ` Linus Torvalds
  2005-11-11  8:28                                       ` Junio C Hamano
  1 sibling, 1 reply; 58+ messages in thread
From: Junio C Hamano @ 2005-11-11  6:04 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git

Linus Torvalds <torvalds@osdl.org> writes:

> So the cheap test is to just say
>
> 	/* Do we have multiple results? */
> 	if (result->next)
> 		contaminate_well(result);
>
> no?

Correct.  And this is only for really artificial corner case so
we should try to avoid for normal cases as much as possible,
cheaply.

> Btw, I don't think your contamination logic is necessarily complete. We 
> may not even have parsed some of the commits that end up being on that 
> strange corner case.

I haven't tried walking any other test cases, but wouldn't that
be arguing that the our assumption that the current merge-base
is at least complete if not optimum?

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: merge-base: fully contaminate the well.
  2005-11-11  6:04                                       ` Junio C Hamano
@ 2005-11-11 16:18                                         ` Linus Torvalds
  0 siblings, 0 replies; 58+ messages in thread
From: Linus Torvalds @ 2005-11-11 16:18 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Thu, 10 Nov 2005, Junio C Hamano wrote:
> 
> > Btw, I don't think your contamination logic is necessarily complete. We 
> > may not even have parsed some of the commits that end up being on that 
> > strange corner case.
> 
> I haven't tried walking any other test cases, but wouldn't that
> be arguing that the our assumption that the current merge-base
> is at least complete if not optimum?

Oh yes, it's always complete, even though it may not be optimal. And it's 
going to be optimal in all realistic cases.

And even in the unrealistic cases if we end up returning a commit that is 
actually reachable by another one (through at least two levels of other 
commits that were in the wrong date-order with the _other_ ways of 
reaching that commit), the recursive strategy of merging the merge-bases 
will always end up boiling it doing to the optimal thing.

So I think this is absolutely 100% correct.

			Linus

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: merge-base: fully contaminate the well.
  2005-11-11  5:36                                     ` Linus Torvalds
  2005-11-11  6:04                                       ` Junio C Hamano
@ 2005-11-11  8:28                                       ` Junio C Hamano
  1 sibling, 0 replies; 58+ messages in thread
From: Junio C Hamano @ 2005-11-11  8:28 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git

Linus Torvalds <torvalds@osdl.org> writes:

> Btw, I don't think your contamination logic is necessarily complete. We 
> may not even have parsed some of the commits that end up being on that 
> strange corner case....

You are right.  And the situation seems really bad.

The full-contaminator is not full at all, and fails miserably in
not so pathlogical case.  If we have something like this:

	1	2	List		A B C D E F G
	F	E	F1 E2		- - - - 2 1 -
        |\     /|	G1 E2 D1 C1	- - 1 1 2 1 1
        \ \   / |	E2 D1 C1	- - 1 1 2 1 1
        |\  D  /|	G3 D3 C3	- - 3 3 2 1 3
        | \ | / |	D3 C3		- - 3 3 2 1 3
        |   C   |	C7		- - 7 3 2 1 3
        |   |   |
        |   B   |
        |   |   /
         \  A  /
          \ | /
            G

we would end up finding D and G and stop there, without ever
seeing A or B.  B _might_ be touched when we look at C at the
last round, but there is no way for us to find G is reachable
from D (or C) without parsing more than what we parsed in the
main loop.

The worst part of this is that you can indefinitely extend C-B-A
chain trivially, and all it takes is the one, initial commit G,
that has a screwed-up timestamp.  All the other commits in this
example are in the right time order.  Very sad.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: Comments on recursive merge..
  2005-11-08 21:02         ` Fredrik Kuivinen
  2005-11-08 21:47           ` Junio C Hamano
  2005-11-08 21:52           ` Linus Torvalds
@ 2005-11-08 23:04           ` Johannes Schindelin
  2 siblings, 0 replies; 58+ messages in thread
From: Johannes Schindelin @ 2005-11-08 23:04 UTC (permalink / raw)
  To: Fredrik Kuivinen; +Cc: Linus Torvalds, Junio C Hamano, git

Hi,

On Tue, 8 Nov 2005, Fredrik Kuivinen wrote:

> On Tue, Nov 08, 2005 at 12:58:50PM +0100, Johannes Schindelin wrote:
> > 
> > We already have a fallback list: after really-trivial, try automatic, ...,
> > try resolve. Why not just add recursive? So, if even resolve failed, just 
> > try once more, with recursive.
> > 
> 
> I don't think this is a very good idea for two reasons. The first one
> is that there are some merge scenarios involving renames which should
> be conflicts but are cleanly merged by git-resolve.
> 
> The second reason is that with the fall back list the recursive
> strategy will only be used in the strange corner cases and will thus
> not get nearly the same amount of testing it would get if it was the
> first choice (or directly after the really-trivial merge).

Two very valid points. You convinced me.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [RFC/PATCH] Make git-recursive the default strategy for git-pull.
  2005-11-08  0:33     ` Linus Torvalds
  2005-11-08  0:59       ` Junio C Hamano
  2005-11-08 11:58       ` Johannes Schindelin
@ 2005-11-08 16:21       ` Junio C Hamano
  2005-11-11 22:25       ` Comments on recursive merge Junio C Hamano
  3 siblings, 0 replies; 58+ messages in thread
From: Junio C Hamano @ 2005-11-08 16:21 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git

This does two things:

 - It changes the hardcoded default merge strategy for two-head
   git-pull from resolve to recursive.

 - .git/config file acquires two configuration items.
   pull.twohead names the strategy for two-head case, and
   pull.octopus names the strategy for octopus merge.

IOW you are paranoid, you can have the following lines in your
.git/config file and keep using git-merge-resolve when pulling
one remote:

	[pull]
		twohead = resolve

OTOH, you can say this:

	[pull]
		twohead = resolve
		twohead = recursive

to try quicker resolve first, and when it fails, fall back to
recursive.

Signed-off-by: Junio C Hamano <junkio@cox.net>

---

  Linus Torvalds <torvalds@osdl.org> writes:

  > Hmm. True. The _really_ trivial in-index case triggers for me pretty 
  > often, but I haven't done any statistics. It might be only 50% of the 
  > time.
  >...
  > It's certainly an option to just do what I just did, namely use the 
  > default one until it breaks, and then just do "git reset --hard" and re-do 
  > the pull with "-s recursive". A bit sad, and it would be good to have 
  > coverage on the recursive strategy..

  Hopefully something like this would make people aware of
  recursive and give it a wider coverage and chance to mature.

 git-pull.sh |   16 ++++++++++++++--
 1 files changed, 14 insertions(+), 2 deletions(-)

applies-to: 75922cf23cc070e2d5220d961a8f645f1bc8bb60
3acc20beaf0df9ce11a1b7aabf8c9dc7507a9b44
diff --git a/git-pull.sh b/git-pull.sh
index 2358af6..3b875ad 100755
--- a/git-pull.sh
+++ b/git-pull.sh
@@ -79,10 +79,22 @@ case "$merge_head" in
 	exit 0
 	;;
 ?*' '?*)
-	strategy_default_args='-s octopus'
+	var=`git-var -l | sed -ne 's/^pull\.octopus=/-s /p'`
+	if test '' = "$var"
+	then
+		strategy_default_args='-s octopus'
+	else
+		strategy_default_args=$var
+	fi
 	;;
 *)
-	strategy_default_args='-s resolve'
+	var=`git-var -l | sed -ne 's/^pull\.twohead=/-s /p'`
+	if test '' = "$var"
+	then
+		strategy_default_args='-s recursive'
+	else
+		strategy_default_args=$var
+	fi
 	;;
 esac
 
---
0.99.9.GIT

^ permalink raw reply related	[flat|nested] 58+ messages in thread

* Re: Comments on recursive merge..
  2005-11-08  0:33     ` Linus Torvalds
                         ` (2 preceding siblings ...)
  2005-11-08 16:21       ` [RFC/PATCH] Make git-recursive the default strategy for git-pull Junio C Hamano
@ 2005-11-11 22:25       ` Junio C Hamano
  2005-11-11 22:53         ` Linus Torvalds
  2005-11-12  6:35         ` Ryan Anderson
  3 siblings, 2 replies; 58+ messages in thread
From: Junio C Hamano @ 2005-11-11 22:25 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git

Linus Torvalds <torvalds@osdl.org> writes:

>> Another thing to consider is if it is fast enough for everyday
>> trivial merges.
>
> Hmm. True. The _really_ trivial in-index case triggers for me pretty 
> often, but I haven't done any statistics. It might be only 50% of the 
> time.

Just for fun, I randomly picked two heads/master commits from
linux-2.6 repository (one was when I happened to have pulled the
last time, and the other was when I thought this might be an
interesting exercise and pulled again), and fed the commits
between the two to a little script that looks at commits and
tries to stat what they did (the script ignores renames so they
appear as deletes and adds).

Here is what the script spitted out:

        Total commit objects: 3957
        Trivial Merges: 72 (1.82%)
        Merges: 225 (5.69%)
        Number of paths touched by non-merge commits:
                average 4.50, median 2, min 2, max 199
        Number of merge parents:
                average 2.00, median 2, min 2, max 2
        Number of merge bases:
                average 1.00, median 1, min 1, max 1
        File level merges:
                average 37.61, median 8, min 0, max 555
        Number of changed paths from the first parent:
                average 379.09, median 66, min 1, max 7553
        File level 3-ways:
                average 1.96, median 1, min 0, max 37
        Paths deleted:
                average 47.56, median 15, min 0, max 554

This counts what happened in individual devleoper's trees,
subsystem maintainer trees and your tree, not just what you saw
yourself.

Some observations.

 - Trivial Merges count is surprisingly high.  About 1/3 of
   merges are pure in-index merges.

 - Most of the commits (developer commits, not merges) are
   small and touches only a couple of paths.

 - Nobody does octopus ;-).

 - We did not have multi-base merge case during the period
   looked at (but the sample count is very low).

 - merge-one-file was called for only a handful (median 8)
   files, which is negligibly small compared to the total 17K
   files in the kernel tree, and fairly small compared to the
   number of changed paths from the first parent (meaning,
   read-tree trivial collapsing helped majorly).  Among them,
   the number of paths that needed real file-level 3-way merges
   were even smaller (avg 1.96).

   All three of these points together is a fine demonstration
   that you designed git really right.

The samples were between these two commits:

commit 6693e74a16ef563960764bd963f1048392135c3c
Author: Linus Torvalds <torvalds@g5.osdl.org>
Date:   Tue Oct 25 20:40:09 2005 -0700

commit 388f7ef720a982f49925e7b4e96f216f208f8c03
Author: Linus Torvalds <torvalds@g5.osdl.org>
Date:   Fri Nov 11 09:26:39 2005 -0800

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: Comments on recursive merge..
  2005-11-11 22:25       ` Comments on recursive merge Junio C Hamano
@ 2005-11-11 22:53         ` Linus Torvalds
  2005-11-12  0:42           ` Junio C Hamano
  2005-11-12  6:35         ` Ryan Anderson
  1 sibling, 1 reply; 58+ messages in thread
From: Linus Torvalds @ 2005-11-11 22:53 UTC (permalink / raw)
  To: Junio C Hamano, Paul Mackerras; +Cc: Git Mailing List

On Fri, 11 Nov 2005, Junio C Hamano wrote:
> 
> Some observations.
> 
>  - Trivial Merges count is surprisingly high.  About 1/3 of
>    merges are pure in-index merges.

I actually don't think that is surprisingly high, and would actually have 
expected it to be closer to 50%.

On the other hand, the merges that end up being pure fast-forwards aren't 
counted as merges at all (since they don't show up as commits), so maybe 
that's what skews my preception of a big percentage of merges as being 
really trivial.

>  - Most of the commits (developer commits, not merges) are
>    small and touches only a couple of paths.

This is something where I think the kernel is perhaps unusual, especially 
for a big project. We really do encourage people to make lots of small and 
well-defined changes, and the whole flow of development has been geared 
towards it. 

>  - Nobody does octopus ;-).

I do think octopus is really cool, and still think seeing that five-way 
octopus-merge in gitk in the git history was really really cool.

It doesn't look as good any more, btw: do "gitk" on the current git tree, 
and search for "Octopus merge", and you'll see some of the history lines 
crossing each other. Paul?

But yeah, it's a pretty special thing. I think its coolness factor way 
outweighs its usefullness factor ;^p

>  - We did not have multi-base merge case during the period
>    looked at (but the sample count is very low).

Again, this is possibly because the kernel has already had a few years of 
distributed SCM usage under its belt, and we've tried to not only merge 
in a timely manner, but also try to keep history reasonably clean and not 
have a lot of cross-merging back and forth. That cuts down on multi-base 
possibilities.

>  - merge-one-file was called for only a handful (median 8)
>    files, which is negligibly small compared to the total 17K
>    files in the kernel tree, and fairly small compared to the
>    number of changed paths from the first parent (meaning,
>    read-tree trivial collapsing helped majorly).  Among them,
>    the number of paths that needed real file-level 3-way merges
>    were even smaller (avg 1.96).

I definitely think this is true for any big project.

Small projects will inevitably have changes that modify large portions of 
the source base. But with small projects, it doesn't really matter _what_ 
you do, you can do it fast.

Big projects (at least the sane kind) will never have lots of changes that 
modify a very big percentage of the source-tree. It's just too painful 
(and I'm not talking from a SCM angle, just from a developer angle).

>    All three of these points together is a fine demonstration
>    that you designed git really right.

Well, it's self-re-inforcing. It was designed for the kernel usage 
patterns, so using the kernel to confirm that it's the "right design" is a 
bit self-serving. Sure, it's a good sign that my mental model of what the 
usage patters are does actually match reality, but at the same time it 
might be more interesting to see if other projects that use git end up 
using it the same way and/or have different statistics.

I do expect that the size of the project will impact the statistics a lot. 

		Linus

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: Comments on recursive merge..
  2005-11-11 22:53         ` Linus Torvalds
@ 2005-11-12  0:42           ` Junio C Hamano
  0 siblings, 0 replies; 58+ messages in thread
From: Junio C Hamano @ 2005-11-12  0:42 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git

Linus Torvalds <torvalds@osdl.org> writes:

> I actually don't think that is surprisingly high, and would actually have 
> expected it to be closer to 50%.
>
> On the other hand, the merges that end up being pure fast-forwards aren't 
> counted as merges at all (since they don't show up as commits), so maybe 
> that's what skews my preception of a big percentage of merges as being 
> really trivial.

That is what I meant by "it does not show just what you saw".
It shows the whole community experience by everybody who has
commit there.  Being _the_ integration point, I imagine that
most of the pure fast-forwards you saw were real merges for
somebody else, and that merge being fast/safe/convenient
counts, not for you but for your subsystem people.

>>    All three of these points together is a fine demonstration
>>    that you designed git really right.
>
> Well, it's self-re-inforcing. It was designed for the kernel
> usage patterns, so using the kernel to confirm that it's the
> "right design" is a bit self-serving.

That is true.  My point was that it's not like git was done
right only for _you_, sacrificing subsystem people.  The sample
4k commits show that the assumption of the usage pattern git is
optimized for actually holds (commits being small, merges being
mostly trivial) for kernel people other than you.

It is yet to be seen if the same assumption holds for other
projects.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: Comments on recursive merge..
  2005-11-11 22:25       ` Comments on recursive merge Junio C Hamano
  2005-11-11 22:53         ` Linus Torvalds
@ 2005-11-12  6:35         ` Ryan Anderson
  2005-11-12  7:44           ` [PATCH] GIT commit statistics Junio C Hamano
  1 sibling, 1 reply; 58+ messages in thread
From: Ryan Anderson @ 2005-11-12  6:35 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Linus Torvalds, git

[-- Attachment #1: Type: text/plain, Size: 875 bytes --]

Junio C Hamano wrote:
> Linus Torvalds <torvalds@osdl.org> writes:
> 
> 
>>>Another thing to consider is if it is fast enough for everyday
>>>trivial merges.
>>
>>Hmm. True. The _really_ trivial in-index case triggers for me pretty 
>>often, but I haven't done any statistics. It might be only 50% of the 
>>time.
> 
> 
> Just for fun, I randomly picked two heads/master commits from
> linux-2.6 repository (one was when I happened to have pulled the
> last time, and the other was when I thought this might be an
> interesting exercise and pulled again), and fed the commits
> between the two to a little script that looks at commits and
> tries to stat what they did (the script ignores renames so they
> appear as deletes and adds).

Mind sharing the script?

It'be nice to know if these stats are typical, or unusual when you get
numbers from a variety of other trees.



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 256 bytes --]

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [PATCH] GIT commit statistics.
  2005-11-12  6:35         ` Ryan Anderson
@ 2005-11-12  7:44           ` Junio C Hamano
  2005-11-12 12:19             ` Martin Langhoff
  0 siblings, 1 reply; 58+ messages in thread
From: Junio C Hamano @ 2005-11-12  7:44 UTC (permalink / raw)
  To: Ryan Anderson; +Cc: Linus Torvalds, git

Ryan Anderson <ryan@michonline.com> writes:

> Junio C Hamano wrote:
>
>> Just for fun, I randomly picked two heads/master commits from
>> linux-2.6 repository ... and fed the commits
>> between the two to a little script that looks at commits and
>> tries to stat what they did (the script ignores renames so they
>> appear as deletes and adds).
>
> Mind sharing the script?
>
> It'be nice to know if these stats are typical, or unusual when you get
> numbers from a variety of other trees.

Very unpolished but here they are.

I misread the trivial count in my original message.  Trivial and
Merge are counted separately, so among 3957 commits, merges were
297 (72 trivials and 225 others).

-- >8 -- cut here -- >8 --
Subject: [PATCH] GIT commit statistics

A set of scripts that read the existing commit history, and
show various stats.

Sample usage:

    # Arguments are given to git-rev-list; defaults to ORIG..
    # if not given, to retrace what was just pulled.
    $ ./contrib/jc-git-stat-1.sh v0.99.9g..maint |
      ./contrib/jc-git-stat-1-log.perl
    Total commit objects: 43
    Trivial Merges: 1 (2.33%)
    Merges: 1 (2.33%)
    Number of paths touched by non-merge commits:
	    average 3.00, median 2, min 2, max 18
    Number of merge parents:
	    average 2.50, median 3, min 2, max 3
    Number of merge bases:
	    average 1.00, median 1, min 1, max 1
    File level merges:
	    average 0.50, median 1, min 0, max 1
    Number of changed paths from the first parent:
	    average 28.00, median 52, min 4, max 52
    File level 3-ways:
	    average 1.00, median 1, min 1, max 1

 * "Trivial Merges" are the ones done by read-tree --trivial;
 * "Merges" are other merges;
 * "File level merges" are paths not collapsed by read-tree 3-way (i.e.
   given to merge-one-file);
 * "File level 3-ways" are paths merge-one-file would have run 'merge';

Signed-off-by: Junio C Hamano <junkio@cox.net>

---

 contrib/jc-git-stat-1-log.perl |   87 ++++++++++++++++++++++++++++++++++++++++
 contrib/jc-git-stat-1-mof.sh   |   59 +++++++++++++++++++++++++++
 contrib/jc-git-stat-1.sh       |   74 ++++++++++++++++++++++++++++++++++
 3 files changed, 220 insertions(+), 0 deletions(-)
 create mode 100755 contrib/jc-git-stat-1-log.perl
 create mode 100755 contrib/jc-git-stat-1-mof.sh
 create mode 100755 contrib/jc-git-stat-1.sh

applies-to: 9a0f0c748316751fbf593a21f2b16bcdd975095a
2cb3da4b260ed82dc379a11d91f55fe774a2ea49
diff --git a/contrib/jc-git-stat-1-log.perl b/contrib/jc-git-stat-1-log.perl
new file mode 100755
index 0000000..b70af2b
--- /dev/null
+++ b/contrib/jc-git-stat-1-log.perl
@@ -0,0 +1,87 @@
+#!/usr/bin/perl
+
+my ($patches, $failures, $merges, $trivials) = (0, 0, 0, 0);
+my (@patch_paths,
+    @parent_counts,
+    @base_counts,
+    @merge_counts,
+    @path_counts,
+    @res_counts,
+    @merge_m, 
+    @merge_a,
+    @merge_d, 
+    @merge_c, 
+    @merge_u);
+
+sub avg_median {
+    my ($ary) = shift;
+    my ($msg) = shift;
+    my @a = sort { $a <=> $b } @$ary;
+    my $sum = 0;
+    for (@a) { $sum += $_ }
+    return unless (@a && $sum);
+    my ($avg, $med) = ($sum/@a, $a[(@a/2)]);
+    my ($min, $max) = ($a[0], $a[$#a]);
+    printf "%s:\n\taverage %.2f, median %d, min %d, max %d\n",
+    	$msg, $avg, $med, $min, $max;
+}
+
+while (<>) {
+    next unless (s/^([MCFT]) [0-9a-f]{40} //);
+    chomp;
+    my $type = $1;
+    if ($type eq 'F') {
+	$failures++;
+	next;
+    }
+    if ($type eq 'C') {
+	$patches++;
+	push @patch_paths, $_;
+	next;
+    }
+    if ($type eq 'M') {
+	$merges++;
+    }
+    elsif ($type eq 'T') {
+	$trivials++;
+    }
+    else {
+	die "?? $type";
+    }
+    s/^(\d+) (\d+) (\d+) (\d+) (\d+) *//;
+    push @parent_counts, $1;
+    push @base_counts, $2;
+    push @merge_counts, $3;
+    push @path_counts, $4;
+    push @res_counts, $5;
+    if ($type eq 'M') {
+	/M=(\d+) A=(\d+) D=(\d+) C=(\d+) U=(\d+)/ or die;
+	push @merge_m, $1;
+	push @merge_a, $2;
+	push @merge_d, $3;
+	push @merge_c, $4;
+	push @merge_u, $5;
+    }
+}
+
+my $total = ($failures+$patches+$merges+$trivials);
+print "Total commit objects: $total\n";
+printf "Trivial Merges: $trivials (%.2f%%)\n", ($trivials * 100.0/$total);
+printf "Merges: $merges (%.2f%%)\n", ($merges * 100.0/$total);
+if ($failures) {
+    print "Failures: $failures\n";
+}
+
+avg_median(\@patch_paths, "Number of paths touched by non-merge commits");
+avg_median(\@parent_counts, "Number of merge parents");
+avg_median(\@base_counts, "Number of merge bases");
+avg_median(\@merge_counts, "File level merges");
+avg_median(\@path_counts, "Number of changed paths from the first parent");
+#avg_median(\@res_counts, "");
+avg_median(\@merge_m, "File level 3-ways");
+avg_median(\@merge_a, "Paths added");
+avg_median(\@merge_d, "Paths deleted");
+avg_median(\@merge_c, "Paths identically added with wrong permission");
+avg_median(\@merge_u, "Paths added differently");
+
+
diff --git a/contrib/jc-git-stat-1-mof.sh b/contrib/jc-git-stat-1-mof.sh
new file mode 100755
index 0000000..2be6d8b
--- /dev/null
+++ b/contrib/jc-git-stat-1-mof.sh
@@ -0,0 +1,59 @@
+#!/bin/sh
+#
+# Copyright (c) Linus Torvalds, 2005
+# Copyright (c) Junio C Hamano, 2005
+#
+# This is modified from the git per-file merge script, called with
+#
+#   $1 - original file SHA1 (or empty)
+#   $2 - file in branch1 SHA1 (or empty)
+#   $3 - file in branch2 SHA1 (or empty)
+#   $4 - pathname in repository
+#   $5 - orignal file mode (or empty)
+#   $6 - file in branch1 mode (or empty)
+#   $7 - file in branch2 mode (or empty)
+#
+# Handle some trivial cases.. The _really_ trivial cases have
+# been handled already by git-read-tree, but that one doesn't
+# do any merges that might change the tree layout.
+
+case "${1:-.}${2:-.}${3:-.}" in
+#
+# Deleted in both or deleted in one and unchanged in the other
+#
+"$1.." | "$1.$1" | "$1$1.")
+	echo D
+	;;
+
+#
+# Added in one.
+#
+".$2." | "..$3" )
+	echo A
+	;;
+
+#
+# Added in both (check for same permissions).
+#
+".$3$2")
+	if [ "$6" != "$7" ]; then
+		echo C
+	else
+		echo A
+	fi
+	;;
+
+#
+# Modified in both, but differently.
+#
+"$1$2$3")
+	echo M
+	;;
+
+".$2$3")
+	echo U
+	;;
+*)
+	echo C
+	;;
+esac
diff --git a/contrib/jc-git-stat-1.sh b/contrib/jc-git-stat-1.sh
new file mode 100755
index 0000000..b03c69d
--- /dev/null
+++ b/contrib/jc-git-stat-1.sh
@@ -0,0 +1,74 @@
+#!/bin/sh
+
+MOF=`dirname "$0"`/jc-git-stat-1-mof.sh
+
+GIT_INDEX_FILE=.tmp-index
+export GIT_INDEX_FILE
+LF='
+'
+
+check_merge () {
+	rm -f $GIT_INDEX_FILE
+	commit=$1
+	shift
+
+	case "$#" in
+	2)
+		MB=$(git-merge-base --all "$@")
+		;;
+	*)
+		MB=$(git-show-branch --merge-base "$@")
+		;;
+	esac
+	basecnt=$(echo $MB | wc -l)
+
+	if git-read-tree --trivial -m $MB "$@" 2>/dev/null
+	then
+		type=T
+		pathcnt=$(git-diff-index --cached --name-status "$1" | wc -l)
+		rescnt=$(git-diff-index --cached --name-status "$commit" | wc -l)
+		echo "T $commit $# $basecnt 0 $pathcnt $rescnt"
+	elif git-read-tree -m $MB "$@" 2>/dev/null
+	then
+	        script='s/^ *\([0-9]*\) *\([A-Z]\)/\2=\1/'
+		type=M
+		mergecnt=$(git-ls-files --unmerged | sort -k 4,4 -u | wc -l)
+		pathcnt=$(git-diff-index --cached --name-status "$1" | wc -l)
+
+		C=0 A=0 M=0 U=0 D=0
+		eval `git-merge-index -o "$MOF" -a |
+			sort |
+			uniq -c |
+			sed -e "$script"`
+		rescnt=$(git-diff-index --cached --name-status "$commit" | wc -l)
+		echo "M $commit $# $basecnt $mergecnt $pathcnt $rescnt M=$M A=$A D=$D C=$C U=$U"
+	else
+		echo "F $commit $# $basecnt"
+	fi
+}
+
+check_patch () {
+	pathcnt=$(git-diff-tree --name-status -r "$1" | wc -l)
+	echo "C $1 $pathcnt"
+}
+
+case "$#" in
+0)
+	set ORIG_HEAD.. ;;
+esac
+
+git-rev-list --parents "$@" |
+while read commit parents
+do
+	case "$parents" in
+	?*' '?*)
+		# Merge
+		check_merge $commit $parents
+		;;
+	*)
+		# Change
+		check_patch $commit $parents
+		;;
+	esac
+done
+
---
0.99.9.GIT

^ permalink raw reply related	[flat|nested] 58+ messages in thread

* Re: [PATCH] GIT commit statistics.
  2005-11-12  7:44           ` [PATCH] GIT commit statistics Junio C Hamano
@ 2005-11-12 12:19             ` Martin Langhoff
  2005-11-12 12:53               ` Petr Baudis
                                 ` (3 more replies)
  0 siblings, 4 replies; 58+ messages in thread
From: Martin Langhoff @ 2005-11-12 12:19 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Ryan Anderson, Linus Torvalds, git

On 11/12/05, Junio C Hamano <junkio@cox.net> wrote:
> Ryan Anderson <ryan@michonline.com> writes:
>
> > Junio C Hamano wrote:
> >
> >> Just for fun, I randomly picked two heads/master commits from
> >> linux-2.6 repository ... and fed the commits
> >> between the two to a little script that looks at commits and
> >> tries to stat what they did (the script ignores renames so they
> >> appear as deletes and adds).

Related to this, I've been wondering whether it'd be possible to teach
git to rebase local patches, even if that means rewriting local
history. When you are dealing with team shared repo, the sequences of
pull/push end up being quite messy, full of little meaningless merges.
Similarly, when dealing with an upstream, my tree gets slowly out of
sync and slightly messy. Eventually I get a new checkout, and rebase
any pending patches with git-format-patch and git-am.

The same process would be much easier if I could just cg-update from
the repo and get it to try and actually rebase my local commits --
rewriting history as if I had committed them after the update. Of
course, it'd be cheating... but we cheat all the time anyway, we only
sweat harder at it ;-)

cheers,

martin

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH] GIT commit statistics.
  2005-11-12 12:19             ` Martin Langhoff
@ 2005-11-12 12:53               ` Petr Baudis
  2005-11-15 10:04                 ` Catalin Marinas
  2005-11-12 19:04               ` Johannes Schindelin
                                 ` (2 subsequent siblings)
  3 siblings, 1 reply; 58+ messages in thread
From: Petr Baudis @ 2005-11-12 12:53 UTC (permalink / raw)
  To: Martin Langhoff
  Cc: Junio C Hamano, Ryan Anderson, Linus Torvalds, git,
	catalin.marinas

Dear diary, on Sat, Nov 12, 2005 at 01:19:45PM CET, I got a letter
where Martin Langhoff <martin.langhoff@gmail.com> said that...
> The same process would be much easier if I could just cg-update from
> the repo and get it to try and actually rebase my local commits --
> rewriting history as if I had committed them after the update. Of
> course, it'd be cheating... but we cheat all the time anyway, we only
> sweat harder at it ;-)

I'm a bit reluctant about this functionality available in cg-update, but
then people will start to want commit stack and stuff, while they should
be just already long using StGIT for tracking their patches.

Actually, I wanted to also implement e-mail functionality to cg-mkpatch,
but I'm not sure now - perhaps people wanting that should really just
use StGIT. Cogito or GIT core is not very suitable for keeping your
patches against someone else's tree if he is not going to GIT-merge with
you, exactly because it's not really very convenient to update your
patches.

On the same note, I would like StGIT to drop functionality not really
belonging to patch stack manager (stg add, stg rm, stg status, ...) so
that its commandset gets smaller and more focused - but before I would
suggest dropping stg status, cg-status must be able to do conflicts
tracking, so I will dedicate another mail to this sometime in the
future, with a more detailed proposal.

So, is there any reason why you want this in GIT/Cogito and don't want
to use StGIT?

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
VI has two modes: the one in which it beeps and the one in which
it doesn't.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH] GIT commit statistics.
  2005-11-12 12:53               ` Petr Baudis
@ 2005-11-15 10:04                 ` Catalin Marinas
  2005-11-15 15:29                   ` Chuck Lever
  0 siblings, 1 reply; 58+ messages in thread
From: Catalin Marinas @ 2005-11-15 10:04 UTC (permalink / raw)
  To: Petr Baudis
  Cc: Martin Langhoff, Junio C Hamano, Ryan Anderson, Linus Torvalds,
	git

On 12/11/05, Petr Baudis <pasky@suse.cz> wrote:
> On the same note, I would like StGIT to drop functionality not really
> belonging to patch stack manager (stg add, stg rm, stg status, ...) so
> that its commandset gets smaller and more focused

This was the case with the first StGIT implementations but I slowly
began to want to only use StGIT and not switch to something else for
trivial SCM operations. I eventually added 'stg commit' which stores
the patches permanently into the base of the stack to enable some kind
of maintainer mode for StGIT. My main use for this was to import
patches directly into the main branch and not keep a separate one and
pull between them.

> - but before I would
> suggest dropping stg status, cg-status must be able to do conflicts
> tracking, so I will dedicate another mail to this sometime in the
> future, with a more detailed proposal.

The gitmergeonefile.py script in StGIT adds every conflict to the
.git/conflicts file which is read by 'stg status'. My goal is not to
leave any unmerged entried in the index even if there are conflicts.
Maybe this could be changed and .git/conflicts file avoided entirely.

Anyway, while I'll try not to add more SCM functionality to StGIT, I
don't think I should remove the existing add/rm/status functionality.
It's just handy not to use a different command when you want a new
file added to a patch.

--
Catalin

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH] GIT commit statistics.
  2005-11-15 10:04                 ` Catalin Marinas
@ 2005-11-15 15:29                   ` Chuck Lever
  0 siblings, 0 replies; 58+ messages in thread
From: Chuck Lever @ 2005-11-15 15:29 UTC (permalink / raw)
  To: Petr Baudis
  Cc: Catalin Marinas, Martin Langhoff, Junio C Hamano, Ryan Anderson,
	Linus Torvalds, git

[-- Attachment #1: Type: text/plain, Size: 1324 bytes --]

Catalin Marinas wrote:
> On 12/11/05, Petr Baudis <pasky@suse.cz> wrote:
> 
>>On the same note, I would like StGIT to drop functionality not really
>>belonging to patch stack manager (stg add, stg rm, stg status, ...) so
>>that its commandset gets smaller and more focused
> 
> 
> This was the case with the first StGIT implementations but I slowly
> began to want to only use StGIT and not switch to something else for
> trivial SCM operations. I eventually added 'stg commit' which stores
> the patches permanently into the base of the stack to enable some kind
> of maintainer mode for StGIT. My main use for this was to import
> patches directly into the main branch and not keep a separate one and
> pull between them.

  ...

> Anyway, while I'll try not to add more SCM functionality to StGIT, I
> don't think I should remove the existing add/rm/status functionality.
> It's just handy not to use a different command when you want a new
> file added to a patch.

petr,

currently it isn't recommended to use StGIT with other porcelains.

so either: make it completely safe to use StGIT with other porcelains, 
or add a minimal amount of SCM-like functionality so users don't miss it 
and try to use other porcelains and trash their repositories.

overall i agree with catalin-- it's just easier to use a single tool.

[-- Attachment #2: cel.vcf --]
[-- Type: text/x-vcard, Size: 439 bytes --]

begin:vcard
fn:Chuck Lever
n:Lever;Charles
org:Network Appliance, Incorporated;Linux NFS Client Development
adr:535 West William Street, Suite 3100;;Center for Information Technology Integration;Ann Arbor;MI;48103-4943;USA
email;internet:cel@citi.umich.edu
title:Member of Technical Staff
tel;work:+1 734 763-4415
tel;fax:+1 734 763 4434
tel;home:+1 734 668-1089
x-mozilla-html:FALSE
url:http://www.monkey.org/~cel/
version:2.1
end:vcard


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH] GIT commit statistics.
  2005-11-12 12:19             ` Martin Langhoff
  2005-11-12 12:53               ` Petr Baudis
@ 2005-11-12 19:04               ` Johannes Schindelin
  2005-11-13 10:59               ` Junio C Hamano
  2005-11-13 11:11               ` Petr Baudis
  3 siblings, 0 replies; 58+ messages in thread
From: Johannes Schindelin @ 2005-11-12 19:04 UTC (permalink / raw)
  To: Martin Langhoff; +Cc: git

Hi,

On Sun, 13 Nov 2005, Martin Langhoff wrote:

> [...] I've been wondering whether it'd be possible to teach git to 
> rebase local patches, even if that means rewriting local history. When 
> you are dealing with team shared repo, the sequences of pull/push end up 
> being quite messy, full of little meaningless merges. Similarly, when 
> dealing with an upstream, my tree gets slowly out of sync and slightly 
> messy. Eventually I get a new checkout, and rebase any pending patches 
> with git-format-patch and git-am.

I thought about this problem for a while. You described a typical use case 
(in a small team with one central repo), where pushes are done only from a 
cleaned up branch, but the work branches stay.

If the tree corresponding to the central HEAD matches the tree 
corresponding of your private merge branch at a given stage, a simple 
graft should be your solution.

Example: You pull and push from/to origin on a central server. Your 
(dirty) work branch is master. Now, at a given time, 
origin^{tree}==master~15^{tree}. You could then generate a graft which 
tells git that all parents of origin are also parents of master~15.

If at a given stage, origin^{tree}==master^{tree}, that graft would make 
your next merge a fast forward, effectively cleaning up your branch 
without loosing your history.

The graft could be generated by a simple script.

Would this help you?

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH] GIT commit statistics.
  2005-11-12 12:19             ` Martin Langhoff
  2005-11-12 12:53               ` Petr Baudis
  2005-11-12 19:04               ` Johannes Schindelin
@ 2005-11-13 10:59               ` Junio C Hamano
  2005-11-13 20:42                 ` Martin Langhoff
  2005-11-13 11:11               ` Petr Baudis
  3 siblings, 1 reply; 58+ messages in thread
From: Junio C Hamano @ 2005-11-13 10:59 UTC (permalink / raw)
  To: Martin Langhoff; +Cc: git

Martin Langhoff <martin.langhoff@gmail.com> writes:

> Similarly, when dealing with an upstream, my tree gets slowly out of
> sync and slightly messy. Eventually I get a new checkout, and rebase
> any pending patches with git-format-patch and git-am.

The key is not to let your tree go "slowly" out of sync, from my
experience.  When Linus was the maintainer, I used to do the
equivalent of the following all the time to keep up with his
tree while keeping my history clean [*1*].

Frequently [*2*], I tried to see if Linus made something new and
interesting.  My "origin" branch was always copy of Linus head.

	$ git fetch origin

The command would say "Fast forward".  So what did he do?

	$ git show-branch master origin
        ! [origin] Separate LDFLAGS and CFLAGS.
         * [master] Rename lost+found to lost-found.
        --
         + [master] Rename lost+found to lost-found.
         + [master^] Fix compilation warnings in pack-redundant.c
         + [master~2] Debian: build-depend on libexpat-dev.
        +  [origin] Separate LDFLAGS and CFLAGS.
        ++ [master~3] Split gitk into seperate RPM package

Ah, the commit master~3 was what he had the last time I pulled
from him, and since then he made a commit while I did three.  I
could do "git pull . origin" at this point, but that would
result in a useless mini-merge.  My tree is not public so I can
freely rebase to clean things up.

	$ git rebase origin
	$ git show-branch
        ! [origin] Separate LDFLAGS and CFLAGS.
         * [master] Rename lost+found to lost-found.
        --
         + [master] Rename lost+found to lost-found.
         + [master^] Fix compilation warnings in pack-redundant.c
         + [master~2] Debian: build-depend on libexpat-dev.
        ++ [origin] Separate LDFLAGS and CFLAGS.

Now I am fast-forward, so I could ask him to pull from me [*3*].

I think each of your developers can do the same, treating the
"project shared repository" as "Linus repository" and pull that
into the "origin" branch, and when the "master" is ready, push
it back into the shared repository (which is equivalent of Linus
pulling everything from me while doing nothing else in his
repository).

For a sizable change that deserves a topic branch with a long
sequence of commits, rebasing is not always the optimum
solution; and you may want to keep the full merge history of
such a branch pushed into the public repository as is.  But for
simpler cases that 'git rebase' can handle easily without
conflicts, the above procedure would help you keeping the
history of your shared repository less cluttered.

[Footnotes]

*1* Back then we did not have multi-head fetch, show-branch nor
rebase, so I did these using a homebrew Porcelain.

*2* Unlike CVS which always mucks with the working tree, 'git
fetch' into a branch that is not current one is an operation and
can be done even when I am in the middle of a heavy hackery.
Being able to peek into what others are up even when your tree
is in a messy state (the fetch is often followed by log and
diff) helps you to avoid doing duplicated work or going in a
wrong direction, which was great.

*3* Even back then almost all changes were fed via e-mail to
the maintainer.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH] GIT commit statistics.
  2005-11-13 10:59               ` Junio C Hamano
@ 2005-11-13 20:42                 ` Martin Langhoff
  2005-11-14  3:33                   ` Junio C Hamano
  0 siblings, 1 reply; 58+ messages in thread
From: Martin Langhoff @ 2005-11-13 20:42 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On 11/13/05, Junio C Hamano <junkio@cox.net> wrote:
> Ah, the commit master~3 was what he had the last time I pulled
> from him, and since then he made a commit while I did three.  I
> could do "git pull . origin" at this point, but that would
> result in a useless mini-merge.  My tree is not public so I can
> freely rebase to clean things up.
>
>         $ git rebase origin
>         $ git show-branch

What happens if there are conflicts during git-rebase? I'm thinking of
adding an '-r' option to cg-update that will rebase instead of
merging, if the rebase is clean.

Is there a cheap way to ask from a shell script whether the merge is
truly trivial? I thought git-diff-tree would help me here, but it
doesn't...


martin

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH] GIT commit statistics.
  2005-11-13 20:42                 ` Martin Langhoff
@ 2005-11-14  3:33                   ` Junio C Hamano
  2005-11-14  4:01                     ` Martin Langhoff
  0 siblings, 1 reply; 58+ messages in thread
From: Junio C Hamano @ 2005-11-14  3:33 UTC (permalink / raw)
  To: Martin Langhoff; +Cc: git

Martin Langhoff <martin.langhoff@gmail.com> writes:

> On 11/13/05, Junio C Hamano <junkio@cox.net> wrote:
>> ....  I
>> could do "git pull . origin" at this point, but that would
>> result in a useless mini-merge.  My tree is not public so I can
>> freely rebase to clean things up.
>>
>>         $ git rebase origin
>>         $ git show-branch
>
> What happens if there are conflicts during git-rebase?

Well, obviously you could resolve them ;-).  But if you are
rebasing just to reduce trivial mini-merges, it might make more
sense to honestly record the merge if the rebase involves
conflict resolution.  After all, the reason rebase got conflicts
is because the development trail by somebody else that has been
already committed to the shared "master" branch overlapped what
you were doing in your "master" branch, isn't it?

In your message you indicated that you use "format-patch" piped
to "am".  I think that is a better approach than "rebase" these
days; the conflict can be handled easier with that approach, and
if you use "--3way" flag you do not even have to worry about
patches in your branch that is already there in the shared
"master" (your "origin") branch.

So instead of running "git rebase origin" at this point, I may
do something like this [*1*]:

	$ git-reset --hard origin
        $ git-format-patch -k --stdout origin ORIG_HEAD | git am -3 -k

The first step rewinds my "master" (the original is stored in
ORIG_HEAD), and the second step extracts the commits that were
in my master but not in origin in a patch form, an replay them
on top of the "master" (which was rewound to "origin").

"git-am" would stop at the first unapplicable patch if there is
a conflict, leaving the conflicting patch in .dotest/patch.
I have to fix it up before going further.  Here is how.

1. "git am" 3-way fallback would have kicked in, because I have
   all the blobs the patch is supposed to apply to, and my
   working tree and index is in a state just like when I am
   resolving a conflicting merge after a pull.  Clean up the
   conflict in the working tree, build-test and all as usual.

2. Run "git diff HEAD >.dotest/patch" to record what the patch
   should have been if it were to apply cleanly on top of the
   previous state.  If I did a noteworthy adjustment to the
   patch, I might also edit .dotest/final-commit to update the
   commit log message.

3. Then reset the working tree and index before the failed
   application of this patch with "git reset --hard".

After that:

	$ git am -3

would let me restart from that commit that did not replay well.

> Is there a cheap way to ask from a shell script whether the merge is
> truly trivial? I thought git-diff-tree would help me here, but it
> doesn't...

This was recently added by Linus to help git-merge do that:

	git-read-tree --trivial -m -u $O $A $B

The command exits with a non-zero status, without touching index
nor working tree, when the merge is not "truly trivial".
Otherwise it does its thing -- the trivial in-index merge is
done, files in working tree updated and the only thing left for
you to do is to create a commit having parent $A and $B.

Would that help?

[Footnote]

*1* This is what the "make rebase restartable" comment in TODO
list is about, and I wanted to rewrite "rebase" to do exactly
these two commands, but I got distracted ;-).

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH] GIT commit statistics.
  2005-11-14  3:33                   ` Junio C Hamano
@ 2005-11-14  4:01                     ` Martin Langhoff
  2005-11-14  6:06                       ` Junio C Hamano
  0 siblings, 1 reply; 58+ messages in thread
From: Martin Langhoff @ 2005-11-14  4:01 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On 11/14/05, Junio C Hamano <junkio@cox.net> wrote:
> In your message you indicated that you use "format-patch" piped
> to "am".  I think that is a better approach than "rebase" these
> days

Hmmm. But doesn't deal well with binary changes. We deal with a large
set of projects, and while we don't manage that many binary files, it
is just enough that I'll have to pass on only using format-patch.

> This was recently added by Linus to help git-merge do that:
>
>         git-read-tree --trivial -m -u $O $A $B

Cool! Abusing that, perhaps I could teach git-rebase to take a
'--trivial-only' flag, and then cg-update --rebase could try
git-rebase --trivial-only, and fall back on cg-merge...

cheers,

martin

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH] GIT commit statistics.
  2005-11-14  4:01                     ` Martin Langhoff
@ 2005-11-14  6:06                       ` Junio C Hamano
  2005-11-14  8:51                         ` Martin Langhoff
  0 siblings, 1 reply; 58+ messages in thread
From: Junio C Hamano @ 2005-11-14  6:06 UTC (permalink / raw)
  To: Martin Langhoff; +Cc: git

Martin Langhoff <martin.langhoff@gmail.com> writes:

> On 11/14/05, Junio C Hamano <junkio@cox.net> wrote:
>> In your message you indicated that you use "format-patch" piped
>> to "am".  I think that is a better approach than "rebase" these
>> days
>
> Hmmm. But doesn't deal well with binary changes. We deal with a large
> set of projects, and while we don't manage that many binary files, it
> is just enough that I'll have to pass on only using format-patch.

It shouldn't be too tricky to enhance "git am" (git-apply called
at around line 49 in it) to grok binary differences for this
purpose, because you would have both pre- and post-image blob in
your object database, because the patch is being used only to
replay what you have in your reository, and it records their
abbreviated SHA1 name.

I've never felt need to "merge" the binary files myself and had
never got around doing this, but if you are interested, it would
go something like this:

. Make "index %.7s..%.7s" that abbreviates pre- and post- image
  blob SHA1s in diff.c configurable to spit out full 40 bytes.
  Call that option --full-index-sha1.

. The updated "git rebase" that uses the "format-patch | am" I
  outlined would pass the --full-index-sha1 to format-patch
  (which is pased onto underlying diff-tree -p).

. In apply.c, check if all of the following holds:

    * we have both the full 40-byte old_sha1_prefix[] and
      new_sha1_prefix[]; and

    * what the index records matches old_sha1_prefix[]; and

    * the new blob is found in the object database;

  and for such a path:

    * change parse_chunk() not to barf even on a binary patch.

    * change apply_data() to just declare the patch application
      result is the new blob recorded in the patch.

Hmm.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH] GIT commit statistics.
  2005-11-14  6:06                       ` Junio C Hamano
@ 2005-11-14  8:51                         ` Martin Langhoff
  2005-11-14  9:25                           ` Petr Baudis
                                             ` (2 more replies)
  0 siblings, 3 replies; 58+ messages in thread
From: Martin Langhoff @ 2005-11-14  8:51 UTC (permalink / raw)
  To: Junio C Hamano, Petr Baudis; +Cc: git

On 11/14/05, Junio C Hamano <junkio@cox.net> wrote:
> It shouldn't be too tricky to enhance "git am" (git-apply called
> at around line 49 in it) to grok binary differences for this
> purpose, because you would have both pre- and post-image blob in
> your object database, because the patch is being used only to
> replay what you have in your reository, and it records their
> abbreviated SHA1 name.

I'm curious. What would be the advantages of this over git-read-tree
-m for use within a single repo? I keep thinking that I need an
intra-repo way of doing it (arguably faster and more reliable),
instead of git-format-patch|git-am, which is less reliable and slower.

OTOH,  if this is heading towards teaching git-am how to apply changes
to binary files based on known SHA1s, this will give birth to a type
of patch that applies only if you have the objects beforehand. Is that
enough to get by? Perhaps we need a format to fully describe binary
files?

> I've never felt need to "merge" the binary files myself and had
> never got around doing this, but if you are interested, it would
> go something like this:

That's quite a bit of C hacking... I'm game for all the Perl and shell
scripts in git, but I know better than start learning C in _this_
project with you, Linus and the whole list watching me make the fool
;-)

So noone hacks this bit of C you are proposing I'll eventually get
something done with cg-update and git-rebase.

In the meantime, the user experience of working with a small team and
a shared repo has improved significantly by switching from cg-update
to cg-fetch && git-rebase origin. So much so that I suspect that it'd
be a big win for cg-update to default to rebase on merges from
'origin'.

cheers,

martin

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH] GIT commit statistics.
  2005-11-14  8:51                         ` Martin Langhoff
@ 2005-11-14  9:25                           ` Petr Baudis
  2005-11-14 21:25                             ` Martin Langhoff
  2005-11-14  9:27                           ` Junio C Hamano
  2005-11-15  3:00                           ` Junio C Hamano
  2 siblings, 1 reply; 58+ messages in thread
From: Petr Baudis @ 2005-11-14  9:25 UTC (permalink / raw)
  To: Martin Langhoff; +Cc: Junio C Hamano, git

Dear diary, on Mon, Nov 14, 2005 at 09:51:29AM CET, I got a letter
where Martin Langhoff <martin.langhoff@gmail.com> said that...
> In the meantime, the user experience of working with a small team and
> a shared repo has improved significantly by switching from cg-update
> to cg-fetch && git-rebase origin. So much so that I suspect that it'd
> be a big win for cg-update to default to rebase on merges from
> 'origin'.

I still don't understand what's the point. And it is confusing for
the user, since the history suddenly doesn't reflect how the GIT
development happened (which is what the history is about), and it is
downright deadly as soon as more than one branch gets around, which
makes it even more confusing for the user (you can do cg-update, yeah
- oh wait, unless you do X, then you need to remember to always do
cg-update -R or whatever).

All the woes just to get rid of the merge commits. What's wrong on merge
commits? If they irritate you so much, cg-log -M. ;-)

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
VI has two modes: the one in which it beeps and the one in which
it doesn't.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH] GIT commit statistics.
  2005-11-14  9:25                           ` Petr Baudis
@ 2005-11-14 21:25                             ` Martin Langhoff
  0 siblings, 0 replies; 58+ messages in thread
From: Martin Langhoff @ 2005-11-14 21:25 UTC (permalink / raw)
  To: Petr Baudis; +Cc: Junio C Hamano, git

On 11/14/05, Petr Baudis <pasky@suse.cz> wrote:
> All the woes just to get rid of the merge commits. What's wrong on merge
> commits? If they irritate you so much, cg-log -M. ;-)

Well, if you have a team of 5 working closely, doing commit/update
several times a day, soon the following things happen:

 - everyone has sightly different histories, even if they all have the same tree
 - in the repo, there are 'update' merges galore.
 - as soon as you _actually_ have branches, it's really hard to
distinguish branch merges from 'same branch update before commit'
merges.

by replacing the daily/hourly intra-team, self-branch `cg-update` with
`cg-fetch && git-rebase` our history makes much more sense _and_
looking at gitk I can see clearly again the interesting merges (ie:
merges with other branches).

In practice, it's exactly what happens with git's history -- the
merges are actually from rebased patches, that's why the history is
"readable".

does that make sense?

cheers,

martin

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH] GIT commit statistics.
  2005-11-14  8:51                         ` Martin Langhoff
  2005-11-14  9:25                           ` Petr Baudis
@ 2005-11-14  9:27                           ` Junio C Hamano
  2005-11-15  3:00                           ` Junio C Hamano
  2 siblings, 0 replies; 58+ messages in thread
From: Junio C Hamano @ 2005-11-14  9:27 UTC (permalink / raw)
  To: Martin Langhoff; +Cc: Petr Baudis, git

Martin Langhoff <martin.langhoff@gmail.com> writes:

> I'm curious. What would be the advantages of this over git-read-tree
> -m for use within a single repo?

On a large tree I had an impression that applying patch and then
falling back on 3-way is faster, but other than that, nothing,
really.  Just being able to use a single mechanism, which does
not buy us much.

> OTOH,  if this is heading towards teaching git-am how to apply changes
> to binary files based on known SHA1s, this will give birth to a type
> of patch that applies only if you have the objects beforehand. Is that
> enough to get by? Perhaps we need a format to fully describe binary
> files?

I'd rather not see our "patch" go in the direction of recording
both pre- and post-image of blob for binary files, which is what
we would end up doing if we really want to do binary flexibly.

Well, that may be nice as an option, but not by default.

An option halfway in between would be to record the pre-image
SHA1 and post- blob, perhaps compressed-uuencoded.  This would
limit us to the case that the recipient has not touched the
binary file and replacing it, but in practice that might be
enough.

I'm willing to do the C hackery myself if there is enough
interest in it.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH] GIT commit statistics.
  2005-11-14  8:51                         ` Martin Langhoff
  2005-11-14  9:25                           ` Petr Baudis
  2005-11-14  9:27                           ` Junio C Hamano
@ 2005-11-15  3:00                           ` Junio C Hamano
  2 siblings, 0 replies; 58+ messages in thread
From: Junio C Hamano @ 2005-11-15  3:00 UTC (permalink / raw)
  To: Martin Langhoff; +Cc: git

Martin Langhoff <martin.langhoff@gmail.com> writes:

>> I've never felt need to "merge" the binary files myself and had
>> never got around doing this, but if you are interested, it would
>> go something like this:
>
> That's quite a bit of C hacking...

I'll be pushing out something along the lines I said in the
proposed updates branch tonight.  I've even run 1 (one) test
;-).  The relevant commits are these:

+  [pu~5^2] rebase: make it usable for binary files as well.
+  [pu~5^2^] diff: --full-index
+  [pu~5^2~2] apply: allow-binary-replacement.
+  [pu~5^2~3] Rewrite rebase to use git-format-patch piped to git-am.

Note that this does not make generated "binary patch" usable
across repositories yet.  Since our patches are reversible out
of principle, making it usable across repositories involves
recording both the pre- and post-image of binary blobs in the
patch output, which may be a useful option in some cases but not
necessary for the immediate application of intra-repo rebasing.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH] GIT commit statistics.
  2005-11-12 12:19             ` Martin Langhoff
                                 ` (2 preceding siblings ...)
  2005-11-13 10:59               ` Junio C Hamano
@ 2005-11-13 11:11               ` Petr Baudis
  3 siblings, 0 replies; 58+ messages in thread
From: Petr Baudis @ 2005-11-13 11:11 UTC (permalink / raw)
  To: Martin Langhoff; +Cc: Junio C Hamano, Ryan Anderson, Linus Torvalds, git

Dear diary, on Sat, Nov 12, 2005 at 01:19:45PM CET, I got a letter
where Martin Langhoff <martin.langhoff@gmail.com> said that...
> Similarly, when dealing with an upstream, my tree gets slowly out of
> sync and slightly messy.

I've been replying only to this, but missed the team shared repo case,
where StGIT obviously does not make much sense.

> Related to this, I've been wondering whether it'd be possible to teach
> git to rebase local patches, even if that means rewriting local
> history. When you are dealing with team shared repo, the sequences of
> pull/push end up being quite messy, full of little meaningless merges.

Well, only pulls make up for merges. But what's wrong with that? This is
just what you get when using distributed VCS, and what's the point in
skewing the history to look different than how did it happen? I'd just
get used to it. :-)

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
VI has two modes: the one in which it beeps and the one in which
it doesn't.

^ permalink raw reply	[flat|nested] 58+ messages in thread

end of thread, other threads:[~2005-11-15 15:30 UTC | newest]

Thread overview: 58+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-11-07 16:48 Comments on recursive merge Linus Torvalds
2005-11-07 16:56 ` Linus Torvalds
2005-11-07 23:19   ` [PATCH] merge-recursive: Only print relevant rename messages Fredrik Kuivinen
2005-11-07 23:54     ` Junio C Hamano
2005-11-09 10:36       ` Fredrik Kuivinen
2005-11-07 22:58 ` Comments on recursive merge Fredrik Kuivinen
2005-11-08  0:13   ` Junio C Hamano
2005-11-08  0:33     ` Linus Torvalds
2005-11-08  0:59       ` Junio C Hamano
2005-11-08 11:58       ` Johannes Schindelin
2005-11-08 21:02         ` Fredrik Kuivinen
2005-11-08 21:47           ` Junio C Hamano
2005-11-08 21:52           ` Linus Torvalds
2005-11-08 22:36             ` Fredrik Kuivinen
2005-11-08 23:05               ` Linus Torvalds
2005-11-08 23:18                 ` Johannes Schindelin
2005-11-09  0:18                   ` Linus Torvalds
2005-11-09  6:10                     ` Junio C Hamano
2005-11-09  0:32                 ` Petr Baudis
2005-11-09  0:51                   ` Linus Torvalds
2005-11-09  0:59                     ` Junio C Hamano
2005-11-09  1:22                       ` Linus Torvalds
2005-11-09  1:42                         ` Junio C Hamano
2005-11-09 10:20                           ` Junio C Hamano
2005-11-09 14:59                             ` Petr Baudis
2005-11-09 16:30                             ` Linus Torvalds
2005-11-09 20:13                               ` Junio C Hamano
2005-11-09 21:58                                 ` Linus Torvalds
2005-11-09 22:56                                   ` Junio C Hamano
2005-11-09 23:34                                     ` Linus Torvalds
2005-11-11  2:58                                   ` merge-base: fully contaminate the well Junio C Hamano
2005-11-11  5:36                                     ` Linus Torvalds
2005-11-11  6:04                                       ` Junio C Hamano
2005-11-11 16:18                                         ` Linus Torvalds
2005-11-11  8:28                                       ` Junio C Hamano
2005-11-08 23:04           ` Comments on recursive merge Johannes Schindelin
2005-11-08 16:21       ` [RFC/PATCH] Make git-recursive the default strategy for git-pull Junio C Hamano
2005-11-11 22:25       ` Comments on recursive merge Junio C Hamano
2005-11-11 22:53         ` Linus Torvalds
2005-11-12  0:42           ` Junio C Hamano
2005-11-12  6:35         ` Ryan Anderson
2005-11-12  7:44           ` [PATCH] GIT commit statistics Junio C Hamano
2005-11-12 12:19             ` Martin Langhoff
2005-11-12 12:53               ` Petr Baudis
2005-11-15 10:04                 ` Catalin Marinas
2005-11-15 15:29                   ` Chuck Lever
2005-11-12 19:04               ` Johannes Schindelin
2005-11-13 10:59               ` Junio C Hamano
2005-11-13 20:42                 ` Martin Langhoff
2005-11-14  3:33                   ` Junio C Hamano
2005-11-14  4:01                     ` Martin Langhoff
2005-11-14  6:06                       ` Junio C Hamano
2005-11-14  8:51                         ` Martin Langhoff
2005-11-14  9:25                           ` Petr Baudis
2005-11-14 21:25                             ` Martin Langhoff
2005-11-14  9:27                           ` Junio C Hamano
2005-11-15  3:00                           ` Junio C Hamano
2005-11-13 11:11               ` Petr Baudis

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).