git-rerere observations and feature suggestions

git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* git-rerere observations and feature suggestions
@ 2008-06-16 11:01 Ingo Molnar
  2008-06-16 11:09 ` Mike Hommey
                   ` (5 more replies)
  0 siblings, 6 replies; 45+ messages in thread
From: Ingo Molnar @ 2008-06-16 11:01 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano

We are running a rather complex Git tree with heavy use of git-rerere 
(the -tip kernel tree, with more than 80 topic branches). git-rerere is 
really nice in that it caches conflict resolutions, but there are a few 
areas where it would be nice to have improvements:

 - Fixing resolutions: currently, when i do an incorrect conflict
   resolution, and fix it on the next run, git-rerere does not pick up
   the new resolution but uses the old (buggy) one on the next run. To
   fix it up i have to find the right entries in .git/rr-cache/* and
   manually erase them. Would be nice to have "git-rerere gc <pathspec>"
   to flush out a single bad resolution.

 - File deletion: would be nice if git-rerere picked up git-rm
   resolutions. We hit this every now and then and right now i know 
   which ones need an extra git-rm pass.

 - Automation: would be nice to have a git-rerere modus operandi where
   it would auto-commit things if and only if all conflicting files were 
   resolved.

 - Sharing .git/rr-cache. It's quite a PITA to share the .git/rr-cache
   amongst -tip maintainers right now. It seems to have dependencies on 
   the index file, so if we want to share the conflict resolution data, 
   we have to copy our index file (which is dangerous anyway and assumes 
   very similar repositories).

   It would be much nicer if we could share conflict resolutions with 
   each other - and with others as well. For example linux-next could 
   re-use our conflict resolution data as well - often Stephen Rothwell 
   has to re-do the same conflict resolution as well, creating 
   duplicated work.

   ( Also, it's a GPL nitpicky issue: the conflict resolution database 
     can be argued to be part of "source code" and as such it should be 
     shared with everyone who asks. With trivial merges the data is
     probably not copyrightable hence probably falls outside the scope 
     of the GPL, but with a complex topic tree like -tip with dozens of 
     conflict resolutions, the boundary is perhaps more blurred. )

	Ingo

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: git-rerere observations and feature suggestions
  2008-06-16 11:01 git-rerere observations and feature suggestions Ingo Molnar
@ 2008-06-16 11:09 ` Mike Hommey
  2008-06-16 15:48   ` Pierre Habouzit
  2008-06-16 11:26 ` David Kastrup
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 45+ messages in thread
From: Mike Hommey @ 2008-06-16 11:09 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: git, Junio C Hamano

- At least, compress the data in the rr-cache. It can grow big quite
  easily. Also, I wonder if keeping the entire files is not overkill...

Mike

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: git-rerere observations and feature suggestions
  2008-06-16 11:01 git-rerere observations and feature suggestions Ingo Molnar
  2008-06-16 11:09 ` Mike Hommey
@ 2008-06-16 11:26 ` David Kastrup
  2008-06-16 11:27 ` Theodore Tso
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 45+ messages in thread
From: David Kastrup @ 2008-06-16 11:26 UTC (permalink / raw)
  To: git

Ingo Molnar <mingo@elte.hu> writes:

>    ( Also, it's a GPL nitpicky issue: the conflict resolution database 
>      can be argued to be part of "source code" and as such it should be 
>      shared with everyone who asks.

I don't think that interpretation holds water.  Not even the version
control history is part of the _corresponding_ source code AFAICT.  If
it were, GPLed software distributions would be a nightmare since you
would have to deliver everything with complete history.

Only very nonstandard usage of version control might make the
_corresponding_ source code be contained in more than HEAD of the
release branch.

-- 
David Kastrup, Kriemhildstr. 15, 44793 Bochum

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: git-rerere observations and feature suggestions
  2008-06-16 11:01 git-rerere observations and feature suggestions Ingo Molnar
  2008-06-16 11:09 ` Mike Hommey
  2008-06-16 11:26 ` David Kastrup
@ 2008-06-16 11:27 ` Theodore Tso
  2008-06-16 12:38   ` David Kastrup
  2008-06-16 19:52   ` Ingo Molnar
  2008-06-16 18:46 ` Junio C Hamano
                   ` (2 subsequent siblings)
  5 siblings, 2 replies; 45+ messages in thread
From: Theodore Tso @ 2008-06-16 11:27 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: git, Junio C Hamano

On Mon, Jun 16, 2008 at 01:01:13PM +0200, Ingo Molnar wrote:
>    ( Also, it's a GPL nitpicky issue: the conflict resolution database 
>      can be argued to be part of "source code" and as such it should be 
>      shared with everyone who asks. With trivial merges the data is
>      probably not copyrightable hence probably falls outside the scope 
>      of the GPL, but with a complex topic tree like -tip with dozens of 
>      conflict resolutions, the boundary is perhaps more blurred. )

For a more complex merge resolution, granted that it rises to the
level of being "copyrightable", but I think it would be a huge stretch
to call the rr-cache the "preferred form for modifications"!  :-)

   	    	     	 	    	     - Ted

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: git-rerere observations and feature suggestions
  2008-06-16 11:27 ` Theodore Tso
@ 2008-06-16 12:38   ` David Kastrup
  2008-06-16 19:52   ` Ingo Molnar
  1 sibling, 0 replies; 45+ messages in thread
From: David Kastrup @ 2008-06-16 12:38 UTC (permalink / raw)
  To: git

Theodore Tso <tytso@mit.edu> writes:

> On Mon, Jun 16, 2008 at 01:01:13PM +0200, Ingo Molnar wrote:
>>    ( Also, it's a GPL nitpicky issue: the conflict resolution database 
>>      can be argued to be part of "source code" and as such it should be 
>>      shared with everyone who asks. With trivial merges the data is
>>      probably not copyrightable hence probably falls outside the scope 
>>      of the GPL, but with a complex topic tree like -tip with dozens of 
>>      conflict resolutions, the boundary is perhaps more blurred. )
>
> For a more complex merge resolution, granted that it rises to the
> level of being "copyrightable", but I think it would be a huge stretch
> to call the rr-cache the "preferred form for modifications"!  :-)

The GPL just calls for all "corresponding" source code, not all
"interesting" source code.

-- 
David Kastrup, Kriemhildstr. 15, 44793 Bochum

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: git-rerere observations and feature suggestions
  2008-06-16 11:09 ` Mike Hommey
@ 2008-06-16 15:48   ` Pierre Habouzit
  2008-06-16 15:57     ` Pierre Habouzit
  0 siblings, 1 reply; 45+ messages in thread
From: Pierre Habouzit @ 2008-06-16 15:48 UTC (permalink / raw)
  To: Mike Hommey; +Cc: Ingo Molnar, git, Junio C Hamano

[-- Attachment #1: Type: text/plain, Size: 739 bytes --]

On Mon, Jun 16, 2008 at 11:09:18AM +0000, Mike Hommey wrote:
> - At least, compress the data in the rr-cache. It can grow big quite
>   easily. Also, I wonder if keeping the entire files is not overkill...

  Actually it would be rather straightforward to put it in the usual git
store, and represent the current rr-cache with a flat file that points
to the in-git preimage/postimages, and make git-gc aware of those.

  This would deal with the huge number of files + compression quite
easily. I'm quite sure it's pretty straightforward actually :)

-- 
·O·  Pierre Habouzit
··O                                                madcoder@debian.org
OOO                                                http://www.madism.org

[-- Attachment #2: Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: git-rerere observations and feature suggestions
  2008-06-16 15:48   ` Pierre Habouzit
@ 2008-06-16 15:57     ` Pierre Habouzit
  2008-06-16 16:18       ` Sverre Rabbelier
  0 siblings, 1 reply; 45+ messages in thread
From: Pierre Habouzit @ 2008-06-16 15:57 UTC (permalink / raw)
  To: Mike Hommey, Ingo Molnar, git, Junio C Hamano

[-- Attachment #1: Type: text/plain, Size: 778 bytes --]

On Mon, Jun 16, 2008 at 03:48:51PM +0000, Pierre Habouzit wrote:
> On Mon, Jun 16, 2008 at 11:09:18AM +0000, Mike Hommey wrote:
> > - At least, compress the data in the rr-cache. It can grow big quite
> >   easily. Also, I wonder if keeping the entire files is not overkill...
> 
>   Actually it would be rather straightforward to put it in the usual git
> store, and represent the current rr-cache with a flat file that points
> to the in-git preimage/postimages, and make git-gc aware of those.

  Actually, this is probably a required step in the direction of sharing
such things btw.

-- 
·O·  Pierre Habouzit
··O                                                madcoder@debian.org
OOO                                                http://www.madism.org

[-- Attachment #2: Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: git-rerere observations and feature suggestions
  2008-06-16 15:57     ` Pierre Habouzit
@ 2008-06-16 16:18       ` Sverre Rabbelier
  2008-06-17  7:37         ` Karl Hasselström
  0 siblings, 1 reply; 45+ messages in thread
From: Sverre Rabbelier @ 2008-06-16 16:18 UTC (permalink / raw)
  To: Pierre Habouzit, Mike Hommey, Ingo Molnar, git, Junio C Hamano

On Mon, Jun 16, 2008 at 5:57 PM, Pierre Habouzit <madcoder@debian.org> wrote:
> On Mon, Jun 16, 2008 at 03:48:51PM +0000, Pierre Habouzit wrote:
>>   Actually it would be rather straightforward to put it in the usual git
>> store, and represent the current rr-cache with a flat file that points
>> to the in-git preimage/postimages, and make git-gc aware of those.
>
>  Actually, this is probably a required step in the direction of sharing
> such things btw.

Perhaps an approach similar to the 'notes' implementation can be used,
in which a separate branch is created to contain the notes. This way
the rerere information (being the 'rerere' branch) can be shared
easily (by just pulling the branch), and as said we get free
compression. Another advantage would be that you automagically get the
ability to unlearn a bad rerere by simply (partially) reverting a
commit on the rerere branch!

-- 
Cheers,

Sverre Rabbelier

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: git-rerere observations and feature suggestions
  2008-06-16 11:01 git-rerere observations and feature suggestions Ingo Molnar
                   ` (2 preceding siblings ...)
  2008-06-16 11:27 ` Theodore Tso
@ 2008-06-16 18:46 ` Junio C Hamano
  2008-06-16 19:09   ` Ingo Molnar
                     ` (2 more replies)
  2008-06-16 20:11 ` Jakub Narebski
  2008-06-17 10:24 ` Johannes Schindelin
  5 siblings, 3 replies; 45+ messages in thread
From: Junio C Hamano @ 2008-06-16 18:46 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: git

Ingo Molnar <mingo@elte.hu> writes:

> We are running a rather complex Git tree with heavy use of git-rerere 
> (the -tip kernel tree, with more than 80 topic branches). git-rerere is 
> really nice in that it caches conflict resolutions, but there are a few 
> areas where it would be nice to have improvements:
>
>  - Fixing resolutions: currently, when i do an incorrect conflict
>    resolution, and fix it on the next run, git-rerere does not pick up
>    the new resolution but uses the old (buggy) one on the next run. To
>    fix it up i have to find the right entries in .git/rr-cache/* and
>    manually erase them. Would be nice to have "git-rerere gc <pathspec>"
>    to flush out a single bad resolution.

I agree this is a real issue (I sometimes know that the resolution is iffy
and say "rerere clear" to choose not to record it, but that is working
around the issue with a perfect foresight and is not a solution).

I think (and I think you would agree) "gc" is not the right word but
rather you would want to more actively discard the wrong one.

I agree that it is the right UI to do this to specify paths right after
you found that a bad resolution that was recorded previously was used by
rerere (I think that is what you are suggesting).  Upon such a request, we
should undo the bad resolution and bring the working tree copy to the
original conflicted state, and clear the bad rerere entry.

>  - File deletion: would be nice if git-rerere picked up git-rm
>    resolutions. We hit this every now and then and right now i know 
>    which ones need an extra git-rm pass.

I originally did not have need for anything other than three-way conflict
resolving to a result.  I do not know how safe reapplying a removal to
different context, though.

>  - Automation: would be nice to have a git-rerere modus operandi where
>    it would auto-commit things if and only if all conflicting files were 
>    resolved.

I am not sure how safe this is.  rerere as originally designed does not
even update the index with merge results so that the application of
earlier resolution can be manually inspected, and this is exactly because
I consider a blind textual reapplication of previous resolution always
iffy, even though I invented the whole mechanism.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: git-rerere observations and feature suggestions
  2008-06-16 18:46 ` Junio C Hamano
@ 2008-06-16 19:09   ` Ingo Molnar
  2008-06-16 20:50     ` Junio C Hamano
  2008-06-18 10:57     ` git-rerere observations and feature suggestions Ingo Molnar
  2008-06-16 19:10   ` Junio C Hamano
  2008-06-23  9:49   ` Ingo Molnar
  2 siblings, 2 replies; 45+ messages in thread
From: Ingo Molnar @ 2008-06-16 19:09 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

* Junio C Hamano <gitster@pobox.com> wrote:

> >  - Automation: would be nice to have a git-rerere modus operandi where
> >    it would auto-commit things if and only if all conflicting files were 
> >    resolved.
> 
> I am not sure how safe this is.  rerere as originally designed does 
> not even update the index with merge results so that the application 
> of earlier resolution can be manually inspected, and this is exactly 
> because I consider a blind textual reapplication of previous 
> resolution always iffy, even though I invented the whole mechanism.

We use a 'safe, lazy integration' method in -tip, that basically has 
external checks against any integration bugs.

Basically, we integrate only about once a day, and we advance the topic 
branches but do not reintegrate on every topic merge. We merge commits 
_both_ to their target topic branches, and to the (previous) integration 
branch.

Then once a day (or every second day) we 'reintegrate': we propagate the 
topic branches to the linux-next auto-*-next branches [recreating them 
from scratch] and flush out the messy criss-cross merges from the 
integration tree.

But that is always an identity transformation as far as the integration 
result is concerned: the result of the integration run must be exactly 
the same content (obviously it results in a very different tree 
structure) as the previous one. We only run it on a perfectly tested 
tree so we know none of our previous merges were wrong, and we want the 
git-rerere result to be the same. We repeat the integration until the 
end result matches.

In fact sometimes git-rerere is able to pick up a conflict resolution 
from our 'messy' delta-merge into the integration tree, which is an 
added bonus. (this doesnt always work if the merge order differs from 
integration order)

Anyway, the gist is that in this workflow it does not hurt at all if 
git-rerere is "unsafe", and we'd love to have the integration as fast as 
possible. Right now most of my manual overhead is in making sure that 
git-rerere has not missed some file.

At a ~100 conflicting files tracked, that is rather error-prone, and i'd 
love to have further automation here besides a rather lame method of 
grepping for:

  "Resolved 'kernel/Makefile' using previous resolution."

type of patterns in git-merge output.

So i'd not mind if git-rerere was safe by default, but it would be nice 
to have some knob to turn it into something fast and automatic. For us 
it would be much _safer_, because right now most of our manual energy is 
spent on checking something that could be automated.

We could in theory avoid git-rerere altogether by creating separate 
conflict resolution branches, and automated their handling - but we 
thought git-rerere was pretty nice as well and kept the branch count 
down.

And while asking for an arm i'd also like to ask for a leg, if i may: 
i'd love it if a "slightly conflicting" octopus merge of 85 topic trees 
would not result in one huge conflict commit that merges together 1000 
commits into a single commit ;-)

So right now in our -tip scripts work around this issue: we 'serialize' 
the topic merges despite having very nice opportunities for higher-order 
octopus merges. The integration would be a lot faster if we could use 
octopus merges and automated git-rerere. (Octopus merges would look much 
nicer as well in graphical representation as well, which counts too :-) )

	Ingo

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: git-rerere observations and feature suggestions
  2008-06-16 18:46 ` Junio C Hamano
  2008-06-16 19:09   ` Ingo Molnar
@ 2008-06-16 19:10   ` Junio C Hamano
  2008-06-16 19:44     ` Ingo Molnar
  2008-06-23  9:49   ` Ingo Molnar
  2 siblings, 1 reply; 45+ messages in thread
From: Junio C Hamano @ 2008-06-16 19:10 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: git

Junio C Hamano <gitster@pobox.com> writes:

> Ingo Molnar <mingo@elte.hu> writes:
> ...
>>  - Automation: would be nice to have a git-rerere modus operandi where
>>    it would auto-commit things if and only if all conflicting files were 
>>    resolved.
>
> I am not sure how safe this is.  rerere as originally designed does not
> even update the index with merge results so that the application of
> earlier resolution can be manually inspected, and this is exactly because
> I consider a blind textual reapplication of previous resolution always
> iffy, even though I invented the whole mechanism.

By the way, this safety is not a theoretical issue but has been a real
one.  I had two topics that changed the calling convention of the same
function in different ways, and when they were merged to 'pu', the
declaration, definition, and call sites existed on both of these branches
were handled beautifully by rerere.

Recording autoresolution would have been a wrong thing to do.  One of the
branches added a new call site to a file that was not among the ones that
conflicted in the merge between the two branches.  That call site, that
uses the calling convention of one branch, needed to be adjusted to
accomodate the change of calling convention from the other branch (from
textual merge's point of view, this has to be an evil merge).  I had to
make and keep a mental note about that new call site until both topics
graduated to 'master' (similar to your need to remember a particular merge
is resolved to removal right now).

To safely automate reapplication of such a merge, rerere needs to become
much more clever.

The conflicts rerere notices and records are strictly per blob.  A
conflicted merge to a blob is inspected and a "conflict signature", which
becomes the directory name under rr-cache, is computed.  We record the
conflicted blob as a whole as the preimage, and your hand resolution as a
whoe as the postimage.  Next time when you have a conflicted merge to a
blob, and the conflict has the exact same conflict signature, we run
three-way merge between the recorded preimage, postimage and the new
conflicted result.

If we want to handle new call sites added only on a single side, you
should be able to express something like "when a merge has a conflicted
blob with this conflict signature, look in the whole tree, even outside
the set of conflicted paths, and change this text to that".  This is too
much automation and I somehow think the potential for errors (both from
the tool and from the user) is too high.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: git-rerere observations and feature suggestions
  2008-06-16 19:10   ` Junio C Hamano
@ 2008-06-16 19:44     ` Ingo Molnar
  0 siblings, 0 replies; 45+ messages in thread
From: Ingo Molnar @ 2008-06-16 19:44 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

* Junio C Hamano <gitster@pobox.com> wrote:

> > Ingo Molnar <mingo@elte.hu> writes:
> > ...
> >>  - Automation: would be nice to have a git-rerere modus operandi where
> >>    it would auto-commit things if and only if all conflicting files were 
> >>    resolved.
> >
> > I am not sure how safe this is.  rerere as originally designed does 
> > not even update the index with merge results so that the application 
> > of earlier resolution can be manually inspected, and this is exactly 
> > because I consider a blind textual reapplication of previous 
> > resolution always iffy, even though I invented the whole mechanism.
> 
> By the way, this safety is not a theoretical issue but has been a real 
> one.  I had two topics that changed the calling convention of the same 
> function in different ways, and when they were merged to 'pu', the 
> declaration, definition, and call sites existed on both of these 
> branches were handled beautifully by rerere.
> 
> Recording autoresolution would have been a wrong thing to do.  One of 
> the branches added a new call site to a file that was not among the 
> ones that conflicted in the merge between the two branches.  That call 
> site, that uses the calling convention of one branch, needed to be 
> adjusted to accomodate the change of calling convention from the other 
> branch (from textual merge's point of view, this has to be an evil 
> merge).  I had to make and keep a mental note about that new call site 
> until both topics graduated to 'master' (similar to your need to 
> remember a particular merge is resolved to removal right now).
> 
> To safely automate reapplication of such a merge, rerere needs to 
> become much more clever.

in our workflow, we dont ever do any semantic things during the 
integration run. I.e. we dont put more complex merge changes into the 
integration merge commits.

Such integration effects do come up occasionally (especially when a 
topic changes some widely used infrastructure), and we handle them via 
separate merge branches. The current ones in -tip are 
tip/tracing/ftrace-mergefixups and tip/tracing/mmiotrace-mergefixups.

They are one or two orders of magnitude more rare than regular 
conflicts, and they show up immediately during testing. (or we 
anticipate them beforehand)

i.e. we'd like to have a 'dumb' phase of integration, as much cached and 
automated as possible. Things that need more thought need to go into 
separate branches anyway, for better reviewability - merge commits are 
rather hard to debug as they hide their true contents, so we try to keep 
them simple and contextual only.

	Ingo

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: git-rerere observations and feature suggestions
  2008-06-16 11:27 ` Theodore Tso
  2008-06-16 12:38   ` David Kastrup
@ 2008-06-16 19:52   ` Ingo Molnar
  2008-06-16 20:25     ` Junio C Hamano
  1 sibling, 1 reply; 45+ messages in thread
From: Ingo Molnar @ 2008-06-16 19:52 UTC (permalink / raw)
  To: Theodore Tso; +Cc: git, Junio C Hamano

* Theodore Tso <tytso@mit.edu> wrote:

> On Mon, Jun 16, 2008 at 01:01:13PM +0200, Ingo Molnar wrote:
> >    ( Also, it's a GPL nitpicky issue: the conflict resolution database 
> >      can be argued to be part of "source code" and as such it should be 
> >      shared with everyone who asks. With trivial merges the data is
> >      probably not copyrightable hence probably falls outside the scope 
> >      of the GPL, but with a complex topic tree like -tip with dozens of 
> >      conflict resolutions, the boundary is perhaps more blurred. )
> 
> For a more complex merge resolution, granted that it rises to the 
> level of being "copyrightable", but I think it would be a huge stretch 
> to call the rr-cache the "preferred form for modifications"!  :-)

yeah - i'm not really arguing any detail of the GPL here. I'm arguing 
the principle: there should be no technical assymetry between maintainer 
and contributor. So if i am able to run an effort-free integration of 85 
topic branches, i'd like contributors (who will eventually grow up into 
co-maintainer roles in the future) to be able to do the same, if they 
want to do so.

right now that is simply not possible technically - it's even very hard 
to share a .git/rr-cache with a co-maintainer whom i can trust with my 
index file. (which is an otherwise unsafe private binary cache that i'd 
not put into a public repository as it could in theory contain lots of 
unrelated data and is not endian-safe, etc.)

	Ingo

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: git-rerere observations and feature suggestions
  2008-06-16 11:01 git-rerere observations and feature suggestions Ingo Molnar
                   ` (3 preceding siblings ...)
  2008-06-16 18:46 ` Junio C Hamano
@ 2008-06-16 20:11 ` Jakub Narebski
  2008-06-17 10:24 ` Johannes Schindelin
  5 siblings, 0 replies; 45+ messages in thread
From: Jakub Narebski @ 2008-06-16 20:11 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: git, Junio C Hamano

Ingo Molnar <mingo@elte.hu> writes:

> We are running a rather complex Git tree with heavy use of git-rerere 
> (the -tip kernel tree, with more than 80 topic branches). git-rerere is 
> really nice in that it caches conflict resolutions, but there are a few 
> areas where it would be nice to have improvements:
[...]

>  - File deletion: would be nice if git-rerere picked up git-rm
>    resolutions. We hit this every now and then and right now i know 
>    which ones need an extra git-rm pass.

>From what I remember some time ago on git mailing list there was idea
for git-rerere2, which would record resolutions on tree level,
i.e. record file renames.  It could probably record file deletion as
well... would someone implement it, and didn't it stay loose idea.

-- 
Jakub Narebski
Poland
ShadeHawk on #git

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: git-rerere observations and feature suggestions
  2008-06-16 19:52   ` Ingo Molnar
@ 2008-06-16 20:25     ` Junio C Hamano
  2008-06-16 20:46       ` Ingo Molnar
  0 siblings, 1 reply; 45+ messages in thread
From: Junio C Hamano @ 2008-06-16 20:25 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Theodore Tso, git

Ingo Molnar <mingo@elte.hu> writes:

> right now that is simply not possible technically - it's even very hard 
> to share a .git/rr-cache with a co-maintainer whom i can trust with my 
> index file. (which is an otherwise unsafe private binary cache that i'd 
> not put into a public repository as it could in theory contain lots of 
> unrelated data and is not endian-safe, etc.)

Where did you get the idea that .git/index is involved in any way, I
wonder...

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: git-rerere observations and feature suggestions
  2008-06-16 20:25     ` Junio C Hamano
@ 2008-06-16 20:46       ` Ingo Molnar
  2008-06-16 21:37         ` Junio C Hamano
  0 siblings, 1 reply; 45+ messages in thread
From: Ingo Molnar @ 2008-06-16 20:46 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Theodore Tso, git


* Junio C Hamano <gitster@pobox.com> wrote:

> Ingo Molnar <mingo@elte.hu> writes:
> 
> > right now that is simply not possible technically - it's even very 
> > hard to share a .git/rr-cache with a co-maintainer whom i can trust 
> > with my index file. (which is an otherwise unsafe private binary 
> > cache that i'd not put into a public repository as it could in 
> > theory contain lots of unrelated data and is not endian-safe, etc.)
> 
> Where did you get the idea that .git/index is involved in any way, I 
> wonder...

so it's only the rr-cache metadata that is involved? We had a few cases 
where git-rerere sessions were not repeatable by copying the 
.git/rr-cache, so i just assumed that there's some extra metadata in the 
index file. When that happened i took a look at git/builtin-rerere.c:

 static int find_conflict(struct path_list *conflict)
 {
        int i;
        if (read_cache() < 0)
                return error("Could not read index");

and (mistakenly) assumed that git-rerere depends on having something in 
the index file - but on a second look it just checks out the conflicting 
file(s) from the index file, right?

	Ingo

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: git-rerere observations and feature suggestions
  2008-06-16 19:09   ` Ingo Molnar
@ 2008-06-16 20:50     ` Junio C Hamano
  2008-06-22  9:47       ` [PATCH 1/5] rerere: rerere_created_at() and has_resolution() abstraction Junio C Hamano
                         ` (4 more replies)
  2008-06-18 10:57     ` git-rerere observations and feature suggestions Ingo Molnar
  1 sibling, 5 replies; 45+ messages in thread
From: Junio C Hamano @ 2008-06-16 20:50 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: git

Ingo Molnar <mingo@elte.hu> writes:

> * Junio C Hamano <gitster@pobox.com> wrote:
>
>> >  - Automation: would be nice to have a git-rerere modus operandi where
>> >    it would auto-commit things if and only if all conflicting files were 
>> >    resolved.
>> 
>> I am not sure how safe this is.  rerere as originally designed does 
>> not even update the index with merge results so that the application 
>> of earlier resolution can be manually inspected, and this is exactly 
>> because I consider a blind textual reapplication of previous 
>> resolution always iffy, even though I invented the whole mechanism.
> ...
> So i'd not mind if git-rerere was safe by default, but it would be nice 
> to have some knob to turn it into something fast and automatic. For us 
> it would be much _safer_, because right now most of our manual energy is 
> spent on checking something that could be automated.

Oh, "unsafe switch" that is off by default will not hurt anybody, and I do
not mind it as a new feature.  We are in agreement in that sense.

Perhaps the way forward would be (and this is independent of the issue of
recording removal as a possible form of resolution):

 (1) Introduce a new configuration rerere.autoupdate that is off by
     default, but when it is on, paths cleanly resolved by rerere will
     also be updated in the index (if we have capability to record
     removal, this may remove such a path from the index as the result).

 (2) The callers of rerere that expects rerere to resolve needs to be
     changed to see if the resulting index after rerere is fully merged,
     and continue.  Currently the callers are "merge", "rebase" and "am",
     I think.  This step might be a bit more involved than you might
     think, as rerere currently happens in the codepath that knows the
     caller does _not_ go further than leaving the failed conflict to be
     sorted out by the user (rerere is designed as merely a way to help).

     Also you _might_ want a separate configuration rerere.autocommit to
     control this --- the user (but not you) might be willing to allow
     autoupdate but you may still want to eyeball the result.

Independent of the above, we have two potential new features:

 * Introduce "git rerere revert paths..."  that brings the index and
   working tree back to the conflicted state after a previous resolution
   is applied, because that resolution is incorrect.  The old resolution
   cached in rr-cache is also removed.

   This however will become much less useful if you allow autoresolution
   to be committed automatically, as the caller will move ahead without
   giving you a chance to say "oh, that one is bad -- do not proceed".

 * Somehow record the fact that the resolution for a particular conflict
   signature is to remove the resulting path.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: git-rerere observations and feature suggestions
  2008-06-16 20:46       ` Ingo Molnar
@ 2008-06-16 21:37         ` Junio C Hamano
  0 siblings, 0 replies; 45+ messages in thread
From: Junio C Hamano @ 2008-06-16 21:37 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Theodore Tso, git

Ingo Molnar <mingo@elte.hu> writes:

> * Junio C Hamano <gitster@pobox.com> wrote:
>
>> Ingo Molnar <mingo@elte.hu> writes:
>> 
>> > right now that is simply not possible technically - it's even very 
>> > hard to share a .git/rr-cache with a co-maintainer whom i can trust 
>> > with my index file. (which is an otherwise unsafe private binary 
>> > cache that i'd not put into a public repository as it could in 
>> > theory contain lots of unrelated data and is not endian-safe, etc.)
>> 
>> Where did you get the idea that .git/index is involved in any way, I 
>> wonder...
>
> so it's only the rr-cache metadata that is involved?

The binary part of the index should be in network byte order and endian
safe.  But it is not necessary to share the index.  Well, if you think
about it, it would be mighty silly if index had any long term effect on
the operation of rerere, which is all about "I've done many conflict
resolutions in the past.  My work tree state (including the index) came
back to a state similar to the conflicted state I saw some time ago.
Let's reuse the previous resolution if we can."  You might have switched
branches, ran "reset --hard" and did 47 thousands different things to your
index since you resolved the conflict you are about to re-resolve ;-).

The replay and conflict recoding codepath of rerere goes like this:

 * read the index, list the paths that have conflicts;

 * inspect the conflicted blob to compute the conflict signature $sig and
   store the sig and path in MERGE_RR;

 * look into rr-cache/$sig; does it have already a conflict resolution
   recorded?

   - If so, modify the file in the working tree the same way to bring
     rr-cache/$sig/preimage to rr-cache/$sig/postimage by 3-way merge.

   - if not, record the file in the working tree as rr-cache/$sig/preimage

The resolution recording codepath goes like:

 * see if any paths listed in MERGE_RR is resolved in the index;

 * look into rr-cache/$sig for such resolved path.  Does it already record
   a resolution?

   - If not, we have a new resolution we can use.  Record it as
     rr-cache/$sig/postimage for later use.

So rerere _does_ look at the index to decide what entries in rr-cache are
relevant and applicable.  But other than that, it is not used.  I do not
think there is no reason copy index to be able to reuse rr-cache.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: git-rerere observations and feature suggestions
  2008-06-16 16:18       ` Sverre Rabbelier
@ 2008-06-17  7:37         ` Karl Hasselström
  0 siblings, 0 replies; 45+ messages in thread
From: Karl Hasselström @ 2008-06-17  7:37 UTC (permalink / raw)
  To: sverre
  Cc: Pierre Habouzit, Mike Hommey, Ingo Molnar, git, Junio C Hamano,
	Catalin Marinas

On 2008-06-16 18:18:40 +0200, Sverre Rabbelier wrote:

> On Mon, Jun 16, 2008 at 5:57 PM, Pierre Habouzit <madcoder@debian.org> wrote:
>
> > On Mon, Jun 16, 2008 at 03:48:51PM +0000, Pierre Habouzit wrote:
> >
> > > Actually it would be rather straightforward to put it in the
> > > usual git store, and represent the current rr-cache with a flat
> > > file that points to the in-git preimage/postimages, and make
> > > git-gc aware of those.
> >
> > Actually, this is probably a required step in the direction of
> > sharing such things btw.
>
> Perhaps an approach similar to the 'notes' implementation can be
> used, in which a separate branch is created to contain the notes.
> This way the rerere information (being the 'rerere' branch) can be
> shared easily (by just pulling the branch), and as said we get free
> compression. Another advantage would be that you automagically get
> the ability to unlearn a bad rerere by simply (partially) reverting
> a commit on the rerere branch!

FWIW, StGit is well on its way to store its patch metadata in a git
branch, for much the same reasons.

-- 
Karl Hasselström, kha@treskal.com
      www.treskal.com/kalle

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: git-rerere observations and feature suggestions
  2008-06-16 11:01 git-rerere observations and feature suggestions Ingo Molnar
                   ` (4 preceding siblings ...)
  2008-06-16 20:11 ` Jakub Narebski
@ 2008-06-17 10:24 ` Johannes Schindelin
  5 siblings, 0 replies; 45+ messages in thread
From: Johannes Schindelin @ 2008-06-17 10:24 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: git, Junio C Hamano

Hi,

On Mon, 16 Jun 2008, Ingo Molnar wrote:
>
>  - Sharing .git/rr-cache. It's quite a PITA to share the .git/rr-cache 
>    amongst -tip maintainers right now. It seems to have dependencies on 
>    the index file, so if we want to share the conflict resolution data, 
>    we have to copy our index file (which is dangerous anyway and assumes 
>    very similar repositories).

I was dreaming about having "git rerere infer-from <merge-commit>".  This 
would be

- more versatile, as you do not have to ask the guy to share the cache,

- would avoid transmitting lots of data that can be inferred from the 
  data,

- would avoid relying on the honesty of the person sharing the cache, and

- it would put all license wieners^Wissues at rest.

FWIW this is in my TODO list, but I am unlikely to get to it, least of all 
before 1.5.6 comes out.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: git-rerere observations and feature suggestions
  2008-06-16 19:09   ` Ingo Molnar
  2008-06-16 20:50     ` Junio C Hamano
@ 2008-06-18 10:57     ` Ingo Molnar
  2008-06-18 11:29       ` Miklos Vajna
                         ` (2 more replies)
  1 sibling, 3 replies; 45+ messages in thread
From: Ingo Molnar @ 2008-06-18 10:57 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git


* Ingo Molnar <mingo@elte.hu> wrote:

> And while asking for an arm i'd also like to ask for a leg, if i may: 
> i'd love it if a "slightly conflicting" octopus merge of 85 topic 
> trees would not result in one huge conflict commit that merges 
> together 1000 commits into a single commit ;-)
> 
> So right now in our -tip scripts work around this issue: we 
> 'serialize' the topic merges despite having very nice opportunities 
> for higher-order octopus merges. The integration would be a lot faster 
> if we could use octopus merges and automated git-rerere. (Octopus 
> merges would look much nicer as well in graphical representation as 
> well, which counts too :-) )

just to demonstrate it, i tried today to do an octopus merge of 87 topic 
branches:

git-merge build checkme core/checkme core/debugobjects core/futex-64bit 
core/iter-div core/kill-the-BKL core/locking core/misc core/percpu 
core/printk core/rcu core/rodata core/softirq core/softlockup 
core/stacktrace core/topology core/urgent cpus4096 genirq kmemcheck 
kmemcheck2 mm/xen out-of-tree pci-for-jesse safe-poison-pointers sched 
sched-devel scratch stackprotector timers/clockevents timers/hpet 
timers/hrtimers timers/nohz timers/posixtimers tip tracing/ftrace 
tracing/ftrace-mergefixups tracing/immediates tracing/markers 
tracing/mmiotrace tracing/mmiotrace-mergefixups tracing/nmisafe 
tracing/sched_markers tracing/stopmachine-allcpus tracing/sysprof 
tracing/textedit x86/apic x86/apm x86/bitops x86/build x86/checkme 
x86/cleanups x86/cpa x86/cpu x86/defconfig x86/delay x86/gart x86/i8259 
x86/idle x86/intel x86/irq x86/irqstats x86/kconfig x86/ldt x86/mce 
x86/memtest x86/mmio x86/mpparse x86/nmi x86/numa x86/numa-fixes x86/pat 
x86/pebs x86/ptemask x86/resumetrace x86/scratch x86/setup x86/smpboot 
x86/threadinfo x86/timers x86/urgent x86/urgent-undo-ioapic x86/uv 
x86/vdso x86/xen x86/xsave

it failed miserably:

 warning: ignoring 066519068ad2fbe98c7f45552b1f592903a9c8c8; cannot 
 handle more than 25 refs
 [...]
 fatal: merge program failed
 Automated merge did not work.
 Should not be doing an Octopus.
 Merge with strategy octopus failed.

this wasnt even for purposes of an integration run: all i wanted to do 
was to pick up 2-3 new commits i have queued into 2-3 topic branches, 
into the (throw-away) integration branch. All the other branches were 
unmodified and already merged into the integration branch.

Hence i believe that the suggestions above by Git that i'm doing 
something wrong are ... wrong :-)

My scripting around this would be a lot faster (less than 10 seconds 
runtime versus a minute currently) and more robust if we could do such 
higher-order octopus merges.

	Ingo

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: git-rerere observations and feature suggestions
  2008-06-18 10:57     ` git-rerere observations and feature suggestions Ingo Molnar
@ 2008-06-18 11:29       ` Miklos Vajna
  2008-06-18 18:43         ` Ingo Molnar
  2008-06-18 11:36       ` Ingo Molnar
  2008-06-18 22:01       ` Jakub Narebski
  2 siblings, 1 reply; 45+ messages in thread
From: Miklos Vajna @ 2008-06-18 11:29 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Junio C Hamano, git

[-- Attachment #1: Type: text/plain, Size: 1683 bytes --]

On Wed, Jun 18, 2008 at 12:57:31PM +0200, Ingo Molnar <mingo@elte.hu> wrote:
> just to demonstrate it, i tried today to do an octopus merge of 87 topic 
> branches:
> 
> git-merge build checkme core/checkme core/debugobjects core/futex-64bit 
> core/iter-div core/kill-the-BKL core/locking core/misc core/percpu 
> core/printk core/rcu core/rodata core/softirq core/softlockup 
> core/stacktrace core/topology core/urgent cpus4096 genirq kmemcheck 
> kmemcheck2 mm/xen out-of-tree pci-for-jesse safe-poison-pointers sched 
> sched-devel scratch stackprotector timers/clockevents timers/hpet 
> timers/hrtimers timers/nohz timers/posixtimers tip tracing/ftrace 
> tracing/ftrace-mergefixups tracing/immediates tracing/markers 
> tracing/mmiotrace tracing/mmiotrace-mergefixups tracing/nmisafe 
> tracing/sched_markers tracing/stopmachine-allcpus tracing/sysprof 
> tracing/textedit x86/apic x86/apm x86/bitops x86/build x86/checkme 
> x86/cleanups x86/cpa x86/cpu x86/defconfig x86/delay x86/gart x86/i8259 
> x86/idle x86/intel x86/irq x86/irqstats x86/kconfig x86/ldt x86/mce 
> x86/memtest x86/mmio x86/mpparse x86/nmi x86/numa x86/numa-fixes x86/pat 
> x86/pebs x86/ptemask x86/resumetrace x86/scratch x86/setup x86/smpboot 
> x86/threadinfo x86/timers x86/urgent x86/urgent-undo-ioapic x86/uv 
> x86/vdso x86/xen x86/xsave
> 
> it failed miserably:
> 
>  warning: ignoring 066519068ad2fbe98c7f45552b1f592903a9c8c8; cannot 
>  handle more than 25 refs

The upcoming builtin-merge won't have this problem. I have added a
testcase for this in my working branch:

http://repo.or.cz/w/git/vmiklos.git?a=commit;h=7eef40b3cd772692c6eb7520686300533f35f10c

[-- Attachment #2: Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: git-rerere observations and feature suggestions
  2008-06-18 10:57     ` git-rerere observations and feature suggestions Ingo Molnar
  2008-06-18 11:29       ` Miklos Vajna
@ 2008-06-18 11:36       ` Ingo Molnar
  2008-06-18 22:01       ` Jakub Narebski
  2 siblings, 0 replies; 45+ messages in thread
From: Ingo Molnar @ 2008-06-18 11:36 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git


* Ingo Molnar <mingo@elte.hu> wrote:

> just to demonstrate it, i tried today to do an octopus merge of 87 
> topic branches:
> 
> git-merge build checkme core/checkme core/debugobjects core/futex-64bit 
> core/iter-div core/kill-the-BKL core/locking core/misc core/percpu 
> core/printk core/rcu core/rodata core/softirq core/softlockup 
> core/stacktrace core/topology core/urgent cpus4096 genirq kmemcheck 
> kmemcheck2 mm/xen out-of-tree pci-for-jesse safe-poison-pointers sched 
> sched-devel scratch stackprotector timers/clockevents timers/hpet 
> timers/hrtimers timers/nohz timers/posixtimers tip tracing/ftrace 
> tracing/ftrace-mergefixups tracing/immediates tracing/markers 
> tracing/mmiotrace tracing/mmiotrace-mergefixups tracing/nmisafe 
> tracing/sched_markers tracing/stopmachine-allcpus tracing/sysprof 
> tracing/textedit x86/apic x86/apm x86/bitops x86/build x86/checkme 
> x86/cleanups x86/cpa x86/cpu x86/defconfig x86/delay x86/gart x86/i8259 
> x86/idle x86/intel x86/irq x86/irqstats x86/kconfig x86/ldt x86/mce 
> x86/memtest x86/mmio x86/mpparse x86/nmi x86/numa x86/numa-fixes x86/pat 
> x86/pebs x86/ptemask x86/resumetrace x86/scratch x86/setup x86/smpboot 
> x86/threadinfo x86/timers x86/urgent x86/urgent-undo-ioapic x86/uv 
> x86/vdso x86/xen x86/xsave
> 
> it failed miserably:
> 
>  warning: ignoring 066519068ad2fbe98c7f45552b1f592903a9c8c8; cannot 
>  handle more than 25 refs
>  [...]
>  fatal: merge program failed
>  Automated merge did not work.
>  Should not be doing an Octopus.
>  Merge with strategy octopus failed.
> 
> this wasnt even for purposes of an integration run: all i wanted to do 
> was to pick up 2-3 new commits i have queued into 2-3 topic branches, 
> into the (throw-away) integration branch. All the other branches were 
> unmodified and already merged into the integration branch.
> 
> Hence i believe that the suggestions above by Git that i'm doing 
> something wrong are ... wrong :-)
> 
> My scripting around this would be a lot faster (less than 10 seconds 
> runtime versus a minute currently) and more robust if we could do such 
> higher-order octopus merges.

some hard numbers. Doing a scripted loop of 80 git-merges is 16.2 
seconds:

 earth4:~/tip> time ( for N in $(cat 11 12 13 14); do git-merge $N; done )
 [...]
 Already up-to-date.

 real    0m16.211s
 user    0m10.719s
 sys     0m5.604s

doing the octopus merge of 4x 20 branch octopus merges is 11.6 seconds:

 earth4:~/tip> time ( for N in 1 2 3 4; do git-merge $(cat 1$N); done )
 Already up-to-date. Yeeah!
 Already up-to-date. Yeeah!
 Already up-to-date. Yeeah!
 Already up-to-date. Yeeah!

 real    0m11.580s
 user    0m8.617s
 sys     0m2.895s

a 40% speedup - and would be another 10% faster with an order-of-80 
merge as well i think. Not to be sniffed at.

	Ingo

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: git-rerere observations and feature suggestions
  2008-06-18 11:29       ` Miklos Vajna
@ 2008-06-18 18:43         ` Ingo Molnar
  2008-06-18 19:53           ` Miklos Vajna
  0 siblings, 1 reply; 45+ messages in thread
From: Ingo Molnar @ 2008-06-18 18:43 UTC (permalink / raw)
  To: Miklos Vajna; +Cc: Junio C Hamano, git


* Miklos Vajna <vmiklos@frugalware.org> wrote:

> On Wed, Jun 18, 2008 at 12:57:31PM +0200, Ingo Molnar <mingo@elte.hu> wrote:
> > just to demonstrate it, i tried today to do an octopus merge of 87 topic 
> > branches:
> > 
> > git-merge build checkme core/checkme core/debugobjects core/futex-64bit 
> > core/iter-div core/kill-the-BKL core/locking core/misc core/percpu 
> > core/printk core/rcu core/rodata core/softirq core/softlockup 
> > core/stacktrace core/topology core/urgent cpus4096 genirq kmemcheck 
> > kmemcheck2 mm/xen out-of-tree pci-for-jesse safe-poison-pointers sched 
> > sched-devel scratch stackprotector timers/clockevents timers/hpet 
> > timers/hrtimers timers/nohz timers/posixtimers tip tracing/ftrace 
> > tracing/ftrace-mergefixups tracing/immediates tracing/markers 
> > tracing/mmiotrace tracing/mmiotrace-mergefixups tracing/nmisafe 
> > tracing/sched_markers tracing/stopmachine-allcpus tracing/sysprof 
> > tracing/textedit x86/apic x86/apm x86/bitops x86/build x86/checkme 
> > x86/cleanups x86/cpa x86/cpu x86/defconfig x86/delay x86/gart x86/i8259 
> > x86/idle x86/intel x86/irq x86/irqstats x86/kconfig x86/ldt x86/mce 
> > x86/memtest x86/mmio x86/mpparse x86/nmi x86/numa x86/numa-fixes x86/pat 
> > x86/pebs x86/ptemask x86/resumetrace x86/scratch x86/setup x86/smpboot 
> > x86/threadinfo x86/timers x86/urgent x86/urgent-undo-ioapic x86/uv 
> > x86/vdso x86/xen x86/xsave
> > 
> > it failed miserably:
> > 
> >  warning: ignoring 066519068ad2fbe98c7f45552b1f592903a9c8c8; cannot 
> >  handle more than 25 refs
> 
> The upcoming builtin-merge won't have this problem. I have added a 
> testcase for this in my working branch:
> 
> http://repo.or.cz/w/git/vmiklos.git?a=commit;h=7eef40b3cd772692c6eb7520686300533f35f10c

cool, thanks a ton!

stupid question: does this mean that if i install the latest Git devel 
snapshot (v1.5.6-rc3-21-g8c6b578 or later), i'll be able to experiment 
around with it right now?

	Ingo

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: git-rerere observations and feature suggestions
  2008-06-18 18:43         ` Ingo Molnar
@ 2008-06-18 19:53           ` Miklos Vajna
  0 siblings, 0 replies; 45+ messages in thread
From: Miklos Vajna @ 2008-06-18 19:53 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Junio C Hamano, git

[-- Attachment #1: Type: text/plain, Size: 435 bytes --]

On Wed, Jun 18, 2008 at 08:43:29PM +0200, Ingo Molnar <mingo@elte.hu> wrote:
> cool, thanks a ton!
> 
> stupid question: does this mean that if i install the latest Git devel 
> snapshot (v1.5.6-rc3-21-g8c6b578 or later), i'll be able to experiment 
> around with it right now?

Nope. It is currently in the 'builtin-merge' branch of
git://repo.or.cz/git/vmiklos.git. And I'm working on to be merged after
1.5.6 will be out.

[-- Attachment #2: Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: git-rerere observations and feature suggestions
  2008-06-18 10:57     ` git-rerere observations and feature suggestions Ingo Molnar
  2008-06-18 11:29       ` Miklos Vajna
  2008-06-18 11:36       ` Ingo Molnar
@ 2008-06-18 22:01       ` Jakub Narebski
  2008-06-18 22:38         ` Miklos Vajna
  2 siblings, 1 reply; 45+ messages in thread
From: Jakub Narebski @ 2008-06-18 22:01 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Junio C Hamano, git

Ingo Molnar <mingo@elte.hu> writes:

> * Ingo Molnar <mingo@elte.hu> wrote:
> 
> > And while asking for an arm i'd also like to ask for a leg, if i may: 
> > i'd love it if a "slightly conflicting" octopus merge of 85 topic 
> > trees would not result in one huge conflict commit that merges 
> > together 1000 commits into a single commit ;-)
> > 
> > So right now in our -tip scripts work around this issue: we 
> > 'serialize' the topic merges despite having very nice opportunities 
> > for higher-order octopus merges. The integration would be a lot faster 
> > if we could use octopus merges and automated git-rerere. (Octopus 
> > merges would look much nicer as well in graphical representation as 
> > well, which counts too :-) )
> 
> just to demonstrate it, i tried today to do an octopus merge of 87 topic 
> branches:
> 
> git-merge build checkme core/checkme core/debugobjects core/futex-64bit 
> core/iter-div core/kill-the-BKL core/locking core/misc core/percpu 
> core/printk core/rcu core/rodata core/softirq core/softlockup 
> core/stacktrace core/topology core/urgent cpus4096 genirq kmemcheck 
> kmemcheck2 mm/xen out-of-tree pci-for-jesse safe-poison-pointers sched 
> sched-devel scratch stackprotector timers/clockevents timers/hpet 
> timers/hrtimers timers/nohz timers/posixtimers tip tracing/ftrace 
> tracing/ftrace-mergefixups tracing/immediates tracing/markers 
> tracing/mmiotrace tracing/mmiotrace-mergefixups tracing/nmisafe 
> tracing/sched_markers tracing/stopmachine-allcpus tracing/sysprof 
> tracing/textedit x86/apic x86/apm x86/bitops x86/build x86/checkme 
> x86/cleanups x86/cpa x86/cpu x86/defconfig x86/delay x86/gart x86/i8259 
> x86/idle x86/intel x86/irq x86/irqstats x86/kconfig x86/ldt x86/mce 
> x86/memtest x86/mmio x86/mpparse x86/nmi x86/numa x86/numa-fixes x86/pat 
> x86/pebs x86/ptemask x86/resumetrace x86/scratch x86/setup x86/smpboot 
> x86/threadinfo x86/timers x86/urgent x86/urgent-undo-ioapic x86/uv 
> x86/vdso x86/xen x86/xsave
> 
> it failed miserably:
> 
>  warning: ignoring 066519068ad2fbe98c7f45552b1f592903a9c8c8; cannot 
>  handle more than 25 refs
>  [...]
>  fatal: merge program failed
>  Automated merge did not work.
>  Should not be doing an Octopus.
>  Merge with strategy octopus failed.
> 
> this wasnt even for purposes of an integration run: all i wanted to do 
> was to pick up 2-3 new commits i have queued into 2-3 topic branches, 
> into the (throw-away) integration branch. All the other branches were 
> unmodified and already merged into the integration branch.
> 
> Hence i believe that the suggestions above by Git that i'm doing 
> something wrong are ... wrong :-)
> 
> My scripting around this would be a lot faster (less than 10 seconds 
> runtime versus a minute currently) and more robust if we could do such 
> higher-order octopus merges.

As a part of patch series introducing new fast-forward strategies
(--ff=never, --ff=only) there was patch which did merge reduction
before selecting merge strategy, by Sverre Hvammen Johansen
  "[PATCH 4/5] Head reduction before selecting merge strategy"
  http://thread.gmane.org/gmane.comp.version-control.git/80288/focus=80335
(I'm not sure if the link above is to nevest version of patch series).

It is now part of 'pu' branch, as commit 59171adb9c.  It didn't make
into 'next' as it conflict with builtin merge by Miklos Vajna, which
(as he wrote) also includes head reduction.

So you either would have to compile git from builtin-merge repository,
compile git from 'pu' or just use git-merge.sh from 'pu' branch, or
apply or cherry pick appropriate commit and compile git.

-- 
Jakub Narebski
Poland
ShadeHawk on #git

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: git-rerere observations and feature suggestions
  2008-06-18 22:01       ` Jakub Narebski
@ 2008-06-18 22:38         ` Miklos Vajna
  2008-06-19  7:23           ` Karl Hasselström
  0 siblings, 1 reply; 45+ messages in thread
From: Miklos Vajna @ 2008-06-18 22:38 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: Ingo Molnar, Junio C Hamano, git

[-- Attachment #1: Type: text/plain, Size: 768 bytes --]

On Wed, Jun 18, 2008 at 03:01:24PM -0700, Jakub Narebski <jnareb@gmail.com> wrote:
> As a part of patch series introducing new fast-forward strategies
> (--ff=never, --ff=only) there was patch which did merge reduction
> before selecting merge strategy, by Sverre Hvammen Johansen
>   "[PATCH 4/5] Head reduction before selecting merge strategy"
>   http://thread.gmane.org/gmane.comp.version-control.git/80288/focus=80335
> (I'm not sure if the link above is to nevest version of patch series).

Side note: builtin-merge does not have problem with merging 25+ refs
even in case every ref contains "new" commits.

The patch by Sverre Hvammen Johansen is useful if some of the refs has
no "new" commits, so it will help here, but I think it does not help in
all cases.

[-- Attachment #2: Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: git-rerere observations and feature suggestions
  2008-06-18 22:38         ` Miklos Vajna
@ 2008-06-19  7:23           ` Karl Hasselström
  2008-06-19  7:29             ` Miklos Vajna
  2008-06-19  7:30             ` Junio C Hamano
  0 siblings, 2 replies; 45+ messages in thread
From: Karl Hasselström @ 2008-06-19  7:23 UTC (permalink / raw)
  To: Miklos Vajna; +Cc: Jakub Narebski, Ingo Molnar, Junio C Hamano, git

On 2008-06-19 00:38:21 +0200, Miklos Vajna wrote:

> On Wed, Jun 18, 2008 at 03:01:24PM -0700, Jakub Narebski
> <jnareb@gmail.com> wrote:
>
> > As a part of patch series introducing new fast-forward strategies
> > (--ff=never, --ff=only) there was patch which did merge reduction
> > before selecting merge strategy, by Sverre Hvammen Johansen
> >   "[PATCH 4/5] Head reduction before selecting merge strategy"
> >   http://thread.gmane.org/gmane.comp.version-control.git/80288/focus=80335
> > (I'm not sure if the link above is to nevest version of patch
> > series).
>
> Side note: builtin-merge does not have problem with merging 25+ refs
> even in case every ref contains "new" commits.

So how many parents can a commit have, exactly? Is there a hard limit
somewhere, or just a point beyond which some git tools will start
behaving strangely?

-- 
Karl Hasselström, kha@treskal.com
      www.treskal.com/kalle

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: git-rerere observations and feature suggestions
  2008-06-19  7:23           ` Karl Hasselström
@ 2008-06-19  7:29             ` Miklos Vajna
  2008-06-19  7:30             ` Junio C Hamano
  1 sibling, 0 replies; 45+ messages in thread
From: Miklos Vajna @ 2008-06-19  7:29 UTC (permalink / raw)
  To: Karl Hasselström; +Cc: Jakub Narebski, Ingo Molnar, Junio C Hamano, git

[-- Attachment #1: Type: text/plain, Size: 576 bytes --]

On Thu, Jun 19, 2008 at 09:23:08AM +0200, Karl Hasselström <kha@treskal.com> wrote:
> > Side note: builtin-merge does not have problem with merging 25+ refs
> > even in case every ref contains "new" commits.
> 
> So how many parents can a commit have, exactly? Is there a hard limit
> somewhere, or just a point beyond which some git tools will start
> behaving strangely?

AFAIK there is no limit at a core level. git-show-branch has a limit of
25 refs (it can't show more then 25 refs at one time) and git-merge.sh
uses show-branch, while builtin-merge does not.

[-- Attachment #2: Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: git-rerere observations and feature suggestions
  2008-06-19  7:23           ` Karl Hasselström
  2008-06-19  7:29             ` Miklos Vajna
@ 2008-06-19  7:30             ` Junio C Hamano
  2008-06-19  8:21               ` Karl Hasselström
  1 sibling, 1 reply; 45+ messages in thread
From: Junio C Hamano @ 2008-06-19  7:30 UTC (permalink / raw)
  To: Karl Hasselström; +Cc: Miklos Vajna, Jakub Narebski, Ingo Molnar, git

Karl Hasselström <kha@treskal.com> writes:

> So how many parents can a commit have, exactly? Is there a hard limit
> somewhere, or just a point beyond which some git tools will start
> behaving strangely?

There is no hard limit at the data structure level.

git-commit-tree has a hard limit of accepting 16 parents.  git-blame has
the same 16-parent limit while following the history (but the one in
'next' has lifted the latter limitation).

But that is purely academic.  Anybody who does an octopus with more than 8
legs should get his head examined ;-).

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: git-rerere observations and feature suggestions
  2008-06-19  7:30             ` Junio C Hamano
@ 2008-06-19  8:21               ` Karl Hasselström
  2008-06-19  8:33                 ` Miklos Vajna
  0 siblings, 1 reply; 45+ messages in thread
From: Karl Hasselström @ 2008-06-19  8:21 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Miklos Vajna, Jakub Narebski, Ingo Molnar, git, Catalin Marinas

On 2008-06-19 00:30:43 -0700, Junio C Hamano wrote:

> Karl Hasselström <kha@treskal.com> writes:
>
> > So how many parents can a commit have, exactly? Is there a hard
> > limit somewhere, or just a point beyond which some git tools will
> > start behaving strangely?
>
> There is no hard limit at the data structure level.
>
> git-commit-tree has a hard limit of accepting 16 parents. git-blame
> has the same 16-parent limit while following the history (but the
> one in 'next' has lifted the latter limitation).

Thanks.

> But that is purely academic. Anybody who does an octopus with more
> than 8 legs should get his head examined ;-).

Catalin and I are tossing ideas around for how to represent the
history of an StGit patch stack (using a git commit for each log
entry). One complication is that we have to keep references to all
unapplied patches so that gc will leave them alone (and so that they
will get carried along during a pull, in the future). And the number
of unapplied patches is potentially large, so I thought we'd be going
to have to make a tree of "merge" commits to connect them all up.

(What we'd really like, of course, is a way to refer to a set of
commits such that they are guaranteed to be reachable (in the gc and
pull sense), but not considered "parents".)

-- 
Karl Hasselström, kha@treskal.com
      www.treskal.com/kalle

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: git-rerere observations and feature suggestions
  2008-06-19  8:21               ` Karl Hasselström
@ 2008-06-19  8:33                 ` Miklos Vajna
  2008-06-19  9:19                   ` Karl Hasselström
  0 siblings, 1 reply; 45+ messages in thread
From: Miklos Vajna @ 2008-06-19  8:33 UTC (permalink / raw)
  To: Karl Hasselström
  Cc: Junio C Hamano, Jakub Narebski, Ingo Molnar, git, Catalin Marinas

[-- Attachment #1: Type: text/plain, Size: 1467 bytes --]

On Thu, Jun 19, 2008 at 10:21:56AM +0200, Karl Hasselström <kha@treskal.com> wrote:
> Catalin and I are tossing ideas around for how to represent the
> history of an StGit patch stack (using a git commit for each log
> entry). One complication is that we have to keep references to all
> unapplied patches so that gc will leave them alone (and so that they
> will get carried along during a pull, in the future). And the number
> of unapplied patches is potentially large, so I thought we'd be going
> to have to make a tree of "merge" commits to connect them all up.
> 
> (What we'd really like, of course, is a way to refer to a set of
> commits such that they are guaranteed to be reachable (in the gc and
> pull sense), but not considered "parents".)

I had a similar problem in git/vmiklos.git on repo.or.cz, while working
on builtin-rebase: I squash several patches using rebase -i before
sending a series, but it's nice to have the old long list of small
patches in case I would need them later.

What I did is to have a rebase-history branch: each commit in it is an
octopus merge:

- The first parent is the previous rebase-history ref

- The second is the old HEAD

- The third is the new HEAD

This way I can use git rebase -i without worrying about loosing history,
even if reflogs are not shared among machines.

(It may or may not be a good idea to do something like this in StGit, I
just though I share this idea here.)

[-- Attachment #2: Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: git-rerere observations and feature suggestions
  2008-06-19  8:33                 ` Miklos Vajna
@ 2008-06-19  9:19                   ` Karl Hasselström
  2008-06-19 10:06                     ` Miklos Vajna
  0 siblings, 1 reply; 45+ messages in thread
From: Karl Hasselström @ 2008-06-19  9:19 UTC (permalink / raw)
  To: Miklos Vajna
  Cc: Junio C Hamano, Jakub Narebski, Ingo Molnar, git, Catalin Marinas

On 2008-06-19 10:33:56 +0200, Miklos Vajna wrote:

> On Thu, Jun 19, 2008 at 10:21:56AM +0200, Karl Hasselström
> <kha@treskal.com> wrote:
>
> > Catalin and I are tossing ideas around for how to represent the
> > history of an StGit patch stack (using a git commit for each log
> > entry). One complication is that we have to keep references to all
> > unapplied patches so that gc will leave them alone (and so that
> > they will get carried along during a pull, in the future). And the
> > number of unapplied patches is potentially large, so I thought
> > we'd be going to have to make a tree of "merge" commits to connect
> > them all up.
> >
> > (What we'd really like, of course, is a way to refer to a set of
> > commits such that they are guaranteed to be reachable (in the gc
> > and pull sense), but not considered "parents".)
>
> I had a similar problem in git/vmiklos.git on repo.or.cz, while
> working on builtin-rebase: I squash several patches using rebase -i
> before sending a series, but it's nice to have the old long list of
> small patches in case I would need them later.
>
> What I did is to have a rebase-history branch: each commit in it is
> an octopus merge:
>
> - The first parent is the previous rebase-history ref
>
> - The second is the old HEAD
>
> - The third is the new HEAD
>
> This way I can use git rebase -i without worrying about loosing
> history, even if reflogs are not shared among machines.
>
> (It may or may not be a good idea to do something like this in
> StGit, I just though I share this idea here.)

What you're describing is pretty much what we're thinking about doing
-- have a log branch where each commit contains enough metadata to
recreate the complete patch stack state at that point in time, and has
all the parents it needs to be safe from gc.

The particular problem I'm asking about here is that due to StGit's
concept of "unapplied" patches that are per definition not reachable
from the current branch head, a given log entry might have to keep an
unbounded number of commits from being gc'ed. Thus my question about
what would blow up if we were to make a commit with 50 parents. Or
100. Or 1000, if our users are crazy enough. (The alternative being,
of course, to make a tree of octopuses with a fixed maximum fan-out.)

-- 
Karl Hasselström, kha@treskal.com
      www.treskal.com/kalle

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: git-rerere observations and feature suggestions
  2008-06-19  9:19                   ` Karl Hasselström
@ 2008-06-19 10:06                     ` Miklos Vajna
  2008-06-19 10:35                       ` Karl Hasselström
  0 siblings, 1 reply; 45+ messages in thread
From: Miklos Vajna @ 2008-06-19 10:06 UTC (permalink / raw)
  To: Karl Hasselström
  Cc: Junio C Hamano, Jakub Narebski, Ingo Molnar, git, Catalin Marinas

[-- Attachment #1: Type: text/plain, Size: 1307 bytes --]

On Thu, Jun 19, 2008 at 11:19:03AM +0200, Karl Hasselström <kha@treskal.com> wrote:
> What you're describing is pretty much what we're thinking about doing
> -- have a log branch where each commit contains enough metadata to
> recreate the complete patch stack state at that point in time, and has
> all the parents it needs to be safe from gc.
> 
> The particular problem I'm asking about here is that due to StGit's
> concept of "unapplied" patches that are per definition not reachable
> from the current branch head, a given log entry might have to keep an
> unbounded number of commits from being gc'ed. Thus my question about
> what would blow up if we were to make a commit with 50 parents. Or
> 100. Or 1000, if our users are crazy enough. (The alternative being,
> of course, to make a tree of octopuses with a fixed maximum fan-out.)

I may miss something, but you have (at least) two options to store
"patches".

You can store them as a blob, make a tree of them and make a commit in
the log branch point to the tree. This one has the advantage of being
able to do a 'git log' on a particular patch of the patch set.

The other one is to create n+1 trees (and commits, where the first
commit has no parent) for n patches, and point to the last commit from
the log branch.

[-- Attachment #2: Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: git-rerere observations and feature suggestions
  2008-06-19 10:06                     ` Miklos Vajna
@ 2008-06-19 10:35                       ` Karl Hasselström
  0 siblings, 0 replies; 45+ messages in thread
From: Karl Hasselström @ 2008-06-19 10:35 UTC (permalink / raw)
  To: Miklos Vajna
  Cc: Junio C Hamano, Jakub Narebski, Ingo Molnar, git, Catalin Marinas

On 2008-06-19 12:06:37 +0200, Miklos Vajna wrote:

> You can store them as a blob, make a tree of them and make a commit
> in the log branch point to the tree. This one has the advantage of
> being able to do a 'git log' on a particular patch of the patch set.

If I don't store the pre or post tree in its entirety, I lose the
ability to do patch application by three-way merge. (The current StGit
design assumes that we can always make a three-way merge as a last
resort when applying patches. Basically, StGit is just a fancy way to
rebase.)

But yes, this is a viable idea. (Though once I have to store one of
the trees, I believe it's actually simpler and cheaper to just store
the other tree as well, instead of having to compute the diff and
store that in a blob.)

> The other one is to create n+1 trees (and commits, where the first
> commit has no parent) for n patches, and point to the last commit
> from the log branch.

There's actually no point in making more than one commit. A tree can
easily hold a lot of sub-trees.

I have an existing implementation that stores the pre and post tree
for each patch, plus some metadata (message, author). The issue with
this format is that every time we write a new log entry (that is, for
every StGit command), we have to call git multiple times in order to
write several new trees and blobs.

StGit normally represents each patch by a commit object, so it should
be faster to simply write a single new commit to the log that has some
metadata in its commit message and just refers to all the patches'
commit objects (by having them as parents). Which is why I was
inquiring about the maximum number of parents of a commit object.

( Some background: At a given point in time, your StGit stack consists
  of a few applied patches, and a few unapplied patches. The applied
  patches are just a linear sequence of commits at the top of your
  current branch, so we can trivially save them all from the garbage
  collector by making the stack top a parent of our log commit. The
  unapplied patches, however, are commits that are not reachable from
  the stack top -- they can be "pushed" onto the stack by rebasing, at
  which point they become applied, but until then we can't make any
  assumptions about them being ancestors of anything. So a log commit
  potentially has to have _every_ unapplied patch as a parent. (If we
  know that the commit of an unapplied patch used to be applied, we
  know that it's reachable from previous log commits, but we don't
  always know that.) )

-- 
Karl Hasselström, kha@treskal.com
      www.treskal.com/kalle

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH 1/5] rerere: rerere_created_at() and has_resolution() abstraction
  2008-06-16 20:50     ` Junio C Hamano
@ 2008-06-22  9:47       ` Junio C Hamano
  2008-06-22  9:47       ` [PATCH 2/5] git-rerere: detect unparsable conflicts Junio C Hamano
                         ` (3 subsequent siblings)
  4 siblings, 0 replies; 45+ messages in thread
From: Junio C Hamano @ 2008-06-22  9:47 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: git

There were too many places in the code how an entry in the rerere database
looks like, and the garbage_collect() function that iterates over
subdirectories of the rr-cache directory was the worse offender.

Introduce two helper functions, rerere_created_at() and has_resolution(),
to abstract out the logic a bit better.

Incidentally this fixes a small memory leak in garbage_collect()
function.  The path list to collect the entries to be pruned were defined
to strdup the paths but the caller was feeding a path after doing an extra
copy.  Because the list does not have to be sorted by conflict signature
hash, we use path_list_append() instead of path_list_insert().

While we are at it, make a conflicted hunk comparision in handle_file() a
bit easier to read.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 * So this is a series to handle the first point in the message I am
   replying to.

 builtin-rerere.c |   57 ++++++++++++++++++++++++-----------------------------
 1 files changed, 26 insertions(+), 31 deletions(-)

diff --git a/builtin-rerere.c b/builtin-rerere.c
index 85222d9..610b96a 100644
--- a/builtin-rerere.c
+++ b/builtin-rerere.c
@@ -23,6 +23,18 @@ static const char *rr_path(const char *name, const char *file)
 	return git_path("rr-cache/%s/%s", name, file);
 }
 
+static time_t rerere_created_at(const char *name)
+{
+	struct stat st;
+	return stat(rr_path(name, "preimage"), &st) ? (time_t) 0 : st.st_mtime;
+}
+
+static int has_resolution(const char *name)
+{
+	struct stat st;
+	return !stat(rr_path(name, "postimage"), &st);
+}
+
 static void read_rr(struct path_list *rr)
 {
 	unsigned char sha1[20];
@@ -98,13 +110,10 @@ static int handle_file(const char *path,
 		else if (!prefixcmp(buf, "======="))
 			hunk = 2;
 		else if (!prefixcmp(buf, ">>>>>>> ")) {
-			int cmp = strbuf_cmp(&one, &two);
-
+			if (strbuf_cmp(&one, &two) > 0)
+				strbuf_swap(&one, &two);
 			hunk_no++;
 			hunk = 0;
-			if (cmp > 0) {
-				strbuf_swap(&one, &two);
-			}
 			if (out) {
 				fputs("<<<<<<<\n", out);
 				fwrite(one.buf, one.len, 1, out);
@@ -201,33 +210,24 @@ static void unlink_rr_item(const char *name)
 static void garbage_collect(struct path_list *rr)
 {
 	struct path_list to_remove = { NULL, 0, 0, 1 };
-	char buf[1024];
 	DIR *dir;
 	struct dirent *e;
-	int len, i, cutoff;
+	int i, cutoff;
 	time_t now = time(NULL), then;
 
-	strlcpy(buf, git_path("rr-cache"), sizeof(buf));
-	len = strlen(buf);
-	dir = opendir(buf);
-	strcpy(buf + len++, "/");
+	dir = opendir(git_path("rr-cache"));
 	while ((e = readdir(dir))) {
 		const char *name = e->d_name;
-		struct stat st;
-		if (name[0] == '.' && (name[1] == '\0' ||
-					(name[1] == '.' && name[2] == '\0')))
+		if (name[0] == '.' &&
+		    (name[1] == '\0' || (name[1] == '.' && name[2] == '\0')))
 			continue;
-		i = snprintf(buf + len, sizeof(buf) - len, "%s", name);
-		strlcpy(buf + len + i, "/preimage", sizeof(buf) - len - i);
-		if (stat(buf, &st))
+		then = rerere_created_at(name);
+		if (!then)
 			continue;
-		then = st.st_mtime;
-		strlcpy(buf + len + i, "/postimage", sizeof(buf) - len - i);
-		cutoff = stat(buf, &st) ? cutoff_noresolve : cutoff_resolve;
-		if (then < now - cutoff * 86400) {
-			buf[len + i] = '\0';
-			path_list_insert(xstrdup(name), &to_remove);
-		}
+		cutoff = (has_resolution(name)
+			  ? cutoff_resolve : cutoff_noresolve);
+		if (then < now - cutoff * 86400)
+			path_list_append(name, &to_remove);
 	}
 	for (i = 0; i < to_remove.nr; i++)
 		unlink_rr_item(to_remove.items[i].path);
@@ -306,13 +306,11 @@ static int do_plain_rerere(struct path_list *rr, int fd)
 	 */
 
 	for (i = 0; i < rr->nr; i++) {
-		struct stat st;
 		int ret;
 		const char *path = rr->items[i].path;
 		const char *name = (const char *)rr->items[i].util;
 
-		if (!stat(rr_path(name, "preimage"), &st) &&
-				!stat(rr_path(name, "postimage"), &st)) {
+		if (has_resolution(name)) {
 			if (!merge(name, path)) {
 				fprintf(stderr, "Resolved '%s' using "
 						"previous resolution.\n", path);
@@ -410,11 +408,8 @@ int cmd_rerere(int argc, const char **argv, const char *prefix)
 		return do_plain_rerere(&merge_rr, fd);
 	else if (!strcmp(argv[1], "clear")) {
 		for (i = 0; i < merge_rr.nr; i++) {
-			struct stat st;
 			const char *name = (const char *)merge_rr.items[i].util;
-			if (!stat(git_path("rr-cache/%s", name), &st) &&
-					S_ISDIR(st.st_mode) &&
-					stat(rr_path(name, "postimage"), &st))
+			if (!has_resolution(name))
 				unlink_rr_item(name);
 		}
 		unlink(merge_rr_path);
-- 
1.5.6.12.g73f03

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 2/5] git-rerere: detect unparsable conflicts
  2008-06-16 20:50     ` Junio C Hamano
  2008-06-22  9:47       ` [PATCH 1/5] rerere: rerere_created_at() and has_resolution() abstraction Junio C Hamano
@ 2008-06-22  9:47       ` Junio C Hamano
  2008-06-22  9:47       ` [PATCH 3/5] rerere: remove dubious "tail_optimization" Junio C Hamano
                         ` (2 subsequent siblings)
  4 siblings, 0 replies; 45+ messages in thread
From: Junio C Hamano @ 2008-06-22  9:47 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: git

rerere did not detect the case where <<< === >>> markers did not match.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin-rerere.c |    5 +++++
 1 files changed, 5 insertions(+), 0 deletions(-)

diff --git a/builtin-rerere.c b/builtin-rerere.c
index 610b96a..addc5c7 100644
--- a/builtin-rerere.c
+++ b/builtin-rerere.c
@@ -144,6 +144,11 @@ static int handle_file(const char *path,
 		fclose(out);
 	if (sha1)
 		SHA1_Final(sha1, &ctx);
+	if (hunk) {
+		if (output)
+			unlink(output);
+		return error("Could not parse conflict hunks in %s", path);
+	}
 	return hunk_no;
 }
 
-- 
1.5.6.12.g73f03

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 3/5] rerere: remove dubious "tail_optimization"
  2008-06-16 20:50     ` Junio C Hamano
  2008-06-22  9:47       ` [PATCH 1/5] rerere: rerere_created_at() and has_resolution() abstraction Junio C Hamano
  2008-06-22  9:47       ` [PATCH 2/5] git-rerere: detect unparsable conflicts Junio C Hamano
@ 2008-06-22  9:47       ` Junio C Hamano
  2008-06-22  9:48       ` [PATCH 4/5] t4200: fix rerere test Junio C Hamano
  2008-06-22  9:48       ` [PATCH 5/5] rerere.autoupdate Junio C Hamano
  4 siblings, 0 replies; 45+ messages in thread
From: Junio C Hamano @ 2008-06-22  9:47 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: git

It is dubious if it is cheaper to shift entries repeatedly using memmove()
to collect entries that needs to be written out in front of an array than
simply marking the entries to be skipped.  In addition, the label called this
"tail optimization", but this obviously is not what people usually call
with that name.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin-rerere.c |   19 +++++++++----------
 1 files changed, 9 insertions(+), 10 deletions(-)

diff --git a/builtin-rerere.c b/builtin-rerere.c
index addc5c7..0eec1f9 100644
--- a/builtin-rerere.c
+++ b/builtin-rerere.c
@@ -66,8 +66,12 @@ static int write_rr(struct path_list *rr, int out_fd)
 {
 	int i;
 	for (i = 0; i < rr->nr; i++) {
-		const char *path = rr->items[i].path;
-		int length = strlen(path) + 1;
+		const char *path;
+		int length;
+		if (!rr->items[i].util)
+			continue;
+		path = rr->items[i].path;
+		length = strlen(path) + 1;
 		if (write_in_full(out_fd, rr->items[i].util, 40) != 40 ||
 		    write_in_full(out_fd, "\t", 1) != 1 ||
 		    write_in_full(out_fd, path, length) != length)
@@ -319,7 +323,7 @@ static int do_plain_rerere(struct path_list *rr, int fd)
 			if (!merge(name, path)) {
 				fprintf(stderr, "Resolved '%s' using "
 						"previous resolution.\n", path);
-				goto tail_optimization;
+				goto mark_resolved;
 			}
 		}
 
@@ -330,13 +334,8 @@ static int do_plain_rerere(struct path_list *rr, int fd)
 
 		fprintf(stderr, "Recorded resolution for '%s'.\n", path);
 		copy_file(rr_path(name, "postimage"), path, 0666);
-tail_optimization:
-		if (i < rr->nr - 1)
-			memmove(rr->items + i,
-				rr->items + i + 1,
-				sizeof(rr->items[0]) * (rr->nr - i - 1));
-		rr->nr--;
-		i--;
+	mark_resolved:
+		rr->items[i].util = NULL;
 	}
 
 	return write_rr(rr, fd);
-- 
1.5.6.12.g73f03

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 4/5] t4200: fix rerere test
  2008-06-16 20:50     ` Junio C Hamano
                         ` (2 preceding siblings ...)
  2008-06-22  9:47       ` [PATCH 3/5] rerere: remove dubious "tail_optimization" Junio C Hamano
@ 2008-06-22  9:48       ` Junio C Hamano
  2008-06-22  9:48       ` [PATCH 5/5] rerere.autoupdate Junio C Hamano
  4 siblings, 0 replies; 45+ messages in thread
From: Junio C Hamano @ 2008-06-22  9:48 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: git

The test used "diff-files -q" which is not about reporting if there is
a difference at all.  Instead, make sure that the path remains as
conflicting in the index after rerere autoresolves it, as we will be
adding rerere.autoupdate configuration with the next patch.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 t/t4200-rerere.sh |    6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/t/t4200-rerere.sh b/t/t4200-rerere.sh
index 85d7e3e..afb3e3d 100755
--- a/t/t4200-rerere.sh
+++ b/t/t4200-rerere.sh
@@ -193,9 +193,9 @@ test_expect_success 'resolution was recorded properly' '
 	echo Bello > file3 &&
 	git add file3 &&
 	git commit -m version2 &&
-	! git merge fifth &&
-	git diff-files -q &&
-	test Cello = "$(cat file3)"
+	test_must_fail git merge fifth &&
+	test Cello = "$(cat file3)" &&
+	test 0 != $(git ls-files -u | wc -l)
 '
 
 test_done
-- 
1.5.6.12.g73f03

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 5/5] rerere.autoupdate
  2008-06-16 20:50     ` Junio C Hamano
                         ` (3 preceding siblings ...)
  2008-06-22  9:48       ` [PATCH 4/5] t4200: fix rerere test Junio C Hamano
@ 2008-06-22  9:48       ` Junio C Hamano
  4 siblings, 0 replies; 45+ messages in thread
From: Junio C Hamano @ 2008-06-22  9:48 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: git

When this configuration is set, paths that are autoresolved by git-rerere
are updated in the index as well.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 Documentation/config.txt |    5 +++++
 builtin-rerere.c         |   37 +++++++++++++++++++++++++++++++++++++
 t/t4200-rerere.sh        |   10 ++++++++++
 3 files changed, 52 insertions(+), 0 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 5331b45..0c7cf61 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -650,6 +650,11 @@ gc.rerereunresolved::
 	kept for this many days when `git rerere gc` is run.
 	The default is 15 days.  See linkgit:git-rerere[1].
 
+rerere.autoupdate::
+	When set to true, `git-rerere` updates the index with the
+	resulting contents after it cleanly resolves conflicts using
+	previously recorded resolution.  Defaults to false.
+
 rerere.enabled::
 	Activate recording of resolved conflicts, so that identical
 	conflict hunks can be resolved automatically, should they
diff --git a/builtin-rerere.c b/builtin-rerere.c
index 0eec1f9..839b26e 100644
--- a/builtin-rerere.c
+++ b/builtin-rerere.c
@@ -16,6 +16,9 @@ static int cutoff_resolve = 60;
 /* if rerere_enabled == -1, fall back to detection of .git/rr-cache */
 static int rerere_enabled = -1;
 
+/* automatically update cleanly resolved paths to the index */
+static int rerere_autoupdate;
+
 static char *merge_rr_path;
 
 static const char *rr_path(const char *name, const char *file)
@@ -276,9 +279,36 @@ static int diff_two(const char *file1, const char *label1,
 	return 0;
 }
 
+static struct lock_file index_lock;
+
+static int update_paths(struct path_list *update)
+{
+	int i;
+	int fd = hold_locked_index(&index_lock, 0);
+	int status = 0;
+
+	if (fd < 0)
+		return -1;
+
+	for (i = 0; i < update->nr; i++) {
+		struct path_list_item *item = &update->items[i];
+		if (add_file_to_cache(item->path, ADD_CACHE_IGNORE_ERRORS))
+			status = -1;
+	}
+
+	if (!status && active_cache_changed) {
+		if (write_cache(fd, active_cache, active_nr) ||
+		    commit_locked_index(&index_lock))
+			die("Unable to write new index file");
+	} else if (fd >= 0)
+		rollback_lock_file(&index_lock);
+	return status;
+}
+
 static int do_plain_rerere(struct path_list *rr, int fd)
 {
 	struct path_list conflict = { NULL, 0, 0, 1 };
+	struct path_list update = { NULL, 0, 0, 1 };
 	int i;
 
 	find_conflict(&conflict);
@@ -323,6 +353,8 @@ static int do_plain_rerere(struct path_list *rr, int fd)
 			if (!merge(name, path)) {
 				fprintf(stderr, "Resolved '%s' using "
 						"previous resolution.\n", path);
+				if (rerere_autoupdate)
+					path_list_insert(path, &update);
 				goto mark_resolved;
 			}
 		}
@@ -338,6 +370,9 @@ static int do_plain_rerere(struct path_list *rr, int fd)
 		rr->items[i].util = NULL;
 	}
 
+	if (update.nr)
+		update_paths(&update);
+
 	return write_rr(rr, fd);
 }
 
@@ -349,6 +384,8 @@ static int git_rerere_config(const char *var, const char *value, void *cb)
 		cutoff_noresolve = git_config_int(var, value);
 	else if (!strcmp(var, "rerere.enabled"))
 		rerere_enabled = git_config_bool(var, value);
+	else if (!strcmp(var, "rerere.autoupdate"))
+		rerere_autoupdate = git_config_bool(var, value);
 	else
 		return git_default_config(var, value, cb);
 	return 0;
diff --git a/t/t4200-rerere.sh b/t/t4200-rerere.sh
index afb3e3d..a64727d 100755
--- a/t/t4200-rerere.sh
+++ b/t/t4200-rerere.sh
@@ -193,9 +193,19 @@ test_expect_success 'resolution was recorded properly' '
 	echo Bello > file3 &&
 	git add file3 &&
 	git commit -m version2 &&
+	git tag version2 &&
 	test_must_fail git merge fifth &&
 	test Cello = "$(cat file3)" &&
 	test 0 != $(git ls-files -u | wc -l)
 '
 
+test_expect_success 'rerere.autoupdate' '
+	git config rerere.autoupdate true
+	git reset --hard &&
+	git checkout version2 &&
+	test_must_fail git merge fifth &&
+	test 0 = $(git ls-files -u | wc -l)
+
+'
+
 test_done
-- 
1.5.6.12.g73f03

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* Re: git-rerere observations and feature suggestions
  2008-06-16 18:46 ` Junio C Hamano
  2008-06-16 19:09   ` Ingo Molnar
  2008-06-16 19:10   ` Junio C Hamano
@ 2008-06-23  9:49   ` Ingo Molnar
  2008-06-23 14:19     ` Peter Zijlstra
  2008-06-23 15:12     ` Jeff King
  2 siblings, 2 replies; 45+ messages in thread
From: Ingo Molnar @ 2008-06-23  9:49 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Peter Zijlstra, Chris Mason, Thomas Gleixner

[-- Attachment #1: Type: text/plain, Size: 2807 bytes --]


another git-rerere observation: occasionally it happens that i 
accidentally commit a merge marker into the source code.

That's obviously stupid, and it normally gets found by testing quickly, 
but still it would be a really useful avoid-shoot-self-in-foot feature 
if git-commit could warn about such stupidities of mine.

( and if i could configure git-commit to outright reject a commit like 
  that - i never want to commit lines with <<<<<< or >>>>> markers)

Another merge conflict observation is that Git is much worse at figuring 
out the right merge resolution than our previous Quilt based workflow 
was. I eventually found it to be mainly due to the following detail: 
sometimes it's more useful to first apply the merged branch and then 
attempt to merge HEAD, as a patch.

I've got a script for that which also combines it with the "rej" tool, 
and in about 70%-80% of the cases where Git is unable to resolve a merge 
automatically it figures things out. ('rej' is obviously a more relaxed 
merge utility, but it's fairly robust in my experience, with a very low 
false positive rate.)

The ad-hoc "tip-mergetool" script we are using is attached below. It's 
really just for demonstration purposes - it doesnt work when there's a 
rename related conflict, etc.

Peter Zijstra also wrote a git-mergetool extension for the 'rej' tool 
btw., he might want to post that patch. I've attached Chris Mason's rej 
tool too.

	Ingo


[ "$#" = 0 ] && {
  SRC=`git-ls-files -u | cut -f2 | head -1`
} || {
  SRC=`git-ls-files -u | grep $1 | cut -f2 | head -1`
}

[ "$SRC" = "" -o ! -f "$SRC" ] && { echo "$1 has no conflicts!"; exit -1; }

SRC_SED=`echo $SRC | sed 's/\//\\\\\//g'`

SHA_1=`git-ls-files -u | grep $SRC | grep '^.* .* 1\>' | cut -d' ' -f2`
SHA_2=`git-ls-files -u | grep $SRC | grep '^.* .* 2\>' | cut -d' ' -f2`
SHA_3=`git-ls-files -u | grep $SRC | grep '^.* .* 3\>' | cut -d' ' -f2`

mv -b $SRC $SRC.automerge             || { echo error1; exit -1; }

git-diff $SHA_1 $SHA_2 | sed "s/$SHA_1/$SRC_SED/g" |
   sed "s/$SHA_2/$SRC_SED/g" > $SRC.diff-v1 || { echo error2; exit -1; }

git-diff $SHA_1 $SHA_3 | sed "s/$SHA_1/$SRC_SED/g" |
   sed "s/$SHA_3/$SRC_SED/g" > $SRC.diff-v2 || { echo error2; exit -1; }

git-cat-file -p $SHA_1 > $SRC         || { echo error4; exit -1; }

ls -l $SRC.automerge $SRC $SRC.diff-v1 $SRC.diff-v2

patch -p1 < $SRC.diff-v2 || { echo error5; exit -1; }
patch -p1 < $SRC.diff-v1 || {
    echo "reject file ..."
    ls -l $SRC.rej
    echo "trying auto-merge ..."
    OK=$(rej -dry-run $SRC.rej 2>&1 | grep ', 0 conflicts remain')
    [ "$OK" = "" ] && { echo "$OK"; exit -1; }
    rej -a $SRC.rej
}

echo "adding $SRC to the commit"
git-add $SRC
echo 'if happy with the result, do: git-commit -m "Manual merge of conflicts."'

echo "merge successful!"
exit 0


[-- Attachment #2: rej --]
[-- Type: text/plain, Size: 29404 bytes --]

#!/usr/bin/perl
# Reject merging program, released under GPLv2
# Contact Chris Mason <mason@suse.com> with bugs or patches
#

use strict;
use Getopt::Long qw(:config no_ignore_case);
use File::Temp;
use IO::File;
use POSIX ":sys_wait_h";

my @file; 			# array with the contents of the file
my @merged_file; 		# temporary copy of the file for merging
my $diff_mode = 0;		# running a multi-file diff instead of a reject
my @hunks;			# array of hashes for all the hunks
my $hunk_num = 0;  		# total number of hunks
my $rejfh;			# handle for the reject file
my $filefh;			# handle for the file
my $mergefh;			# handle for the merge stream
my $VERSION = "0.15";
my $context = 0;		# prefer context from the reject?
my $merge_prog = "gvimdiff";    # merge program to run
my $source_file;		# name of the source file
my $reject_file;		# name of the reject file
my $orig_reject_file;		# original name of the reject file
my $output_file;		# name of merge output (skips merge prog)
my $auto;			# auto mode, put results into source file
my $interactive = 0;		# go into command mode
my $editor = "gvim";		# selected editor for opening the reject
my $open_reject = 0;		# should the reject be opened?
my $report_only = 0;		# only try to place the hunks
my $exit_value = 0;		# our return value;
my $fully_matched_hunks = 0;	# lines that match the -/+ lines in hunk
my $strip_level = 0;		# how many path components to strip in diffmode
my $last_diff_line;		# for buffering input lines
my $total_hunks;		# total hunks found
my $reverse_patch = 0;		# should we reverse the input patch?
my $no_forward_search = 0;      # only try to match reverse hunks
my $quick = 0;			# stop trying after the first conflict
my $skip_merge = 0;		# don't run the merge prog at all
my $global_matched_hunks = 0;   # total for whole diff
my $global_reject_hunks = 0;	# 
my $global_conflicts = 0;	# number of conflicts for the last rej run


# string equality ignoring whitespace
# the first string has a leading control char from the reject,
# either ' ', '-' or '+'
sub fuzzy_eq($$) {
    my ($hunk, $file) = @_;
    my $line;

    $line = $hunk;
    $line =~ s/^.//;

    if ($line eq $file) {
        return 1;
    }
    # strip all whitespace and try agin
    $line =~ s/\s//g;
    $file =~ s/\s//g;
    if ($line eq $file) {
        return 1;
    }
}

# places the aligned hunk into the merge stream.
# in context mode, or if this was a forward match
# all context is taken from the reject file
#
# otherwise, this tries to take as much context as possible from the source
# file.
sub place_hunk($$$) {
    my ($hunk, $start, $end) = @_;
    my $fhunk = \@{$hunk->{'fwhunk'}};
    my $rhunk = $hunk->{'aligned_hunk'};
    my $size = scalar(@file);
    my $i;
    my $k;
    my $merged_index = 0;
    my $hunk_index = 0;
    my $hunk_len = scalar(@$fhunk);
    my $line;
    my $file_hunk = $hunk->{'file_hunk'};
    my @tmp_hunk;
    my $file_index;
    my $c;
    my $fline;

    @merged_file = ();
    # check for a change that is already applied.  print a warning as well
    # since this is not a 100% reliable check.
    #
    if ($hunk->{'method'} eq "forward" && 
        $hunk->{'hunk_match'} == count_hunk_changed($hunk, "forward")) {
	print STDERR "WARNING: skipping hunk already applied: $hunk->{'desc'}";
        return;
    }
    # try to maximize context from the reject file 
    # if we matched the forward hunk, it means the file
    # already had some form of the change applied.  Don't
    # try to be smart, just use the context from the reject
    if ($hunk->{'options'} =~ m/context/ || $context || 
        $hunk->{'method'} eq "forward") {
        @merged_file = @file [ 0 .. $start - 1];
	
	foreach $line (@$fhunk) {
	    $line =~ s/^.//;
	    push @merged_file, $line;
	}
	push @merged_file, @file[$end+1 .. scalar(@file)];
	@file = @merged_file;
	return;
    }
    # align to the start of the reverse hunk
    while($hunk_index < scalar(@$rhunk)) {
        $line = $rhunk->[$hunk_index];    
	$fline = $file_hunk->[0];
	$fline =~ s/^.//;
	last if (fuzzy_eq($line, $fline));
	$hunk_index++;
    }

    $file_index = 0;
    while($file_index < scalar(@$file_hunk)) {
        $line = $rhunk->[$hunk_index];    
	$fline = $file_hunk->[$file_index];
	if ($line =~ m/^\|/) {
	    $fline =~ s/^./\+/;
	    push @tmp_hunk, $fline;
	} elsif ($line =~ m/^ /) {
	    push @tmp_hunk, $fline;
	} elsif ($line =~ m/^-/) {
	    my $tline = $fline;
	    $tline =~ s/^.//;
	    if (!fuzzy_eq($line, $tline)) {
		my $cline = $line;
		$cline =~ s/^.//;
	        push @tmp_hunk, "+<<<<<<< delete $cline";
	    }
	}
	$file_index++;
	$hunk_index++;
	if ($hunk_index > scalar(@$rhunk)) {
	    print STDERR "warning: hunk_index is $hunk_index, limit " . 
	                  scalar(@$rhunk) . "\n";
	}
    }
    @$file_hunk = @tmp_hunk;
    @tmp_hunk = ();
    $size = scalar(@$file_hunk);
    $hunk_index = 0;
    $file_index = 0;
    while($hunk_index < scalar(@$fhunk)) {
        $line = $fhunk->[$hunk_index];    
	$fline = $file_hunk->[0];
	if (!($fline =~ m/^\|/)) {
	    $fline =~ s/^.//;
	    last if (fuzzy_eq($line, $fline));
	}
	last if ($line =~ m/^\+/) ;
	$hunk_index++;
    }
    while($file_index < $size || $hunk_index < scalar(@$fhunk)) {
	$fline = $file_hunk->[$file_index];
	if ($fline =~ m/^\+/ || $hunk_index >= scalar(@$fhunk)) {
	    $fline =~ s/^.//;
	    push @tmp_hunk, $fline;
	    $file_index++;
	    next;
	}
        $line = $fhunk->[$hunk_index];    
	if ($line =~ m/^\+/) {
	    $line =~ s/^.//;
	    push @tmp_hunk, $line;
	} elsif ($file_index < scalar(@$file_hunk)) {
	    if ($fline =~ m/^ /) {
		$fline =~ s/^.//;
		push @tmp_hunk, $fline;
	    }
	    $file_index++;
	}
	$hunk_index++;
    }
    @merged_file = @file[ 0 .. $start -1];
    push @merged_file, @tmp_hunk;
    push @merged_file, @file [ $end+1 .. scalar(@file) ];
    @file = @merged_file;
}

# try to find the next line in the file or the hunk that
# match.  The entire hunk is searched for a line matching the
# file, 20 lines forward are searched in the file.
#
# $aligned_hunk and $file_hunk are pointers to arrays, a line
# containing "|\n" is inserted to indicate missing lines in the
# file or hunk:
#
# FILE				HUNK:
#  A				 A
# |				 B
#  C				 C
#  D				|
# There is a leading control char on each line in each of the aligned
# arrays
#
sub align_match($$$$$$) {
    my ($hunk, $aligned_hunk, $file_hunk, $hindex, $findex, $hunk_changed) = @_;
    my $hunk_len = scalar(@$hunk);
    my $i;
    my $hline;
    my $fline;
    my $s;
    my $limit = 20;
    if ($$findex + $limit > scalar(@file)) {
        $limit = scalar(@file) - $$findex;
    }

    # look forward in the hunk for this line from the file
    $fline = $file[$$findex];
    for ($i = $$hindex; $i < $hunk_len; $i++) {
        if ($i != $$hindex && $hunk->[$i] =~ m/^\S/) {
	    $hunk_changed++;
	}
	if (fuzzy_eq($hunk->[$i], $fline)) {
	    for ($s = 0 ; $s < $i - $$hindex ; $s++) {
		push @$file_hunk, "|\n";
		push @$aligned_hunk, $hunk->[$$hindex + $s];		
	    }	    
	    push @$file_hunk, " $fline";
	    push @$aligned_hunk, $hunk->[$i];
	    $$hindex = $i;
	    return 1;
	}
    }
    # look forward in the file for this line from the hunk
    $hline = $hunk->[$$hindex];
    for ($i = $$findex; $i < $$findex + $limit; $i++) {
	if (fuzzy_eq($hline, $file[$i])) {
	    for ($s = 0 ; $s < $i - $$findex ; $s++) {
		push @$aligned_hunk, "|\n";
		push @$file_hunk, " $file[$$findex + $s]";
	    }
	    push @$file_hunk, " $file[$i]";
	    push @$aligned_hunk, $hline;
	    $$findex = $i;
	    return 1;
	}
    }
    return 0;
}

# once a matching line is found between the hunk and the file,
# test match walks through both trying to find out how many total
# matching lines there are.
sub test_match($$$$) {
    my ($hunk, $hunk_line, $file_line, $direction) = @_;
    my $line;
    my $i;
    my $file_len = scalar(@file);
    my $hunk_len = scalar(@$hunk);
    my @file_hunk = ();
    my @aligned_hunk = ();
    my $match_len = 0;
    my $last_match_line = -1;
    my $score = 0;
    my $consec = 0;
    my $ws_match = 0;
    my $hunk_match = 0;
    my $hunk_changed = 0;
    while($hunk_line < $hunk_len) {
	$line = $hunk->[$hunk_line];
	if ($line =~ m/^\S/) {
	    $hunk_changed++;
	}
	if (fuzzy_eq($line,$file[$file_line])) {
	    # each line in the file hunk has one char for 
	    # special chars
	    #
	    push @file_hunk, " $file[$file_line]";
	    push @aligned_hunk, $line;
	    $match_len++;
	    $last_match_line = $file_line;
	    # bump the score if we're matching a non-context line
	    # or if this is a consecutive match
	    if ($line =~ m/^\S/) {
	        $hunk_match++;
		$score++;
	    }
	    if ($consec) {
	        $score++;
	    }
	    $consec = 1;
	    # count matches that are whitespace alone.  These are
	    # very unreliable.
	    if ($file[$file_line] =~ m/^\s*$/) {
	        $ws_match++;
	    }
	} else {
	    # walk the hunk and the file trying to find the next match
	    $consec = 0;
	    if (align_match($hunk, \@aligned_hunk, \@file_hunk, 
	                \$hunk_line, \$file_line, \$hunk_changed)) {
		$match_len++;
		if ($hunk->[$hunk_line] =~ m/^\S/) {
		    $score++;
		    $hunk_match++;
		}
	    } else {
		if ($file_line < scalar(@file)) {
		    push @file_hunk, " $file[$file_line]";
		} else {
		    push @file_hunk, "|\n";
		}
		push @aligned_hunk, $line;
	    }
	    $last_match_line = $file_line;
	}
	$file_line++;
	$hunk_line++;
    }
    
    if ($match_len == $ws_match) {
        $match_len = 0;
	$score = 0;
	$last_match_line = -1;
    }
    
    return ($match_len, $hunk_match, $score, $last_match_line, 
            \@aligned_hunk, \@file_hunk)
}

sub count_hunk_changed($$) {
    my ($hunk, $method) = @_;
    my $fhunk;
    my $changed = 0;
    
    if ($method eq "forward") {
	$fhunk = \@{$hunk->{'fwhunk'}};
    } else {
	$fhunk = \@{$hunk->{'revhunk'}};
    }
    foreach my $l (@$fhunk) {
	if ($l =~ m/^\S/) {
	    $changed++;
	}
    }
    return $changed;
}
# start of the fuzzy matching engine, try to find matching lines
# between the hunk and the file.
sub _find_hunk($$$) {
    my ($struct, $hunk, $direction) = @_;
    my $hunk_len = scalar(@$hunk);
    my $file_len = scalar(@file);
    my $i;
    my $k;
    my $hunk_line;

    # for each line in the hunk, try to fuzz it into the file
    for ($i = 0 ; $i < $file_len; $i++) {
        for ($k = 0; $k < $hunk_len; $k++) {
	    $hunk_line = $hunk->[$k];
	    # if the hunk lines remaining are less then the best match
	    # so far, we're done
	    #
	    if (scalar(@$hunk) - $k < $struct->{'match_count'}) {
	        last;
	    }
	    # don't try too far into the hunk, after a while there's
	    # no chance we'll find any useful match
	    if ($k > 10) {
	        last;
	    }
	    if (fuzzy_eq($hunk_line, $file[$i])) {
		my ($match_len, $hunk_match, $score, $file_end, 
		    $aligned_hunk, $file_hunk) = test_match($hunk, $k, $i, 
		    					    $direction);
		if ($match_len > $struct->{'match_count'} || 
		    ($match_len == $struct->{'match_count'} && 
		    $score > $struct->{'score'})) {
		    # don't pick the forward hunk over the reverse hunk
		    # if the reverse hunk made any matches to non-context lines
		    # and we don't have a perfect forward match
		    if ($direction eq "forward" && $struct->{'method'} eq
		        "reverse") {
			if ($hunk_match != 
			    count_hunk_changed($struct,"forward")) {
			    my $rch = count_hunk_changed($struct, "reverse");
			    if ($rch > 0 && $struct->{'hunk_match'} > 0) {
				next;
			    } elsif ($rch == 0 && $struct->{'match_count'} > 2){
			        next;
			    } elsif ($rch == $struct->{'hunk_match'}) {
			        next;
			    }
			}

		    }
		    $struct->{'match_count'} = $match_len;
		    $struct->{'score'} = $score;
		    $struct->{'start'} = $i;
		    $struct->{'end'} = $file_end;
		    $struct->{'method'} = $direction;
		    $struct->{'aligned_hunk'} = $aligned_hunk;
		    $struct->{'file_hunk'} = $file_hunk;
		    $struct->{'hunk_match'} = $hunk_match;
		}
		# when deciding we've perfectly placed the change
		# allow for fuzz of two on either side, unless
		# there are no non-context lines in this part of the patch.
		# in that case, be fuzz free.
		# this code needs a little more work, disabled for now
		#my $fuzz = 4;
		#if ($struct->{'hunk_match'} == 0) {
		#    $fuzz = 0;
		#}
		#if ($struct->{'match_count'} >= (scalar(@$hunk) - $fuzz) &&
		#    $struct->{'hunk_match'} >= 
		#    count_hunk_changed($struct,$direction)) {
		#    #return;
		#}
		last;
	    }
	}
    }
}

# figures out where the hunk should go into the file, and
# inserts it into the merge stream.  If no suitable location is found
# the hunk is merged at the top.
#
sub find_hunk($) {
    my ($hunk) = @_;
    my $fhunk = \@{$hunk->{'fwhunk'}};
    my $hc;
    my $ret = 0;
    if (!($hunk->{'options'} =~ m/forward/)) {
        _find_hunk($hunk, \@{$hunk->{'revhunk'}}, "reverse");
    }
    if (!$no_forward_search && !($hunk->{'options'} =~ m/reverse/)) {
        _find_hunk($hunk, $fhunk, "forward");
    }
    $hc = count_hunk_changed($hunk, "reverse");
    if ($hunk->{'method'} eq "reverse" &&
        $hunk->{'hunk_match'} == count_hunk_changed($hunk, "reverse")) {
	# if hunk_mach is zero, then our patch is only adding new lines.
	# make sure we've found some reasonable context in the patch
	# before calling it a fully matched hunk
	if ($hunk->{'hunk_match'} > 0 || $hunk->{'match_count'} > 2) {
            $fully_matched_hunks++;
	    $ret = 1;
        }
    }
    if ($hunk->{'match_count'} >= 2) {
        place_hunk($hunk, $hunk->{'start'}, $hunk->{'end'});
    } else {
	$hunk->{'method'} = "forward";
        place_hunk($hunk, 0, 0);
    }
    return $ret;
}

sub print_interactive_usage {
    print "[c]ontext: toggle the context command line parameter off/on\n";
    print "[d]one: exit interactive mode\n";
    print "[h]elp: this screen\n";
    print "[m]erge: run the merge program again\n";
    print "[p]rocess: process the hunks again.  This will write over the output file\n";
    print "[r]eject: open in the reject in \$REJEDITOR or \$EDITOR\n";
    print "\t\$REJEDITOR is used first if it exists\n";
    print "[t]empreject: copy the reject to a temp file and open for editing\n";
    print "\tany later process commands will use the temp file\n";
    print "restore: restore backup copy of source file in auto mode\n";
    print "\n";
}
sub run_interactive($$$) {
    my ($pid, $file, $file2) = @_;
    my $editor_pid = 0;
    my $tfh;

    print ">> rej $VERSION interactive mode (type help for help)\n";
    print ">> ";
    while(<STDIN>) {
        chomp;
	if (m/^(h|help)$/) {
	    print_interactive_usage();
	} elsif (m/^restore$/) {
	    if ($auto) {
	        `cp $source_file.mergebackup $source_file`;
		if ($?) {
		    print "cp $source_file.mergebackup $source_file \n";
		    print "exited with " . $? >> 8 . "\n";
		}
	    }
	} elsif (m/^(r|reject)$/) {
	    open_reject();
	} elsif (m/^(t|tempreject)$/) {
	    $rejfh = new IO::File;
	    $rejfh->open("<$orig_reject_file") || 
	            die "Unable to open $orig_reject_file";
	    $tfh = new IO::File;
	    $tfh->open(">$orig_reject_file.tmp") || 
	          die "Unable to open $orig_reject_file.tmp";
	    while(<$rejfh>) {
	        print $tfh $_;    
	    }
	    close($rejfh);
	    close($tfh);
	    $reject_file = "$orig_reject_file.tmp";
	    print "Switching reject to $orig_reject_file.tmp";
	    open_reject();
	} elsif (m/^(c|context)$/) {
	    $context = ($context + 1) % 2;
	    print "context mode toggled to $context\n";
	} elsif (m/^(p|process)$/) {
	    print "processing reject again\n";
	    process_reject();
	} elsif (m/^(m|mergewindow)$/) {
	    waitpid(-1, WNOHANG);
	    $pid = fork();
	    if (!$pid) {
	        _run_merge($file, $file2);
		exit 0;
	    }
	} elsif (m/^(d|done|quit)$/) {
	    last;
	}
	print ">> ";
    }
    print "waiting for merge windows\n";
    wait;
    if (!$auto) {
        unlink($file2);
    }
}

sub open_reject {
    if (defined($editor)) {
	print "opening reject $reject_file in $editor\n";
	system("$editor $reject_file");
    } else {
	print "please define \$EDITOR or \$REJEDITOR in your env\n";
    }
}

sub _run_merge($$) {
    my ($file, $file2) = @_;
    my $ret;
    my $pid;

    if ($merge_prog eq "kdiff3") {
	$ret = system("kdiff3 -o $file $file $file2");
    } elsif ($merge_prog eq "tkdiff") {
        $ret = system("tkdiff -o $file $file $file2");
    } elsif ($merge_prog =~ m/vimdiff/) {
        $ret = system("$merge_prog -f $file $file2");
    } else {
        $ret = system("$merge_prog $file $file2");
    }
    if ($ret) {
        $ret = $ret >> 8;
	print STDERR "warning: $merge_prog exited with $ret\n";
    }
}

# run the merge program, with a little customization for each kind
#
sub run_merge($$) {
    my ($file, $file2) = @_;
    my $ret;
    my $pid;

    if ($interactive) {
        $pid = fork();
	if ($pid) {
	    run_interactive($pid, $file, $file2);
	    return;
	}
    }
    _run_merge($file, $file2);

    if (!$auto && !$interactive) {
        unlink $file2;
    }
    if ($interactive) {
        exit 0;
    }
}

# look forward in the hunk for three more consecutive context lines
# this is used to split a large hunk into smaller ones
sub three_more_context($$$$) {
    my ($rev, $fw, $rindex, $findex) = @_;
    my $revctx = 0;
    my $fwctx = 0;
    
    while($rindex < scalar(@$rev) && $findex < scalar(@$fw)) {
        if ($rev->[$rindex++] =~ m/^ /) {
	    $revctx++;
	} else {
	    $revctx = 0;
	}
        if ($fw->[$findex++] =~ m/^ /) {
	    $fwctx++;
	} else {
	    $fwctx = 0;
	}
	last if($revctx > 3 && $fwctx > 3);
    }
    return ($revctx > 3 && $fwctx > 3);
}

# walk a hunk and divide it into smaller pieces.  The smaller pieces should
# be easier to place in the file.
#
sub split_and_push_hunk($$) {
    my ($hunks, $hunk) = @_;
    my $rev;
    my $fw;
    my $line;
    my $i;
    my @tmp;
    my $tmp_hash;
    my $fw_index;
    my $context;
    my $nonctx;
again:
    @tmp = ();
    $tmp_hash = {};
    $fw_index = 0;
    $context = 0;
    $i = 0;
    $rev = \@{$hunk->{'revhunk'}};
    $fw = \@{$hunk->{'fwhunk'}};
    $nonctx = 0;

    while($i < scalar(@$rev) && $fw_index < scalar(@$fw)) {
	$line = $rev->[$i];
        if ($line =~ m/^ / && $fw->[$fw_index] =~ m/^ /) {
	    $context++;
	} else {
	    $nonctx = 1;
	    # walk both arrays forward until we get to the next next bit
	    # of context in both
	    while($fw_index < scalar(@$fw) && !($fw->[$fw_index] =~ m/^ /)) {
	        $fw_index++;
	    }
	    while($i < scalar(@$rev) && !($rev->[$i] =~ m/^ /)) {
	        $i++;
	    }
	    $context = 1;
	}

	# split the hunk if we've seen a non-context line,
	# we've seen three context lines already, and the hunk
	# still has three context lines in a row later on.
	if ($nonctx && $context >= 3 && 
	    three_more_context($rev, $fw, $i, $fw_index)) {
	    my $t;
	    # for both the rev and forward arrays, copy the 
	    # first part of the hunk to a tmp array and assign it
	    # back into the old hunk.  
	    #
	    # Then make a new hunk comprised
	    # of the remaining parts of both arrays.
	    # Make sure the context we've found is put into both
	    # old and new hunks.
	    for ($t = $i - $context + 1 ; $t < scalar(@$rev); $t++) {
	        push @tmp, $rev->[$t];
	    }
	    $tmp_hash->{'revhunk'} = [@tmp];
	    @tmp = ();
	    for ($t = 0; $t <= $i; $t++) {
	        push @tmp, $rev->[$t];
	    }
	    $hunk->{'revhunk'} = [@tmp];
	    $tmp_hash->{'desc'} = $tmp[0];

	    @tmp = ();
	    for ($t = $fw_index - $context + 1 ; $t < scalar(@$fw); $t++) {
	        push @tmp, $fw->[$t];
	    }
	    $tmp_hash->{'fwhunk'} = [@tmp];
	    @tmp = ();
	    for ($t = 0; $t <= $fw_index; $t++) {
	        push @tmp, $fw->[$t];
	    }
	    $hunk->{'fwhunk'} = [@tmp];
	    push @$hunks, $hunk;
	    $hunk = {%$tmp_hash};
	    goto again;
	}
	$fw_index++;
	$i++;
    }
    push @$hunks, $hunk;
}

sub print_usage {
    print STDERR "usage: rej [-acdeiFMqrR] [-p num] [-o file] [-m prog] file file.rej\n";
    print STDERR "\t-a replace file with merged result.  Use with care!\n";
    print STDERR "\t\tbackup of original file created as file.mergebackup\n";
    print STDERR "\t-c maximize context from the reject.  This makes it\n";
    print STDERR "\t\teasier to figure out stubborn rejets\n";
    print STDERR "\t-dry-run check for conflicts, but do nothing\n";
    print STDERR "\t-i interactive mode\n";
    print STDERR "\t-F don't check for already applied changes\n";
    print STDERR "\t-M don't run the external merge program at all\n";
    print STDERR "\t-p num: strip num levels off the paths found in the diff\n";
    print STDERR "\t-q in dry-run mode, exit once a conflict is found\n";
    print STDERR "\t\totherwise, only run merge prog when there are conflicts\n";
    print STDERR "\t-r open the reject file in \$EDITOR\n";
    print STDERR "\t-R reverse the diff or reject\n";
    print STDERR "\t-o file specify output file for merge\n";
    print STDERR "\t\tThe merge program will not be run in this case\n";
    print STDERR "\t-m prog specify merge progam.  You can use:\n";
    print STDERR "\t\t[g]vimdiff (default), kdiff3, tkdiff and meld\n";
    print STDERR "\t\tothers called as: mergeprog foo.c foo.c.tmp\n";
    print STDERR "\t\tThe REJMERGE environment var specifies the merge program as well\n";
    print STDERR "\t-r open the reject with \$REJEDITOR or \$EDITOR\n";
    print STDERR "\n";
    exit 1;
}

sub strip_path($) {
    my ($p) = @_;

    #if ($p =~ m/(.*\/){$strip_level}?(.*)/) {
    if ($p =~ m/([^\/]+\/){$strip_level}?(.*)/) {
        $p = $2;
	return $p;
    }
    return undef;
}

sub reverse_line($) {
    my ($line) = @_;
    my $orig_line = $line;
    if (!$reverse_patch) {
        return $line;
    }
    if ($line =~ s/^-([^-].*)/\+$1/) {
        return $line;    
    } elsif ($line =~ s/^\+([^\+].*)/-$1/) {
        return $line;
    }
    return $line;
}

sub process_reject {
    if (!defined($rejfh) || !$diff_mode) {
	$rejfh = new IO::File;
	$rejfh->open("<$reject_file") || die "Unable to open $reject_file";
    }
    # in diff mode, find the first file indicator
    if ($diff_mode) {
	if ($rejfh->eof()) {
	    return 1;
	}
        while(defined($last_diff_line) || !$rejfh->eof()) {
	    if (defined($last_diff_line))  {
	        $_ = $last_diff_line;
		undef $last_diff_line;
	    } else {
	        $_ = <$rejfh>;
	    }
	    if (m/^--- (\S*)/) {
	        my $fname = $1;
		$fname = strip_path($fname);
		if ( -f $fname) {
		    $source_file = $fname;
		    last;
		} else {
		    my $t = <$rejfh>;
		    if ($t =~ m/^\+\+\+ (\S*)/) {
		        $fname = strip_path($1);
			if (-f $fname) {
			    $source_file = $fname;
			    last;
			}
		    }
		}
	    }
	}
    }

    $filefh = new IO::File;
    $filefh->open("<$source_file") || die "Unable to open $source_file";

    # struct hunk {
    #     @fwhunk;
    #     @revhunk;
    #     $hunk_offset; offset in hunk where matching started
    #     $start; 
    #     $end;
    #     $score;
    #     $match_count;
    #     $desc; @@ line
    #	  $options; # string with the special options for this hunk
    #     $aligned_hunk; points to array of @revhunk aligned with file
    #     $file_hunk; points to array of file lines aligned with aligned_hunk
    #     $method; "forward" or "reverse" defines how the hunk was matched
    # }

    #build the arrays of hunks
    my $hunk;
    my @fw;
    my @rev;
    my $more;
    my $last = 0;
    my $exclude = 0;
    my $hunk_opt = "";
    @hunks = ();
    $fully_matched_hunks = 0;
    $total_hunks = 0;
    $hunk_num = 0;
    while(<$rejfh>) {
	if ($diff_mode && m/^--- /) {
	    $last_diff_line = $_;
	    last;
	}
	$_ = reverse_line($_);
	if (m/^(@@|\*\*\* )/) {
again:
	    last if ($last);
	    # special options string
	    $hunk_opt = "";
	    if (m/###(.*)/) {
	        $hunk_opt = $1;
		if ($hunk_opt =~ m/only/) {
		    @hunks = ();
		    $hunk_num = 0;
		    $last = 1;
		} 
		if ($hunk_opt =~ m/exclude/) {
		    while(<$rejfh>) {
			if ($diff_mode && m/^--- /) {
			    $last_diff_line = $_;
			    goto read_file;
			}
			if (m/^(@@|\*\*\* )/) {
			    goto again;
			}
		    }
		} 
		if ($hunk_opt =~ m/last/) {
		    $last = 1;
		}
	    }
	    if ($hunk_num > 0) {
		$hunk->{'fwhunk'} = [@fw];
		$hunk->{'revhunk'} = [@rev];
		split_and_push_hunk(\@hunks, $hunk);
	    }
	    @fw = ();
	    @rev = ();
	    $hunk = {};
	    $hunk_num++;
	    $hunk->{'desc'} = $_;
	    $hunk->{'options'} = $hunk_opt;
	    # not a unified diff?
	    # process the whole thing right here
	    if (!m/^@@/) {
		if ($diff_mode) {
		   die "Unable to handle multi file context diffs";
		}
		$more = 0;
		while(<$rejfh>) {
		    $_ = reverse_line($_);
		    last if (m/^--- \d+,\d+ ---/);
		    s/(^.)./$1/;
		    if ($reverse_patch) {
			push @fw, $_;
		    } else {
			push @rev, $_;
		    }
		}
		while(<$rejfh>) {
		    $_ = reverse_line($_);
		    if (m/^\*\*\* /) {
			$more = 1;
			last;
		    }
		    if (!(m/^\*/)) {
			s/(^.)./$1/;
			if ($reverse_patch) {
			    push @rev, $_;
			} else {
			    push @fw, $_;
			}
		    }
		}
		if ($more) {
		    goto again;
		}
	    }
	} elsif (m/^-[^-]/) {
	    push @rev, $_;
	} elsif (m/^\+[^\+]/) {
	    push @fw, $_;
	} elsif (m/^ /) {
	    push @fw, $_;
	    push @rev, $_;
	}
    }
read_file:
    if (!$diff_mode) {
	$rejfh->close();
    }

    # push any leftover hunks from the loop above
    if (defined($hunk->{'desc'})) {
	$hunk->{'fwhunk'} = [@fw];
	$hunk->{'revhunk'} = [@rev];
	split_and_push_hunk(\@hunks, $hunk);
    }

    @file = ();
    # build the file array
    while(<$filefh>) {
	push @file, $_;
    }
    $filefh->close();

    $total_hunks += scalar(@hunks);
    # try to place each hunk into the file
    my $ret;
    foreach my $href (@hunks) {
	$ret = find_hunk($href);
	if ($report_only && $quick && $ret == 0) {
	    last;
	}
    }
    my $conflicts = $total_hunks - $fully_matched_hunks;
    $global_matched_hunks += $fully_matched_hunks;
    $global_reject_hunks += $total_hunks;
    $global_conflicts = $conflicts;
    if ($conflicts > 0) {
        $exit_value = 1;
    }
    if ($report_only && $quick && $conflicts > 0) {
        $conflicts = "some";
    }
    print STDERR "\t$source_file: $fully_matched_hunks matched, $conflicts conflicts remain\n";
    if ($report_only) {
	if ($quick && $exit_value) {
	    return 1;
	}
        return 0;
    }

    if (!defined($mergefh)) {
	# from here down either copies the merge result to $output_file or
	# runs the merge program
	if (defined($auto)) {
	    my $ret;
	    $ret = rename $source_file, "$source_file.mergebackup";
	    if (!$ret) {
		die "Unable to rename $source_file to $source_file.mergebackup";
	    }
	    $output_file = $source_file;
	}
	if (defined($output_file)) {
	    $mergefh = new IO::File;
	    $mergefh->open(">$output_file") || 
	              die "Unable to open $output_file";
	} else {
	    $mergefh = new File::Temp(TEMPLATE => "$source_file.XXXXX", 
				      UNLINK => 0) ||
				      die "Unable to create temp file";
	}
    } else {
	# mergefh is only defined when we're reloading.  
	# Just truncate and seek to 0
	if ($output_file) {
	    $mergefh->open(">$output_file")||die "Unable to open $output_file";
	} else {
	    $mergefh->truncate(0); 
	    seek $mergefh, 0, SEEK_SET;
	}
    }
    foreach my $l (@file) {
	print $mergefh $l;
    }
    $mergefh->flush();
    if ($output_file) {
        $mergefh->close();
    }
    return 0;
}

if (defined($ENV{'REJMERGE'})) {
    $merge_prog = $ENV{'REJMERGE'};
}
if (defined($ENV{'REJEDITOR'})) {
    $editor = $ENV{'REJEDITOR'};
} elsif (defined($ENV{'EDITOR'})) {
    $editor = $ENV{'EDITOR'};
}

GetOptions("context" => \$context,
	   "auto" => \$auto,
	   "dry-run" => \$report_only,
	   "out=s" => \$output_file,
	   "F|no-forward" => \$no_forward_search,
	   "interactive" => \$interactive,
	   "reject" => \$open_reject,
	   "Reverse" => \$reverse_patch,
	   "p|strip-level=s" => \$strip_level,
	   "quick" => \$quick,
	   "M|no-merge" => \$skip_merge,
           "merge=s" => \$merge_prog) || print_usage();;

$source_file = $ARGV[0];
$reject_file = $ARGV[1];
if (scalar(@ARGV) < 2) {
    if ($ARGV[0] =~ m/\.rej$/) {
        $reject_file = $ARGV[0];
	$source_file = $reject_file;
	$source_file =~ s/\.rej$//;
    } elsif (-f $source_file && -f "$source_file.rej") {
        $reject_file = "$source_file.rej";
    } elsif (-f $source_file) {
	$reject_file = $source_file;
	undef($source_file);
        $diff_mode = 1;
    } else {
        print_usage();
    }
}

$orig_reject_file = $reject_file;

if (!$diff_mode) {
    foreach my $f ($source_file, $reject_file) {
	if (! -f $f) {
	    print STDERR "Unable to find $f\n";
	    exit 1;
	}
    }
}

while(1) {
    if (process_reject()) {
        if ($diff_mode) {
	    print STDERR "$reject_file: total of $global_matched_hunks / $global_reject_hunks matched\n";
	}
        last;
    }
    if (!$report_only) {
	if ($open_reject) {
	    open_reject();
	}
	if (!$skip_merge && (!$quick || $global_conflicts > 0)) {
	    if (!defined($output_file)) {
		run_merge($source_file, $mergefh);
	    } elsif ($auto) {
		run_merge($source_file, "$source_file.mergebackup");
	    }
	}
    }
    if ($diff_mode) {
        undef($mergefh);
    } else {
        last;
    }
}

exit $exit_value;

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: git-rerere observations and feature suggestions
  2008-06-23  9:49   ` Ingo Molnar
@ 2008-06-23 14:19     ` Peter Zijlstra
  2008-06-23 14:26       ` Peter Zijlstra
  2008-06-23 15:12     ` Jeff King
  1 sibling, 1 reply; 45+ messages in thread
From: Peter Zijlstra @ 2008-06-23 14:19 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Junio C Hamano, git, Chris Mason, Thomas Gleixner

On Mon, 2008-06-23 at 11:49 +0200, Ingo Molnar wrote:
> another git-rerere observation: occasionally it happens that i 
> accidentally commit a merge marker into the source code.
> 
> That's obviously stupid, and it normally gets found by testing quickly, 
> but still it would be a really useful avoid-shoot-self-in-foot feature 
> if git-commit could warn about such stupidities of mine.
> 
> ( and if i could configure git-commit to outright reject a commit like 
>   that - i never want to commit lines with <<<<<< or >>>>> markers)
> 
> Another merge conflict observation is that Git is much worse at figuring 
> out the right merge resolution than our previous Quilt based workflow 
> was. I eventually found it to be mainly due to the following detail: 
> sometimes it's more useful to first apply the merged branch and then 
> attempt to merge HEAD, as a patch.
> 
> I've got a script for that which also combines it with the "rej" tool, 
> and in about 70%-80% of the cases where Git is unable to resolve a merge 
> automatically it figures things out. ('rej' is obviously a more relaxed 
> merge utility, but it's fairly robust in my experience, with a very low 
> false positive rate.)
> 
> The ad-hoc "tip-mergetool" script we are using is attached below. It's 
> really just for demonstration purposes - it doesnt work when there's a 
> rename related conflict, etc.
> 
> Peter Zijstra also wrote a git-mergetool extension for the 'rej' tool 
> btw., he might want to post that patch. I've attached Chris Mason's rej 
> tool too.

This is what I run with.

I added the cp to the 3-way merge tools because I think its stupid to
see the messed up merge markers instead of the original file.

The rej target basically takes the local version and takes the diff
between base and remote and applies that as a patch, upon failure it
invokes rej to fix up the mess.

--- /usr/bin/git-mergetool	2008-04-08 19:01:37.000000000 +0200
+++ git-mergetool	2008-06-02 19:00:55.000000000 +0200
@@ -214,12 +214,14 @@ merge_file () {
 	    ;;
 	meld|vimdiff)
 	    touch "$BACKUP"
+	    cp -- "$BASE" "$path"
 	    "$merge_tool_path" -- "$LOCAL" "$path" "$REMOTE"
 	    check_unchanged
 	    save_backup
 	    ;;
 	gvimdiff)
 		touch "$BACKUP"
+		cp -- "$BASE" "$path"
 		"$merge_tool_path" -f -- "$LOCAL" "$path" "$REMOTE"
 		check_unchanged
 		save_backup
@@ -271,6 +273,13 @@ merge_file () {
 	    status=$?
 	    save_backup
 	    ;;
+        rej)
+	    touch "$BACKUP"
+	    cp -- "$LOCAL" "$path"
+	    diff -up "$BASE" "$REMOTE" | patch "$path" || rej "$path"
+	    check_unchanged
+	    save_backup
+	    ;;
     esac
     if test "$status" -ne 0; then
 	echo "merge of $path failed" 1>&2
@@ -311,7 +320,7 @@ done
 
 valid_tool() {
 	case "$1" in
-		kdiff3 | tkdiff | xxdiff | meld | opendiff | emerge | vimdiff | gvimdiff | ecmerge)
+		kdiff3 | tkdiff | xxdiff | meld | opendiff | emerge | vimdiff | gvimdiff | ecmerge | rej)
 			;; # happy
 		*)
 			return 1

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: git-rerere observations and feature suggestions
  2008-06-23 14:19     ` Peter Zijlstra
@ 2008-06-23 14:26       ` Peter Zijlstra
  0 siblings, 0 replies; 45+ messages in thread
From: Peter Zijlstra @ 2008-06-23 14:26 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Junio C Hamano, git, Chris Mason, Thomas Gleixner

On Mon, 2008-06-23 at 16:20 +0200, Peter Zijlstra wrote:
> On Mon, 2008-06-23 at 11:49 +0200, Ingo Molnar wrote:
> > another git-rerere observation: occasionally it happens that i 
> > accidentally commit a merge marker into the source code.
> > 
> > That's obviously stupid, and it normally gets found by testing quickly, 
> > but still it would be a really useful avoid-shoot-self-in-foot feature 
> > if git-commit could warn about such stupidities of mine.
> > 
> > ( and if i could configure git-commit to outright reject a commit like 
> >   that - i never want to commit lines with <<<<<< or >>>>> markers)
> > 
> > Another merge conflict observation is that Git is much worse at figuring 
> > out the right merge resolution than our previous Quilt based workflow 
> > was. I eventually found it to be mainly due to the following detail: 
> > sometimes it's more useful to first apply the merged branch and then 
> > attempt to merge HEAD, as a patch.
> > 
> > I've got a script for that which also combines it with the "rej" tool, 
> > and in about 70%-80% of the cases where Git is unable to resolve a merge 
> > automatically it figures things out. ('rej' is obviously a more relaxed 
> > merge utility, but it's fairly robust in my experience, with a very low 
> > false positive rate.)
> > 
> > The ad-hoc "tip-mergetool" script we are using is attached below. It's 
> > really just for demonstration purposes - it doesnt work when there's a 
> > rename related conflict, etc.
> > 
> > Peter Zijstra also wrote a git-mergetool extension for the 'rej' tool 
> > btw., he might want to post that patch. I've attached Chris Mason's rej 
> > tool too.
> 
> This is what I run with.
> 
> I added the cp to the 3-way merge tools because I think its stupid to
> see the messed up merge markers instead of the original file.

While we're on the subject, I only found one tool that 'digs' these
merge markers and that is xxdiff --unmerge.

One would think more tools understand these merge markers, but I
couldn't find any.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: git-rerere observations and feature suggestions
  2008-06-23  9:49   ` Ingo Molnar
  2008-06-23 14:19     ` Peter Zijlstra
@ 2008-06-23 15:12     ` Jeff King
  2008-06-23 15:22       ` Ingo Molnar
  1 sibling, 1 reply; 45+ messages in thread
From: Jeff King @ 2008-06-23 15:12 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Junio C Hamano, git, Peter Zijlstra, Chris Mason, Thomas Gleixner

On Mon, Jun 23, 2008 at 11:49:06AM +0200, Ingo Molnar wrote:

> another git-rerere observation: occasionally it happens that i 
> accidentally commit a merge marker into the source code.
> 
> That's obviously stupid, and it normally gets found by testing quickly, 
> but still it would be a really useful avoid-shoot-self-in-foot feature 
> if git-commit could warn about such stupidities of mine.
> 
> ( and if i could configure git-commit to outright reject a commit like 
>   that - i never want to commit lines with <<<<<< or >>>>> markers)

The right place for this is in a pre-commit hook, which can look at what
you are about to commit and decide if it is OK. In fact, the default
pre-commit hook that ships with git performs this exact check. You just
need to turn it on with:

  chmod +x .git/hooks/pre-commit

-Peff

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: git-rerere observations and feature suggestions
  2008-06-23 15:12     ` Jeff King
@ 2008-06-23 15:22       ` Ingo Molnar
  0 siblings, 0 replies; 45+ messages in thread
From: Ingo Molnar @ 2008-06-23 15:22 UTC (permalink / raw)
  To: Jeff King
  Cc: Junio C Hamano, git, Peter Zijlstra, Chris Mason, Thomas Gleixner


* Jeff King <peff@peff.net> wrote:

> > ( and if i could configure git-commit to outright reject a commit like 
> >   that - i never want to commit lines with <<<<<< or >>>>> markers)
> 
> The right place for this is in a pre-commit hook, which can look at 
> what you are about to commit and decide if it is OK. In fact, the 
> default pre-commit hook that ships with git performs this exact check. 
> You just need to turn it on with:
> 
>   chmod +x .git/hooks/pre-commit

cool, thanks :-)

	Ingo

^ permalink raw reply	[flat|nested] 45+ messages in thread

end of thread, other threads:[~2008-06-23 15:24 UTC | newest]

Thread overview: 45+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-06-16 11:01 git-rerere observations and feature suggestions Ingo Molnar
2008-06-16 11:09 ` Mike Hommey
2008-06-16 15:48   ` Pierre Habouzit
2008-06-16 15:57     ` Pierre Habouzit
2008-06-16 16:18       ` Sverre Rabbelier
2008-06-17  7:37         ` Karl Hasselström
2008-06-16 11:26 ` David Kastrup
2008-06-16 11:27 ` Theodore Tso
2008-06-16 12:38   ` David Kastrup
2008-06-16 19:52   ` Ingo Molnar
2008-06-16 20:25     ` Junio C Hamano
2008-06-16 20:46       ` Ingo Molnar
2008-06-16 21:37         ` Junio C Hamano
2008-06-16 18:46 ` Junio C Hamano
2008-06-16 19:09   ` Ingo Molnar
2008-06-16 20:50     ` Junio C Hamano
2008-06-22  9:47       ` [PATCH 1/5] rerere: rerere_created_at() and has_resolution() abstraction Junio C Hamano
2008-06-22  9:47       ` [PATCH 2/5] git-rerere: detect unparsable conflicts Junio C Hamano
2008-06-22  9:47       ` [PATCH 3/5] rerere: remove dubious "tail_optimization" Junio C Hamano
2008-06-22  9:48       ` [PATCH 4/5] t4200: fix rerere test Junio C Hamano
2008-06-22  9:48       ` [PATCH 5/5] rerere.autoupdate Junio C Hamano
2008-06-18 10:57     ` git-rerere observations and feature suggestions Ingo Molnar
2008-06-18 11:29       ` Miklos Vajna
2008-06-18 18:43         ` Ingo Molnar
2008-06-18 19:53           ` Miklos Vajna
2008-06-18 11:36       ` Ingo Molnar
2008-06-18 22:01       ` Jakub Narebski
2008-06-18 22:38         ` Miklos Vajna
2008-06-19  7:23           ` Karl Hasselström
2008-06-19  7:29             ` Miklos Vajna
2008-06-19  7:30             ` Junio C Hamano
2008-06-19  8:21               ` Karl Hasselström
2008-06-19  8:33                 ` Miklos Vajna
2008-06-19  9:19                   ` Karl Hasselström
2008-06-19 10:06                     ` Miklos Vajna
2008-06-19 10:35                       ` Karl Hasselström
2008-06-16 19:10   ` Junio C Hamano
2008-06-16 19:44     ` Ingo Molnar
2008-06-23  9:49   ` Ingo Molnar
2008-06-23 14:19     ` Peter Zijlstra
2008-06-23 14:26       ` Peter Zijlstra
2008-06-23 15:12     ` Jeff King
2008-06-23 15:22       ` Ingo Molnar
2008-06-16 20:11 ` Jakub Narebski
2008-06-17 10:24 ` Johannes Schindelin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).