git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Merge-Recursive Improvements
@ 2008-02-12 22:16 Voltage Spike
  2008-02-12 23:03 ` Stefan Monnier
                   ` (3 more replies)
  0 siblings, 4 replies; 23+ messages in thread
From: Voltage Spike @ 2008-02-12 22:16 UTC (permalink / raw)
  To: git

I would like to make a series of significant improvements to the
merge-recursive mechanism in git, but I was hoping to solicit some early
feedback before submitting patches.

First, git is overly zealous at merging differences and two functions  
added
at the same point in a file become intertwined during the merge. A  
trivial
example of this behavior:

   <<<<<<< HEAD:file.txt
   void newfunc1()
   =======
   void newfunc2()
   >>>>>>> merge:file.txt
   {
     int err;
   <<<<<<< HEAD:file.txt
     err = doSomething();
   =======
     err = doSomethingElse();
   >>>>>>> merge:file.txt

Second, git doesn't tell me the original code inside the conflict  
markers so
I almost always resort to "MERGE_HEAD...ORIG_HEAD" and
"ORIG_HEAD...MERGE_HEAD" diffs to see what was going on. I could use an
external diff tool (yuck), but I would like to modify the conflict  
markers
to resemble those of Perforce:

   >>>>>>> merge-base:file.txt
   Original code.
   ======= HEAD:file.txt
   Head code.
   ======= merge:file.txt
   Merged code.
   <<<<<<<

Third, git doesn't appear to have any sense of context when performing a
merge. Another contrived example which wouldn't be flagged as a merge
conflict:

   ptr = malloc(len); // Added in HEAD.
   init();            // Included in merge-base.
   ptr = malloc(len); // Added in "merge".

Fourth, git doesn't provide a mechanism for merges to ignore whitespace
changes.

I resolved issues the first and the fourth through the introduction  
of new
configuration variables and trivial modifications to the manner in  
which we
call xdl_merge. I suspect the second and third issue may also be  
simple to
solve but would require that I modify libxdiff directly.

Are these changes something other people might be interested in? (It  
seems
odd that nobody is complaining about these really irritating flaws.)  
Should
I concern myself with writing a custom merge driver rather than  
modify core
behavior (even if the change is configurable)?  If I should focus on an
external driver, under what circumstances would merge.*.recursive  
come into
play (i.e., when do I have to worry about poor behavior for an "internal
merge")?

Thank you in advance for the feedback.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Merge-Recursive Improvements
  2008-02-12 22:16 Merge-Recursive Improvements Voltage Spike
@ 2008-02-12 23:03 ` Stefan Monnier
  2008-02-12 23:11 ` Junio C Hamano
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 23+ messages in thread
From: Stefan Monnier @ 2008-02-12 23:03 UTC (permalink / raw)
  To: git

> "ORIG_HEAD...MERGE_HEAD" diffs to see what was going on. I could use an
> external diff tool (yuck), but I would like to modify the conflict markers
> to resemble those of Perforce:

>>>>>>>> merge-base:file.txt
>   Original code.
>   ======= HEAD:file.txt
>   Head code.
>   ======= merge:file.txt
>   Merged code.
>   <<<<<<<

Having such 3-parts conflicts helps tremendously when you have to do
the merge by hand, so I'm 100% in favor of such a change.

BUT Please, please, pretty please, don't follow Perforce who blindly
disregards previous standards.  Instead use the format used by diff3
which has been there for ages:

   <<<<<<< foo
   original text
   ||||||| bar
   ancestor
   =======
   new text
   >>>>>>> baz

> Third, git doesn't appear to have any sense of context when performing a
> merge. Another contrived example which wouldn't be flagged as a merge
> conflict:

>   ptr = malloc(len); // Added in HEAD.
>   init();            // Included in merge-base.
>   ptr = malloc(len); // Added in "merge".

Yes, that's nasty.

> Fourth, git doesn't provide a mechanism for merges to ignore whitespace
> changes.

I can live with that.  As long as the conflict is clearly marked with
all 3 parts, I can use any external tool I want to resolve the conflict.


        Stefan

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Merge-Recursive Improvements
  2008-02-12 22:16 Merge-Recursive Improvements Voltage Spike
  2008-02-12 23:03 ` Stefan Monnier
@ 2008-02-12 23:11 ` Junio C Hamano
  2008-02-12 23:48 ` Linus Torvalds
  2008-02-13  7:39 ` Merge-Recursive Improvements Johannes Sixt
  3 siblings, 0 replies; 23+ messages in thread
From: Junio C Hamano @ 2008-02-12 23:11 UTC (permalink / raw)
  To: Voltage Spike; +Cc: git

Voltage Spike <voltspike@gmail.com> writes:

> I would like to make a series of significant improvements to the
> merge-recursive mechanism in git, but I was hoping to solicit some early
> feedback before submitting patches.
>
> First, git is overly zealous at merging differences and two functions
> added
> at the same point in a file become intertwined during the merge. A
> trivial
> example of this behavior:
>
>   <<<<<<< HEAD:file.txt
>   void newfunc1()
>   =======
>   void newfunc2()
>   >>>>>>> merge:file.txt
>   {
>     int err;
>   <<<<<<< HEAD:file.txt
>     err = doSomething();
>   =======
>     err = doSomethingElse();
>   >>>>>>> merge:file.txt

This lacks illustration of what you change that example to, which
makes the proposal harder to judge.

I suspect you are saying that you would want to coalesce
adjacent hunks that have too small number of lines between '>>>'
of the previous hunk and '<<<' of the current hunk by duplicate
the common hunks, like this:

   <<<<<<< HEAD:file.txt
   void newfunc1()
   {
     int err;
     err = doSomething();
   =======
   void newfunc2()
   {
     int err;
     err = doSomethingElse();
   >>>>>>> merge:file.txt

(here, two lines that are "{" and "int err;" are taken as "too small").

I think it makes sense.

> Second, git doesn't tell me the original code inside the conflict
>
>   >>>>>>> merge-base:file.txt
>   Original code.
>   ======= HEAD:file.txt
>   Head code.
>   ======= merge:file.txt
>   Merged code.
>   <<<<<<<

This is a much harder sell, as external tool like git-mergetool
that inspect the result depend on the current output.

And it is not as useful as an alternative.

In case you did not know, you can get a much better picture by:

    $ git log --left-right -p --merge

because you would then see not just the merge base version but
the changes _and the reasons for the changes_ in between.

> Third, git doesn't appear to have any sense of context when performing a
> merge. Another contrived example which wouldn't be flagged as a merge
> conflict:
>
>   ptr = malloc(len); // Added in HEAD.
>   init();            // Included in merge-base.
>   ptr = malloc(len); // Added in "merge".

Are you saying it a problem to report or not to report?  In
either case, I decline to comment on this one, as I do not have
a strong opinion either way.

> Fourth, git doesn't provide a mechanism for merges to ignore whitespace
> changes.

That would be a good change.

I can immediately say that 1 and 4 are worthwhile things to do,
as long as they are contained to xdl_merge().  It would help
other users of the merge logic.

I've started working on rewriting revert to directly use
xdl_merge(), bypassing major parts of merge-recursive, and I
imagine such a change you propose would be useful without
affecting the callers.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Merge-Recursive Improvements
  2008-02-12 22:16 Merge-Recursive Improvements Voltage Spike
  2008-02-12 23:03 ` Stefan Monnier
  2008-02-12 23:11 ` Junio C Hamano
@ 2008-02-12 23:48 ` Linus Torvalds
  2008-02-13  0:05   ` Johannes Schindelin
  2008-02-13  7:39 ` Merge-Recursive Improvements Johannes Sixt
  3 siblings, 1 reply; 23+ messages in thread
From: Linus Torvalds @ 2008-02-12 23:48 UTC (permalink / raw)
  To: Voltage Spike; +Cc: git



On Tue, 12 Feb 2008, Voltage Spike wrote:
> 
> First, git is overly zealous at merging differences and two functions added
> at the same point in a file become intertwined during the merge. A trivial
> example of this behavior:

Hmm. Have you tested what happens if you use XDL_MERGE_EAGER instead of 
XDL_MERGE_ZEALOUS in the "level" argument to xdl_merge() in 
merge-recursive.c?

(No, I didn't test it myself, but it may get you the behaviour you want, 
and we could make it a config option for people who want a less aggressive 
merge)

			Linus

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Merge-Recursive Improvements
  2008-02-12 23:48 ` Linus Torvalds
@ 2008-02-13  0:05   ` Johannes Schindelin
  2008-02-13  1:10     ` [PATCH] xdl_merge(): introduce XDL_MERGE_ZEALOUS_ALNUM Johannes Schindelin
  0 siblings, 1 reply; 23+ messages in thread
From: Johannes Schindelin @ 2008-02-13  0:05 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Voltage Spike, git

Hi,

On Tue, 12 Feb 2008, Linus Torvalds wrote:

> On Tue, 12 Feb 2008, Voltage Spike wrote:
> > 
> > First, git is overly zealous at merging differences and two functions 
> > added at the same point in a file become intertwined during the merge. 
> > A trivial example of this behavior:
> 
> Hmm. Have you tested what happens if you use XDL_MERGE_EAGER instead of 
> XDL_MERGE_ZEALOUS in the "level" argument to xdl_merge() in 
> merge-recursive.c?
> 
> (No, I didn't test it myself, but it may get you the behaviour you want, 
> and we could make it a config option for people who want a less 
> aggressive merge)

Actually, I have this in my ever-growing TODO:

XDL_MERGE_ZEALOUS_ALNUM: require an alnum in the common code; otherwise do 
not de-conflict it.

In other words, if there is a hunk consisting of conflicting lines, which 
are identical, but have no letter and no number in it, then keep them as 
conflicts.

But I never got around to try it.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH] xdl_merge(): introduce XDL_MERGE_ZEALOUS_ALNUM
  2008-02-13  0:05   ` Johannes Schindelin
@ 2008-02-13  1:10     ` Johannes Schindelin
  2008-02-13  1:34       ` Junio C Hamano
  2008-02-13  2:06       ` [PATCH] xdl_merge(): introduce XDL_MERGE_ZEALOUS_ALNUM Linus Torvalds
  0 siblings, 2 replies; 23+ messages in thread
From: Johannes Schindelin @ 2008-02-13  1:10 UTC (permalink / raw)
  To: Linus Torvalds, gitster; +Cc: Voltage Spike, git


When a merge conflicts, there are often common lines that are not really
common, such as empty lines or lines containing a single curly bracket.

With XDL_MERGE_ZEALOUS_ALNUM, we use the following heuristics: when a
hunk does not contain any letters or digits, it is treated as conflicting.

In other words, a conflict which used to look like this:

	<<<<<<<
	if (a == 1)
	=======
	if (a == 2)
	>>>>>>>
	{
	<<<<<<<
		b = 2;
	=======
		b = 1;
	>>>>>>>

will look like this with ZEALOUS_ALNUM:

	<<<<<<<
	if (a == 1)
	{
		b = 2;
	=======
	if (a == 2)
	{
		b = 1;
	>>>>>>>

To demonstrate this, git-merge-file has been switched from
XDL_MERGE_ZEALOUS to XDL_MERGE_ZEALOUS_ALNUM.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---

	On Wed, 13 Feb 2008, Johannes Schindelin wrote:

	> On Tue, 12 Feb 2008, Linus Torvalds wrote:
	> 
	> > On Tue, 12 Feb 2008, Voltage Spike wrote:
	> > > 
	> > > First, git is overly zealous at merging differences and two 
	> > > functions added at the same point in a file become 
	> > > intertwined during the merge.  A trivial example of this 
	> > > behavior:
	> > 
	> > Hmm. Have you tested what happens if you use XDL_MERGE_EAGER 
	> > instead of XDL_MERGE_ZEALOUS in the "level" argument to 
	> > xdl_merge() in merge-recursive.c?
	> > 
	> > (No, I didn't test it myself, but it may get you the behaviour 
	> > you want, and we could make it a config option for people who 
	> > want a less aggressive merge)
	> 
	> Actually, I have this in my ever-growing TODO:
	> 
	> XDL_MERGE_ZEALOUS_ALNUM: require an alnum in the common code; 
	> otherwise do not de-conflict it.
	> 
	> In other words, if there is a hunk consisting of conflicting 
	> lines, which are identical, but have no letter and no number in 
	> it, then keep them as conflicts.
	> 
	> But I never got around to try it.

	I just could not resist.  But now I HEAD for bed.

 builtin-merge-file.c  |    2 +-
 t/t6023-merge-file.sh |   40 ++++++++++++++++++++++++++++++++++
 xdiff/xdiff.h         |    1 +
 xdiff/xmerge.c        |   57 +++++++++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 99 insertions(+), 1 deletions(-)

diff --git a/builtin-merge-file.c b/builtin-merge-file.c
index 58deb62..adce6d4 100644
--- a/builtin-merge-file.c
+++ b/builtin-merge-file.c
@@ -46,7 +46,7 @@ int cmd_merge_file(int argc, const char **argv, const char *prefix)
 	}
 
 	ret = xdl_merge(mmfs + 1, mmfs + 0, names[0], mmfs + 2, names[2],
-			&xpp, XDL_MERGE_ZEALOUS, &result);
+			&xpp, XDL_MERGE_ZEALOUS_ALNUM, &result);
 
 	for (i = 0; i < 3; i++)
 		free(mmfs[i].ptr);
diff --git a/t/t6023-merge-file.sh b/t/t6023-merge-file.sh
index 8641996..7e72b8d 100755
--- a/t/t6023-merge-file.sh
+++ b/t/t6023-merge-file.sh
@@ -139,4 +139,44 @@ test_expect_success 'binary files cannot be merged' '
 	grep "Cannot merge binary files" merge.err
 '
 
+cat > a1.c << EOF
+int main()
+{
+	return 1;
+}
+EOF
+
+cat > a2.c << EOF
+int main2()
+{
+	return 0;
+}
+EOF
+
+cat > a3.c << EOF
+int main3()
+{
+	return 2;
+}
+EOF
+
+cat > expect << EOF
+<<<<<<< a2.c
+int main2()
+{
+	return 0;
+}
+=======
+int main3()
+{
+	return 2;
+}
+>>>>>>> a3.c
+EOF
+
+test_expect_success 'ZEALOUS_ALNUM' '
+	! git merge-file -p a2.c a1.c a3.c > merge.out &&
+	git diff expect merge.out
+'
+
 test_done
diff --git a/xdiff/xdiff.h b/xdiff/xdiff.h
index c00ddaa..413082e 100644
--- a/xdiff/xdiff.h
+++ b/xdiff/xdiff.h
@@ -53,6 +53,7 @@ extern "C" {
 #define XDL_MERGE_MINIMAL 0
 #define XDL_MERGE_EAGER 1
 #define XDL_MERGE_ZEALOUS 2
+#define XDL_MERGE_ZEALOUS_ALNUM 3
 
 typedef struct s_mmfile {
 	char *ptr;
diff --git a/xdiff/xmerge.c b/xdiff/xmerge.c
index b83b334..330121b 100644
--- a/xdiff/xmerge.c
+++ b/xdiff/xmerge.c
@@ -248,10 +248,63 @@ static int xdl_refine_conflicts(xdfenv_t *xe1, xdfenv_t *xe2, xdmerge_t *m,
 	return 0;
 }
 
+static int line_contains_alnum(const char *ptr, long size)
+{
+	while (size--)
+		if (isalnum(*(ptr++)))
+			return 1;
+	return 0;
+}
+
+static int lines_contain_alnum(xdfenv_t *xe, int i, int chg)
+{
+	for (; chg; chg--, i++)
+		if (line_contains_alnum(xe->xdf2.recs[i]->ptr,
+				xe->xdf2.recs[i]->size))
+			return 1;
+	return 0;
+}
+
+/*
+ * This function merges m and m->next, marking everything between those hunks
+ * as conflicting, too.
+ */
+static void xdl_merge_two_conflicts(xdmerge_t *m)
+{
+	xdmerge_t *next_m = m->next;
+	m->chg1 += next_m->i1 + next_m->chg1 - m->i1;
+	m->chg2 += next_m->i2 + next_m->chg2 - m->i2;
+	m->next = next_m->next;
+	free(next_m);
+}
+
+static int xdl_non_alnum_conflicts(xdfenv_t *xe1, xdmerge_t *m)
+{
+	int result = 0;
+
+	for (;;) {
+		xdmerge_t *next_m = m->next;
+
+		if (!next_m)
+			return result;
+
+		if (lines_contain_alnum(xe1, m->i1 + m->chg1,
+				next_m->i1 + next_m->chg1 - 1
+				- m->i1 - m->chg1))
+			m = next_m;
+		else {
+			result++;
+			xdl_merge_two_conflicts(m);
+		}
+	}
+}
+
 /*
  * level == 0: mark all overlapping changes as conflict
  * level == 1: mark overlapping changes as conflict only if not identical
  * level == 2: analyze non-identical changes for minimal conflict set
+ * level == 3: analyze non-identical changes for minimal conflict set, but
+ *             treat hunks not containing any letter or number as conflicting
  *
  * returns < 0 on error, == 0 for no conflicts, else number of conflicts
  */
@@ -359,6 +412,10 @@ static int xdl_do_merge(xdfenv_t *xe1, xdchange_t *xscr1, const char *name1,
 		xdl_cleanup_merge(changes);
 		return -1;
 	}
+	if (level > 2 && xdl_non_alnum_conflicts(xe1, changes) < 0) {
+		xdl_cleanup_merge(changes);
+		return -1;
+	}
 	/* output */
 	if (result) {
 		int size = xdl_fill_merge_buffer(xe1, name1, xe2, name2,
-- 
1.5.4.1.1321.g633fc8

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH] xdl_merge(): introduce XDL_MERGE_ZEALOUS_ALNUM
  2008-02-13  1:10     ` [PATCH] xdl_merge(): introduce XDL_MERGE_ZEALOUS_ALNUM Johannes Schindelin
@ 2008-02-13  1:34       ` Junio C Hamano
  2008-02-13 11:16         ` Johannes Schindelin
  2008-02-13  2:06       ` [PATCH] xdl_merge(): introduce XDL_MERGE_ZEALOUS_ALNUM Linus Torvalds
  1 sibling, 1 reply; 23+ messages in thread
From: Junio C Hamano @ 2008-02-13  1:34 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Linus Torvalds, Voltage Spike, git

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

> When a merge conflicts, there are often common lines that are not really
> common, such as empty lines or lines containing a single curly bracket.
>
> With XDL_MERGE_ZEALOUS_ALNUM, we use the following heuristics: when a
> hunk does not contain any letters or digits, it is treated as conflicting.

I like the general direction.

This might need to be loosened further if we want to cover
Voltage's case where the inconveniently common hunk had another
line, "int err;", which had alnums.  Perhaps we would want to
say "max N alnums" instead of "no alnums".

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] xdl_merge(): introduce XDL_MERGE_ZEALOUS_ALNUM
  2008-02-13  1:10     ` [PATCH] xdl_merge(): introduce XDL_MERGE_ZEALOUS_ALNUM Johannes Schindelin
  2008-02-13  1:34       ` Junio C Hamano
@ 2008-02-13  2:06       ` Linus Torvalds
  2008-02-13 11:22         ` Johannes Schindelin
  1 sibling, 1 reply; 23+ messages in thread
From: Linus Torvalds @ 2008-02-13  2:06 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: gitster, Voltage Spike, git



On Wed, 13 Feb 2008, Johannes Schindelin wrote:
> 
> With XDL_MERGE_ZEALOUS_ALNUM, we use the following heuristics: when a
> hunk does not contain any letters or digits, it is treated as conflicting.

Well, I think this is interesting in itself, but..

To some degree it would be even more interesting to at least partially 
separate the issue of "what conflicts" with the issue of "how do we 
express things when they _do_ conflict".

IOW, it's quite possible that we want to have the ZEALOUS algorithm for 
doing conflict resolution (on the assumption that we want aggressively 
merge), but then when conflicts happen _despite_ being zealous in the 
resolver, print out the resulting conflict with near-by conflicts merged 
into bigger block.

> In other words, a conflict which used to look like this:
> 
> 	<<<<<<<
> 	if (a == 1)
> 	=======
> 	if (a == 2)
> 	>>>>>>>
> 	{
> 	<<<<<<<
> 		b = 2;
> 	=======
> 		b = 1;
> 	>>>>>>>
> 
> will look like this with ZEALOUS_ALNUM:
> 
> 	<<<<<<<
> 	if (a == 1)
> 	{
> 		b = 2;
> 	=======
> 	if (a == 2)
> 	{
> 		b = 1;
> 	>>>>>>>

I think this is an improvement already, but to take the example that 
voltspike had:

 <<<<<<< HEAD:file.txt
 void newfunc1()
 =======
 void newfunc2()
 >>>>>>> merge:file.txt
 {
   int err;
 <<<<<<< HEAD:file.txt
   err = doSomething();
 =======
   err = doSomethingElse();
 >>>>>>> merge:file.txt

this does have alnum's in the shared region ("int err") so it wouldn't 
have been modified by this, but it would be nice to notice: "there were 
just two small lines between two conflicts, and we could actually make the 
final conflict marker _smaller_ by merging them", and just doing the 
reverse of xdl_refine_conflicts(), and do a "xdl_merge_conflicts()" before 
printout, and get

 <<<<<<< HEAD:file.txt
 void newfunc1()
 {
   int err;
   err = doSomething();
 =======
 void newfunc2()
 {
   int err;
   err = doSomethingElse();
 >>>>>>> merge:file.txt

(note how this really *is* smaller: it's 11 lines rather than 12 lines, 
because while we had to duplicate the two common lines in between the 
conflicts (+2), we got rid of the three marker lines (-3), giving us a net 
win of one line.

So the "merge adjacent conflicts" logic should actually be pretty simple: 
if there is less than three lines between two conflicts, the conflicts 
should always be merged, because the end result is smaller.

(And with three lines in between the end result is as many lines, but 
arguably simpler, so it's probably better to merge then too).

Hmm? What do you think?

			Linus

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Merge-Recursive Improvements
  2008-02-12 22:16 Merge-Recursive Improvements Voltage Spike
                   ` (2 preceding siblings ...)
  2008-02-12 23:48 ` Linus Torvalds
@ 2008-02-13  7:39 ` Johannes Sixt
  2008-02-13  8:17   ` Steffen Prohaska
                     ` (2 more replies)
  3 siblings, 3 replies; 23+ messages in thread
From: Johannes Sixt @ 2008-02-13  7:39 UTC (permalink / raw)
  To: Voltage Spike; +Cc: git

Voltage Spike schrieb:
> Third, git doesn't appear to have any sense of context when performing a
> merge. Another contrived example which wouldn't be flagged as a merge
> conflict:
> 
>   ptr = malloc(len); // Added in HEAD.
>   init();            // Included in merge-base.
>   ptr = malloc(len); // Added in "merge".

You seem to say that you want this to result in a merge conflict.

I'm opposed to this: It means that you would mark a conflict if there is a
single unchanged line between the two changes that come from the merged
branches. So far it has happened for me much more frequently that such
merges were correct, and I should not be bothered with conflict markers. I
conciously prefer to pay the price that such a merge is incorrect on occasion.

You also need to draw a border line: a single unchanged line between the
changes? Or better also conflict at 2 lines? Or 3?

-- Hannes

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Merge-Recursive Improvements
  2008-02-13  7:39 ` Merge-Recursive Improvements Johannes Sixt
@ 2008-02-13  8:17   ` Steffen Prohaska
  2008-02-13  8:21   ` Voltage Spike
  2008-02-15 19:21   ` Junio C Hamano
  2 siblings, 0 replies; 23+ messages in thread
From: Steffen Prohaska @ 2008-02-13  8:17 UTC (permalink / raw)
  To: Johannes Sixt; +Cc: Voltage Spike, git


On Feb 13, 2008, at 8:39 AM, Johannes Sixt wrote:

> Voltage Spike schrieb:
>> Third, git doesn't appear to have any sense of context when  
>> performing a
>> merge. Another contrived example which wouldn't be flagged as a merge
>> conflict:
>>
>>   ptr = malloc(len); // Added in HEAD.
>>   init();            // Included in merge-base.
>>   ptr = malloc(len); // Added in "merge".
>
> You seem to say that you want this to result in a merge conflict.
>
> I'm opposed to this: It means that you would mark a conflict if  
> there is a
> single unchanged line between the two changes that come from the  
> merged
> branches. So far it has happened for me much more frequently that such
> merges were correct, and I should not be bothered with conflict  
> markers. I
> conciously prefer to pay the price that such a merge is incorrect  
> on occasion.
>
> You also need to draw a border line: a single unchanged line  
> between the
> changes? Or better also conflict at 2 lines? Or 3?

Maybe git could try various numbers and print a certainty
measure that tells the user how far appart non-conflicting
changes are.  If changes are near git would print a low
certainty and the user could decide to review the merge in
more detail than he would usually do.

	Steffen

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Merge-Recursive Improvements
  2008-02-13  7:39 ` Merge-Recursive Improvements Johannes Sixt
  2008-02-13  8:17   ` Steffen Prohaska
@ 2008-02-13  8:21   ` Voltage Spike
  2008-02-13  8:46     ` Johannes Sixt
  2008-02-15 19:21   ` Junio C Hamano
  2 siblings, 1 reply; 23+ messages in thread
From: Voltage Spike @ 2008-02-13  8:21 UTC (permalink / raw)
  To: Johannes Sixt; +Cc: git

On Feb 13, 2008, at 12:39 AM, Johannes Sixt wrote:

> Voltage Spike schrieb:
>> Third, git doesn't appear to have any sense of context when  
>> performing a
>> merge. Another contrived example which wouldn't be flagged as a merge
>> conflict:
>>
>>   ptr = malloc(len); // Added in HEAD.
>>   init();            // Included in merge-base.
>>   ptr = malloc(len); // Added in "merge".
>
> You seem to say that you want this to result in a merge conflict.

Yes, it appears that I wasn't clear that I see the above as a conflict.

> I'm opposed to this: It means that you would mark a conflict if  
> there is a
> single unchanged line between the two changes that come from the  
> merged
> branches. So far it has happened for me much more frequently that such
> merges were correct, and I should not be bothered with conflict  
> markers. I
> conciously prefer to pay the price that such a merge is incorrect  
> on occasion.

That is why I'm hoping to make it configurable. I know that we have  
more information than during a simple patch, but it seems odd that  
changes can be occurring all around your local modifications and  
you'll never be notified.

Which leads to a different point: does this lessen the value of  
falling back to a 3-way merge during a rebase?

> You also need to draw a border line: a single unchanged line  
> between the
> changes? Or better also conflict at 2 lines? Or 3?

I naturally assumed the default number of context lines: 3. If I  
recall correctly, this isn't typically configurable.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Merge-Recursive Improvements
  2008-02-13  8:21   ` Voltage Spike
@ 2008-02-13  8:46     ` Johannes Sixt
  0 siblings, 0 replies; 23+ messages in thread
From: Johannes Sixt @ 2008-02-13  8:46 UTC (permalink / raw)
  To: Voltage Spike; +Cc: git

Voltage Spike schrieb:
> On Feb 13, 2008, at 12:39 AM, Johannes Sixt wrote:
> 
>> Voltage Spike schrieb:
>>> Third, git doesn't appear to have any sense of context when performing a
>>> merge. Another contrived example which wouldn't be flagged as a merge
>>> conflict:
>>>
>>>   ptr = malloc(len); // Added in HEAD.
>>>   init();            // Included in merge-base.
>>>   ptr = malloc(len); // Added in "merge".
>>
>> You seem to say that you want this to result in a merge conflict.
> 
> Yes, it appears that I wasn't clear that I see the above as a conflict.
> 
>> I'm opposed to this: It means that you would mark a conflict if there
>> is a
>> single unchanged line between the two changes that come from the merged
>> branches. So far it has happened for me much more frequently that such
>> merges were correct, and I should not be bothered with conflict
>> markers. I
>> conciously prefer to pay the price that such a merge is incorrect on
>> occasion.
> 
> That is why I'm hoping to make it configurable. I know that we have more
> information than during a simple patch, but it seems odd that changes
> can be occurring all around your local modifications and you'll never be
> notified.
> 
> Which leads to a different point: does this lessen the value of falling
> back to a 3-way merge during a rebase?

The current non-conflicting merges are invaluable for my workflow, which
involves lots and lots of rebasing and cherry-picking.

>> You also need to draw a border line: a single unchanged line between the
>> changes? Or better also conflict at 2 lines? Or 3?
> 
> I naturally assumed the default number of context lines: 3. If I recall
> correctly, this isn't typically configurable.

Nawww... Guess how many, many more conflicts this would report?

Practically all merges that I do are during rebase and cherry-pick. During
this work I often have changes that are separated by only a single line.
The potential merge conflicts that fall in the above category I know in
advance because I've made the changes just two minutes ago, and I can fix
them even without being reminded by a merge conflict.

IOW: I don't need conflict markers in this case - I need them not to
conflict at all.

-- Hannes

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] xdl_merge(): introduce XDL_MERGE_ZEALOUS_ALNUM
  2008-02-13  1:34       ` Junio C Hamano
@ 2008-02-13 11:16         ` Johannes Schindelin
  2008-02-15 17:32           ` Junio C Hamano
  0 siblings, 1 reply; 23+ messages in thread
From: Johannes Schindelin @ 2008-02-13 11:16 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Linus Torvalds, Voltage Spike, git

Hi,

On Tue, 12 Feb 2008, Junio C Hamano wrote:

> Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
> 
> > When a merge conflicts, there are often common lines that are not 
> > really common, such as empty lines or lines containing a single curly 
> > bracket.
> >
> > With XDL_MERGE_ZEALOUS_ALNUM, we use the following heuristics: when a 
> > hunk does not contain any letters or digits, it is treated as 
> > conflicting.
> 
> I like the general direction.
> 
> This might need to be loosened further if we want to cover Voltage's 
> case where the inconveniently common hunk had another line, "int err;", 
> which had alnums.  Perhaps we would want to say "max N alnums" instead 
> of "no alnums".

Maybe even both.  

As Linus has stated in the other reply, up to three lines between two 
conflicts could be "merged" with the conflicts by default, because less or 
equally much screen estate would used.

So I am thinking about an interface that is not too painful.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] xdl_merge(): introduce XDL_MERGE_ZEALOUS_ALNUM
  2008-02-13  2:06       ` [PATCH] xdl_merge(): introduce XDL_MERGE_ZEALOUS_ALNUM Linus Torvalds
@ 2008-02-13 11:22         ` Johannes Schindelin
  0 siblings, 0 replies; 23+ messages in thread
From: Johannes Schindelin @ 2008-02-13 11:22 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: gitster, Voltage Spike, git

Hi,

On Tue, 12 Feb 2008, Linus Torvalds wrote:

> On Wed, 13 Feb 2008, Johannes Schindelin wrote:
> > 
> > With XDL_MERGE_ZEALOUS_ALNUM, we use the following heuristics: when a 
> > hunk does not contain any letters or digits, it is treated as 
> > conflicting.
> 
> [...]
> 
> So the "merge adjacent conflicts" logic should actually be pretty 
> simple: if there is less than three lines between two conflicts, the 
> conflicts should always be merged, because the end result is smaller.
> 
> (And with three lines in between the end result is as many lines, but 
> arguably simpler, so it's probably better to merge then too).
> 
> Hmm? What do you think?

Makes sense.  As I said to Junio, I'll think about an interface to do 
this.  The obvious choice is to have a struct, but that has to be memset() 
to 0 for future compatibility.

OTOH there's xpparam_t already, and we could just have that as a member of 
the new struct, something like

typedef struct s_xmergeparam {
	xpparam_t xpp;
	enum {
		XDL_MERGE_MINIMAL = 0,
		XDL_MERGE_EAGER,
		XDL_MERGE_ZEALOUS,
		XDL_MERGE_ZEALOUS_ALNUM
	} mode;
	/* minimum number of inter-conflict lines goes here */
} xmergeparam_t;

Hmm.  This has to simmer in my head a bit.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] xdl_merge(): introduce XDL_MERGE_ZEALOUS_ALNUM
  2008-02-13 11:16         ` Johannes Schindelin
@ 2008-02-15 17:32           ` Junio C Hamano
  2008-02-15 18:17             ` Linus Torvalds
  0 siblings, 1 reply; 23+ messages in thread
From: Junio C Hamano @ 2008-02-15 17:32 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Linus Torvalds, Voltage Spike, git

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

> Maybe even both.  
>
> As Linus has stated in the other reply, up to three lines between two 
> conflicts could be "merged" with the conflicts by default, because less or 
> equally much screen estate would used.
>
> So I am thinking about an interface that is not too painful.

I think there is no excuse not to coalesce hunks separated by
three lines or less, so we can first get immediate improvement
without any configuration or tweaking.  My "less than N alnums"
was ill thought out overengineering, and Linus's improvement is
much cleaner.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] xdl_merge(): introduce XDL_MERGE_ZEALOUS_ALNUM
  2008-02-15 17:32           ` Junio C Hamano
@ 2008-02-15 18:17             ` Linus Torvalds
  2008-02-15 18:23               ` Johannes Schindelin
  2008-02-17 19:06               ` Johannes Schindelin
  0 siblings, 2 replies; 23+ messages in thread
From: Linus Torvalds @ 2008-02-15 18:17 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Johannes Schindelin, Voltage Spike, git



On Fri, 15 Feb 2008, Junio C Hamano wrote:
> 
> I think there is no excuse not to coalesce hunks separated by
> three lines or less

Well, I think the two line limit is the "unquestionable" one, since that's 
the one that actually results in fewer lines over-all. 

The three-line case gets a bit less obvious since the line count doesn't 
change, and if the unchanged lines are complex, it might well be better to 
leave them as shared. What's not uncommon at all is that you have a small 
change that results in a new variable or similar, and then it's quite 
possible that the first conflict comes from a new variable declaration, 
and the second conflict is the "real code" change, and if there are three 
complex lines in between, it probably makes sense to keep them unmodified, 
and have two much simpler choices.

In fact, in many ways, maybe we'd be better off counting (non-space) bytes 
rather than lines. That gets the "complexity" argument mostly right.

			Linus

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] xdl_merge(): introduce XDL_MERGE_ZEALOUS_ALNUM
  2008-02-15 18:17             ` Linus Torvalds
@ 2008-02-15 18:23               ` Johannes Schindelin
  2008-02-17 19:06               ` Johannes Schindelin
  1 sibling, 0 replies; 23+ messages in thread
From: Johannes Schindelin @ 2008-02-15 18:23 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Junio C Hamano, Voltage Spike, git

Hi,

On Fri, 15 Feb 2008, Linus Torvalds wrote:

> In fact, in many ways, maybe we'd be better off counting (non-space) 
> bytes rather than lines. That gets the "complexity" argument mostly 
> right.

I don't like it.  It's not simple enough.  Let's stay with 3 lines, and if 
it turns out to be a bad choice, change it to two.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Merge-Recursive Improvements
  2008-02-13  7:39 ` Merge-Recursive Improvements Johannes Sixt
  2008-02-13  8:17   ` Steffen Prohaska
  2008-02-13  8:21   ` Voltage Spike
@ 2008-02-15 19:21   ` Junio C Hamano
  2 siblings, 0 replies; 23+ messages in thread
From: Junio C Hamano @ 2008-02-15 19:21 UTC (permalink / raw)
  To: Johannes Sixt; +Cc: Voltage Spike, git

Johannes Sixt <j.sixt@viscovery.net> writes:

> Voltage Spike schrieb:
>> Third, git doesn't appear to have any sense of context when performing a
>> merge. Another contrived example which wouldn't be flagged as a merge
>> conflict:
>> 
>>   ptr = malloc(len); // Added in HEAD.
>>   init();            // Included in merge-base.
>>   ptr = malloc(len); // Added in "merge".
>
> You seem to say that you want this to result in a merge conflict.
>
> I'm opposed to this: It means that you would mark a conflict if there is a
> single unchanged line between the two changes that come from the merged
> branches. So far it has happened for me much more frequently that such
> merges were correct, and I should not be bothered with conflict markers. I
> conciously prefer to pay the price that such a merge is incorrect on occasion.

Actually I think we really should mark this as conflict.  The
tool should resolve only the most unquestionable cases and keep
humans in the loop to validate the result if there is any
uncertainty.  Resolving the above example automatically without
warning is most likely a problem waiting to happen.

Such a merge being more often correct than not is not an
argument for resolving them silently.  It's rare mismerge cases
that will bite you later, and we should really be careful,
especially when a mismerge is less common.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] xdl_merge(): introduce XDL_MERGE_ZEALOUS_ALNUM
  2008-02-15 18:17             ` Linus Torvalds
  2008-02-15 18:23               ` Johannes Schindelin
@ 2008-02-17 19:06               ` Johannes Schindelin
  2008-02-17 19:07                 ` [PATCH 1/2] xdl_merge(): make XDL_MERGE_ZEALOUS output simpler Johannes Schindelin
  1 sibling, 1 reply; 23+ messages in thread
From: Johannes Schindelin @ 2008-02-17 19:06 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Junio C Hamano, Voltage Spike, git

Hi,

On Fri, 15 Feb 2008, Linus Torvalds wrote:

> On Fri, 15 Feb 2008, Junio C Hamano wrote:
> > 
> > I think there is no excuse not to coalesce hunks separated by three 
> > lines or less
> 
> Well, I think the two line limit is the "unquestionable" one, since 
> that's the one that actually results in fewer lines over-all.

Well, I hit a problem.  It is visible in t7201-co.sh:

Suppose you have these files

new1	orig	new2
1	1	1
2	2	3
3	3	4
4	4	5
5	5	6
7	6	7
8	7	8
	8

In other words: if the "6" was removed in the first case, and the "2" in 
the second case, all of a sudden changes which were not really conflicting 
(if one side was unchanged, it is considered a resolvable "conflict") now 
_will_ conflict.

In the upcoming patch, I now restrict this merging of conflicts to 
non-resolvable conflicts only.

Will send it out shortly.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH 1/2] xdl_merge(): make XDL_MERGE_ZEALOUS output simpler
  2008-02-17 19:06               ` Johannes Schindelin
@ 2008-02-17 19:07                 ` Johannes Schindelin
  2008-02-17 19:07                   ` [PATCH(RFC) 2/2] xdl_merge(): introduce XDL_MERGE_ZEALOUS_ALNUM Johannes Schindelin
  2008-02-18  8:35                   ` [PATCH 1/2] xdl_merge(): make XDL_MERGE_ZEALOUS output simpler Junio C Hamano
  0 siblings, 2 replies; 23+ messages in thread
From: Johannes Schindelin @ 2008-02-17 19:07 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Junio C Hamano, Voltage Spike, git


When a merge conflicts, there are often less than three common lines
between two conflicting regions.

Since a conflict takes up as many lines as are conflicting, plus three
lines for the commit markers,  the output will be shorter (and thus,
simpler) in this case, if the common lines will be merged into the
conflicting regions.

This patch merges up to three common lines into the conflicts.

For example, what looked like this before this patch:

	<<<<<<<
	if (a == 1)
	=======
	if (a != 0)
	>>>>>>>
	{
		int i;
	<<<<<<<
		a = 0;
	=======
		a = !a;
	>>>>>>>

will now look like this:

	<<<<<<<
	if (a == 1)
	{
		int i;
		a = 0;
	=======
	if (a != 0)
	{
		int i;
		a = !a;
	>>>>>>>

Suggested Linus (based on ideas by "Voltage Spike" -- if that name is
real, it is mighty cool).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 t/t6023-merge-file.sh |   10 ++++++++++
 xdiff/xmerge.c        |   47 ++++++++++++++++++++++++++++++++++++++++++++++-
 2 files changed, 56 insertions(+), 1 deletions(-)

diff --git a/t/t6023-merge-file.sh b/t/t6023-merge-file.sh
index 8641996..869e8d5 100755
--- a/t/t6023-merge-file.sh
+++ b/t/t6023-merge-file.sh
@@ -139,4 +139,14 @@ test_expect_success 'binary files cannot be merged' '
 	grep "Cannot merge binary files" merge.err
 '
 
+sed -e "s/deerit.$/deerit;/" -e "s/me;$/me./" < new5.txt > new6.txt
+sed -e "s/deerit.$/deerit,/" -e "s/me;$/me,/" < new5.txt > new7.txt
+
+test_expect_success 'MERGE_ZEALOUS simplifies non-conflicts' '
+
+	! git merge-file -p new6.txt new5.txt new7.txt > output &&
+	test 1 = $(grep ======= < output | wc -l)
+
+'
+
 test_done
diff --git a/xdiff/xmerge.c b/xdiff/xmerge.c
index b83b334..9cd448c 100644
--- a/xdiff/xmerge.c
+++ b/xdiff/xmerge.c
@@ -249,6 +249,49 @@ static int xdl_refine_conflicts(xdfenv_t *xe1, xdfenv_t *xe2, xdmerge_t *m,
 }
 
 /*
+ * This function merges m and m->next, marking everything between those hunks
+ * as conflicting, too.
+ */
+static void xdl_merge_two_conflicts(xdmerge_t *m)
+{
+	xdmerge_t *next_m = m->next;
+	m->chg1 = next_m->i1 + next_m->chg1 - m->i1;
+	m->chg2 = next_m->i2 + next_m->chg2 - m->i2;
+	m->next = next_m->next;
+	free(next_m);
+}
+
+/*
+ * If there are less than 3 non-conflicting lines between conflicts,
+ * it appears simpler -- because it takes up less (or as many) lines --
+ * if the lines are moved into the conflicts.
+ */
+static int xdl_simplify_non_conflicts(xdfenv_t *xe1, xdmerge_t *m)
+{
+	int result = 0;
+
+	if (!m)
+		return result;
+	for (;;) {
+		xdmerge_t *next_m = m->next;
+		int begin, end;
+
+		if (!next_m)
+			return result;
+
+		begin = m->i1 + m->chg1;
+		end = next_m->i1;
+
+		if (m->mode != 0 || next_m->mode != 0 || end - begin > 3)
+			m = next_m;
+		else {
+			result++;
+			xdl_merge_two_conflicts(m);
+		}
+	}
+}
+
+/*
  * level == 0: mark all overlapping changes as conflict
  * level == 1: mark overlapping changes as conflict only if not identical
  * level == 2: analyze non-identical changes for minimal conflict set
@@ -355,7 +398,9 @@ static int xdl_do_merge(xdfenv_t *xe1, xdchange_t *xscr1, const char *name1,
 	if (!changes)
 		changes = c;
 	/* refine conflicts */
-	if (level > 1 && xdl_refine_conflicts(xe1, xe2, changes, xpp) < 0) {
+	if (level > 1 &&
+			(xdl_refine_conflicts(xe1, xe2, changes, xpp) < 0 ||
+			 xdl_simplify_non_conflicts(xe1, changes) < 0)) {
 		xdl_cleanup_merge(changes);
 		return -1;
 	}
-- 
1.5.4.1.1396.g177d-dirty

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH(RFC) 2/2] xdl_merge(): introduce XDL_MERGE_ZEALOUS_ALNUM
  2008-02-17 19:07                 ` [PATCH 1/2] xdl_merge(): make XDL_MERGE_ZEALOUS output simpler Johannes Schindelin
@ 2008-02-17 19:07                   ` Johannes Schindelin
  2008-02-18  8:35                   ` [PATCH 1/2] xdl_merge(): make XDL_MERGE_ZEALOUS output simpler Junio C Hamano
  1 sibling, 0 replies; 23+ messages in thread
From: Johannes Schindelin @ 2008-02-17 19:07 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Junio C Hamano, Voltage Spike, git


When a merge conflicts, there are often common lines that are not really
common, such as empty lines or lines containing a single curly bracket.

With XDL_MERGE_ZEALOUS_ALNUM, we use the following heuristics: when a
hunk does not contain any letters or digits, it is treated as conflicting.

In other words, a conflict which used to look like this:

	<<<<<<<
					a = 1;
	=======
					output();
	>>>>>>>
				}
			}
		}

	<<<<<<<
		output();
	=======
		b = 1;
	>>>>>>>

will look like this with ZEALOUS_ALNUM:

	<<<<<<<
					a = 1;
				}
			}
		}

		output();
	=======
					output();
				}
			}
		}

		b = 1;
	>>>>>>>

To demonstrate this, git-merge-file has been switched from
XDL_MERGE_ZEALOUS to XDL_MERGE_ZEALOUS_ALNUM.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>

Conflicts:

	t/t6023-merge-file.sh
---
 builtin-merge-file.c  |    2 +-
 t/t6023-merge-file.sh |   10 ++++++++++
 xdiff/xdiff.h         |    1 +
 xdiff/xmerge.c        |   31 ++++++++++++++++++++++++++++---
 4 files changed, 40 insertions(+), 4 deletions(-)

diff --git a/builtin-merge-file.c b/builtin-merge-file.c
index 58deb62..adce6d4 100644
--- a/builtin-merge-file.c
+++ b/builtin-merge-file.c
@@ -46,7 +46,7 @@ int cmd_merge_file(int argc, const char **argv, const char *prefix)
 	}
 
 	ret = xdl_merge(mmfs + 1, mmfs + 0, names[0], mmfs + 2, names[2],
-			&xpp, XDL_MERGE_ZEALOUS, &result);
+			&xpp, XDL_MERGE_ZEALOUS_ALNUM, &result);
 
 	for (i = 0; i < 3; i++)
 		free(mmfs[i].ptr);
diff --git a/t/t6023-merge-file.sh b/t/t6023-merge-file.sh
index 869e8d5..79dc58b 100755
--- a/t/t6023-merge-file.sh
+++ b/t/t6023-merge-file.sh
@@ -149,4 +149,14 @@ test_expect_success 'MERGE_ZEALOUS simplifies non-conflicts' '
 
 '
 
+sed -e 's/deerit./&\n\n\n\n/' -e "s/locavit,/locavit;/" < new6.txt > new8.txt
+sed -e 's/deerit./&\n\n\n\n/' -e "s/locavit,/locavit --/" < new7.txt > new9.txt
+
+test_expect_success 'ZEALOUS_ALNUM' '
+
+	! git merge-file -p new8.txt new5.txt new9.txt > merge.out &&
+	test 1 = $(grep ======= < merge.out | wc -l)
+
+'
+
 test_done
diff --git a/xdiff/xdiff.h b/xdiff/xdiff.h
index c00ddaa..413082e 100644
--- a/xdiff/xdiff.h
+++ b/xdiff/xdiff.h
@@ -53,6 +53,7 @@ extern "C" {
 #define XDL_MERGE_MINIMAL 0
 #define XDL_MERGE_EAGER 1
 #define XDL_MERGE_ZEALOUS 2
+#define XDL_MERGE_ZEALOUS_ALNUM 3
 
 typedef struct s_mmfile {
 	char *ptr;
diff --git a/xdiff/xmerge.c b/xdiff/xmerge.c
index 9cd448c..2128eaf 100644
--- a/xdiff/xmerge.c
+++ b/xdiff/xmerge.c
@@ -248,6 +248,23 @@ static int xdl_refine_conflicts(xdfenv_t *xe1, xdfenv_t *xe2, xdmerge_t *m,
 	return 0;
 }
 
+static int line_contains_alnum(const char *ptr, long size)
+{
+	while (size--)
+		if (isalnum(*(ptr++)))
+			return 1;
+	return 0;
+}
+
+static int lines_contain_alnum(xdfenv_t *xe, int i, int chg)
+{
+	for (; chg; chg--, i++)
+		if (line_contains_alnum(xe->xdf2.recs[i]->ptr,
+				xe->xdf2.recs[i]->size))
+			return 1;
+	return 0;
+}
+
 /*
  * This function merges m and m->next, marking everything between those hunks
  * as conflicting, too.
@@ -266,7 +283,8 @@ static void xdl_merge_two_conflicts(xdmerge_t *m)
  * it appears simpler -- because it takes up less (or as many) lines --
  * if the lines are moved into the conflicts.
  */
-static int xdl_simplify_non_conflicts(xdfenv_t *xe1, xdmerge_t *m)
+static int xdl_simplify_non_conflicts(xdfenv_t *xe1, xdmerge_t *m,
+		int simplify_if_no_alnum)
 {
 	int result = 0;
 
@@ -282,7 +300,11 @@ static int xdl_simplify_non_conflicts(xdfenv_t *xe1, xdmerge_t *m)
 		begin = m->i1 + m->chg1;
 		end = next_m->i1;
 
-		if (m->mode != 0 || next_m->mode != 0 || end - begin > 3)
+		if (m->mode != 0 || next_m->mode != 0 ||
+				(end - begin > 3 &&
+				 (!simplify_if_no_alnum ||
+				  lines_contain_alnum(xe1, begin,
+					  end - begin))))
 			m = next_m;
 		else {
 			result++;
@@ -295,6 +317,8 @@ static int xdl_simplify_non_conflicts(xdfenv_t *xe1, xdmerge_t *m)
  * level == 0: mark all overlapping changes as conflict
  * level == 1: mark overlapping changes as conflict only if not identical
  * level == 2: analyze non-identical changes for minimal conflict set
+ * level == 3: analyze non-identical changes for minimal conflict set, but
+ *             treat hunks not containing any letter or number as conflicting
  *
  * returns < 0 on error, == 0 for no conflicts, else number of conflicts
  */
@@ -400,7 +424,8 @@ static int xdl_do_merge(xdfenv_t *xe1, xdchange_t *xscr1, const char *name1,
 	/* refine conflicts */
 	if (level > 1 &&
 			(xdl_refine_conflicts(xe1, xe2, changes, xpp) < 0 ||
-			 xdl_simplify_non_conflicts(xe1, changes) < 0)) {
+			 xdl_simplify_non_conflicts(xe1, changes,
+				level > 2) < 0)) {
 		xdl_cleanup_merge(changes);
 		return -1;
 	}
-- 
1.5.4.1.1396.g177d-dirty

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH 1/2] xdl_merge(): make XDL_MERGE_ZEALOUS output simpler
  2008-02-17 19:07                 ` [PATCH 1/2] xdl_merge(): make XDL_MERGE_ZEALOUS output simpler Johannes Schindelin
  2008-02-17 19:07                   ` [PATCH(RFC) 2/2] xdl_merge(): introduce XDL_MERGE_ZEALOUS_ALNUM Johannes Schindelin
@ 2008-02-18  8:35                   ` Junio C Hamano
  2008-02-18 11:33                     ` Johannes Schindelin
  1 sibling, 1 reply; 23+ messages in thread
From: Junio C Hamano @ 2008-02-18  8:35 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Linus Torvalds, Voltage Spike, git

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

> When a merge conflicts, there are often less than three common lines
> between two conflicting regions.
>
> Since a conflict takes up as many lines as are conflicting, plus three
> lines for the commit markers,  the output will be shorter (and thus,
> simpler) in this case, if the common lines will be merged into the
> conflicting regions.
>
> This patch merges up to three common lines into the conflicts.

I can give immediate positive feedback to this.

When I rebuilt "next" last night, I considered rebasing its
constituent branches while I was at it (I ended up not doing
that as I felt it was too much).

The tip of js/reflog-delete used to be at cb97cc9.  Rebasing
this on top of any recent master will give you unreadable
conflicts in t/t1410-reflog.sh, but with these two patches (but
the second one does not have chance to kick in for this
particular case) the result is quite obvious and much cleaner.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 1/2] xdl_merge(): make XDL_MERGE_ZEALOUS output simpler
  2008-02-18  8:35                   ` [PATCH 1/2] xdl_merge(): make XDL_MERGE_ZEALOUS output simpler Junio C Hamano
@ 2008-02-18 11:33                     ` Johannes Schindelin
  0 siblings, 0 replies; 23+ messages in thread
From: Johannes Schindelin @ 2008-02-18 11:33 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Linus Torvalds, Voltage Spike, git

Hi,

On Mon, 18 Feb 2008, Junio C Hamano wrote:

> Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
> 
> > When a merge conflicts, there are often less than three common lines 
> > between two conflicting regions.
> >
> > Since a conflict takes up as many lines as are conflicting, plus three 
> > lines for the commit markers, the output will be shorter (and thus, 
> > simpler) in this case, if the common lines will be merged into the 
> > conflicting regions.
> >
> > This patch merges up to three common lines into the conflicts.
> 
> I can give immediate positive feedback to this.
> 
> When I rebuilt "next" last night, I considered rebasing its constituent 
> branches while I was at it (I ended up not doing that as I felt it was 
> too much).
> 
> The tip of js/reflog-delete used to be at cb97cc9.  Rebasing this on top 
> of any recent master will give you unreadable conflicts in 
> t/t1410-reflog.sh, but with these two patches (but the second one does 
> not have chance to kick in for this particular case) the result is quite 
> obvious and much cleaner.

Great!

Note that the _ALNUM stuff was meant more for discussion, as it only is 
activated for merge-file, not for merge-recursive or blame or all the 
other (indirect) users of xdl_merge().

I am a bit hesitant on activating it, since merging is pretty important, 
after all, and if I break things with it...

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2008-02-18 11:34 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-02-12 22:16 Merge-Recursive Improvements Voltage Spike
2008-02-12 23:03 ` Stefan Monnier
2008-02-12 23:11 ` Junio C Hamano
2008-02-12 23:48 ` Linus Torvalds
2008-02-13  0:05   ` Johannes Schindelin
2008-02-13  1:10     ` [PATCH] xdl_merge(): introduce XDL_MERGE_ZEALOUS_ALNUM Johannes Schindelin
2008-02-13  1:34       ` Junio C Hamano
2008-02-13 11:16         ` Johannes Schindelin
2008-02-15 17:32           ` Junio C Hamano
2008-02-15 18:17             ` Linus Torvalds
2008-02-15 18:23               ` Johannes Schindelin
2008-02-17 19:06               ` Johannes Schindelin
2008-02-17 19:07                 ` [PATCH 1/2] xdl_merge(): make XDL_MERGE_ZEALOUS output simpler Johannes Schindelin
2008-02-17 19:07                   ` [PATCH(RFC) 2/2] xdl_merge(): introduce XDL_MERGE_ZEALOUS_ALNUM Johannes Schindelin
2008-02-18  8:35                   ` [PATCH 1/2] xdl_merge(): make XDL_MERGE_ZEALOUS output simpler Junio C Hamano
2008-02-18 11:33                     ` Johannes Schindelin
2008-02-13  2:06       ` [PATCH] xdl_merge(): introduce XDL_MERGE_ZEALOUS_ALNUM Linus Torvalds
2008-02-13 11:22         ` Johannes Schindelin
2008-02-13  7:39 ` Merge-Recursive Improvements Johannes Sixt
2008-02-13  8:17   ` Steffen Prohaska
2008-02-13  8:21   ` Voltage Spike
2008-02-13  8:46     ` Johannes Sixt
2008-02-15 19:21   ` Junio C Hamano

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).