git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* auto merge bug
@ 2013-03-04 16:46 David Krmpotic
  2013-03-05  9:03 ` Jeff King
  0 siblings, 1 reply; 9+ messages in thread
From: David Krmpotic @ 2013-03-04 16:46 UTC (permalink / raw)
  To: git

Hi!

We started working on a .NET app and the XML project file (.csproj)
got corrupted (a few closing tag missing).

79	     <Compile Include="SlovaricaForm.Designer.cs">
80	       <DependentUpon>SlovaricaForm.cs</DependentUpon>
81	+    <Compile Include="WebCamForm.cs">
82	+      <SubType>Form</SubType>
83	+    </Compile>
84	+    <Compile Include="WebCamForm.Designer.cs">
85	+      <DependentUpon>WebCamForm.cs</DependentUpon>
86	     </Compile>

between lines 80 and 81 there should be </Compile>

similarly:

121	     </EmbeddedResource>
122	     <EmbeddedResource Include="SlovaricaForm.resx">
123	       <DependentUpon>SlovaricaForm.cs</DependentUpon>
124	+    <EmbeddedResource Include="WebCamForm.resx">
125	+      <DependentUpon>WebCamForm.cs</DependentUpon>
126	     </EmbeddedResource>
127	     <EmbeddedResource Include="WordsSelectForm.resx">
128	       <DependentUpon>WordsSelectForm.cs</DependentUpon>

between 123 and 124 there is  </EmbeddedResource> missing.

The problematic commit is here:

https://github.com/davidhq/logo_x/commit/e3e5fa4b60b7939999b2a8c44330312755b72f93

it has two parents: ae2a364 and bd1a059

on both parents the project compiles in Visual Studio because
Logo.csproj is not corrupted.

How to reproduce and see that really there were no conflicts and the
file became corrupted:

C:\temp> git clone git@github.com:davidhq/logo_x.git
C:\temp\logo_x [master]> git checkout ae2a364
Note: checking out 'ae2a364'.

You are in 'detached HEAD' state. You can look around, make experimental
changes and commit them, and you can discard any commits you make in this
state without impacting any branches by performing another checkout.

If you want to create a new branch to retain commits you create, you may
do so (now or later) by using -b with the checkout command again. Example:

  git checkout -b new_branch_name

HEAD is now at ae2a364... general handler for letters
C:\temp\logo_x [(ae2a364...)]> git merge bd1a059
Auto-merging Logo/Logo.csproj
Merge made by the 'recursive' strategy.
 Logo/Logo.csproj            |   7 ++
 Logo/WebCamForm.Designer.cs |  88 +++++++++++++++++++
 Logo/WebCamForm.cs          | 209 ++++++++++++++++++++++++++++++++++++++++++++
 Logo/WebCamForm.resx        | 120 +++++++++++++++++++++++++
 Logo/WordsForm.Designer.cs  |   1 +
 Logo/WordsForm.cs           |   7 ++
 6 files changed, 432 insertions(+)
 create mode 100644 Logo/WebCamForm.Designer.cs
 create mode 100644 Logo/WebCamForm.cs
 create mode 100644 Logo/WebCamForm.resx

Now check Logo.csproj and observe line 81 (it should read </Compile>

If I add both missing closing tags the project compiles again.

Please investigate and thank you!

PS: on Windows I have version 1.8.0.msysgit.0 of git and on Mac I'm
not sure now, it's a bit older, but the same problem happens.

David

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: auto merge bug
  2013-03-04 16:46 auto merge bug David Krmpotic
@ 2013-03-05  9:03 ` Jeff King
  2013-03-05  9:12   ` Jeff King
  2013-03-05 15:44   ` Junio C Hamano
  0 siblings, 2 replies; 9+ messages in thread
From: Jeff King @ 2013-03-05  9:03 UTC (permalink / raw)
  To: David Krmpotic; +Cc: git

On Mon, Mar 04, 2013 at 05:46:48PM +0100, David Krmpotic wrote:

> We started working on a .NET app and the XML project file (.csproj)
> got corrupted (a few closing tag missing).
> 
> 79	     <Compile Include="SlovaricaForm.Designer.cs">
> 80	       <DependentUpon>SlovaricaForm.cs</DependentUpon>
> 81	+    <Compile Include="WebCamForm.cs">
> 82	+      <SubType>Form</SubType>
> 83	+    </Compile>
> 84	+    <Compile Include="WebCamForm.Designer.cs">
> 85	+      <DependentUpon>WebCamForm.cs</DependentUpon>
> 86	     </Compile>
> 
> between lines 80 and 81 there should be </Compile>
>
> [...]
>
> The problematic commit is here:
> 
> https://github.com/davidhq/logo_x/commit/e3e5fa4b60b7939999b2a8c44330312755b72f93

Thanks for an easy-to-reproduce report. The problem here is that your
.gitattributes file specifies the "union" merge driver for .csproj
(and other) files. From "git help attributes":

           union
               Run 3-way file level merge for text files, but take lines
               from both versions, instead of leaving conflict markers.
               This tends to leave the added lines in the resulting file
               in random order and the user should verify the result. Do
               not use this if you do not understand the implications.

Your <Compile> stanzas each end on an identical line. So it sees that
one side has:

  A
  B
  Z

and the other side has:

  C
  D
  Z

It realizes that the "Z" is common, so is not part of the conflict. But
in the normal 3-way merge case, the rest of it conflicts, so you get a
chance to inspect it. But with "union", it just silently concatenates
the conflicting bits.

I suspect you can run into other problems with "union" here, too,
because line order _does_ matter for you. It comes close to working if
both sides are just adding elements at the same level of the tree (as
you are here), but what about more complicated edits?

I think what you really want is an XML-aware merge tool that can see you
just added two independent <Compile>...</Compile> stanzas that can
co-exist (i.e., it could do a union, but at the level of XML tags, not
at the level of individual lines).  I do not know offhand of any such
tool (or for that matter, a good general XML-aware 3-way merge tool),
but if you had one, you could plug it in as a custom merge driver.

You might be able to get by with a version of the "union" driver that
asks the 3-way merge driver to be less aggressive about shrinking the
conflict blocks. For example, with this patch to git:

diff --git a/ll-merge.c b/ll-merge.c
index fb61ea6..61b1d4e 100644
--- a/ll-merge.c
+++ b/ll-merge.c
@@ -100,7 +100,6 @@ static int ll_xdl_merge(const struct ll_merge_driver *drv_unused,
 	}
 
 	memset(&xmp, 0, sizeof(xmp));
-	xmp.level = XDL_MERGE_ZEALOUS;
 	xmp.favor = opts->variant;
 	xmp.xpp.flags = opts->xdl_opts;
 	if (git_xmerge_style >= 0)

I think the merge will produce the results you are looking for. This
would have to be configurable, though, as it is a regression for
existing users of "union", which would want the duplicate-line
suppression (or maybe not; it will only catch such duplicates at the
beginning and end of the conflict hunk, so maybe it is sane to always
ask "union" to keep all lines).

I'd still worry about more complicated edits fooling "union", but at
least the simple cases would work.

In the meantime, I think you are better to drop those "merge"
gitattributes, and just let the regular three way merge generate a
conflict which you can inspect. For the merge in question, it yields:

  diff --cc Logo/Logo.csproj
  index 4113434,c681862..0000000
  --- a/Logo/Logo.csproj
  +++ b/Logo/Logo.csproj
  @@@ -67,17 -67,11 +67,25 @@@
        <Reference Include="System.Xml" />
      </ItemGroup>
      <ItemGroup>
  ++<<<<<<< HEAD
   +    <Compile Include="MainForm.cs">
   +      <SubType>Form</SubType>
   +    </Compile>
   +    <Compile Include="MainForm.Designer.cs">
   +      <DependentUpon>MainForm.cs</DependentUpon>
   +    </Compile>
   +    <Compile Include="SlovaricaForm.cs">
   +      <SubType>Form</SubType>
   +    </Compile>
   +    <Compile Include="SlovaricaForm.Designer.cs">
   +      <DependentUpon>SlovaricaForm.cs</DependentUpon>
  ++=======
  +     <Compile Include="WebCamForm.cs">
  +       <SubType>Form</SubType>
  +     </Compile>
  +     <Compile Include="WebCamForm.Designer.cs">
  +       <DependentUpon>WebCamForm.cs</DependentUpon>
  ++>>>>>>> bd1a059
        </Compile>
        <Compile Include="WordsSelectForm.cs">
          <SubType>Form</SubType>

where you can see that it "shrinks" the conflict hunk to not include the
line added by both sides (the file "</Compile>" after the conflict). But
by triggering a conflict, you can actually look at and fix it. That's
more work, of course.

Another alternate is to keep the "union" driver and just do better
testing of merges. Even with the stock 3-way driver, a merge that
auto-resolves is not necessarily correct (e.g., even if there are not
textual conflicts, there may be semantic ones).

-Peff

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: auto merge bug
  2013-03-05  9:03 ` Jeff King
@ 2013-03-05  9:12   ` Jeff King
  2013-03-05 15:44   ` Junio C Hamano
  1 sibling, 0 replies; 9+ messages in thread
From: Jeff King @ 2013-03-05  9:12 UTC (permalink / raw)
  To: David Krmpotic; +Cc: git

On Tue, Mar 05, 2013 at 04:03:26AM -0500, Jeff King wrote:

> You might be able to get by with a version of the "union" driver that
> asks the 3-way merge driver to be less aggressive about shrinking the
> conflict blocks. For example, with this patch to git:
> 
> diff --git a/ll-merge.c b/ll-merge.c
> index fb61ea6..61b1d4e 100644
> --- a/ll-merge.c
> +++ b/ll-merge.c
> @@ -100,7 +100,6 @@ static int ll_xdl_merge(const struct ll_merge_driver *drv_unused,
>  	}
>  
>  	memset(&xmp, 0, sizeof(xmp));
> -	xmp.level = XDL_MERGE_ZEALOUS;
>  	xmp.favor = opts->variant;
>  	xmp.xpp.flags = opts->xdl_opts;
>  	if (git_xmerge_style >= 0)
> 
> I think the merge will produce the results you are looking for. This
> would have to be configurable, though, as it is a regression for
> existing users of "union", which would want the duplicate-line
> suppression (or maybe not; it will only catch such duplicates at the
> beginning and end of the conflict hunk, so maybe it is sane to always
> ask "union" to keep all lines).

Here's what the patch would look like to make it non-configurable, but
to just trigger for the "union" case:

diff --git a/ll-merge.c b/ll-merge.c
index fb61ea6..fc33a23 100644
--- a/ll-merge.c
+++ b/ll-merge.c
@@ -83,7 +83,8 @@ static int ll_xdl_merge(const struct ll_merge_driver *drv_unused,
 			mmfile_t *src1, const char *name1,
 			mmfile_t *src2, const char *name2,
 			const struct ll_merge_options *opts,
-			int marker_size)
+			int marker_size,
+			int level)
 {
 	xmparam_t xmp;
 	assert(opts);
@@ -100,7 +101,7 @@ static int ll_xdl_merge(const struct ll_merge_driver *drv_unused,
 	}
 
 	memset(&xmp, 0, sizeof(xmp));
-	xmp.level = XDL_MERGE_ZEALOUS;
+	xmp.level = level;
 	xmp.favor = opts->variant;
 	xmp.xpp.flags = opts->xdl_opts;
 	if (git_xmerge_style >= 0)
@@ -129,7 +130,23 @@ static int ll_union_merge(const struct ll_merge_driver *drv_unused,
 	o.variant = XDL_MERGE_FAVOR_UNION;
 	return ll_xdl_merge(drv_unused, result, path_unused,
 			    orig, NULL, src1, NULL, src2, NULL,
-			    &o, marker_size);
+			    &o, marker_size, XDL_MERGE_MINIMAL);
+}
+
+static int ll_text_merge(const struct ll_merge_driver *drv,
+			 mmbuffer_t *result,
+			 const char *path,
+			 mmfile_t *orig, const char *orig_name,
+			 mmfile_t *src1, const char *name1,
+			 mmfile_t *src2, const char *name2,
+			 const struct ll_merge_options *opts,
+			 int marker_size)
+{
+	return ll_xdl_merge(drv, result, path,
+			    orig, orig_name,
+			    src1, name1,
+			    src2, name2,
+			    opts, marker_size, XDL_MERGE_ZEALOUS);
 }
 
 #define LL_BINARY_MERGE 0
@@ -137,7 +154,7 @@ static struct ll_merge_driver ll_merge_drv[] = {
 #define LL_UNION_MERGE 2
 static struct ll_merge_driver ll_merge_drv[] = {
 	{ "binary", "built-in binary merge", ll_binary_merge },
-	{ "text", "built-in 3-way text merge", ll_xdl_merge },
+	{ "text", "built-in 3-way text merge", ll_text_merge },
 	{ "union", "built-in union merge", ll_union_merge },
 };
 

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: auto merge bug
  2013-03-05  9:03 ` Jeff King
  2013-03-05  9:12   ` Jeff King
@ 2013-03-05 15:44   ` Junio C Hamano
  2013-03-05 17:59     ` Jeff King
  1 sibling, 1 reply; 9+ messages in thread
From: Junio C Hamano @ 2013-03-05 15:44 UTC (permalink / raw)
  To: Jeff King; +Cc: David Krmpotic, git

Jeff King <peff@peff.net> writes:

> I think the merge will produce the results you are looking for. This
> would have to be configurable, though, as it is a regression for
> existing users of "union", which would want the duplicate-line
> suppression (or maybe not; it will only catch such duplicates at the
> beginning and end of the conflict hunk, so maybe it is sane to always
> ask "union" to keep all lines).

The original use-case example of "union" was to merge two shopping
lists (e.g. I add "bread" and "orange juice" to remind me that we
need to buy these things, while my wife adds "bread" and "butter").

We do not necessarily want to end up with a shopping list to buy two
loaves of bread.  When the user verifies and fixes up the result, we
can keep the current behaviour and those who want to re-dup can add
one back, or we can change the behaviour to leave the duplicates and
those who do not want to see duplicates can remove them manually.

Given that the caveat you quoted already tells the user to verify
the result and not to use it without understanding its implications,
I think it technically is fine either way (read: keeping duplicates
is not a clearly superiour solution). So let's leave it as-is.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: auto merge bug
  2013-03-05 15:44   ` Junio C Hamano
@ 2013-03-05 17:59     ` Jeff King
  2013-03-05 18:47       ` Junio C Hamano
  0 siblings, 1 reply; 9+ messages in thread
From: Jeff King @ 2013-03-05 17:59 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: David Krmpotic, git

On Tue, Mar 05, 2013 at 07:44:13AM -0800, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
> 
> > I think the merge will produce the results you are looking for. This
> > would have to be configurable, though, as it is a regression for
> > existing users of "union", which would want the duplicate-line
> > suppression (or maybe not; it will only catch such duplicates at the
> > beginning and end of the conflict hunk, so maybe it is sane to always
> > ask "union" to keep all lines).
> 
> The original use-case example of "union" was to merge two shopping
> lists (e.g. I add "bread" and "orange juice" to remind me that we
> need to buy these things, while my wife adds "bread" and "butter").
> 
> We do not necessarily want to end up with a shopping list to buy two
> loaves of bread.  When the user verifies and fixes up the result, we
> can keep the current behaviour and those who want to re-dup can add
> one back, or we can change the behaviour to leave the duplicates and
> those who do not want to see duplicates can remove them manually.
> 
> Given that the caveat you quoted already tells the user to verify
> the result and not to use it without understanding its implications,
> I think it technically is fine either way (read: keeping duplicates
> is not a clearly superiour solution). So let's leave it as-is.

My problem with the current behavior is that it is not predictable
whether it will de-dup or not. If your shopping lists are:

  bread
  orange juice

  bread
  butter

it works; you get only one bread. If they are:

  milk
  bread
  orange juice

  beer
  bread
  butter

you get two. It depends on the exact behavior of the XDL_MERGE_ZEALOUS
flag. What I'd propose is two different drivers:

  1. Find conflicts via 3-way merge, and include both sides of the
     conflict verbatim. Do not use XDL_MERGE_ZEALOUS, as it is more
     important to retain items from both sides (in their original order)
     than it is to remove duplicates.

  2. A true line-based union, which should act like "cat $ours $theirs |
     sort | uniq". That is what you want for the shopping list example,
     I think (you could also preserve existing ordering with a lookup
     table, though I prefer clobbering the ordering; the ordering of
     resolved conflicts will be arbitrary anyway, so it makes it clear
     from the outset that you should not use this driver if your content
     is not really a set (in the mathematical sense) of lines).

     You could also have sets of other objects (e.g., blank-line
     delimited paragraphs, changelog entries, etc). But you would need
     some way to specify the parsing then[1].

I'm not sure which should be called "union". The first one would still
need careful examination of the result. The second one should always be
correct, but only because it is limited to a much more constrained
problem.

I'm also not sure how useful those really are in practice. I have not
used "union" myself ever. And in the example that started this thread, I
find the use of "union" slightly dubious. I do not even know how it
would react to a line _changing_, or other complicated edit. Short of a
specialized XML-aware merge driver, using XDL_MERGE_ZEALOUS and kicking
the result out to the user (i.e., what the default merge driver does)
seems like the only sane thing, even if it is more work at merge time.

-Peff

[1] Some of this is fairly easy to do with perl one-liners (e.g., "perl
   -00 -ne 'print unless $h{$_}++" for paragraph mode), so maybe it is
   just an education/documentation issue. I dunno. I have always been
   happy enough with the stock merge.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: auto merge bug
  2013-03-05 17:59     ` Jeff King
@ 2013-03-05 18:47       ` Junio C Hamano
  2013-03-05 20:56         ` Andreas Ericsson
       [not found]         ` <194F685F-9460-42C6-B5A5-59475F53D038@gmail.com>
  0 siblings, 2 replies; 9+ messages in thread
From: Junio C Hamano @ 2013-03-05 18:47 UTC (permalink / raw)
  To: Jeff King; +Cc: David Krmpotic, git

Jeff King <peff@peff.net> writes:

> I'm also not sure how useful those really are in practice. I have not
> used "union" myself ever. And in the example that started this thread, I
> find the use of "union" slightly dubious.

Yeah, I do not think anybody sane used "union" outside toy examples.
IIRC, it was originally done as a "if you want a GIGO, here it is,
go hang yourself." response to "I am too lazy to resolve conflicts
myself, Git should let me take both sides blindly."

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: auto merge bug
  2013-03-05 18:47       ` Junio C Hamano
@ 2013-03-05 20:56         ` Andreas Ericsson
       [not found]         ` <194F685F-9460-42C6-B5A5-59475F53D038@gmail.com>
  1 sibling, 0 replies; 9+ messages in thread
From: Andreas Ericsson @ 2013-03-05 20:56 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jeff King, David Krmpotic, git

On 03/05/2013 07:47 PM, Junio C Hamano wrote:
> Jeff King <peff@peff.net> writes:
> 
>> I'm also not sure how useful those really are in practice. I have not
>> used "union" myself ever. And in the example that started this thread, I
>> find the use of "union" slightly dubious.
> 
> Yeah, I do not think anybody sane used "union" outside toy examples.

I do, for lists used in tests or to generate perfect hashes from. It's
really quite handy for things like that but totally useless for any
type of multiline format, or even .ini style files unless you're very,
very careful with how you write them.

-- 
Andreas Ericsson                   andreas.ericsson@op5.se
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231

Considering the successes of the wars on alcohol, poverty, drugs and
terror, I think we should give some serious thought to declaring war
on peace.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: auto merge bug
       [not found]         ` <194F685F-9460-42C6-B5A5-59475F53D038@gmail.com>
@ 2013-03-05 22:13           ` David Krmpotic
  2013-03-06  9:15             ` Jeff King
  0 siblings, 1 reply; 9+ messages in thread
From: David Krmpotic @ 2013-03-05 22:13 UTC (permalink / raw)
  To: Junio C Hamano, git

Hi guys! Thank you for responses.. I haven't suspected that repos
created via GitHub windows app would have union set by default :( have
to ask them about it.. it seems wrong to me… Here are the defaults for
a windows repo created with GitHub for windows app:

logo (master)$ cat .gitattributes
# Auto detect text files and perform LF normalization
* text=auto

# Custom for Visual Studio
*.cs     diff=csharp
*.sln    merge=union
*.csproj merge=union
*.vbproj merge=union
*.fsproj merge=union
*.dbproj merge=union

# Standard to msysgit
*.doc	 diff=astextplain
*.DOC	 diff=astextplain
*.docx diff=astextplain
*.DOCX diff=astextplain
*.dot  diff=astextplain
*.DOT  diff=astextplain
*.pdf  diff=astextplain
*.PDF	 diff=astextplain
*.rtf	 diff=astextplain
*.RTF	 diff=astextplain

While investigating my problem I have read about the special union
merge mode, but didn't check if maybe my repo was in that mode..
really didn't expect it.

THANK YOU again… now I'll write to the github guys..


David

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: auto merge bug
  2013-03-05 22:13           ` David Krmpotic
@ 2013-03-06  9:15             ` Jeff King
  0 siblings, 0 replies; 9+ messages in thread
From: Jeff King @ 2013-03-06  9:15 UTC (permalink / raw)
  To: David Krmpotic; +Cc: Junio C Hamano, git

On Tue, Mar 05, 2013 at 11:13:12PM +0100, David Krmpotic wrote:

> Hi guys! Thank you for responses.. I haven't suspected that repos
> created via GitHub windows app would have union set by default :( have
> to ask them about it.. it seems wrong to me… Here are the defaults for
> a windows repo created with GitHub for windows app:
> [...]
> # Custom for Visual Studio
> *.cs     diff=csharp
> *.sln    merge=union
> *.csproj merge=union
> *.vbproj merge=union
> *.fsproj merge=union
> *.dbproj merge=union

Yeah, I think defaulting to merge=union there is questionable. In an
ideal world, the GitHub for Windows folks would ship a specialized merge
helper for handling VS project files. It can be open-source and
distributed separately for people who don't use GitHub, but they can
integrate it seamlessly into the GitHub client. So everybody wins.

I see you've already written to GitHub support; thanks. I'll make sure
your issue gets routed to the right people, and I'll see if I can
convince them to write the specialized tool. :)

-Peff

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2013-03-06  9:15 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-03-04 16:46 auto merge bug David Krmpotic
2013-03-05  9:03 ` Jeff King
2013-03-05  9:12   ` Jeff King
2013-03-05 15:44   ` Junio C Hamano
2013-03-05 17:59     ` Jeff King
2013-03-05 18:47       ` Junio C Hamano
2013-03-05 20:56         ` Andreas Ericsson
     [not found]         ` <194F685F-9460-42C6-B5A5-59475F53D038@gmail.com>
2013-03-05 22:13           ` David Krmpotic
2013-03-06  9:15             ` Jeff King

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).