* Bug-ish: CRLF endings and conflict markers
@ 2007-01-11 9:41 Andy Parkins
2007-01-11 9:46 ` Johannes Schindelin
2007-01-11 9:50 ` Shawn O. Pearce
0 siblings, 2 replies; 8+ messages in thread
From: Andy Parkins @ 2007-01-11 9:41 UTC (permalink / raw)
To: git
Hello,
I am sure the response to this will be "tough", however I'll mention it
anyway. It wasn't a problem for me, as I know git is LF endings only, but
others might be a bit confused.
* Track a file that has CR-LF endings. Not a problem, git works fine with
these and treats the CR as an extra character at the end of the line.
* Have two branches make conflicting changes to that file
* Merge, conflict
* Open the file
* Find that (almost) every line now displays with a "^M" at the end. vim has
found some lines that don't have CR-LF ends so dropped back to UNIX mode.
* Be careful that your conflict resolution doesn't introduce any new lines as
they will not have CR-LF endings.
The "(almost)" above refers of course to the conflict markers. These have LF
endings only so force the editor into UNIX mode. Assuming a binary safe
editor, things will be fine if the conflict resolution is simply to remove
the markers, and edit the existing lines. However, if the user adds any
lines during conflict resolution, those new lines will only have LF ends.
This is a problem for compilers that are expecting CR-LF input.
Note: this is not on Windows (spit), this is simply editing a CR-LF file on
UNIX.
The best solution is probably to use the line ending of the conflicted lines.
I've had a look, but I can only fine builtin-rerere.c that generates the
markers - would that be the place to make this change?
Andy
--
Dr Andy Parkins, M Eng (hons), MIEE
andyparkins@gmail.com
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Bug-ish: CRLF endings and conflict markers
2007-01-11 9:41 Bug-ish: CRLF endings and conflict markers Andy Parkins
@ 2007-01-11 9:46 ` Johannes Schindelin
2007-01-11 11:36 ` Andy Parkins
2007-01-11 9:50 ` Shawn O. Pearce
1 sibling, 1 reply; 8+ messages in thread
From: Johannes Schindelin @ 2007-01-11 9:46 UTC (permalink / raw)
To: Andy Parkins; +Cc: git
Hi,
On Thu, 11 Jan 2007, Andy Parkins wrote:
> [describes that conflict markers have LF-only line ending, even if file
> has CR-LF line endings]
>
> The best solution is probably to use the line ending of the conflicted
> lines.
Question is: how to find out. Especially if your file already has mixed
line endings...
> I've had a look, but I can only fine builtin-rerere.c that generates the
> markers - would that be the place to make this change?
No. It would be in xdiff/xmerge.c:{139,147,160}. But I think that xdiff
really is LF-only throughout, so you'd have to do much more work anyway.
Ciao,
Dscho
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: Bug-ish: CRLF endings and conflict markers
2007-01-11 9:46 ` Johannes Schindelin
@ 2007-01-11 11:36 ` Andy Parkins
0 siblings, 0 replies; 8+ messages in thread
From: Andy Parkins @ 2007-01-11 11:36 UTC (permalink / raw)
To: git; +Cc: Johannes Schindelin
On Thursday 2007 January 11 09:46, Johannes Schindelin wrote:
> > The best solution is probably to use the line ending of the conflicted
> > lines.
>
> Question is: how to find out. Especially if your file already has mixed
> line endings...
That's why I said use the line endings of the conflicted lines. Even if the
file is mixed, at least lines added are the same as the ones near them.
Let's say for example a conflict like this:
<<<<
Whole file has DOS endings^M
====
The whole file has DOS endings^M
>>>>
As both of the conflicted parts end in CRLF, the conflict markers would
default to having CRLF. This would cope with mixed-ending files as well, as
the ending is always determined locally.
If we were being really clever, we could make each marker use the ending of
the line following it; making it impossible to accuse git of doing anything
worse than already existed.
> No. It would be in xdiff/xmerge.c:{139,147,160}. But I think that xdiff
Thanks; I see why I couldn't find it: it's generated as a loop "marker_size"
long, rather than the literal that is in rerere. I've had a quick glance at
xdl_fill_merge_buffer() and can see it's not an easy thing to do (at least
for me). Maybe if it itches me a bit more I'll put some effort into my
scratching :-)
> really is LF-only throughout, so you'd have to do much more work anyway.
That doesn't seem to be a problem. git is performing very nicely on the CR-LF
file I'm tracking. As I mentioned, it's treating the CR as just another
character on the line - perfect: merges, diffs, diffstats all work just fine.
Andy
--
Dr Andy Parkins, M Eng (hons), MIEE
andyparkins@gmail.com
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Bug-ish: CRLF endings and conflict markers
2007-01-11 9:41 Bug-ish: CRLF endings and conflict markers Andy Parkins
2007-01-11 9:46 ` Johannes Schindelin
@ 2007-01-11 9:50 ` Shawn O. Pearce
2007-01-11 9:59 ` Johannes Schindelin
1 sibling, 1 reply; 8+ messages in thread
From: Shawn O. Pearce @ 2007-01-11 9:50 UTC (permalink / raw)
To: Andy Parkins; +Cc: git
Andy Parkins <andyparkins@gmail.com> wrote:
> The best solution is probably to use the line ending of the conflicted lines.
> I've had a look, but I can only fine builtin-rerere.c that generates the
> markers - would that be the place to make this change?
builtin-rerere may need to change but the code that's actually
creating the conflict markers isn't there. Its somewhere in
xdiff/xmerge.c. I say somewhere as I haven't dredged down into
that code myself, but I know that's where xdl_merge() is and I
know its xdl_merge() that actually created the content of the
conflict file during the merge.
That said I don't really care about this problem that much.
The problem that I care about is its far too easy to convert the
lineendings in a file (e.g. CRLF->LF, LF->CRLF). This causes the
entire file to differ, making merges very difficult. I really
should just fix it (in the one place where it matters to me) by
modifying the pre-commit hook to look for such a case and abort.
--
Shawn.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Bug-ish: CRLF endings and conflict markers
2007-01-11 9:50 ` Shawn O. Pearce
@ 2007-01-11 9:59 ` Johannes Schindelin
2007-01-11 10:16 ` Shawn O. Pearce
0 siblings, 1 reply; 8+ messages in thread
From: Johannes Schindelin @ 2007-01-11 9:59 UTC (permalink / raw)
To: Shawn O. Pearce; +Cc: Andy Parkins, git
Hi,
On Thu, 11 Jan 2007, Shawn O. Pearce wrote:
> That said I don't really care about this problem that much. The problem
> that I care about is its far too easy to convert the lineendings in a
> file (e.g. CRLF->LF, LF->CRLF). This causes the entire file to differ,
> making merges very difficult. I really should just fix it (in the one
> place where it matters to me) by modifying the pre-commit hook to look
> for such a case and abort.
Why not just introduce a config variable, and do the conversion in-flight?
Or, alternatively, do the merge ignoring white space? (Of course, this is
somewhat pointless when merging whitespace fixes...)
Ciao,
Dscho
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Bug-ish: CRLF endings and conflict markers
2007-01-11 9:59 ` Johannes Schindelin
@ 2007-01-11 10:16 ` Shawn O. Pearce
2007-01-11 10:26 ` Johannes Schindelin
0 siblings, 1 reply; 8+ messages in thread
From: Shawn O. Pearce @ 2007-01-11 10:16 UTC (permalink / raw)
To: Johannes Schindelin; +Cc: Andy Parkins, git
Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote:
> On Thu, 11 Jan 2007, Shawn O. Pearce wrote:
>
> > That said I don't really care about this problem that much. The problem
> > that I care about is its far too easy to convert the lineendings in a
> > file (e.g. CRLF->LF, LF->CRLF). This causes the entire file to differ,
> > making merges very difficult. I really should just fix it (in the one
> > place where it matters to me) by modifying the pre-commit hook to look
> > for such a case and abort.
>
> Why not just introduce a config variable, and do the conversion in-flight?
That's a lot of work and goes very much against the Git mindset that
we never alter content, just store it as-is. If Linus sees this
thread I'm suspecting he will jump in and point out that altering
content transparently like this just wrong. I think he's stated
something like that in the past. :-)
All I want is to make the user realize what they have done. "Hey
dummy, you just changed the entire file and the only difference I see
for most lines is simply the addition/removal of a CR. You shouldn't
do that!". The pre-commit hook is the perfect place for that.
It should be pretty easy to do. For every line in the diff stuff
it into a perl hash after removing any trailing CR. If both an
add and a delete for the exact same line of text appear in the diff
(with the only difference being the CR on the end) and the number
of such lines number is at least 50% of the modified lines in the
file, something's amiss. Smack the user and tell them to fix the
file.
> Or, alternatively, do the merge ignoring white space? (Of course, this is
> somewhat pointless when merging whitespace fixes...)
Lets not go down that road. That's just asking for trouble.
And it sounds like a lot of work from what you pointed out in
another message.
--
Shawn.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Bug-ish: CRLF endings and conflict markers
2007-01-11 10:16 ` Shawn O. Pearce
@ 2007-01-11 10:26 ` Johannes Schindelin
2007-01-11 10:41 ` Shawn O. Pearce
0 siblings, 1 reply; 8+ messages in thread
From: Johannes Schindelin @ 2007-01-11 10:26 UTC (permalink / raw)
To: Shawn O. Pearce; +Cc: Andy Parkins, git
Hi,
On Thu, 11 Jan 2007, Shawn O. Pearce wrote:
> Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote:
> > On Thu, 11 Jan 2007, Shawn O. Pearce wrote:
> >
> > > That said I don't really care about this problem that much. The problem
> > > that I care about is its far too easy to convert the lineendings in a
> > > file (e.g. CRLF->LF, LF->CRLF). This causes the entire file to differ,
> > > making merges very difficult. I really should just fix it (in the one
> > > place where it matters to me) by modifying the pre-commit hook to look
> > > for such a case and abort.
> >
> > Why not just introduce a config variable, and do the conversion in-flight?
>
> That's a lot of work and goes very much against the Git mindset that
> we never alter content, just store it as-is.
While that is correct, we also _use_ the content, and very much alter it
all the time. A diff, for example, is nothing but altered content.
> All I want is to make the user realize what they have done. "Hey dummy,
> you just changed the entire file and the only difference I see for most
> lines is simply the addition/removal of a CR. You shouldn't do that!".
> The pre-commit hook is the perfect place for that.
This sounds sensible. I mistook your task to be integrator for people you
can't smack.
> > Or, alternatively, do the merge ignoring white space? (Of course, this
> > is somewhat pointless when merging whitespace fixes...)
>
> Lets not go down that road. That's just asking for trouble. And it
> sounds like a lot of work from what you pointed out in another message.
But to the contrary, ignoring white space in xdl_merge() is just a
question of passing XDF_IGNORE_WHITESPACE_CHANGE as part of the xpparam's.
Ciao,
Dscho
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Bug-ish: CRLF endings and conflict markers
2007-01-11 10:26 ` Johannes Schindelin
@ 2007-01-11 10:41 ` Shawn O. Pearce
0 siblings, 0 replies; 8+ messages in thread
From: Shawn O. Pearce @ 2007-01-11 10:41 UTC (permalink / raw)
To: Johannes Schindelin; +Cc: Andy Parkins, git
Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote:
> On Thu, 11 Jan 2007, Shawn O. Pearce wrote:
> > That's a lot of work and goes very much against the Git mindset that
> > we never alter content, just store it as-is.
>
> While that is correct, we also _use_ the content, and very much alter it
> all the time. A diff, for example, is nothing but altered content.
Yes, of course. But we don't take A and store A' in the ODB during
a commit. We always assert that if you give us A, you'll get back A.
You can always ask us to compute some sort of A' later on, but that
is always based on the originally stored A.
You can also ask us to combine an A and B and get some new thing C.
But again, we don't then go off and store C' instead of C.
> > All I want is to make the user realize what they have done. "Hey dummy,
> > you just changed the entire file and the only difference I see for most
> > lines is simply the addition/removal of a CR. You shouldn't do that!".
> > The pre-commit hook is the perfect place for that.
>
> This sounds sensible. I mistook your task to be integrator for people you
> can't smack.
To some extent I am. But I have patched the local Cygwin package
I use to distribute Git to everyone to have a 1 line pre-commit
update hook (and no other hooks) which just invokes a Perl script
in /usr/local/bin, and that Perl script is also in the same Cygwin
package.
So Cygwin's "feature" of making all Git hooks executable by default
in a new repository plays in my favor here. The pre-commit hook is
active in every repository by default, and it just execs a standard
script which can be modified at any time. So I can easily update
everyone's pre-commit hook at once.
Currently the hook script is just the stock one that comes with Git,
except it ignores CRLF line endings. I'll likely tweak it as I'm
suggesting, make the CRLF->LF/LF->CRLF check an optional feature
that can be toggled on/off in the top of the script, and submit the
patch to update the stock hook. Others may find the check useful.
--
Shawn.
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2007-01-11 11:36 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-01-11 9:41 Bug-ish: CRLF endings and conflict markers Andy Parkins
2007-01-11 9:46 ` Johannes Schindelin
2007-01-11 11:36 ` Andy Parkins
2007-01-11 9:50 ` Shawn O. Pearce
2007-01-11 9:59 ` Johannes Schindelin
2007-01-11 10:16 ` Shawn O. Pearce
2007-01-11 10:26 ` Johannes Schindelin
2007-01-11 10:41 ` Shawn O. Pearce
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox