From: Dmitry Potapov <dpotapov@gmail.com>
To: Jeff King <peff@peff.net>
Cc: git@vger.kernel.org, Mislav Marohnic <mislav@github.com>
Subject: Re: [RFH] eol=lf on existing mixed line-ending files
Date: Sat, 9 Apr 2011 22:58:59 +0400 [thread overview]
Message-ID: <BANLkTimBewshVRYBibXJ7nDNoX0S0iDaUQ@mail.gmail.com> (raw)
In-Reply-To: <20110407231556.GA10868@sigill.intra.peff.net>
On Fri, Apr 8, 2011 at 3:15 AM, Jeff King <peff@peff.net> wrote:
>
> git init repo &&
> cd repo &&
> {
> printf 'one\n' &&
> printf 'two\r\n'
> } >mixed &&
> git add mixed &&
> git commit -m one &&
> echo '* eol=lf' >.gitattributes
>
> Now if we run "git status" or "git diff", it will let us know that
> "mixed" is modified, insofar as adding and committing it would perform
> the LF conversion.
Well, git _may_ report that file is modified, but usually when you
change .gitattributes, git does not notice changes to file endings
until you touch those files. You can force git to notice changes in
all files by doing:
$ touch -d 2000-1-1 .git/index
so it will re-read all files, but I guess it should be do that
automatically, otherwise many people end up with having inconsistent
file endings in their repository as result of editing .gitattributes
(or by just pulling a new version from the upstream).
>
> Now we come to the first confusing behavior. Generally one would expect
> the working directory to be clean after a "git reset --hard". But not
> here:
>
> git reset --hard &&
> git status
>
> will still show "mixed" as modified.
It is because you discard all changes except to .gitattributes. If
.gitattributes were tracked, "reset" would discard them too, and you
would get clean original state.
> So that kind of makes sense. But it isn't all that helpful, if I just
> want to reset my working tree to something sane without making a new
> commit (more on this later).
If we do not discard changes to .gitattributes then the question is
what a sane state is? It is really difficult to define what is sane
when conversion to the work tree and back gives a different result.
> But here's an extra helping of confusion on top. Every once in a while,
> doing the reset _won't_ keep "mixed" as modified. I can trigger it
> reliably by inserting an extra sleep into git:
you can have the same effect by doing:
git reset --hard HEAD && sleep 1 && git touch .git/index
Ironically, that the race that you observed is result of fixing another
race in git when files are changed too fast, so they may have the same
timestamp. To prevent this race, git checks timestamp of .git/index
and a trcking file. If .git/index timestamp is older or same as that file,
this file is considered dirty. So, it is re-read from the disk to check
if there are any changes. This works well but only if conversion to the
work tree and back produces the same result.
> So we get two different outcomes, depending on the index raciness. Which
> one is right, or is it right for it to be non-deterministic?
I like everything being deterministic, but in this case I do not see
how it is possible without making the normal case much slower.
> And one final question. Let's say I don't immediately convert this mixed
> file to the correct line-endings.
IMHO, adding .gitattributes that specifies line endings while not
fixing actual line endings of existing files is really a bad idea.
As with any other filter, the rule is that conversion from git to
the working tree and back should give the same result for any file
in the repository, otherwise you will have a lot of troubles later.
> Hopefully my example made sense and was reproducible. The real repo
> which triggered this puzzle was jquery. You can try:
>
> git clone git://github.com/jquery/jquery.git &&
> cd jquery &&
> git checkout 1.4.2 &&
> git checkout master
>
> which will fail (but may succeed racily on a slow enough machine).
> Obviously they need to fix the mixed line-ending files in their repo.
> But that fix would be on HEAD, and "git checkout 1.4.2" will be forever
> broken. Is there a way to fix that?
You cannot change the past history. Well, you can overwrite that
setting using .git/info/attributes. It does not make sense to do
that in general, but it may be useful if you do git bisect.
BTW, nowadays, we have much better alternative than using
* crlf=input
Instead of it, you probably want to use:
* text=auto
which will automatically detect text files, so you won't have problems
with binary files. All text files are put into the repository with LF,
but users may have different endings in their working tree if they like.
Dmitry
next prev parent reply other threads:[~2011-04-09 18:59 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-04-07 23:15 [RFH] eol=lf on existing mixed line-ending files Jeff King
2011-04-08 9:36 ` Michael J Gruber
2011-04-08 16:06 ` Jeff King
2011-04-09 18:58 ` Dmitry Potapov [this message]
2011-04-09 19:32 ` Jeff King
2011-04-09 20:09 ` Dmitry Potapov
2011-04-12 13:57 ` Jay Soffian
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=BANLkTimBewshVRYBibXJ7nDNoX0S0iDaUQ@mail.gmail.com \
--to=dpotapov@gmail.com \
--cc=git@vger.kernel.org \
--cc=mislav@github.com \
--cc=peff@peff.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).