From: Michael J Gruber <git@drmicha.warpmail.net>
To: Jeff King <peff@peff.net>
Cc: git@vger.kernel.org, Mislav Marohnic <mislav@github.com>
Subject: Re: [RFH] eol=lf on existing mixed line-ending files
Date: Fri, 08 Apr 2011 11:36:20 +0200 [thread overview]
Message-ID: <4D9ED714.80307@drmicha.warpmail.net> (raw)
In-Reply-To: <20110407231556.GA10868@sigill.intra.peff.net>
Jeff King venit, vidit, dixit 08.04.2011 01:15:
> I investigated some odd git behavior with the EOL gitattributes today,
> and I'm curious to hear what others on the list think of what git does.
> In particular, index raciness means git produces non-deterministic
> results in this case.
>
> The repo in question has a gitattributes file with "* crlf=input" (which
> we would spell "eol=lf" these days, but the results are the same), but
> still contains some files with mixed line endings. Which you can
> reproduce with:
>
> git init repo &&
> cd repo &&
> {
> printf 'one\n' &&
> printf 'two\r\n'
> } >mixed &&
> git add mixed &&
> git commit -m one &&
> echo '* eol=lf' >.gitattributes
>
> Now if we run "git status" or "git diff", it will let us know that
> "mixed" is modified, insofar as adding and committing it would perform
> the LF conversion.
>
> Now we come to the first confusing behavior. Generally one would expect
> the working directory to be clean after a "git reset --hard". But not
> here:
>
> git reset --hard &&
> git status
>
> will still show "mixed" as modified. Because of course we are checking
> out the version from HEAD into the index and working tree, which has the
> mixed line endings. So we rewrite the identical file.
>
> So that kind of makes sense. But it isn't all that helpful, if I just
> want to reset my working tree to something sane without making a new
> commit (more on this later).
>
> But here's an extra helping of confusion on top. Every once in a while,
> doing the reset _won't_ keep "mixed" as modified. I can trigger it
> reliably by inserting an extra sleep into git:
>
> diff --git a/unpack-trees.c b/unpack-trees.c
> index 500ebcf..735b13e 100644
> --- a/unpack-trees.c
> +++ b/unpack-trees.c
> @@ -223,6 +223,7 @@ static int check_updates(struct unpack_trees_options *o)
> }
> }
> stop_progress(&progress);
> + sleep(1);
> if (o->update)
> git_attr_set_direction(GIT_ATTR_CHECKIN, NULL);
> return errs != 0;
>
> That puts a delay between when reset writes the "mixed" file, and when
> we write out the refreshed index. So next time we look at the index
> (e.g., in "status"), we will see that the "mixed" entry has up-to-date
> stat information and not look at its actual contents.
>
> But in the original case (without the sleep), that doesn't happen.
> There, we usually end up writing the file and the index in the same
> second. So when status looks at the index, the "mixed" entry is racily
> clean, and we actually check it again.
>
> So we get two different outcomes, depending on the index raciness. Which
> one is right, or is it right for it to be non-deterministic?
>
> And one final question. Let's say I don't immediately convert this mixed
> file to the correct line-endings. Instead, it persists over a large
> number of commits, some of them even changing the "mixed" file but not
> fixing the line endings[1]. We can simulate that with:
>
> mv .gitattributes tmp
> echo three >>mixed &&
> git commit -a -m three &&
> mv tmp .gitattributes
>
> Now imagine I am somebody who has cloned this repo; the clone will tend
> to end the race condition in the "clean" state, since it will often take
> more than 1 second to write out all of the files (at least for a
> normal-sized project). We can simulate using our sleep-patched reset:
>
> git reset --hard
>
> to get a "clean" repo. Now let's say I want to explore old history, so I
> go to a detached HEAD, but using normal git, not the sleep-patched one:
>
> git checkout HEAD^
>
> And, of course, now we think "mixed" is modified. After I'm done
> exploring, I want to go back to "master", but I can't:
>
> $ git checkout master
> error: Your local changes to the following files would be overwritten by checkout:
> mixed
>
> What is the best way out of this situation? You can't use "reset --hard"
> to fix the working tree. I guess "git checkout -f" is the best option.
>
> Hopefully my example made sense and was reproducible. The real repo
> which triggered this puzzle was jquery. You can try:
>
> git clone git://github.com/jquery/jquery.git &&
> cd jquery &&
> git checkout 1.4.2 &&
> git checkout master
>
> which will fail (but may succeed racily on a slow enough machine).
> Obviously they need to fix the mixed line-ending files in their repo.
> But that fix would be on HEAD, and "git checkout 1.4.2" will be forever
> broken. Is there a way to fix that?
>
> -Peff
>
> [1] The one thing still puzzling me about the jquery repo is how they
> managed to make so many commits (including ones to mixed line ending
> files) without seeing the dirty working tree state and committing it. Is
> there some combination of config that makes this not happen?
When did they introduce the .gitattributes file?
Also, maybe they're jgit users.
Michael
next prev parent reply other threads:[~2011-04-08 9:36 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-04-07 23:15 [RFH] eol=lf on existing mixed line-ending files Jeff King
2011-04-08 9:36 ` Michael J Gruber [this message]
2011-04-08 16:06 ` Jeff King
2011-04-09 18:58 ` Dmitry Potapov
2011-04-09 19:32 ` Jeff King
2011-04-09 20:09 ` Dmitry Potapov
2011-04-12 13:57 ` Jay Soffian
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4D9ED714.80307@drmicha.warpmail.net \
--to=git@drmicha.warpmail.net \
--cc=git@vger.kernel.org \
--cc=mislav@github.com \
--cc=peff@peff.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.