From: "Torsten Bögershausen" <tboegi@web.de>
To: Junio C Hamano <gitster@pobox.com>, Clemens Buchacher <drizzd@aon.at>
Cc: "Thomas Gummerer" <t.gummerer@gmail.com>,
"Jeff King" <peff@peff.net>,
"Torsten Bögershausen" <tboegi@web.de>,
"brian m. carlson" <sandals@crustytoothpaste.net>,
"Lars Schneider" <larsxschneider@gmail.com>,
"Mike Hommey" <mh@glandium.org>,
git@vger.kernel.org
Subject: Re: eol round trip Was: [PATCH] travis-ci: run previously failed ....
Date: Thu, 28 Jan 2016 07:20:58 +0100 [thread overview]
Message-ID: <56A9B34A.1060205@web.de> (raw)
In-Reply-To: <xmqqd1sm9730.fsf@gitster.mtv.corp.google.com>
On 01/27/2016 08:05 PM, Junio C Hamano wrote:
(Changed the topic, 2 notes inside)
> Clemens Buchacher <drizzd@aon.at> writes:
>
>> Coming back to "[PATCH] optionally disable gitattributes": The topics
>> are related, because they both deal with the situation where the work
>> tree has files which are not normalized according to gitattributes. But
>> my patch is more about saying: ok, I know I may have files which need to
>> be normalized, but I want to ignore this issue for now. Please disable
>> gitattributes for now, because I want to work with the files as they are
>> committed. Conversely, the discussion here is about how to reliably
>> detect and fix files which are not normalized.
git ls-files --eol can detect that (as Junio pointed out)
> I primarily wanted to make sure that you understood the underlying
> issue, so that I do not have to go back to the basics in the other
> thread. And it is clear that you obviously do, which is good.
>
> Here, you seem to think that what t0025 wants to see happen is
> sensible, judging by the fact that you call "rm .git/index && git
> reset" a "fix".
>
> My take on this is quite different. After a "reset --hard HEAD", we
> should be able to trust the cached stat information and have "diff
> HEAD" say "no changes". That is what you essentially want in the
> other thread, if I understand you correctly, and in an ideal world
> where the filesystem timestamp has infinite precision, that is what
> would happen in t0025, always "breaking" its expectation. The real
> world has much coarser timestamp granularity than ideal, and that is
> why the test appear to be "flaky", failing to give "correct" outcome
> some of the time--but I'd say that it is expecting a wrong thing.
>
> An index entry that has data that does not round-trip when it goes
> through convert_to_working_tree() and then convert_to_git() "breaks"
> this arrangement, and I'd view it as the user having an inconsistent
> data. It is like you are in a repository that still has an unmerged
> paths--you cannot proceed before you resolve them.
This is actually bringing some light to me: the round-trip test.
There are this "well known but less well document" situations where we
break that rule:
- files are checked in with CRLF into the repo.
- .gittatributes is set to "text" later.
2 different ways to handle it:
- keep the eol at checkout, normalize at checkin -> roundtrip broken
- keep the eol at checkout and checkin -> roundtrip OK
- files with mixed line endings in the repo:
Same here: 2 different ways to handle it:
- keep the eol at checkout, normalize at checkin -> roundtrip broken
- keep the eol at checkout and checkin -> roundtrip OK
- files with CRCRLF line endings in the repo:
Same here: 2 different ways to handle it:
- keep the eol at checkout, normalize at checkin -> roundtrip broken
- keep the eol at checkout and checkin -> roundtrip OK
My feeling is that we should simply say:
You user set attribute to "text" and by doing that, you promised to have
files
with LF only in the index.
If you break that promise, Git does not know, what you really want.
- It may be a situation where you write a shell script which for some
reasons
needs a '\015' at the end of a line, and Git may treat it wrong by
assuming
that this is a CRLF line ending (end converts it into LF)
- It may be that you want CRLF because you added a Windows .BAT file.
It may be that you use git.git and another implementation of Git,
which doesn't
support attributes at all, so that a save way to do this is to just
commit CRLF.
- It may be that this is a historical issue.
Everybody using the project uses git that understands .gitattributes,
so someone may fix it some day.
Can Git make this decision ?
When core.autocrlf is true (and no attributes are set), then the
conversion of line ending is disabled.
On 01/27/2016 08:05 PM, Junio C Hamano wrote:
(Changed the topic, 2 notes inside)
> Clemens Buchacher <drizzd@aon.at> writes:
>
>> Coming back to "[PATCH] optionally disable gitattributes": The topics
>> are related, because they both deal with the situation where the work
>> tree has files which are not normalized according to gitattributes. But
>> my patch is more about saying: ok, I know I may have files which need to
>> be normalized, but I want to ignore this issue for now. Please disable
>> gitattributes for now, because I want to work with the files as they are
>> committed. Conversely, the discussion here is about how to reliably
>> detect and fix files which are not normalized.
git ls-files --eol can detect that (as Junio pointed out)
> I primarily wanted to make sure that you understood the underlying
> issue, so that I do not have to go back to the basics in the other
> thread. And it is clear that you obviously do, which is good.
>
> Here, you seem to think that what t0025 wants to see happen is
> sensible, judging by the fact that you call "rm .git/index && git
> reset" a "fix".
>
> My take on this is quite different. After a "reset --hard HEAD", we
> should be able to trust the cached stat information and have "diff
> HEAD" say "no changes". That is what you essentially want in the
> other thread, if I understand you correctly, and in an ideal world
> where the filesystem timestamp has infinite precision, that is what
> would happen in t0025, always "breaking" its expectation. The real
> world has much coarser timestamp granularity than ideal, and that is
> why the test appear to be "flaky", failing to give "correct" outcome
> some of the time--but I'd say that it is expecting a wrong thing.
>
> An index entry that has data that does not round-trip when it goes
> through convert_to_working_tree() and then convert_to_git() "breaks"
> this arrangement, and I'd view it as the user having an inconsistent
> data. It is like you are in a repository that still has an unmerged
> paths--you cannot proceed before you resolve them.
This is actually bringing some light to me: the round-trip test.
There are this "well known but less well document" situations where we
break that rule:
- files are checked in with CRLF into the repo.
- .gittatributes is set to "text" later.
2 different ways to handle it:
- keep the eol at checkout, normalize at checkin -> roundtrip broken
- keep the eol at checkout and checkin -> roundtrip OK
- files with mixed line endings in the repo:
Same here: 2 different ways to handle it:
- keep the eol at checkout, normalize at checkin -> roundtrip broken
- keep the eol at checkout and checkin -> roundtrip OK
- files with CRCRLF line endings in the repo:
Same here: 2 different ways to handle it:
- keep the eol at checkout, normalize at checkin -> roundtrip broken
- keep the eol at checkout and checkin -> roundtrip OK
My feeling is that we should simply say:
You user set attribute to "text" and by doing that, you promised to have
files
with LF only in the index.
If you break that promise, Git does not know, what you really want.
- It may be a situation where you write a shell script which for some
reasons
needs a '\015' at the end of a line, and Git may treat it wrong by
assuming
that this is a CRLF line ending (end converts it into LF)
- It may be that you want CRLF because you added a Windows .BAT file.
It may be that you use git.git and another implementation of Git,
which doesn't
support attributes at all, so that a save way to do this is to just
commit CRLF.
- It may be that this is a historical issue.
Everybody using the project uses git that understands .gitattributes,
so someone may fix it some day.
Can Git make this decision ?
When core.autocrlf is true (and no attributes are set), then the
conversion of line ending is disabled.
On 01/27/2016 08:05 PM, Junio C Hamano wrote:
(Changed the topic, 2 notes inside)
> Clemens Buchacher <drizzd@aon.at> writes:
>
>> Coming back to "[PATCH] optionally disable gitattributes": The topics
>> are related, because they both deal with the situation where the work
>> tree has files which are not normalized according to gitattributes. But
>> my patch is more about saying: ok, I know I may have files which need to
>> be normalized, but I want to ignore this issue for now. Please disable
>> gitattributes for now, because I want to work with the files as they are
>> committed. Conversely, the discussion here is about how to reliably
>> detect and fix files which are not normalized.
git ls-files --eol can detect that (as Junio pointed out)
> I primarily wanted to make sure that you understood the underlying
> issue, so that I do not have to go back to the basics in the other
> thread. And it is clear that you obviously do, which is good.
>
> Here, you seem to think that what t0025 wants to see happen is
> sensible, judging by the fact that you call "rm .git/index && git
> reset" a "fix".
>
> My take on this is quite different. After a "reset --hard HEAD", we
> should be able to trust the cached stat information and have "diff
> HEAD" say "no changes". That is what you essentially want in the
> other thread, if I understand you correctly, and in an ideal world
> where the filesystem timestamp has infinite precision, that is what
> would happen in t0025, always "breaking" its expectation. The real
> world has much coarser timestamp granularity than ideal, and that is
> why the test appear to be "flaky", failing to give "correct" outcome
> some of the time--but I'd say that it is expecting a wrong thing.
>
> An index entry that has data that does not round-trip when it goes
> through convert_to_working_tree() and then convert_to_git() "breaks"
> this arrangement, and I'd view it as the user having an inconsistent
> data. It is like you are in a repository that still has an unmerged
> paths--you cannot proceed before you resolve them.
This is actually bringing some light to me: the round-trip test.
There are this "well known but less well document" situations where we
break that rule:
- files are checked in with CRLF into the repo.
- .gittatributes is set to "text" later.
2 different ways to handle it:
- keep the eol at checkout, normalize at checkin -> roundtrip broken
- keep the eol at checkout and checkin -> roundtrip OK
- files with mixed line endings in the repo:
Same here: 2 different ways to handle it:
- keep the eol at checkout, normalize at checkin -> roundtrip broken
- keep the eol at checkout and checkin -> roundtrip OK
- files with CRCRLF line endings in the repo:
Same here: 2 different ways to handle it:
- keep the eol at checkout, normalize at checkin -> roundtrip broken
- keep the eol at checkout and checkin -> roundtrip OK
My feeling is that we should simply say:
You user set attribute to "text" and by doing that, you promised to have
files
with LF only in the index.
If you break that promise, Git does not know, what you really want.
- It may be a situation where you write a shell script which for some
reasons
needs a '\015' at the end of a line, and Git may treat it wrong by
assuming
that this is a CRLF line ending (end converts it into LF)
- It may be that you want CRLF because you added a Windows .BAT file.
It may be that you use git.git and another implementation of Git,
which doesn't
support attributes at all, so that a save way to do this is to just
commit CRLF.
- It may be that this is a historical issue.
Everybody using the project uses git that understands .gitattributes,
so someone may fix it some day.
Can Git make this decision ?
When core.autocrlf is true (and no attributes are set), then the
conversion of line endings is disabled.
See convert.v "This is the new safer autocrlf handling",
commit fd6cce9e
So the round trip is achieved when core.autocrlf=true,
but no longer when attributes are added.
[]
> Anyway.
>
> As to your patch in the other thread, here is what I think:
>
> (1) When you know (or perhaps your CI knows) that the working tree
> has never been modified since you did "reset --hard HEAD" (or
> its equivalent, like "git checkout $branch" from a clean
> state), these paths with inconsistent data would break the
> usual check to ask "is the working tree clean?" That is a
> problem and we need a way to ensure that the working tree is
> always judged to be clean immediately after "reset --hard
> HEAD". IOW, I agree with you that the issue you are trying to
> solve is worth solving.
>
> (2) Regardless of the "inconsistent data breaking the cleanliness
> check" issue, it may be handy to have a way to temporarily
> disable the attributes, i.e. allow us to ask "what happens if
> there is no attributes defined?" IOW, I am saying that the
> change in the patch is not without merit.
>
> In addition to (1), I further think that this sequence should not
> report that the path F is modified:
>
> # Write F from HEAD to the working tree, after passing it
> # through convert_to_working_tree()
> $ git reset --hard HEAD
>
> # Force the re-reading, without changing the contents at all
> $ cp F F.new
> $ mv F.new F
>
> $ git diff HEAD
>
> which is broken by paths with inconsistent data. Your CI would want
> a way to make that happen.
>
> However, I do not think disabling attributes (i.e. (2)) is a
> solution to the issue (i.e. (1)), which we just agreed to be an
> issue that is worth solving, for at least two reasons.
>
> * Even without any attributes, core.autocrlf setting can get the
> data in your index (whose lines can be terminated with CRLF) into
> the same "inconsistent data" situation. Disabling attribute
> handling would not have any effect on that codepath, I think.
>
I don't think so, see above.
next prev parent reply other threads:[~2016-01-28 6:22 UTC|newest]
Thread overview: 41+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-01-19 9:24 [PATCH] travis-ci: run previously failed tests first, then slowest to fastest larsxschneider
2016-01-19 19:12 ` Jeff King
2016-01-19 23:00 ` Junio C Hamano
2016-01-20 0:26 ` Mike Hommey
2016-01-20 1:46 ` Junio C Hamano
2016-01-20 1:56 ` Jeff King
2016-01-20 9:22 ` Lars Schneider
2016-01-22 2:33 ` brian m. carlson
2016-01-22 5:52 ` Jeff King
2016-01-22 6:07 ` Jeff King
2016-01-24 14:34 ` Thomas Gummerer
2016-01-24 20:03 ` Junio C Hamano
2016-01-24 22:05 ` Junio C Hamano
2016-01-25 14:42 ` Thomas Gummerer
2016-01-25 17:26 ` Junio C Hamano
2016-01-25 21:52 ` Junio C Hamano
2016-01-27 15:16 ` Clemens Buchacher
2016-01-27 19:05 ` Junio C Hamano
2016-01-27 20:49 ` Junio C Hamano
2016-01-28 7:10 ` Clemens Buchacher
2016-01-28 21:32 ` Junio C Hamano
2016-01-30 8:13 ` Clemens Buchacher
2016-02-01 18:17 ` Junio C Hamano
2016-02-01 19:33 ` Clemens Buchacher
2016-02-02 23:14 ` Junio C Hamano
2016-02-03 8:31 ` Junio C Hamano
2016-02-01 20:26 ` Torsten Bögershausen
2016-01-28 6:20 ` Torsten Bögershausen [this message]
2016-01-25 22:41 ` Thomas Gummerer
2016-01-20 1:53 ` Jeff King
2016-01-20 9:10 ` Lars Schneider
2016-01-19 20:00 ` Junio C Hamano
2016-01-19 22:53 ` Junio C Hamano
2016-01-19 23:06 ` Jeff King
2016-01-19 23:26 ` Junio C Hamano
2016-01-19 23:29 ` Jeff King
2016-01-19 23:30 ` Junio C Hamano
2016-01-19 23:27 ` Jeff King
2016-01-20 7:55 ` Johannes Schindelin
2016-01-20 9:04 ` Lars Schneider
2016-01-20 20:35 ` Junio C Hamano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=56A9B34A.1060205@web.de \
--to=tboegi@web.de \
--cc=drizzd@aon.at \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=larsxschneider@gmail.com \
--cc=mh@glandium.org \
--cc=peff@peff.net \
--cc=sandals@crustytoothpaste.net \
--cc=t.gummerer@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).