All of lore.kernel.org
 help / color / mirror / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: Clemens Buchacher <drizzd@aon.at>
Cc: "Thomas Gummerer" <t.gummerer@gmail.com>,
	"Jeff King" <peff@peff.net>,
	"Torsten Bögershausen" <tboegi@web.de>,
	"brian m. carlson" <sandals@crustytoothpaste.net>,
	"Lars Schneider" <larsxschneider@gmail.com>,
	"Mike Hommey" <mh@glandium.org>,
	git@vger.kernel.org
Subject: Re: [PATCH] travis-ci: run previously failed tests first, then slowest to fastest
Date: Wed, 27 Jan 2016 11:05:07 -0800	[thread overview]
Message-ID: <xmqqd1sm9730.fsf@gitster.mtv.corp.google.com> (raw)
In-Reply-To: <20160127151602.GA1690@ecki.hitronhub.home> (Clemens Buchacher's message of "Wed, 27 Jan 2016 16:16:02 +0100")

Clemens Buchacher <drizzd@aon.at> writes:

> Coming back to "[PATCH] optionally disable gitattributes": The topics
> are related, because they both deal with the situation where the work
> tree has files which are not normalized according to gitattributes. But
> my patch is more about saying: ok, I know I may have files which need to
> be normalized, but I want to ignore this issue for now. Please disable
> gitattributes for now, because I want to work with the files as they are
> committed. Conversely, the discussion here is about how to reliably
> detect and fix files which are not normalized.

I primarily wanted to make sure that you understood the underlying
issue, so that I do not have to go back to the basics in the other
thread.  And it is clear that you obviously do, which is good.

Here, you seem to think that what t0025 wants to see happen is
sensible, judging by the fact that you call "rm .git/index && git
reset" a "fix".

My take on this is quite different.  After a "reset --hard HEAD", we
should be able to trust the cached stat information and have "diff
HEAD" say "no changes".  That is what you essentially want in the
other thread, if I understand you correctly, and in an ideal world
where the filesystem timestamp has infinite precision, that is what
would happen in t0025, always "breaking" its expectation.  The real
world has much coarser timestamp granularity than ideal, and that is
why the test appear to be "flaky", failing to give "correct" outcome
some of the time--but I'd say that it is expecting a wrong thing.

An index entry that has data that does not round-trip when it goes
through convert_to_working_tree() and then convert_to_git() "breaks"
this arrangement, and I'd view it as the user having an inconsistent
data.  It is like you are in a repository that still has an unmerged
paths--you cannot proceed before you resolve them.

Anyway.

As to your patch in the other thread, here is what I think:

 (1) When you know (or perhaps your CI knows) that the working tree
     has never been modified since you did "reset --hard HEAD" (or
     its equivalent, like "git checkout $branch" from a clean
     state), these paths with inconsistent data would break the
     usual check to ask "is the working tree clean?"  That is a
     problem and we need a way to ensure that the working tree is
     always judged to be clean immediately after "reset --hard
     HEAD".  IOW, I agree with you that the issue you are trying to
     solve is worth solving.

 (2) Regardless of the "inconsistent data breaking the cleanliness
     check" issue, it may be handy to have a way to temporarily
     disable the attributes, i.e. allow us to ask "what happens if
     there is no attributes defined?"  IOW, I am saying that the
     change in the patch is not without merit.

In addition to (1), I further think that this sequence should not
report that the path F is modified:

     # Write F from HEAD to the working tree, after passing it
     # through convert_to_working_tree()
     $ git reset --hard HEAD

     # Force the re-reading, without changing the contents at all
     $ cp F F.new
     $ mv F.new F

     $ git diff HEAD

which is broken by paths with inconsistent data.  Your CI would want
a way to make that happen.

However, I do not think disabling attributes (i.e. (2)) is a
solution to the issue (i.e. (1)), which we just agreed to be an
issue that is worth solving, for at least two reasons.

 * Even without any attributes, core.autocrlf setting can get the
   data in your index (whose lines can be terminated with CRLF) into
   the same "inconsistent data" situation.  Disabling attribute
   handling would not have any effect on that codepath, I think.

 * The indexed data and the contents in the working tree file may
   match only because the clean/smudge transformation is done.  If
   you disable attributes, re-checking by passing the working tree
   contents through convert_to_git() and comparing the result with
   what is in the index would tell you that they are different, even
   if the clean/smudge filter pair implements round-trip operations
   correctly.

One way to solve (1) I can think of is to change the definition of
ce_compare_data(), which is called by the code that does not trust
the cached stat data (including but not limited to the Racy Git
codepath).  The current semantics of that function asks this
question:

    We do not know if the working tree file and the indexed data
    match.  Let's see if "git add" of that path would record the
    data that is identical to what is in the index.

This definition was cast in stone by 29e4d363 (Racy GIT, 2005-12-20)
and has been with us since Git v1.0.0.  But that does not have to be
the only sensible definition of this check.  I wonder what would
break if we ask this question instead:

    We do not know if the working tree file and the indexed data
    match.  Let's see if "git checkout" of that path would leave the
    same data as what currently is in the working tree file.

If we did this, "reset --hard HEAD" followed by "diff HEAD" will by
definition always report "is clean" as long as nobody changes files
in the working tree, even with the inconsistent data in the index.

This still requires that convert_to_working_tree(), i.e. your smudge
filter, is deterministic, though, but I think that is a sensible
assumption for sane people, even for those with inconsistent data in
the index.

  reply	other threads:[~2016-01-27 19:05 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-01-19  9:24 [PATCH] travis-ci: run previously failed tests first, then slowest to fastest larsxschneider
2016-01-19 19:12 ` Jeff King
2016-01-19 23:00   ` Junio C Hamano
2016-01-20  0:26     ` Mike Hommey
2016-01-20  1:46       ` Junio C Hamano
2016-01-20  1:56         ` Jeff King
2016-01-20  9:22         ` Lars Schneider
2016-01-22  2:33           ` brian m. carlson
2016-01-22  5:52             ` Jeff King
2016-01-22  6:07               ` Jeff King
2016-01-24 14:34                 ` Thomas Gummerer
2016-01-24 20:03                   ` Junio C Hamano
2016-01-24 22:05                     ` Junio C Hamano
2016-01-25 14:42                       ` Thomas Gummerer
2016-01-25 17:26                         ` Junio C Hamano
2016-01-25 21:52                           ` Junio C Hamano
2016-01-27 15:16                             ` Clemens Buchacher
2016-01-27 19:05                               ` Junio C Hamano [this message]
2016-01-27 20:49                                 ` Junio C Hamano
2016-01-28  7:10                                   ` Clemens Buchacher
2016-01-28 21:32                                     ` Junio C Hamano
2016-01-30  8:13                                       ` Clemens Buchacher
2016-02-01 18:17                                         ` Junio C Hamano
2016-02-01 19:33                                           ` Clemens Buchacher
2016-02-02 23:14                                             ` Junio C Hamano
2016-02-03  8:31                                               ` Junio C Hamano
2016-02-01 20:26                                           ` Torsten Bögershausen
2016-01-28  6:20                                 ` eol round trip Was: [PATCH] travis-ci: run previously failed Torsten Bögershausen
2016-01-25 22:41                           ` [PATCH] travis-ci: run previously failed tests first, then slowest to fastest Thomas Gummerer
2016-01-20  1:53       ` Jeff King
2016-01-20  9:10       ` Lars Schneider
2016-01-19 20:00 ` Junio C Hamano
2016-01-19 22:53   ` Junio C Hamano
2016-01-19 23:06     ` Jeff King
2016-01-19 23:26       ` Junio C Hamano
2016-01-19 23:29         ` Jeff King
2016-01-19 23:30         ` Junio C Hamano
2016-01-19 23:27       ` Jeff King
2016-01-20  7:55   ` Johannes Schindelin
2016-01-20  9:04   ` Lars Schneider
2016-01-20 20:35     ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqqd1sm9730.fsf@gitster.mtv.corp.google.com \
    --to=gitster@pobox.com \
    --cc=drizzd@aon.at \
    --cc=git@vger.kernel.org \
    --cc=larsxschneider@gmail.com \
    --cc=mh@glandium.org \
    --cc=peff@peff.net \
    --cc=sandals@crustytoothpaste.net \
    --cc=t.gummerer@gmail.com \
    --cc=tboegi@web.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.