From: David Turner <dturner@twopensource.com>
To: Junio C Hamano <gitster@pobox.com>
Cc: git mailing list <git@vger.kernel.org>
Subject: Re: File owner/group and git
Date: Thu, 05 Nov 2015 12:57:00 -0500 [thread overview]
Message-ID: <1446746220.4131.57.camel@twopensource.com> (raw)
In-Reply-To: <xmqqoaf9891r.fsf@gitster.mtv.corp.google.com>
On Wed, 2015-11-04 at 18:38 -0800, Junio C Hamano wrote:
> David Turner <dturner@twopensource.com> writes:
>
> > In unpack-trees.c, in verify_uptodate_1, we check ie_match_stat. This
> > returns OWNER_CHANGED if a file has changed ownership since the index
> > was updated. Do we actually care about that particular case? Or really
> > anything other than DATA_CHANGED?
>
> That's a 10-year old code and there aren't that many people left
> who can answer the original rationale, I am afraid ;-)
>
> In general, "Do we actually care?" is not the question we ask in
> this area of the code. "Does it help us to catch real changes, or
> does it change spuriously to make it too unreliable a signal to be
> useful?" is the question that drives the design of this part of the
> system.
>
> DATA_CHANGED is "we know the contents are different without even
> looking at the data". If the size is different from the last time
> we hashed the data, the contents must have changed. The inverse is
> not true (and that is half of the "racy git" issue).
>
> Other *_CHANGED are finely classified only because originally we
> didn't really know which are useful to treat as notable change
> event, and "changed" variable had sufficient number of bits to hold
> different classification, so that we could pick and choose which
> ones we truly care. We knew MTIME was useful in the sense that even
> if the size is the same, updated mtime is good enough indication
> that the stuff has changed, even to "make" utility.
>
> INODE and CTIME are not so stable on some filesystems (e.g. inum may
> not be stable on a network share across remount) and in some
> environments (e.g. some virus scanners touch ctime to mark scanned
> files, cf. 1ce4790b), and would trigger false positives too often to
> be useful. We always paid attention to them initially, but there
> are configurations to tell Git not raise them these days.
>
> OWNER probably falls into a category that is stable enough to be
> useful, as the most likely way for it to change is not by running
> "chown" on the file in-place (which does not change the contents),
> but by running "mv" to drop another file owned by somebody else to
> the original location (which likely does change the contents). At
> the same time, "mv" a different file into the path would likely
> trigger changes to INODE and MTIME as well, so it cannot be more
> than belt-and-suspenders measure to catch modification. In that
> sense ignoring OWNER would not hurt too much.
>
> If it changes spuriously to make it too unreliable a signal to be
> useful, it certainly is OK to introduce a knob to ignore it. It
> might even make sense to ignore it unconditionally if the false hit
> happens too frequently, but offhand my gut reaction is that there
> may be something wrong in the environment (i.e. system outside Git
> in which Git runs) if owner/group changes spuriously to cause
> issues.
Thanks.
The only case where we saw it was with our watchman code, which lies
about ownership (to save space/time). We're going to try ignoring
OWNER_CHANGED in our watchman branch, and if that fixes the issue for
our users, we'll stop worrying about it on the theory that Duy's
watchman stuff is the long-term path forward, and it doesn't have this
issue.
prev parent reply other threads:[~2015-11-05 17:57 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-11-05 2:03 File owner/group and git David Turner
2015-11-05 2:38 ` Junio C Hamano
2015-11-05 17:57 ` David Turner [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1446746220.4131.57.camel@twopensource.com \
--to=dturner@twopensource.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).