All of lore.kernel.org
 help / color / mirror / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: Karsten Blees <karsten.blees@gmail.com>
Cc: Git List <git@vger.kernel.org>
Subject: Re: broken racy detection and performance issues with nanosecond file times
Date: Mon, 28 Sep 2015 11:17:19 -0700	[thread overview]
Message-ID: <xmqqtwqecssw.fsf@gitster.mtv.corp.google.com> (raw)
In-Reply-To: <5605D88A.20104@gmail.com> (Karsten Blees's message of "Sat, 26 Sep 2015 01:28:10 +0200")

Karsten Blees <karsten.blees@gmail.com> writes:

> Ideas for potential solutions:
> ==============================
>
> Performance issues:
> -------------------
>
> 1. Compare file times in minimum supported precision
>    When comparing file times, use the minimum precision supported by
>    both the writing and reading git implementations.
> 1a. Simplest variant: Don't compare nanoseconds if the field in the
>    cached index entry is 0. JGit already does this [5], but at the
>    same time it is very unfriendly to USE_NSEC-enabled git by storing
>    only milliseconds in the nanosecond field. This "simple" solution
>    implies that git implementations that cannot provide full
>    nanosecond precision must leave the nanosecond field empty.
> 1b. More involved: Store the precision in the index entry.
>    We only need 30 bits to encode nanoseconds, so the high 2 bits of
>    the nanosecond field could be used as follows:
>    00: second precision (i.e. ignore, for backward compatibility)
>    01: millisecond precision
>    10: microsecond precision
>    11: nanosecond precision
>    When reading the index, USE-NSEC-enabled git implementations would
>    do dirty checks with the minimum precision supported by themselves
>    and the creator of the index entry.

Yeah, my gut feeling is that we should make sure that at least 1a is
done by all implementations.

I agree that 1b. is a bit more involved in that all binary that was
built with USE_NSEC that is not aware of these 2-bits need to be
eradicated for a new version to be deployed --- the transition for
users who use multiple implementations will be a pain (those that
use just one implementation of Git can just say "rm -f .git/index &&
git reset --hard" or something after updating to the new version of
Git).

> 2. Don't use ctime in dirty checks if ctime.sec == 0.

OK.  That is slightly less drastic than !trust_ctime, I guess.

> Racy detection:
> ---------------
>
> 3. Minimal racy solution
>    * Do all racy checks with second-precision only.
>    * When committing an index.lock file, reset mtime to the time
>      before git started reading the old index (i.e. time(null) when
>      calling read_cache()).
>
>    I believe this should fix all three racy problems described above,
>    although restraining ourselves to second-precision somewhat
>    thwarts the ability to track nanoseconds in the first place.
>    
>    The problem with this solution is that files changed by git itself
>    will appear racy to the next git process, thus increasing the
>    performance penalty after e.g. a large checkout. Although I think
>    that re-reading the file after the file's mtime is the only way to
>    be really sure it hasn't been changed.

... the last of which is what is done anyway, so I think the above,
especally the second bullet-point, is all sensible.

      parent reply	other threads:[~2015-09-28 18:17 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-09-25 23:28 broken racy detection and performance issues with nanosecond file times Karsten Blees
2015-09-28 10:39 ` [PATCH/RFC] read-cache: fix file time comparisons with different precisions Karsten Blees
2015-09-28 12:52   ` Johannes Schindelin
2015-09-29 10:23     ` Karsten Blees
2015-09-29 13:42       ` Johannes Schindelin
2015-09-28 17:38 ` broken racy detection and performance issues with nanosecond file times Junio C Hamano
2015-09-29 11:28   ` Karsten Blees
2015-09-28 18:17 ` Junio C Hamano [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqqtwqecssw.fsf@gitster.mtv.corp.google.com \
    --to=gitster@pobox.com \
    --cc=git@vger.kernel.org \
    --cc=karsten.blees@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.