linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Neil Brown <neilb-l3A5Bk7waGM@public.gmane.org>
To: "J. Bruce Fields" <bfields-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
Cc: Alan Cox <alan-qBU/x9rampVanCEyBjwyrvXRex20P6io@public.gmane.org>,
	"Patrick J. LoPresti"
	<lopresti-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>,
	Andi Kleen <andi-Vw/NltI1exuRpAAqCnN02g@public.gmane.org>,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-kernel
	<linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>
Subject: Re: Proposal: Use hi-res clock for file timestamps
Date: Wed, 18 Aug 2010 15:53:59 +1000	[thread overview]
Message-ID: <20100818155359.66b9ddb6@notabene> (raw)
In-Reply-To: <20100817192937.GD26609-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>

On Tue, 17 Aug 2010 15:29:38 -0400
"J. Bruce Fields" <bfields-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org> wrote:

> On Tue, Aug 17, 2010 at 08:39:41PM +0100, Alan Cox wrote:
> > > The problem with "increment mtime by a nanosecond when necessary" is
> > > that timestamps can wind up out of order.  As in:
> > 
> > Surely that depends on your implementation ?
> > 
> > > 1) Do a bunch of operations on file A
> > > 2) Do one operation on file B
> > > 
> > > Imagine each operation on A incrementing its timestamp by a nanosecond
> > > "just because".  If all of these operations happen in less than 4 ms,
> > > you can wind up with the timestamp on B being EARLIER than the
> > > timestamp on A.  That is a big no-no (think "make" or anything else
> > > relying on timestamps for relative times).
> > 
> > 
> > [time resolution bits of data][value incremented value for that time]
> > 
> > 
> > 	if (time_now == time_last)
> > 		return { time_last , ++ct };
> > 	else {
> > 		ct = 0;
> > 		time_last = time_now;
> > 		return { time_last , 0 };
> > 	}
> > 
> > providing it is done with the same 'ct' across the fs and you can't do
> > enough ops/second to wrap the nanosecs - which should be fine for now,
> > your ordering is still safe is it not ?
> 
> Right, so if I understand correctly, you're proposing a time source
> that's global to the filesystem and that guarantees it will always
> return a unique value by incrementing the nanoseconds field if jiffies
> haven't changed since the last time it was called.
> 
> (Does it really need to be global across all filesystems?  Or is it
> unreasonable to expect your unbelievably-fast make's to behave well when
> sources and targets live on different filesystems?)
>

I'm not sure you even want to pay for a per-filesystem atomic access when
updating mtime.  mnt_want_write - called at the same time - seems to go to
some lengths to avoid an atomic operation.

I think that nfsd should be the only place that has to pay the atomic
penalty, as it is where the need is.

I imagine something like this:
 - Create a global struct timespec which is protected by a seqlock
   Call it current_nfsd_time or similar.
 - file_update_time reads this and uses it if it is newer than
   current_fs_time.
 - nfsd updates it whenever it reads an mtime out of an inode that matches
   current_fs_time to the granularity of 1/HZ.
   If the current value is before current_kernel_time, it
   is set to current_kernel_time, otherwise tv_nsec is incremented -
   unless that increases
   beyond jiffies_to_usec(1)*1000 beyond current_kernel_time.
 - the global 'struct timespec' is zeroed whenever system time is set
   backwards.

Then - providing the fs stores nanosecond timestamps - we should have stable,
globally ordered, precise (if not entirely accurate) time stamps, and a
penalty would only be paid when nfsd actually needs the information.


[[You could probably make ext3 work reasonably well by adding a mount option
  which:
    - advertises s_time_gran as 1
    - when storing: rounds timestamps up to the next second if tv_nsec != 0
    - when loading, setting the timestamp to the current time if the stored
      number matches current_kernel_time().tv_sec+1
  You would get occasional forward jumps in mtime, but usually when you
  aren't looking, and at least you would not get real changes that are not
  reflected in mtime
]]

NeilBrown
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  parent reply	other threads:[~2010-08-18  5:53 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-08-13 18:25 Proposal: Use hi-res clock for file timestamps Patrick J. LoPresti
     [not found] ` <AANLkTimnyXKahtjaFeSsgcq=xMy-pP3na1jidQhZ-dt2-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2010-08-13 18:45   ` john stultz
2010-08-13 18:57     ` Patrick J. LoPresti
2010-08-13 19:09       ` john stultz
     [not found]         ` <1281726579.2810.10.camel-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>
2010-08-13 20:53           ` Patrick J. LoPresti
2010-08-14 16:45             ` Patrick J. LoPresti
2010-08-15  1:50             ` Bret Towe
2010-08-17 14:54   ` Andi Kleen
     [not found]     ` <87aaolwar8.fsf-3rXA9MLqAseW/qJFnhkgxti2O/JbrIOy@public.gmane.org>
2010-08-17 17:41       ` J. Bruce Fields
     [not found]         ` <20100817174134.GA23176-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
2010-08-17 18:29           ` Andi Kleen
     [not found]             ` <20100817182920.GD18161-u0/ZJuX+froe6aEkudXLsA@public.gmane.org>
2010-08-17 18:50               ` Patrick J. LoPresti
2010-08-18 18:20               ` David Woodhouse
2010-08-18 18:32                 ` Patrick J. LoPresti
2010-08-18 18:53                 ` Andi Kleen
2010-08-17 19:04             ` J. Bruce Fields
     [not found]               ` <20100817190447.GA28049-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
2010-08-17 19:18                 ` Patrick J. LoPresti
     [not found]                   ` <AANLkTi=w1UA5ZZDBigpxMiL7A7DnbnQhLkg62JZpC6Ri-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2010-08-17 19:39                     ` Alan Cox
     [not found]                       ` <20100817203941.729830b7-qBU/x9rampVanCEyBjwyrvXRex20P6io@public.gmane.org>
2010-08-17 19:29                         ` J. Bruce Fields
     [not found]                           ` <20100817192937.GD26609-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
2010-08-17 19:52                             ` Alan Cox
2010-08-18  5:53                             ` Neil Brown [this message]
2010-08-18 14:46                               ` Patrick J. LoPresti
2010-08-18 17:32                               ` J. Bruce Fields
     [not found]                                 ` <20100818173203.GC32430-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
2010-08-18 18:15                                   ` Chuck Lever
     [not found]                                     ` <0F91AB9D-0E14-4384-ADD6-0A467C3ABFAC-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
2010-08-18 23:41                                       ` Neil Brown
2010-08-19  0:52                                         ` Neil Brown
2010-08-19  2:08                                           ` J. Bruce Fields
     [not found]                                             ` <20100819020803.GA30151-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
2010-08-19  2:44                                               ` Neil Brown
2010-08-19 22:46                                                 ` J. Bruce Fields
2010-08-18 23:47                                   ` Neil Brown
2010-08-18 17:50                               ` Andi Kleen
2010-08-18 18:54                                 ` J. Bruce Fields
2010-08-18 19:25                                   ` Andi Kleen
2010-08-18 19:30                                     ` J. Bruce Fields
2010-08-17 19:34                         ` Patrick J. LoPresti
2010-08-17 19:54                           ` Alan Cox
     [not found]                             ` <20100817205441.200ab9a4-qBU/x9rampVanCEyBjwyrvXRex20P6io@public.gmane.org>
2010-08-17 19:43                               ` Patrick J. LoPresti
     [not found]                                 ` <AANLkTi=BB-zVFyCLgC+RWai9FFecaOad=pUC2=XFnY3J-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2010-08-17 19:45                                   ` J. Bruce Fields
2010-08-18 18:12                           ` J. Bruce Fields
2010-08-19  1:41                             ` john stultz
     [not found]                               ` <AANLkTi=cx31Mgfe7FxJz6LUmTKFR4=9KEBgbFsNLjiSE-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2010-08-19  2:31                                 ` J. Bruce Fields
     [not found]                                   ` <20100819023106.GB30151-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
2010-08-19  3:17                                     ` john stultz
2010-08-19 22:53                                       ` J. Bruce Fields

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100818155359.66b9ddb6@notabene \
    --to=neilb-l3a5bk7wagm@public.gmane.org \
    --cc=alan-qBU/x9rampVanCEyBjwyrvXRex20P6io@public.gmane.org \
    --cc=andi-Vw/NltI1exuRpAAqCnN02g@public.gmane.org \
    --cc=bfields-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org \
    --cc=linux-fsdevel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=lopresti-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).