All of lore.kernel.org
 help / color / mirror / Atom feed
From: "J. Bruce Fields" <bfields@fieldses.org>
To: Neil Brown <neilb@suse.de>
Cc: Chuck Lever <chuck.lever@oracle.com>,
	Alan Cox <alan@lxorguk.ukuu.org.uk>,
	"Patrick J. LoPresti" <lopresti@gmail.com>,
	Andi Kleen <andi@firstfloor.org>,
	linux-fsdevel@vger.kernel.org, linux-nfs@vger.kernel.org,
	linux-kernel <linux-kernel@vger.kernel.org>
Subject: Re: Proposal: Use hi-res clock for file timestamps
Date: Thu, 19 Aug 2010 18:46:01 -0400	[thread overview]
Message-ID: <20100819224601.GC9275@fieldses.org> (raw)
In-Reply-To: <20100819124413.77ca8baf@notabene>

On Thu, Aug 19, 2010 at 12:44:13PM +1000, Neil Brown wrote:
> On Wed, 18 Aug 2010 22:08:03 -0400
> "J. Bruce Fields" <bfields@fieldses.org> wrote:
> 
> > On Thu, Aug 19, 2010 at 10:52:18AM +1000, Neil Brown wrote:
> > > On Thu, 19 Aug 2010 09:41:36 +1000
> > > Neil Brown <neilb@suse.de> wrote:
> > > 
> > > > So I agree that this is probably more of an issue for directories than for
> > > > files, and that implementing it just for directories would be a sensible
> > > > first step with lower expected overhead - just my reasoning seems to be a bit
> > > > different.
> > > 
> > > Just to be sure we are on the same page:
> > >   file_update_time would always refer to current_nfsd_time, but nfsd would
> > >   only update current_nfsd_time when a directory was examined (and the other
> > >   conditions were met).
> > > 
> > > 
> > > So my current thinking on how this would look - names have been changed:
> > > 
> > >  - global timespec 'current_fs_precise_time' is zeroed when
> > >    current_kernel_time moves backwards and is protected by a seqlock
> > > 
> > >  - current_fs_time would be
> > >          now = max(current_kernel_time(), current_fs_precise_time)
> > >          return timespec_trunc(now, sb->s_time_gran)
> > >    (with appropriate seqlock protection)
> > > 
> > >  - new function in fs/inode.c
> > >          get_precise_time(timestamp)
> > 
> > Odd name for something that returns nothing of interest;
> > bump_precise_time() might be closer?
> > 
> > And unique_time might be better than precise_time, since the property
> > we're asking for is that mtime on a changed file by new?  (Or
> > versioned_time?)
> 
> Agreed on both counts, tough I'm not keen on 'bump' myself.
>   got_unique_time()
> because that it what we just did...  I prefer the name to reflect why the
> function is called, rather than what the function is expected to do about it.
>   never_use_this_timestamp_again(timestamp)
> :-?

Maybe "retire" for a pithier version of never_use_again:

/**
 * retire_timestamp - prevent a timestamp from being reused as an mtime.
 * @timestamp
 *
 * Advance the clock used to generate mtimes to guarantee that the
 * given timestamp will not be reused on any future mtime update.
 * This allows the given timestamp to be passed back to users such as
 * nfs clients which need the guarantee that mtimes will always change
 * on file updates.
 *
 * Depending on the filesystem's s_time_gran this may not be an ironclad
 * guarantee.
 */

?

> 
> 
> > 
> > >                 cft = current_fs_time()
> > >                 if (timestamp == cft)
> > 		     /*
> > 		      * Make sure the next mtime stored will be
> > 		      * something different from timestamp:
> > 		      */
> > >                    write_seqlock()
> > >                    if cft == current_fs_precise_time
> > >                         current_fs_precise_time.tv_nsec++
> > >                    else if cft > current_fs_precise_time
> > 
> > What's the cft < current_fs_precise_time case?
> 
> The current_fs_precise_time has been incremented with a resolution higher
> than s_time_gran.  i.e. s_time_gran > 1.
> I'm not really sure what we want to do about that.
> Maybe we should be incrementing tv_nsec by s_time_gran as long as that is
> significantly less than jiffies_to_usec(1)*1000, but I don't know what I mean
> by 'significantly'.

How about just scratching "significantly" and saying "less"?  As long as
we know jiffies is the default time source for mtimes, that should be
safe, shouldn't it?

> The only values I can find for s_time_gran in current code are 1, 100, 1000
> and 1000000000.

I didn't even know there were any other than 1 and a billion.  OK!

> All those are either way bigger than a jiffie or significantly smaller, but
> suppose a filesystem came along that chose 1000000 (i.e. millisecond
> timestamps) - should we increment tv_nsec by 1000000, or not, or cross that
> bridge when we come to it?
> 
> For reference:
>   default is 1000000000  (this would cover ext2, ext3, reiserfs, fat, sysv, ...)
>   cifs, smbfs, ntfs are 100
>   udf, ceph are 1000
>   rest (btrfs, ext4, gfs2, jfs, nilfs, ocfs2, xfs and virtual filesystems) are 1

Interesting list, thanks!

--b.

WARNING: multiple messages have this Message-ID (diff)
From: "J. Bruce Fields" <bfields-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
To: Neil Brown <neilb-l3A5Bk7waGM@public.gmane.org>
Cc: Chuck Lever <chuck.lever-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>,
	Alan Cox <alan-qBU/x9rampVanCEyBjwyrvXRex20P6io@public.gmane.org>,
	"Patrick J. LoPresti"
	<lopresti-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>,
	Andi Kleen <andi-Vw/NltI1exuRpAAqCnN02g@public.gmane.org>,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-kernel
	<linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>
Subject: Re: Proposal: Use hi-res clock for file timestamps
Date: Thu, 19 Aug 2010 18:46:01 -0400	[thread overview]
Message-ID: <20100819224601.GC9275@fieldses.org> (raw)
In-Reply-To: <20100819124413.77ca8baf@notabene>

On Thu, Aug 19, 2010 at 12:44:13PM +1000, Neil Brown wrote:
> On Wed, 18 Aug 2010 22:08:03 -0400
> "J. Bruce Fields" <bfields-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org> wrote:
> 
> > On Thu, Aug 19, 2010 at 10:52:18AM +1000, Neil Brown wrote:
> > > On Thu, 19 Aug 2010 09:41:36 +1000
> > > Neil Brown <neilb-l3A5Bk7waGM@public.gmane.org> wrote:
> > > 
> > > > So I agree that this is probably more of an issue for directories than for
> > > > files, and that implementing it just for directories would be a sensible
> > > > first step with lower expected overhead - just my reasoning seems to be a bit
> > > > different.
> > > 
> > > Just to be sure we are on the same page:
> > >   file_update_time would always refer to current_nfsd_time, but nfsd would
> > >   only update current_nfsd_time when a directory was examined (and the other
> > >   conditions were met).
> > > 
> > > 
> > > So my current thinking on how this would look - names have been changed:
> > > 
> > >  - global timespec 'current_fs_precise_time' is zeroed when
> > >    current_kernel_time moves backwards and is protected by a seqlock
> > > 
> > >  - current_fs_time would be
> > >          now = max(current_kernel_time(), current_fs_precise_time)
> > >          return timespec_trunc(now, sb->s_time_gran)
> > >    (with appropriate seqlock protection)
> > > 
> > >  - new function in fs/inode.c
> > >          get_precise_time(timestamp)
> > 
> > Odd name for something that returns nothing of interest;
> > bump_precise_time() might be closer?
> > 
> > And unique_time might be better than precise_time, since the property
> > we're asking for is that mtime on a changed file by new?  (Or
> > versioned_time?)
> 
> Agreed on both counts, tough I'm not keen on 'bump' myself.
>   got_unique_time()
> because that it what we just did...  I prefer the name to reflect why the
> function is called, rather than what the function is expected to do about it.
>   never_use_this_timestamp_again(timestamp)
> :-?

Maybe "retire" for a pithier version of never_use_again:

/**
 * retire_timestamp - prevent a timestamp from being reused as an mtime.
 * @timestamp
 *
 * Advance the clock used to generate mtimes to guarantee that the
 * given timestamp will not be reused on any future mtime update.
 * This allows the given timestamp to be passed back to users such as
 * nfs clients which need the guarantee that mtimes will always change
 * on file updates.
 *
 * Depending on the filesystem's s_time_gran this may not be an ironclad
 * guarantee.
 */

?

> 
> 
> > 
> > >                 cft = current_fs_time()
> > >                 if (timestamp == cft)
> > 		     /*
> > 		      * Make sure the next mtime stored will be
> > 		      * something different from timestamp:
> > 		      */
> > >                    write_seqlock()
> > >                    if cft == current_fs_precise_time
> > >                         current_fs_precise_time.tv_nsec++
> > >                    else if cft > current_fs_precise_time
> > 
> > What's the cft < current_fs_precise_time case?
> 
> The current_fs_precise_time has been incremented with a resolution higher
> than s_time_gran.  i.e. s_time_gran > 1.
> I'm not really sure what we want to do about that.
> Maybe we should be incrementing tv_nsec by s_time_gran as long as that is
> significantly less than jiffies_to_usec(1)*1000, but I don't know what I mean
> by 'significantly'.

How about just scratching "significantly" and saying "less"?  As long as
we know jiffies is the default time source for mtimes, that should be
safe, shouldn't it?

> The only values I can find for s_time_gran in current code are 1, 100, 1000
> and 1000000000.

I didn't even know there were any other than 1 and a billion.  OK!

> All those are either way bigger than a jiffie or significantly smaller, but
> suppose a filesystem came along that chose 1000000 (i.e. millisecond
> timestamps) - should we increment tv_nsec by 1000000, or not, or cross that
> bridge when we come to it?
> 
> For reference:
>   default is 1000000000  (this would cover ext2, ext3, reiserfs, fat, sysv, ...)
>   cifs, smbfs, ntfs are 100
>   udf, ceph are 1000
>   rest (btrfs, ext4, gfs2, jfs, nilfs, ocfs2, xfs and virtual filesystems) are 1

Interesting list, thanks!

--b.
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  reply	other threads:[~2010-08-19 22:48 UTC|newest]

Thread overview: 76+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-08-13 18:25 Proposal: Use hi-res clock for file timestamps Patrick J. LoPresti
2010-08-13 18:45 ` john stultz
2010-08-13 18:45   ` john stultz
2010-08-13 18:57   ` Patrick J. LoPresti
2010-08-13 19:09     ` john stultz
2010-08-13 19:09       ` john stultz
2010-08-13 20:53       ` Patrick J. LoPresti
2010-08-13 20:53         ` Patrick J. LoPresti
2010-08-14 16:45         ` Patrick J. LoPresti
2010-08-15  1:50         ` Bret Towe
2010-08-13 19:57     ` Jim Rees
2010-08-13 20:26       ` john stultz
2010-08-13 20:52         ` Jim Rees
2010-08-17 14:54 ` Andi Kleen
2010-08-17 14:54   ` Andi Kleen
2010-08-17 17:41   ` J. Bruce Fields
2010-08-17 17:41     ` J. Bruce Fields
2010-08-17 18:29     ` Andi Kleen
2010-08-17 18:29       ` Andi Kleen
2010-08-17 18:50       ` Patrick J. LoPresti
2010-08-17 18:50         ` Patrick J. LoPresti
2010-08-17 19:04       ` J. Bruce Fields
2010-08-17 19:18         ` Patrick J. LoPresti
2010-08-17 19:18           ` Patrick J. LoPresti
2010-08-17 19:39           ` Alan Cox
2010-08-17 19:39             ` Alan Cox
2010-08-17 19:29             ` J. Bruce Fields
2010-08-17 19:29               ` J. Bruce Fields
2010-08-17 19:52               ` Alan Cox
2010-08-17 19:52                 ` Alan Cox
2010-08-18  5:53               ` Neil Brown
2010-08-18  5:53                 ` Neil Brown
2010-08-18 14:46                 ` Patrick J. LoPresti
2010-08-18 14:46                   ` Patrick J. LoPresti
2010-08-18 17:32                 ` J. Bruce Fields
2010-08-18 17:32                   ` J. Bruce Fields
2010-08-18 18:15                   ` Chuck Lever
2010-08-18 18:15                     ` Chuck Lever
2010-08-18 23:41                     ` Neil Brown
2010-08-18 23:41                       ` Neil Brown
2010-08-19  0:52                       ` Neil Brown
2010-08-19  0:52                         ` Neil Brown
2010-08-19  2:08                         ` J. Bruce Fields
2010-08-19  2:08                           ` J. Bruce Fields
2010-08-19  2:44                           ` Neil Brown
2010-08-19  2:44                             ` Neil Brown
2010-08-19 22:46                             ` J. Bruce Fields [this message]
2010-08-19 22:46                               ` J. Bruce Fields
2010-08-18 23:47                   ` Neil Brown
2010-08-18 23:47                     ` Neil Brown
2010-08-18 17:50                 ` Andi Kleen
2010-08-18 17:50                   ` Andi Kleen
2010-08-18 18:54                   ` J. Bruce Fields
2010-08-18 19:25                     ` Andi Kleen
2010-08-18 19:30                       ` J. Bruce Fields
2010-08-17 19:34             ` Patrick J. LoPresti
2010-08-17 19:34               ` Patrick J. LoPresti
2010-08-17 19:54               ` Alan Cox
2010-08-17 19:43                 ` Patrick J. LoPresti
2010-08-17 19:43                   ` Patrick J. LoPresti
2010-08-17 19:45                   ` J. Bruce Fields
2010-08-17 19:45                     ` J. Bruce Fields
2010-08-18 18:12               ` J. Bruce Fields
2010-08-18 18:12                 ` J. Bruce Fields
2010-08-19  1:41                 ` john stultz
2010-08-19  1:41                   ` john stultz
2010-08-19  2:31                   ` J. Bruce Fields
2010-08-19  2:31                     ` J. Bruce Fields
2010-08-19  3:17                     ` john stultz
2010-08-19  3:17                       ` john stultz
2010-08-19 22:53                       ` J. Bruce Fields
2010-08-18 18:20       ` David Woodhouse
2010-08-18 18:20         ` David Woodhouse
2010-08-18 18:32         ` Patrick J. LoPresti
2010-08-18 18:32           ` Patrick J. LoPresti
2010-08-18 18:53         ` Andi Kleen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100819224601.GC9275@fieldses.org \
    --to=bfields@fieldses.org \
    --cc=alan@lxorguk.ukuu.org.uk \
    --cc=andi@firstfloor.org \
    --cc=chuck.lever@oracle.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=lopresti@gmail.com \
    --cc=neilb@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.