From: "J. Bruce Fields" <bfields@fieldses.org>
To: Andi Kleen <andi@firstfloor.org>
Cc: "Patrick J. LoPresti" <lopresti@gmail.com>,
linux-fsdevel@vger.kernel.org, linux-nfs@vger.kernel.org,
linux-kernel <linux-kernel@vger.kernel.org>
Subject: Re: Proposal: Use hi-res clock for file timestamps
Date: Tue, 17 Aug 2010 13:41:34 -0400 [thread overview]
Message-ID: <20100817174134.GA23176@fieldses.org> (raw)
In-Reply-To: <87aaolwar8.fsf@basil.nowhere.org>
On Tue, Aug 17, 2010 at 04:54:03PM +0200, Andi Kleen wrote:
> "Patrick J. LoPresti" <lopresti@gmail.com> writes:
>
> >
> > 1) Anybody who cares about file system performance is already using
> > "noatime" or "relatime", which mitigates the hit greatly.
>
> Consider mtime.
>
> > If the above patch is too slow for some architectures, how about
> > making it a configuration option? Call it "CONFIG_1980S_FILE_TICK",
> > have it default to YES on the architectures that care and NO on
> > anything remotely modern and sane.
> >
> > OK that's my proposal. Bash away.
>
> I suspect it will be a performance disaster on x86 for VFS intensive
> applications on capable file systems. VFS is very performance
> critical. These checks lurk on unexpected places too, e.g. on /dev
> access.
>
> Even TSC is much slower than just reading the variable.
>
> Also you should check if the file system granuality
> even supports it, it's completely wasted on a ext3 for example.
Agreed, ext3's probably a lost cause here.
> Maybe as a optional sysctl, default to off.
OK, so that leaves us with the race, even on newer filesystems:
1. File is modified, mtime updated
2. Client fetches mtime to revalidate cache
3. File is modified again, mtime updated
4. Client fetches new mtime to revalidate cache
If step 3 doesn't change the mtime, then step 4 (no matter how much
later it is performed) will return the wrong result, and client
applications will see stale data.
If we want to avoid that race, every modification of file data must
result in the mtime being updated to something different from the last
mtime seen by the client.
(A slight window between data modification and mtime update may be OK,
as long as the update happens eventually, and before the change is
committed to disk--close-to-open semantics mean that NFS clients can
live with not seeing changes until data is written to disk.)
Possible responses:
- Tell everyone to use NFSv4 (and make sure we have
changeattr/i_version working correctly).
- Use a finer-grained time source. (I believe you when you say
the TSC is too slow, but maybe we should run some tests to
make sure.)
- Increment mtime by a nanosecond when necessary.
- ?
--b.
next prev parent reply other threads:[~2010-08-17 17:43 UTC|newest]
Thread overview: 45+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-08-13 18:25 Proposal: Use hi-res clock for file timestamps Patrick J. LoPresti
2010-08-13 18:45 ` john stultz
2010-08-13 18:57 ` Patrick J. LoPresti
2010-08-13 19:09 ` john stultz
2010-08-13 20:53 ` Patrick J. LoPresti
2010-08-14 16:45 ` Patrick J. LoPresti
2010-08-15 1:50 ` Bret Towe
2010-08-13 19:57 ` Jim Rees
2010-08-13 20:26 ` john stultz
2010-08-13 20:52 ` Jim Rees
2010-08-17 14:54 ` Andi Kleen
2010-08-17 17:41 ` J. Bruce Fields [this message]
2010-08-17 18:29 ` Andi Kleen
2010-08-17 18:50 ` Patrick J. LoPresti
2010-08-17 19:04 ` J. Bruce Fields
2010-08-17 19:18 ` Patrick J. LoPresti
2010-08-17 19:39 ` Alan Cox
2010-08-17 19:29 ` J. Bruce Fields
2010-08-17 19:52 ` Alan Cox
2010-08-18 5:53 ` Neil Brown
2010-08-18 14:46 ` Patrick J. LoPresti
2010-08-18 17:32 ` J. Bruce Fields
2010-08-18 18:15 ` Chuck Lever
2010-08-18 23:41 ` Neil Brown
2010-08-19 0:52 ` Neil Brown
2010-08-19 2:08 ` J. Bruce Fields
2010-08-19 2:44 ` Neil Brown
2010-08-19 22:46 ` J. Bruce Fields
2010-08-18 23:47 ` Neil Brown
2010-08-18 17:50 ` Andi Kleen
2010-08-18 18:54 ` J. Bruce Fields
2010-08-18 19:25 ` Andi Kleen
2010-08-18 19:30 ` J. Bruce Fields
2010-08-17 19:34 ` Patrick J. LoPresti
2010-08-17 19:54 ` Alan Cox
2010-08-17 19:43 ` Patrick J. LoPresti
2010-08-17 19:45 ` J. Bruce Fields
2010-08-18 18:12 ` J. Bruce Fields
2010-08-19 1:41 ` john stultz
2010-08-19 2:31 ` J. Bruce Fields
2010-08-19 3:17 ` john stultz
2010-08-19 22:53 ` J. Bruce Fields
2010-08-18 18:20 ` David Woodhouse
2010-08-18 18:32 ` Patrick J. LoPresti
2010-08-18 18:53 ` Andi Kleen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20100817174134.GA23176@fieldses.org \
--to=bfields@fieldses.org \
--cc=andi@firstfloor.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-nfs@vger.kernel.org \
--cc=lopresti@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).