From: "Xin Zhao" <uszhaoxin@gmail.com>
To: "Trond Myklebust" <trond.myklebust@fys.uio.no>
Cc: linux-kernel <linux-kernel@vger.kernel.org>,
linux-fsdevel@vger.kernel.org
Subject: Re: Why must NFS access metadata in synchronous mode?
Date: Thu, 1 Jun 2006 12:27:06 -0400 [thread overview]
Message-ID: <4ae3c140606010927t308e7d6ag5a9fc112c859aa45@mail.gmail.com> (raw)
In-Reply-To: <1149141341.13298.21.camel@lade.trondhjem.org>
Question 1: ...and how many NFS implementations have you seen based on
that paper?
I don't know. I only read the NFS implementations distributed with
Linux kernel. But some paper mentioned that the soft update mechanism
suggested in that paper has been adopted by FreeBSD.
Question 2: NFS permissions are checked by the _server_, not the client.
That's true. But I was not saying that all metadata access must be
asynchronous. Even for permission checking, speculative execution
mechanism proposed in Ed Nightingale's "speculative execution ...."
paper published in SOSP 2005 can be used to avoid waiting. The basic
idea is that a NFS client speculatively assume permission checking
returns "OK" and set a checkpoint, then the client can go ahead to
send further requests. If the actual result turns out to be "OK", the
client can discard the checkpoint, otherwise, it rolls back to the
checking point. This can make waiting time overlap with the sending
time of subsequent requests.
Question 3: Cache consistency requirements are _much_ more stringent
for asynchronous operation.
I agree. But I am not sure how local file system like Ext3 handle this
problem. I don't think Ext3 must synchronously write metadata (I will
double check the ext3 code). If I remember correctly, when change
metadata, Ext3 just change it in memory and mark this page to be
dirty. The page will be flushed to disk afterward. If the server
exports an Ext3 code, it should be able to do the same thing. When a
client requests to change metadata, server writes to the mmaped
metadata page and then return to client instead of having to sync the
change to disk. With this mechanism, at least the client does not have
to wait for the disk flush time. Does it make sense? To prevent
interleave change on metadata before it is flushed to disk, the server
can even mark the metadata page to be read-only before it is flushed
to disk.
Xin
On 6/1/06, Trond Myklebust <trond.myklebust@fys.uio.no> wrote:
> On Thu, 2006-06-01 at 00:04 -0400, Xin Zhao wrote:
> > Until kernel 2.6.16, I think NFS still access metadata synchronously,
> > which may impact performance significantly. Several years ago, paper
> > "metadata update performance in file systems" already suggested using
> > asynchronous mode in metadata access.
>
> ...and how many NFS implementations have you seen based on that paper?
>
> > I am curious why NFS does not adopt this suggestion? Can someone explain this?
>
> a) NFS permissions are checked by the _server_, not the client.
>
> b) Cache consistency requirements are _much_ more stringent for
> asynchronous operation. Think for instance about an asynchronous
> mkdir(): how should the client guarantee exclusive semantics (i.e. that
> mkdir either creates a new directory or returns an EEXIST error)? how
> should it guarantee that the server will have enough disk space to
> satisfy your request? how should it guarantee that nobody will change
> the permissions on the parent directory before the metadata was synced
> to disk?,...
>
> People are considering how to implement this sort of thing using the
> NFSv4 concept of delegations and applying them to directories. It is not
> yet obvious how all the details will be solved.
>
> Trond
>
>
next prev parent reply other threads:[~2006-06-01 16:27 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-06-01 4:04 Why must NFS access metadata in synchronous mode? Xin Zhao
2006-06-01 5:55 ` Trond Myklebust
2006-06-01 16:27 ` Xin Zhao [this message]
2006-06-01 17:26 ` Trond Myklebust
2006-06-02 3:42 ` Can Sar
2006-06-02 4:06 ` Trond Myklebust
2006-06-01 21:40 ` Andreas Dilger
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4ae3c140606010927t308e7d6ag5a9fc112c859aa45@mail.gmail.com \
--to=uszhaoxin@gmail.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=trond.myklebust@fys.uio.no \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).