From: "NeilBrown" <neilb@suse.de>
To: "Peter Staubach" <staubach@redhat.com>
Cc: "J. Bruce Fields" <bfields@fieldses.org>,
"Christoph Hellwig" <hch@lst.de>,
jean-noel.cordenner@bull.net, linux-fsdevel@vger.kernel.org
Subject: Re: i_version changes
Date: Thu, 14 Feb 2008 09:06:44 +1100 (EST) [thread overview]
Message-ID: <43087.192.168.1.70.1202940404.squirrel@neil.brown.name> (raw)
In-Reply-To: <47B361D8.1070708@redhat.com>
On Thu, February 14, 2008 8:32 am, Peter Staubach wrote:
>
> I don't think that this is quite true. If the file is changed
> when the NFS server is not running, then the value of i_version
> which is used when the NFS server starts up again must be
> different than the value which was previously used when the NFS
> server was previously running.
As I said, the "NFS has seen this i_version" flag needs to be on
stable storage, e.g. the lsb of the i_version. This will ensure that
any change after NFSD saw the i_version will cause the i_version to
be updated.
So I think it can provide correct semantics.
Precise details:
NFSD: when reading i_version
take lock
tmp = i_version
i_version |= 1
drop lock
return tmp & ~1;
VFS when making any change:
take lock
if (i_version & 1) {
i_version++;
changed=1
}
drop lock
if changed, sync inode
>
> Is the perceived performance hit really going to be as large
> as suspected? We already update the time fields fairly often
> and we don't pay a huge penalty for those, or at least not a
> penalty that we aren't willing to pay. Has anyone measured
> the cost?
Correct NFS semantics require that the i_version be written to disk
before (or when) the change is committed. That means lots more inodes
in the journal.
If you are already doing data=journal, it the hit probably isn't too
high.(?)
You are right: measuring the cost is important. However as we are
designing a generic filesystem interface, we need to understand the
cost on multiple filesystems in a variety of configuration .... or
give the filesystem complete information and let it decide the optimal
implementation.
Giving the filesystem full information means having an inode_operation
"nfsd_reads_version" which returns the number to be used as change_id.
NeilBrown
next prev parent reply other threads:[~2008-02-13 22:06 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-02-10 7:30 i_version changes Christoph Hellwig
2008-02-12 20:06 ` J. Bruce Fields
2008-02-13 9:25 ` Andreas Dilger
2008-02-13 12:52 ` Christoph Hellwig
2008-02-13 14:07 ` Trond Myklebust
2008-02-13 15:12 ` Andreas Dilger
2008-02-13 20:26 ` J. Bruce Fields
2008-02-13 21:19 ` NeilBrown
2008-02-13 21:32 ` Peter Staubach
2008-02-13 22:06 ` NeilBrown [this message]
2008-02-14 14:34 ` Peter Staubach
2008-02-14 8:40 ` Jean noel Cordenner
2008-02-14 14:38 ` Peter Staubach
2008-02-15 10:31 ` Jean noel Cordenner
2008-02-13 21:36 ` J. Bruce Fields
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=43087.192.168.1.70.1202940404.squirrel@neil.brown.name \
--to=neilb@suse.de \
--cc=bfields@fieldses.org \
--cc=hch@lst.de \
--cc=jean-noel.cordenner@bull.net \
--cc=linux-fsdevel@vger.kernel.org \
--cc=staubach@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).