linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Andreas Dilger <adilger@clusterfs.com>
To: Theodore Tso <tytso@mit.edu>
Cc: Cordenner jean noel <jean-noel.cordenner@bull.net>,
	linux-ext4@vger.kernel.org
Subject: Re: [RFC] [patch 2/3] change attribute for ext4: ext4 specific code
Date: Thu, 14 Dec 2006 15:57:10 -0700	[thread overview]
Message-ID: <20061214225710.GM5937@schatzie.adilger.int> (raw)
In-Reply-To: <20061214160307.GE9079@thunk.org>

On Dec 14, 2006  11:03 -0500, Theodore Tso wrote:
> There was discussion on yesterday's call about whether or not 32-bit
> was enough for NFSv4, or whether it also requried 64-bits of change
> notification in the RFC's.  So one of the questions is whether this is
> something that would justify requiring 64-bits --- and if so, maybe we
> need to require that big inodes be used and store the entire 64-bit
> value beyond 128 bytes.  This would mean that NFSv4 cache management
> couldn't be fully implemented without big inodes, or we'd have to make
> do by using the inode ctime as a partial substitute.

Per Trond and Bruce Field's reply to my email it seems that NFSv4 only
needs the version to compare for inequality.  If the change numbers are
sequential for a given inode it can OPTIONALLY extract additional
information about the server (i.e. it still has an up-to-date cache
because it was the only one that did an update on a given file).

So, I think for basic NFSv4 setups that 2^32 is sufficient (per Bull's
original patch) but 2^64 is desirable to avoid collisions and allow the
"sequential updates" logic to work properly for long-lived files.

So, I think a 32-bit field in the small inode, and an additional 32-bit
field in the large inode would be perfect.  It allows this functionality
to work with existing ext3 filesystems, if not quite optimally.

In addition, for Lustre, could we get a 64-bit field in the superblock
which contains the fs-wide version number.

I'm proposing that, per the original Bull patch, l_i_reserved1 be changed
to be i_version for linux, and we add i_version_hi after cr_time_extra in
the large inode.  The disk i_version would be stored in the vfs_inode
i_version (which is already used for this same purpose).  It would be good
for NFSv4 if the i_version field could be expanded to 64 bits to avoid
the need for it to have fs-specific operations, but failing that we can
put the high word into ext4_inode_info and NFS can access it via
export_operations I think.

Cheers, Andreas
--
Andreas Dilger
Principal Software Engineer
Cluster File Systems, Inc.

  reply	other threads:[~2006-12-14 22:57 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-11-29 18:54 [RFC] [patch 2/3] change attribute for ext4: ext4 specific code Jean-Noel Cordenner
2006-12-06 21:49 ` Andreas Dilger
2006-12-13 17:31   ` Cordenner jean noel
2006-12-14 16:03     ` Theodore Tso
2006-12-14 22:57       ` Andreas Dilger [this message]
2006-12-15 10:36       ` Cordenner jean noel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20061214225710.GM5937@schatzie.adilger.int \
    --to=adilger@clusterfs.com \
    --cc=jean-noel.cordenner@bull.net \
    --cc=linux-ext4@vger.kernel.org \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).