linux-nfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Neil Brown <neilb@suse.de>
To: "J. Bruce Fields" <bfields@fieldses.org>
Cc: Christoph Hellwig <hch@infradead.org>,
	cluster-devel@redhat.com, nfs@lists.sourceforge.net,
	Jeff Layton <jlayton@redhat.com>
Subject: Re: [Cluster-devel] [PATCH 0/4 Revised] NLM - lock failover
Date: Mon, 30 Apr 2007 09:10:38 +1000	[thread overview]
Message-ID: <17973.9710.650004.160243@notabene.brown> (raw)
In-Reply-To: message from J. Bruce Fields on Sunday April 29

On Sunday April 29, bfields@fieldses.org wrote:
> On Sat, Apr 28, 2007 at 08:22:55AM +1000, Neil Brown wrote:
> > A flag to unexport cannot work because we don't call unexport - we
> > just flush a kernel cache.
> > 
> > A flag to export is just .... weird.  All the other export flags are
> > state flags.  This would be an action flag.  They are quite different
> > things.   Setting a state flag again is a no-op.  Setting an action
> > flag again has a very real effect.
> 
> In this case the second set shouldn't have any effect--whatever flag is
> set should prevent further locks from being accepted, shouldn't it?  (If
> it matters.)

yes, I guess a "No locks are allowed against this export" makes more
sense than "Remove all locks on this export now".
Though currently the locks are against the filesystem - the export can
disappear from the cache while the locks remain - so it's a long way
from perfect.  Possibly we could insist that the export remains in the
kernel while files are locked .... but we update export flags by
replacing the export, so that would be a little awkward.

Also, I think I was half-thinking about the "reset the grace period"
operation, and that looks a lot like an action.... unless you make it
  grace_period_ends=seconds-since-epoch.

That might work.

> 
> > Also, each filesystem is potentially exported multiple times for
> > different sets of clients.  If such a flag (whether on 'export' or
> > 'unexport') just said "remove locks from this set of clients" it
> > wouldn't meet the needs, and if it said "remove all locks" it would be
> > a very irregular interface.
> 
> The same could be said of the "fsid=" option on exports.  It doesn't
> make sense to provide different filehandle- or path- name spaces
> depending on the IP address of a client.  If my laptop changes IP
> address, then I can (grudgingly) accept the fact that the server may
> have to deny me access that I had before--maybe it just can't trust the
> network I moved to for whatever reason--but I'd really rather it didn't
> suddenly start giving me paths, or different filehandles, or different
> semantics (like sync vs. async).
> 
> So the export interface is already being used for stuff that's really
> intended to be per-filesystem rather than per-(filesystem, client) pair.

ro/rw is often different based on client address, but yes: at lot of
the flags don't really make sense being different for different
clients on the same filesystem.

My feeling was that the "nolocks" flag is essentially pointless unless
it is the same for all exports on the one filesystem, and that gives
it a very different feel.

To make use of such a flag you could not rely on the normal mechanism
for loading flag information: on-demand loading by mountd.
You would need to look through /proc/fs/nfsd/exports, find all the
current exports for the filesystem, tell the kernel to change each
export to have the "nolocks" flag.  And then when you have done all of
that, you want to immediately remove all those export entries so you
can unmount the filesystem.

So while it could be made to work, it doesn't feel clean at all.

A   grace_period_ends=seconds-since-epoch  flag would not have most of
those problems.  e.g. it could be demand loaded.
But there is the risk that it might be set for some exports on a given
filesystem and not for others.  And the consequence of that is that
some clients might not be able to reclaim their locks (because the
lock has already been given to a client which didn't know about the
new grace period).

Now maybe it would be good to have a bunch of nfsd options that are
explicitly per-filesystem rather than per-export.
Maybe that is the sort of interface we should be designing.
  echo "+nolocks /path/to/filesystem" > /proc/fs/nfsd/filesystem_settings
  echo "grace_end=12345678 /path/to/filesystem" > /proc/....
  echo "-write_gather /path" > .....
  

We would need to be clear on how long those settings remain in the
kernel, how it can be told to completely forget a particular
filesystem etc..

But we probably don't need to go over-board straight away.
I like the interface:
   echo -n "flag flag .. /path/name" >  /proc/fs/nfsd/filesystem_settings

where if flags is "?flag", then the value is returned by a subsequent
read on the same file-descriptor.

At this point we only need "nolocks" and "grace_end".
The grace_end information persists until that point in time.
The "nolocks" information .... doesn't persist(?).

NeilBrown

-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

  reply	other threads:[~2007-04-29 23:11 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-04-05 21:50 [PATCH 0/4 Revised] NLM - lock failover Wendy Cheng
2007-04-11 17:01 ` J. Bruce Fields
2007-04-17 19:30 ` [Cluster-devel] " Wendy Cheng
2007-04-18 18:56   ` Wendy Cheng
2007-04-18 19:46     ` [Cluster-devel] " Wendy Cheng
2007-04-19 14:41     ` Christoph Hellwig
2007-04-19 15:08       ` Wendy Cheng
2007-04-19  7:04   ` [Cluster-devel] " Neil Brown
2007-04-19 14:53     ` Wendy Cheng
2007-04-24  3:30     ` Wendy Cheng
2007-04-24  5:52       ` Neil Brown
2007-04-26  4:35         ` Wendy Cheng
2007-04-26  5:43           ` Neil Brown
2007-04-27  2:24             ` Wendy Cheng
2007-04-27  6:00               ` Neil Brown
2007-04-27 11:15                 ` Jeff Layton
2007-04-27 12:40                   ` Neil Brown
2007-04-27 13:42                     ` Jeff Layton
2007-04-27 14:17                       ` Christoph Hellwig
2007-04-27 15:42                         ` J. Bruce Fields
2007-04-27 15:36                           ` Wendy Cheng
2007-04-27 16:31                             ` J. Bruce Fields
2007-04-27 22:22                               ` Neil Brown
2007-04-29 20:13                                 ` J. Bruce Fields
2007-04-29 23:10                                   ` Neil Brown [this message]
2007-04-30  5:19                                     ` Wendy Cheng
2007-05-04 18:42                                     ` J. Bruce Fields
2007-05-04 21:35                                       ` Wendy Cheng
2007-04-27 20:34                             ` Frank van Maarseveen
2007-04-28  3:55                               ` Wendy Cheng
2007-04-28  4:51                                 ` Neil Brown
2007-04-28  5:26                                   ` Marc Eshel
2007-04-28 12:33                                   ` Frank van Maarseveen
2007-04-27 15:12                       ` Jeff Layton
2007-04-25 14:18 ` J. Bruce Fields
2007-04-25 14:10   ` Wendy Cheng
2007-04-25 15:21     ` Marc Eshel
2007-04-25 15:19       ` Wendy Cheng
2007-04-25 15:39         ` [Cluster-devel] " Wendy Cheng
2007-04-25 15:59     ` J. Bruce Fields
2007-04-25 15:52       ` Wendy Cheng
2011-11-30 10:13 ` Pavel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=17973.9710.650004.160243@notabene.brown \
    --to=neilb@suse.de \
    --cc=bfields@fieldses.org \
    --cc=cluster-devel@redhat.com \
    --cc=hch@infradead.org \
    --cc=jlayton@redhat.com \
    --cc=nfs@lists.sourceforge.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).