linux-nfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Neil Brown <neilb@suse.de>
To: wcheng@redhat.com
Cc: cluster-devel@redhat.com, nfs@lists.sourceforge.net
Subject: Re: [Cluster-devel] [PATCH 0/4 Revised] NLM - lock failover
Date: Thu, 26 Apr 2007 15:43:38 +1000	[thread overview]
Message-ID: <17968.15370.88587.653447@notabene.brown> (raw)
In-Reply-To: message from Wendy Cheng on Thursday April 26

On Thursday April 26, wcheng@redhat.com wrote:
> 
> A convincing argument... unfortunately, this happens to be a case where 
> we need to protect server from client's misbehaviors. For a local 
> filesystem (ext3), if any file reference count is not zero (i.e. some 
> clients are still holding the locks), the filesystem can't be 
> un-mounted. We would have to fail the failover to avoid data corruption.

I think this is a tangential problem.
"removing locks held by troublesome clients so that I can unmount my
filesystem" is quite different from "remove locks held by client
clients using virtual-NAS-foo so they can be migrated".

I would have no problem with a new file in the nfsd filesystem such
that
      echo /some/path > /proc/fs/nfsd/nlm_drop_locks
would cause lockd to drop all locks on all files with the same 'struct
super' as "/some/path"->i_sb.
But I think that is independent functionality, that might be useful to
people who aren't doing active-active failover, but happens also to be
useful in conjunction with active-active failover.

We could discuss whether it should be "same superblock" or "same
vfsmount".  Both make sense to some extent.  The latter is possible
more flexible. 

If you had this interface, you might not need to send the various RPC
calls to lockd to get it to drop locks.... but then if you had a
cluster filesystem and wanted to only move some clients to a different
host, you would not want to drop *all* the locks on the filesystem, so
maybe both interfaces are still needed.

> 
> IMHO, having grace period for each client (host) is overkilled.

Yes, it gives you much more flexibility than you would ever what or
use, and in that sense it is overkill.
But it also makes available the specific flexibility that you do want
(grace period per local-address) with an extremely simple change to the
lockd interface, which I think is a big win.

> >
> >[snip]
> >I feel it has taken me quite a while to gain a full understanding of
> >what you are trying to achieve.  Maybe it would be useful to have a
> >concise/precise description of what the goal is.
> >I think a lot of the issues have now become clear, but it seems there
> >remains the issue of what system-wide configurations are expected, and
> >what configuration we can rule 'out of scope' and decide we don't have
> >to deal with.
> >  
> >
> I'm trying to do the write-up now. But could the following temporarily 
> serve the purpose ? What is not clear from this thread of discussion?
> 
> http://www.redhat.com/archives/linux-cluster/2006-June/msg00050.html

Lots of things are not clear - mostly things that have since become
clear in the ongoing discussion.
 - The many-IPs to many-filesystems possibilitiy
 - The need to explicitly handle mis-configured clients
 - The details of needs with respect to SM_NOTIFY callbacks
 - the "big picture" stuff.

I confess that I had a much more shallow understanding of how statd
interacts with lockd  when this discussion first started.
I'm sure that slowed me down in understanding the key issues, and in
suggesting workable possibilities.

I am sorry that this has taken so long.  However I think we are very
close to a solution that will solve everybody's needs.  And you've
found some bugs along the way!!

NeilBrown

-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

  reply	other threads:[~2007-04-26  5:43 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-04-05 21:50 [PATCH 0/4 Revised] NLM - lock failover Wendy Cheng
2007-04-11 17:01 ` J. Bruce Fields
2007-04-17 19:30 ` [Cluster-devel] " Wendy Cheng
2007-04-18 18:56   ` Wendy Cheng
2007-04-18 19:46     ` [Cluster-devel] " Wendy Cheng
2007-04-19 14:41     ` Christoph Hellwig
2007-04-19 15:08       ` Wendy Cheng
2007-04-19  7:04   ` [Cluster-devel] " Neil Brown
2007-04-19 14:53     ` Wendy Cheng
2007-04-24  3:30     ` Wendy Cheng
2007-04-24  5:52       ` Neil Brown
2007-04-26  4:35         ` Wendy Cheng
2007-04-26  5:43           ` Neil Brown [this message]
2007-04-27  2:24             ` Wendy Cheng
2007-04-27  6:00               ` Neil Brown
2007-04-27 11:15                 ` Jeff Layton
2007-04-27 12:40                   ` Neil Brown
2007-04-27 13:42                     ` Jeff Layton
2007-04-27 14:17                       ` Christoph Hellwig
2007-04-27 15:42                         ` J. Bruce Fields
2007-04-27 15:36                           ` Wendy Cheng
2007-04-27 16:31                             ` J. Bruce Fields
2007-04-27 22:22                               ` Neil Brown
2007-04-29 20:13                                 ` J. Bruce Fields
2007-04-29 23:10                                   ` Neil Brown
2007-04-30  5:19                                     ` Wendy Cheng
2007-05-04 18:42                                     ` J. Bruce Fields
2007-05-04 21:35                                       ` Wendy Cheng
2007-04-27 20:34                             ` Frank van Maarseveen
2007-04-28  3:55                               ` Wendy Cheng
2007-04-28  4:51                                 ` Neil Brown
2007-04-28  5:26                                   ` Marc Eshel
2007-04-28 12:33                                   ` Frank van Maarseveen
2007-04-27 15:12                       ` Jeff Layton
2007-04-25 14:18 ` J. Bruce Fields
2007-04-25 14:10   ` Wendy Cheng
2007-04-25 15:21     ` Marc Eshel
2007-04-25 15:19       ` Wendy Cheng
2007-04-25 15:39         ` [Cluster-devel] " Wendy Cheng
2007-04-25 15:59     ` J. Bruce Fields
2007-04-25 15:52       ` Wendy Cheng
2011-11-30 10:13 ` Pavel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=17968.15370.88587.653447@notabene.brown \
    --to=neilb@suse.de \
    --cc=cluster-devel@redhat.com \
    --cc=nfs@lists.sourceforge.net \
    --cc=wcheng@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).