From: Joshua Watt <jpewhacker@gmail.com>
To: NeilBrown <neilb@suse.com>, Jeff Layton <jlayton@redhat.com>,
Trond Myklebust <trond.myklebust@primarydata.com>,
"J . Bruce Fields" <bfields@fieldses.org>
Cc: linux-nfs@vger.kernel.org, Al Viro <viro@zeniv.linux.org.uk>,
David Howells <dhowells@redhat.com>
Subject: Re: [RFC v4 0/9] NFS Force Unmounting
Date: Mon, 04 Dec 2017 08:36:34 -0600 [thread overview]
Message-ID: <1512398194.7031.56.camel@gmail.com> (raw)
In-Reply-To: <20171117174552.18722-1-JPEWhacker@gmail.com>
On Fri, 2017-11-17 at 11:45 -0600, Joshua Watt wrote:
> Latest attempt at unifying the various constraints for force
> umounting
> NFS servers that have disappeared in a timely manner. Namely:
> * umount2(..., MNT_FORCE) should not be made stronger, unless we
> know
> this is the final umount()
> * The "failed server" state should be reversible
> * The mechanism should be able to "unstick" a sync(2) that is stuck
> on
> an unresponsive server
>
> I believe the proposal satisfies all of these concerns. There are a
> few
> major components to this proposal:
> 1) The umount_begin superblock operation now has a corresponding
> umount_end operation. This is invoked by umount() when MNT_FORCE
> is
> specified (like umount_begin()), but the actual unmount failed
> (i.e.
> the mount is busy).
> 2) Instead of killing all the RPC queued at a single point in time,
> the
> NFS mount now kills all queue RPCs and all RPCs that get queued
> between nfs_umount_begin() and nfs_umount_end(). I believe this
> is
> not a significant change in behavior because there were always
> races
> between queuing RPCs and killing them in nfs_umount_begin().
> 3) nfs_umount_end() is *not* called when MNT_DETACH is specified.
> This
> is the indication that the user is done with this mount and all
> RPCs will be killed until the mount finally gets removed.
> 4) The new "transient" mount option prevents sharing nfs_clients
> between multiple superblocks. The only exception to this is when
> the
> kernel does an automatic mount to cross a device boundary ("xdev"
> NFS mount). In this case, the existing code always shares the
> existing nfs_client from parent superblock. The "transient" mount
> option implies "nosharecache", as it doesn't make sense to share
> superblocks if clients aren't shared.
> 5) If the "transient" mount option is specified (and hence the
> nfs_client is not shared), MNT_FORCE kills all RPCs for the
> entire
> nfs_client (and all its nfs_servers). This effectively enables
> the
> "burn down the forest" option when combined with MNT_DETACH.
>
> The choice to use MNT_FORCE as the mechanism for triggering this
> behavior stems from the desire to unstick sync(2) calls that might be
> blocked on a non-responsive NFS server. While it was previously
> discussed to remount with a new mount option to enable this behavior,
> this cannot release the blocked sync(2) call because both
> sync_fileystem() and do_remount() lock the struct superblock-
> >s_umount
> reader-writer lock. As such, a remount will block until the sync(2)
> finishes, which is undesirable. umount2() doesn't have this
> restriction
> and can unblock the sync(2) call.
>
> For the most part, all existing behavior is unchanged if the
> "transient"
> option is not specified. umount -f continues to behave as is has, but
> umount -fl (see note below) now does a more aggressive kill to get
> everything out of there and allow unmounting in a more timely manner.
> Note that there will probably be some communication with the server
> still, as destruction of the NFS client ID and such will occur when
> the
> last reference is removed. If the server is truly gone, this can
> result
> in long blocks at that time.
>
> If it is known at mount time that the server might be disappearing,
> it
> should be mounted with "transient". Doing this will allow mount -fl
> to
> do a more complete cleanup and prevent all communication with the
> server, which should allow a timely cleanup in all cases.
>
> Notes:
>
> Until recently, libmount did not allow a detached and lazy mount at
> the
> same time. This was recently fixed (see
> https://marc.info/?l=util-linux-ng&m=151000714401929&w=2). If you
> want
> to test this, you may need to write a simple test program that
> directly
> calls umount2() with MNT_FORCE | MNT_DETACH.
>
> Thank you all again for your time and comments,
> Joshua Watt
>
> Joshua Watt (9):
> SUNRPC: Add flag to kill new tasks
> SUNRPC: Expose kill_new_tasks in debugfs
> SUNRPC: Simplify client shutdown
> namespace: Add umount_end superblock operation
> NFS: Kill RPCs for the duration of umount
> NFS: Add debugfs for nfs_server and nfs_client
> NFS: Add transient mount option
> NFS: Don't shared transient clients
> NFS: Kill all client RPCs if transient
>
> fs/namespace.c | 22 ++++++-
> fs/nfs/Makefile | 2 +-
> fs/nfs/client.c | 96 +++++++++++++++++++++++++--
> fs/nfs/debugfs.c | 143
> +++++++++++++++++++++++++++++++++++++++++
> fs/nfs/inode.c | 5 ++
> fs/nfs/internal.h | 11 ++++
> fs/nfs/nfs3client.c | 2 +
> fs/nfs/nfs4client.c | 5 ++
> fs/nfs/nfs4super.c | 1 +
> fs/nfs/super.c | 81 ++++++++++++++++++++---
> include/linux/fs.h | 1 +
> include/linux/nfs_fs_sb.h | 6 ++
> include/linux/sunrpc/clnt.h | 1 +
> include/uapi/linux/nfs_mount.h | 1 +
> net/sunrpc/clnt.c | 13 ++--
> net/sunrpc/debugfs.c | 5 ++
> net/sunrpc/sched.c | 3 +
> 17 files changed, 372 insertions(+), 26 deletions(-)
> create mode 100644 fs/nfs/debugfs.c
>
Anyone have any comments on this? Sorry fo the churn, it took a few
tries to get this to where it would work. I also realize I should have
put all my RFC patchsets in-reply-to each other instead of starting a
new thread for each one.
Thanks for your time,
Joshua Watt
next prev parent reply other threads:[~2017-12-04 14:36 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-11-17 17:45 [RFC v4 0/9] NFS Force Unmounting Joshua Watt
2017-11-17 17:45 ` [RFC v4 1/9] SUNRPC: Add flag to kill new tasks Joshua Watt
2017-12-05 22:59 ` NeilBrown
2017-11-17 17:45 ` [RFC v4 2/9] SUNRPC: Expose kill_new_tasks in debugfs Joshua Watt
2017-11-17 17:45 ` [RFC v4 3/9] SUNRPC: Simplify client shutdown Joshua Watt
2017-11-17 17:45 ` [RFC v4 4/9] namespace: Add umount_end superblock operation Joshua Watt
2017-12-06 11:54 ` Jeff Layton
2017-12-06 12:14 ` Al Viro
2017-12-06 12:33 ` Al Viro
2017-12-06 15:41 ` Joshua Watt
2017-11-17 17:45 ` [RFC v4 5/9] NFS: Kill RPCs for the duration of umount Joshua Watt
2017-12-05 23:07 ` NeilBrown
2017-11-17 17:45 ` [RFC v4 6/9] NFS: Add debugfs for nfs_server and nfs_client Joshua Watt
2017-11-17 17:45 ` [RFC v4 7/9] NFS: Add transient mount option Joshua Watt
2017-12-06 12:23 ` Jeff Layton
2017-11-17 17:45 ` [RFC v4 8/9] NFS: Don't shared transient clients Joshua Watt
2017-11-17 17:45 ` [RFC v4 9/9] NFS: Kill all client RPCs if transient Joshua Watt
2017-12-04 14:36 ` Joshua Watt [this message]
2017-12-05 23:34 ` [RFC v4 0/9] NFS Force Unmounting NeilBrown
2017-12-06 13:03 ` Jeff Layton
2017-12-06 16:40 ` Joshua Watt
2017-12-08 2:10 ` NeilBrown
2017-12-14 18:22 ` Joshua Watt
2017-12-14 21:52 ` NeilBrown
2017-12-18 21:48 ` Joshua Watt
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1512398194.7031.56.camel@gmail.com \
--to=jpewhacker@gmail.com \
--cc=bfields@fieldses.org \
--cc=dhowells@redhat.com \
--cc=jlayton@redhat.com \
--cc=linux-nfs@vger.kernel.org \
--cc=neilb@suse.com \
--cc=trond.myklebust@primarydata.com \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).