allowing for a completely cached umount(2) pathwalk

linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Jeff Layton <jlayton@kernel.org>
To: Al Viro <viro@zeniv.linux.org.uk>,
	Christian Brauner <brauner@kernel.org>
Cc: Dave Wysochanski <dwysocha@redhat.com>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	linux-nfs <linux-nfs@vger.kernel.org>,
	David Howells <dhowells@redhat.com>, NeilBrown <neilb@suse.de>,
	Christoph Hellwig <hch@lst.de>
Subject: allowing for a completely cached umount(2) pathwalk
Date: Thu, 13 Apr 2023 18:00:42 -0400	[thread overview]
Message-ID: <95ee689c76bf034fa2fe9fade0bccdb311f3a04f.camel@kernel.org> (raw)

David Wysochanski posted this a week ago:

    https://lore.kernel.org/linux-nfs/CALF+zOnizN1KSE=V095LV6Mug8dJirqk7eN1joX8L1-EroohPA@mail.gmail.com/

It describes a situation where there are nested NFS mounts on a client,
and one of the intermediate mounts ends up being unexported from the
server. In a situation like this, we end up being unable to pathwalk
down to the child mount of these unreachable dentries and can't unmount
anything, even as root.

A decade ago, we did some work to make the kernel not revalidate the
leaf dentry on a umount [1]. This helped some similar sorts of problems
but it doesn't help if the problem is an intermediate dentry.

The idea at the time was that umount(2) is a special case: We are
specifically looking to stop using the mount, so there's nothing to be
gained by revalidating its root dentry and inode.

Based on the problem Dave describes, I'd submit that umount(2) is
special in another way too: It's intended to manipulate the mount table
of the local host, so contacting the backing store (the NFS server in
this case) during a pathwalk doesn't really help anything. All we care
about is getting to the right spot in the mount tree.

A "modest" proposal:
--------------------
This is still somewhat handwavy, but what if we were to make umount(2)
an even more special case for the pathwalk? For the umount(2) pathwalk,
we could:

1/ walk down the dentry tree without calling ->d_revalidate: We don't
care about changes that might have happened remotely. All we care about
is walking down the cached dentries as they are at that moment.

2/ disallow ->lookup operations: a umount is about removing an existing
mount, so the dentries had better already be there.

3/ skip inode ->permission checks. We don't want to check with the
server about our permission to walk the path when we're looking to
unmount. We're walking down the path on the _local_ machine so we can
unuse it. The server should have no say-so in the matter. (We probably
would want to require CAP_SYS_ADMIN or CAP_DAC_READ_SEARCH for this of
course).

We might need other safety checks too that I haven't considered yet.

Is this a terrible idea? Are there potentially problems with
containerized setups if we were to do something like this? Are there
better ways to solve this problem (and others like it)? Maybe this would
be best done with a new UMOUNT_CACHED flag for umount2()?

-- 
Jeff Layton <jlayton@kernel.org>

[1] 8033426e6bdb vfs: allow umount to handle mountpoints without
revalidating them

next             reply	other threads:[~2023-04-13 22:00 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-04-13 22:00 Jeff Layton [this message]
2023-04-13 22:25 ` allowing for a completely cached umount(2) pathwalk Andreas Dilger
2023-04-13 22:41 ` NeilBrown
2023-04-14  2:43   ` Al Viro
2023-04-14  3:28     ` Trond Myklebust
2023-04-14  3:51       ` Al Viro
2023-04-14  4:06         ` Trond Myklebust
2023-04-14  4:21           ` Al Viro
2023-04-14  9:41         ` Christian Brauner
2023-04-14 10:09           ` Jeff Layton
2023-04-14 11:16             ` Christian Brauner
2023-04-14 12:33               ` Jeff Layton
2023-04-14 12:51                 ` Christian Brauner
2023-04-15  9:51             ` Amir Goldstein
2023-04-14 10:06     ` Jeff Layton
2023-04-14 13:41       ` Christian Brauner
2023-04-14 14:21         ` Trond Myklebust
2023-04-14 15:13           ` Christian Brauner
2023-04-14 15:30             ` Trond Myklebust
2023-04-14 15:57               ` Trond Myklebust
2023-04-14 16:22                 ` Al Viro
2023-04-14 16:41                   ` Trond Myklebust
2023-04-14 19:01                     ` Benjamin Coddington
2023-04-17  8:22                       ` Christian Brauner
2023-04-14 16:32               ` Christian Brauner
2023-04-14  2:32 ` Al Viro
2023-04-14 10:01   ` Jeff Layton
2023-04-14 12:18     ` Christian Brauner
2023-04-14 14:57     ` Al Viro
2023-04-14 13:16   ` David Wysochanski
2023-04-16 23:13 ` [PATCH/RFC] VFS: LOOKUP_MOUNTPOINT should used cached info whenever possible NeilBrown
2023-04-17 11:55   ` Christian Brauner
2023-04-17 12:25     ` Jeff Layton
2023-04-17 14:24       ` Christian Brauner
2023-04-17 15:21         ` Jeff Layton
2023-04-17 21:34           ` NeilBrown
2023-04-18  8:10             ` Christian Brauner
2023-04-18  3:25           ` Andreas Dilger
2023-04-18  8:04             ` Christian Brauner
2023-04-20 13:05               ` Jeff Layton
2023-04-20 15:41                 ` Christian Brauner
2023-04-17 21:26     ` NeilBrown
2023-04-20 21:35       ` Al Viro
2023-04-20 22:01         ` NeilBrown
2023-04-20 22:27           ` Al Viro
2023-04-17 12:09   ` Jeff Layton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=95ee689c76bf034fa2fe9fade0bccdb311f3a04f.camel@kernel.org \
    --to=jlayton@kernel.org \
    --cc=brauner@kernel.org \
    --cc=dhowells@redhat.com \
    --cc=dwysocha@redhat.com \
    --cc=hch@lst.de \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=neilb@suse.de \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).