From: Christian Brauner <brauner@kernel.org>
To: Jan Kara <jack@suse.cz>
Cc: Ian Kent <ikent@redhat.com>, Matthew Wilcox <willy@infradead.org>,
Lucas Karpinski <lkarpins@redhat.com>,
viro@zeniv.linux.org.uk, raven@themaw.net,
linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
Alexander Larsson <alexl@redhat.com>,
Eric Chanudet <echanude@redhat.com>
Subject: Re: [RFC v3 1/1] fs/namespace: remove RCU sync for MNT_DETACH umount
Date: Thu, 27 Jun 2024 17:16:29 +0200 [thread overview]
Message-ID: <20240627-abkassieren-grinsen-6ce528fe5526@brauner> (raw)
In-Reply-To: <20240627115418.lcnpctgailhlaffc@quack3>
On Thu, Jun 27, 2024 at 01:54:18PM GMT, Jan Kara wrote:
> On Thu 27-06-24 09:11:14, Ian Kent wrote:
> > On 27/6/24 04:47, Matthew Wilcox wrote:
> > > On Wed, Jun 26, 2024 at 04:07:49PM -0400, Lucas Karpinski wrote:
> > > > +++ b/fs/namespace.c
> > > > @@ -78,6 +78,7 @@ static struct kmem_cache *mnt_cache __ro_after_init;
> > > > static DECLARE_RWSEM(namespace_sem);
> > > > static HLIST_HEAD(unmounted); /* protected by namespace_sem */
> > > > static LIST_HEAD(ex_mountpoints); /* protected by namespace_sem */
> > > > +static bool lazy_unlock = false; /* protected by namespace_sem */
> > > That's a pretty ugly way of doing it. How about this?
> >
> > Ha!
> >
> > That was my original thought but I also didn't much like changing all the
> > callers.
> >
> > I don't really like the proliferation of these small helper functions either
> > but if everyone
> >
> > is happy to do this I think it's a great idea.
>
> So I know you've suggested removing synchronize_rcu_expedited() call in
> your comment to v2. But I wonder why is it safe? I *thought*
> synchronize_rcu_expedited() is there to synchronize the dropping of the
> last mnt reference (and maybe something else) - see the comment at the
> beginning of mntput_no_expire() - and this change would break that?
Yes. During umount mnt->mnt_ns will be set to NULL with namespace_sem
and the mount seqlock held. mntput() doesn't acquire namespace_sem as
that would get rather problematic during path lookup. It also elides
lock_mount_hash() by looking at mnt->mnt_ns because that's set to NULL
when a mount is actually unmounted.
So iirc synchronize_rcu_expedited() will ensure that it is actually the
system call that shuts down all the mounts it put on the umounted list
and not some other task that also called mntput() as that would cause
pretty blatant EBUSY issues.
So callers that come before mnt->mnt_ns = NULL simply return of course
but callers that come after mnt->mnt_ns = NULL will acquire
lock_mount_hash() _under_ rcu_read_lock(). These callers see an elevated
reference count and thus simply return while namespace_lock()'s
synchronize_rcu_expedited() prevents the system call from making
progress.
But I also don't see it working without risk even with MNT_DETACH. It
still has potential to cause issues in userspace. Any program that
always passes MNT_DETACH simply to ensure that even in the very rare
case that a mount might still be busy is unmounted might now end up
seeing increased EBUSY failures for mounts that didn't actually need to
be unmounted with MNT_DETACH. In other words, this is only inocuous if
userspace only uses MNT_DETACH for stuff they actually know is busy when
they're trying to unmount. And I don't think that's the case.
next prev parent reply other threads:[~2024-06-27 15:16 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-06-26 20:07 [RFC v3 0/1] fs/namespace: defer RCU sync for MNT_DETACH umount Lucas Karpinski
2024-06-26 20:07 ` [RFC v3 1/1] fs/namespace: remove " Lucas Karpinski
2024-06-26 20:47 ` Matthew Wilcox
2024-06-27 1:11 ` Ian Kent
2024-06-27 11:54 ` Jan Kara
2024-06-27 15:16 ` Christian Brauner [this message]
2024-06-28 3:17 ` Ian Kent
2024-06-28 12:54 ` Christian Brauner
2024-06-28 15:13 ` Alexander Larsson
2024-07-01 0:58 ` Ian Kent
2024-07-01 5:50 ` Christian Brauner
2024-07-01 8:03 ` Ian Kent
2024-07-01 8:41 ` Alexander Larsson
2024-07-01 10:15 ` Jan Kara
2024-07-01 12:13 ` Christian Brauner
2024-07-01 12:10 ` Christian Brauner
2024-07-03 9:22 ` Christian Brauner
2024-07-04 1:23 ` Ian Kent
2024-07-02 1:29 ` Ian Kent
2024-07-02 4:50 ` Christian Brauner
2024-06-28 2:58 ` Ian Kent
2024-06-28 11:13 ` Jan Kara
2024-07-01 1:08 ` Ian Kent
2024-07-02 4:58 ` Christian Brauner
2024-07-02 7:01 ` Ian Kent
2024-07-02 10:01 ` Jan Kara
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240627-abkassieren-grinsen-6ce528fe5526@brauner \
--to=brauner@kernel.org \
--cc=alexl@redhat.com \
--cc=echanude@redhat.com \
--cc=ikent@redhat.com \
--cc=jack@suse.cz \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=lkarpins@redhat.com \
--cc=raven@themaw.net \
--cc=viro@zeniv.linux.org.uk \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).