From: Ian Kent <ikent@redhat.com>
To: Jan Kara <jack@suse.cz>
Cc: Matthew Wilcox <willy@infradead.org>,
Lucas Karpinski <lkarpins@redhat.com>,
viro@zeniv.linux.org.uk, brauner@kernel.org, raven@themaw.net,
linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
Alexander Larsson <alexl@redhat.com>,
Eric Chanudet <echanude@redhat.com>
Subject: Re: [RFC v3 1/1] fs/namespace: remove RCU sync for MNT_DETACH umount
Date: Mon, 1 Jul 2024 09:08:06 +0800 [thread overview]
Message-ID: <b3bd4181-daf1-457e-807d-b252673d5042@redhat.com> (raw)
In-Reply-To: <20240628111345.3bbcgie4gar6icyj@quack3>
On 28/6/24 19:13, Jan Kara wrote:
> On Fri 28-06-24 10:58:54, Ian Kent wrote:
>> On 27/6/24 19:54, Jan Kara wrote:
>>> On Thu 27-06-24 09:11:14, Ian Kent wrote:
>>>> On 27/6/24 04:47, Matthew Wilcox wrote:
>>>>> On Wed, Jun 26, 2024 at 04:07:49PM -0400, Lucas Karpinski wrote:
>>>>>> +++ b/fs/namespace.c
>>>>>> @@ -78,6 +78,7 @@ static struct kmem_cache *mnt_cache __ro_after_init;
>>>>>> static DECLARE_RWSEM(namespace_sem);
>>>>>> static HLIST_HEAD(unmounted); /* protected by namespace_sem */
>>>>>> static LIST_HEAD(ex_mountpoints); /* protected by namespace_sem */
>>>>>> +static bool lazy_unlock = false; /* protected by namespace_sem */
>>>>> That's a pretty ugly way of doing it. How about this?
>>>> Ha!
>>>>
>>>> That was my original thought but I also didn't much like changing all the
>>>> callers.
>>>>
>>>> I don't really like the proliferation of these small helper functions either
>>>> but if everyone
>>>>
>>>> is happy to do this I think it's a great idea.
>>> So I know you've suggested removing synchronize_rcu_expedited() call in
>>> your comment to v2. But I wonder why is it safe? I *thought*
>>> synchronize_rcu_expedited() is there to synchronize the dropping of the
>>> last mnt reference (and maybe something else) - see the comment at the
>>> beginning of mntput_no_expire() - and this change would break that?
>> Interesting, because of the definition of lazy umount I didn't look closely
>> enough at that.
>>
>> But I wonder, how exactly would that race occur, is holding the rcu read
>> lock sufficient since the rcu'd mount free won't be done until it's
>> released (at least I think that's how rcu works).
> I'm concerned about a race like:
>
> [path lookup] [umount -l]
> ...
> path_put()
> mntput(mnt)
> mntput_no_expire(m)
> rcu_read_lock();
> if (likely(READ_ONCE(mnt->mnt_ns))) {
> do_umount()
> umount_tree()
> ...
> mnt->mnt_ns = NULL;
> ...
> namespace_unlock()
> mntput(&m->mnt)
> mntput_no_expire(mnt)
> smp_mb();
> mnt_add_count(mnt, -1);
> count = mnt_get_count(mnt);
> if (count != 0) {
> ...
> return;
> mnt_add_count(mnt, -1);
> rcu_read_unlock();
> return;
> -> KABOOM, mnt->mnt_count dropped to 0 but nobody cleaned up the mount!
> }
>
> And this scenario is exactly prevented by synchronize_rcu() in
> namespace_unlock().
I just wanted to say that I don't have a reply to this yet, having been
distracted
looking at the concern that Christian raised, in fact this looks like it
will be hard
to grok ...
Ian
next prev parent reply other threads:[~2024-07-01 1:08 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-06-26 20:07 [RFC v3 0/1] fs/namespace: defer RCU sync for MNT_DETACH umount Lucas Karpinski
2024-06-26 20:07 ` [RFC v3 1/1] fs/namespace: remove " Lucas Karpinski
2024-06-26 20:47 ` Matthew Wilcox
2024-06-27 1:11 ` Ian Kent
2024-06-27 11:54 ` Jan Kara
2024-06-27 15:16 ` Christian Brauner
2024-06-28 3:17 ` Ian Kent
2024-06-28 12:54 ` Christian Brauner
2024-06-28 15:13 ` Alexander Larsson
2024-07-01 0:58 ` Ian Kent
2024-07-01 5:50 ` Christian Brauner
2024-07-01 8:03 ` Ian Kent
2024-07-01 8:41 ` Alexander Larsson
2024-07-01 10:15 ` Jan Kara
2024-07-01 12:13 ` Christian Brauner
2024-07-01 12:10 ` Christian Brauner
2024-07-03 9:22 ` Christian Brauner
2024-07-04 1:23 ` Ian Kent
2024-07-02 1:29 ` Ian Kent
2024-07-02 4:50 ` Christian Brauner
2024-06-28 2:58 ` Ian Kent
2024-06-28 11:13 ` Jan Kara
2024-07-01 1:08 ` Ian Kent [this message]
2024-07-02 4:58 ` Christian Brauner
2024-07-02 7:01 ` Ian Kent
2024-07-02 10:01 ` Jan Kara
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=b3bd4181-daf1-457e-807d-b252673d5042@redhat.com \
--to=ikent@redhat.com \
--cc=alexl@redhat.com \
--cc=brauner@kernel.org \
--cc=echanude@redhat.com \
--cc=jack@suse.cz \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=lkarpins@redhat.com \
--cc=raven@themaw.net \
--cc=viro@zeniv.linux.org.uk \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).