All of lore.kernel.org
 help / color / mirror / Atom feed
From: Al Viro <viro@zeniv.linux.org.uk>
To: Rik van Riel <riel@surriel.com>
Cc: linux-kernel@vger.kernel.org, kernel-team@fb.com,
	linux-fsdevel@vger.kernel.org, paulmck@kernel.org,
	gscrivan@redhat.com, Eric Biederman <ebiederm@xmission.com>,
	Chris Mason <clm@fb.com>
Subject: Re: [PATCH 1/2] vfs: free vfsmount through rcu work from kern_unmount
Date: Fri, 18 Feb 2022 20:24:09 +0000	[thread overview]
Message-ID: <YhAAaU5wSoFdMsQf@zeniv-ca.linux.org.uk> (raw)
In-Reply-To: <Yg/273dWmTKDW5Mu@zeniv-ca.linux.org.uk>

On Fri, Feb 18, 2022 at 07:43:43PM +0000, Al Viro wrote:
> On Fri, Feb 18, 2022 at 02:33:31PM -0500, Rik van Riel wrote:
> > On Fri, 2022-02-18 at 19:26 +0000, Al Viro wrote:
> > > On Fri, Feb 18, 2022 at 01:31:13PM -0500, Rik van Riel wrote:
> > > > After kern_unmount returns, callers can no longer access the
> > > > vfsmount structure. However, the vfsmount structure does need
> > > > to be kept around until the end of the RCU grace period, to
> > > > make sure other accesses have all gone away too.
> > > > 
> > > > This can be accomplished by either gating each kern_unmount
> > > > on synchronize_rcu (the comment in the code says it all), or
> > > > by deferring the freeing until the next grace period, where
> > > > it needs to be handled in a workqueue due to the locking in
> > > > mntput_no_expire().
> > > 
> > > NAK.  There's code that relies upon kern_unmount() being
> > > synchronous.  That's precisely the reason why MNT_INTERNAL
> > > is treated that way in mntput_no_expire().
> > 
> > Fair enough. Should I make a kern_unmount_rcu() version
> > that gets called just from mq_put_mnt()?
> 
> Umm...  I'm not sure you can afford having struct ipc_namespace
> freed and reused before the mqueue superblock gets at least to
> deactivate_locked_super().

BTW, that's a good demonstration of the problems with making those
beasts async.  struct mount is *not* accessed past kern_unmount(),
but the objects used by the superblock might very well be - in
this case they (struct ipc_namespace, pointed to by s->s_fs_data)
are freed by the caller after kern_unmount() returns.  And possibly
reused.  Now note that they are used as search keys by
mqueue_get_tree() and it becomes very fishy.

If you want to go that way, make it something like

void put_ipc_ns(struct ipc_namespace *ns)
{
        if (refcount_dec_and_lock(&ns->ns.count, &mq_lock)) {
		mq_clear_sbinfo(ns);
		spin_unlock(&mq_lock);
		kern_unmount_rcu(ns->mq_mnt);
	}
}

and give mqueue this for ->kill_sb():

static void mqueue_kill_super(struct super_block *sb)
{
	struct ipc_namespace *ns = sb->s_fs_info;
	kill_litter_super(sb);
	do the rest of free_ipc_ns();
}

One thing: kern_unmount_rcu() needs a very big warning about
the caution needed from its callers.  It's really not safe
for general use, and it will be a temptation for folks with
scalability problems like this one to just use it instead of
kern_unmount() and declare the problem solved.

  reply	other threads:[~2022-02-18 20:24 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-02-18 18:31 [PATCH 0/2] fix rate limited ipc_namespace freeing Rik van Riel
2022-02-18 18:31 ` [PATCH 1/2] vfs: free vfsmount through rcu work from kern_unmount Rik van Riel
2022-02-18 19:26   ` Al Viro
2022-02-18 19:33     ` Rik van Riel
2022-02-18 19:43       ` Al Viro
2022-02-18 20:24         ` Al Viro [this message]
2022-02-18 21:06           ` Al Viro
2022-02-19  5:50             ` Al Viro
2022-02-19  5:53   ` Al Viro
2022-02-19  5:58     ` Al Viro
2022-02-19  6:07       ` Al Viro
2022-02-18 18:31 ` [PATCH 2/2] ipc: get rid of free_ipc_work workqueue Rik van Riel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YhAAaU5wSoFdMsQf@zeniv-ca.linux.org.uk \
    --to=viro@zeniv.linux.org.uk \
    --cc=clm@fb.com \
    --cc=ebiederm@xmission.com \
    --cc=gscrivan@redhat.com \
    --cc=kernel-team@fb.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=paulmck@kernel.org \
    --cc=riel@surriel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.