linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Al Viro <viro@zeniv.linux.org.uk>
To: David Howells <dhowells@redhat.com>
Cc: avagin@gmail.com, linux-fsdevel@vger.kernel.org
Subject: Re: [PATCH 3/5] cgroup: Fix refcounting
Date: Sat, 5 Jan 2019 06:37:03 +0000	[thread overview]
Message-ID: <20190105063703.GP2217@ZenIV.linux.org.uk> (raw)
In-Reply-To: <154655905608.3032.5762393419161551596.stgit@warthog.procyon.org.uk>

On Thu, Jan 03, 2019 at 11:44:16PM +0000, David Howells wrote:
> Fix cgroup refcounting by the following means:
> 
>  (1) Don't use PERCPU_REF_INIT_DEAD and percpu_ref_reinit().  Using this
>      causes a problem should kernfs_get_tree() create a superblock and then
>      fail to create a root inode.  The superblock destructor will be invoked
>      before kernfs_get_tree() returns - and the refcount isn't reinit'd till
>      after that.
> 
>      To this end, cgroup_setup_root() no longer needs a ref_flags argument.
> 
>  (2) Provide a flag, CSS_CREATING, that is used to prevent concurrent access
>      to a cgroup root that is being set up and for which the superblock is
>      still being set up.  This appears to be necessary to hide the fact that
>      the cgroup is accessible before the superblock is fully initialised.
> 
>  (3) Set CSS_CREATING in cgroup1_get_tree() on a new cgroup_root.  This is
>      then cleared in cgroup_do_get_tree().
> 
>  (4) cgroup_get_tree() is made to call cgroup_get() on the root it sets for a
>      v2 cgroup.  Admittedly, this doesn't do anything because CSS_NO_REF is
>      set, but it future proofs it in case this is changed in future.
> 
>  (5) cgroup_fill_super() is created and passed to kernfs_get_tree() in the
>      kernfs_fs_context struct.  This takes an extra ref on the root for the
>      superblock in the event that a superblock is created.
> 
>      struct cgroup_fs_context::root then holds a single ref on the root right
>      through till it is freed.
> 
> Note that new_root is transferred into the cgroup_fs_context as
> is_new_root, though this is probably unnecessary as it's only used to clear
> CSS_CREATING - and no one else can gain access to the root until we've
> cleared the flag.

Umm...  It'll need reordering; could you do CSS_CREATING parts on top of
mainline, with the rest rediffed on top of that?  We'll need to backport
the fix, obviously...

Re (4) - that's over the top for backporting.  Sure, the analogue of the
problem does exist in mainline - 
        if (IS_ERR(dentry) || !new_sb)
                cgroup_put(&root->cgrp);
can't tell kernfs_mount_ns() failing very early from kernfs_fill_super()
failing and triggering ->kill_sb().  The latter case has a reference
dropped by cgroup_kill_sb(); in the former nothing of that sort
happens.  Said that, we have the following cases:
	1) early failure (anything up to sget_userns() in mainline).
ERR_PTR() is returned, reference not consumed.
	2) successful reuse of existing superblock.  Normal dentry
returned, new_sb set to false, reference not consumed.
	3) new superblock, with failing kernfs_fill_super().
ERR_PTR() returned, new_sb set to true.  Reference consumed.
	4) new superblock, successful setup.  Normal dentry returned,
new_sb set to true.  Reference consumed.

Cases (2) and (4) (!IS_ERR(dentry)) are fine - reference is consumed
iff new_sb is true.  Case (1) is also fine - reference is not consumed.
Case (3) is where we go wrong - the test comes up with "do cgroup_put()",
which we don't want.

Note that case (1) leaves new_sb uninitialized.  And *that* is the
root cause of this shit - if we initialized it (in cgroup_do_mount())
to false, the test would've been simply "if (!new_sb)".  In all
four cases.

That's precisely the same kind of bug as the one fixed in
7b745a4e4051 ("unfuck sysfs_mount()") last May...

  parent reply	other threads:[~2019-01-05  6:37 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-01-03 23:44 [PATCH 1/5] cgroup2: Always apply root flags on successful superblock obtainance David Howells
2019-01-03 23:44 ` [PATCH 2/5] kernfs: Provide a fill_super hook for the subclass filesystem David Howells
2019-01-03 23:44 ` [PATCH 3/5] cgroup: Fix refcounting David Howells
2019-01-04  8:16   ` Andrei Vagin
2019-01-05  6:37   ` Al Viro [this message]
2019-01-05 16:05     ` Al Viro
2019-01-05  7:21   ` David Howells
2019-01-03 23:44 ` [PATCH 4/5] cgroup: refcount fixes David Howells
2019-01-03 23:44 ` [PATCH 5/5] percpu: Kill off PERCPU_REF_INIT_DEAD and percpu_ref_reinit() David Howells
2019-01-03 23:56 ` [PATCH 4/5] cgroup: refcount fixes David Howells

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190105063703.GP2217@ZenIV.linux.org.uk \
    --to=viro@zeniv.linux.org.uk \
    --cc=avagin@gmail.com \
    --cc=dhowells@redhat.com \
    --cc=linux-fsdevel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).