[heads-up] buggered refcounting logics in cgroup1_mount()

linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Al Viro <viro@zeniv.linux.org.uk>
To: Zefan Li <lizefan@huawei.com>
Cc: Tejun Heo <tj@kernel.org>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	linux-fsdevel@vger.kernel.org
Subject: [heads-up] buggered refcounting logics in cgroup1_mount()
Date: Fri, 11 Jan 2019 07:23:09 +0000	[thread overview]
Message-ID: <20190111072309.GT2217@ZenIV.linux.org.uk> (raw)

	There'd been a long string of kludges in that area, and unfortunately
even the latest variant of solution is broken.  You rely upon the trick with
percpu_ref_reinit() being done only after cgroup_do_mount().  That prevents
parallel mount picking cgroup_root of superblock in the middle of setting
up, then losing CPU, original mount succeeding and umount marking our
cgroup_root killed; then the second mount regains CPU and proceeds to set
a new superblock.  Umount of _that_ would try to kill cgroup_root one more
time, with obvious nastiness.

	Delaying the moment when cgroup_root can be grabbed solves that,
but it does nothing for another similar case:
	* cgroup1 is mounted and populated
	* umount doesn't kill cgroup_root since it's non-empty
	* mount attempt picks cgroup_root, tries to do kernfs_pin_sb()
(which finds no superblocks) and successfully bumps the refcount.  Then
it drops cgroup_mutex and proceeds to lose CPU (preemption, blocking
allocation in kernfs, whatever).
	* another mount attempt picks the same cgroup_root, again
finds no superblock, bumps the refcount and proceeds to mount the
sucker.  Then, before the first one regains CPU, it manages to
empty the tree and umount it, marking cgroup_root killed.
	* now the first attempt regains CPU and we are in the same
situation as with the original bug.

	Another problem with that is that (without any races) failing
allocation in d_make_root() within kernfs_fill_super() will end up
in cgroup_kill_sb(), with cgroup_root _still_ marked killed (your
kludge creates it that way and we hadn't reached the reinit yet).

	AFAICS, a cleaner solution would be this:
* to hell with kernfs_pin_sb(); just try to grab a reference to
cgroup_root on reuse.
* have cgroup_kill_sb() treat "it's already marked killed" as
"just drop the reference, then".
* after cgroup_do_mount() check if cgroup_root got marked killed and
do deactivate_locked_super() in such case (with the same
restart_syscall() failure exit).

	Objections?  I would love to kill kernfs_pin_sb() as
a followup (it's a fundamentally broken API), but that's not
a stable fodder; some fix of refcounting bugs, OTOH, should be.

next             reply	other threads:[~2019-01-11  7:23 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-01-11  7:23 Al Viro [this message]
2019-01-11 20:54 ` [heads-up] buggered refcounting logics in cgroup1_mount() Tejun Heo
2019-01-12  5:38   ` Al Viro
2019-01-12 21:52     ` Al Viro
2019-01-15 15:26       ` Tejun Heo
2019-01-14  1:31   ` Zefan Li

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190111072309.GT2217@ZenIV.linux.org.uk \
    --to=viro@zeniv.linux.org.uk \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=lizefan@huawei.com \
    --cc=tj@kernel.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).