From: Tejun Heo <tj@kernel.org>
To: cgroups@vger.kernel.org, linux-kernel@vger.kernel.org
Cc: "Mark Brown" <broonie@kernel.org>,
"Bert Karwatzki" <spasswolf@web.de>,
"Johannes Weiner" <hannes@cmpxchg.org>,
"Michal Koutný" <mkoutny@suse.com>,
"Sebastian Andrzej Siewior" <bigeasy@linutronix.de>,
"Petr Malat" <oss@malat.biz>,
"kernel test robot" <oliver.sang@intel.com>,
"Martin Pitt" <martin@piware.de>,
Aishwarya.TCV@arm.com, "Tejun Heo" <tj@kernel.org>
Subject: [PATCH] cgroup: Migrate tasks to the root css when a controller is rebound
Date: Mon, 1 Jun 2026 09:02:56 -1000 [thread overview]
Message-ID: <20260601190256.1815778-1-tj@kernel.org> (raw)
In-Reply-To: <a9f6c0bcd262e764453b95eb7397871825e11559.camel@web.de>
cgroup_apply_control_disable() defers kill_css_finish() while a css is
still populated, relying on css_update_populated() to fire the deferred
kill once the populated count reaches zero.
This deadlocks when a controller is rebound out of a hierarchy. Mounting
an implicit_on_dfl controller such as perf_event as a v1 hierarchy steals
it off the default hierarchy, and rebind_subsystems() kills its
per-cgroup csses while they are still populated. The migration run in the
same step keeps the old css for a controller no longer in the hierarchy's
mask, so no task is migrated off the dying csses. Their populated count
never reaches zero, the deferred kill_css_finish() never fires, and the
next cgroup_lock_and_drain_offline() hangs forever under cgroup_mutex.
That migration is already a no-op pass over the rebound subtree. Add
cgroup_rebind_ss_mask so find_existing_css_set() resolves the leaving
controllers to the root css. Their tasks are migrated there, the
per-cgroup csses depopulate, and cgroup_apply_control_disable() kills
them synchronously. The deferral stays correct for the rmdir and
controller-disable paths it was meant for.
Fixes: 1dffd95575eb ("cgroup: Defer kill_css_finish() in cgroup_apply_control_disable()")
Reported-by: Mark Brown <broonie@kernel.org>
Closes: https://lore.kernel.org/all/41cd159c-54e5-45e0-81df-eaf36a6c028e@sirena.org.uk/
Reported-by: Bert Karwatzki <spasswolf@web.de>
Closes: https://lore.kernel.org/all/4e986b4ed7e16547805d54b6e67d09120bc4d2f2.camel@web.de/
Signed-off-by: Tejun Heo <tj@kernel.org>
---
Hello, and thanks a lot for all the reproduction information. It made this
much easier to track down.
Bert, Mark, would you mind giving this a try on your setups?
kernel/cgroup/cgroup.c | 35 +++++++++++++++++++++++++++++++----
1 file changed, 31 insertions(+), 4 deletions(-)
diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c
index bdc8deedb4f7..7f4861109e48 100644
--- a/kernel/cgroup/cgroup.c
+++ b/kernel/cgroup/cgroup.c
@@ -197,6 +197,14 @@ static u32 cgrp_dfl_implicit_ss_mask;
/* some controllers can be threaded on the default hierarchy */
static u32 cgrp_dfl_threaded_ss_mask;
+/*
+ * Set across rebind_subsystems() to the controllers leaving a hierarchy.
+ * Guarded by cgroup_mutex. Makes find_existing_css_set() resolve them to the
+ * root css so the affected tasks are migrated there before
+ * cgroup_apply_control_disable() kills the per-cgroup csses.
+ */
+static u32 cgroup_rebind_ss_mask;
+
/* The list of hierarchy roots */
LIST_HEAD(cgroup_roots);
static int cgroup_root_count;
@@ -1083,7 +1091,15 @@ static struct css_set *find_existing_css_set(struct css_set *old_cset,
* won't change, so no need for locking.
*/
for_each_subsys(ss, i) {
- if (root->subsys_mask & (1UL << i)) {
+ if (unlikely(cgroup_rebind_ss_mask & (1UL << i))) {
+ /*
+ * @ss is leaving this hierarchy and its per-cgroup
+ * csses are about to be killed. Resolve to the
+ * surviving root css so the tasks are migrated there.
+ */
+ template[i] = cgroup_css(&root->cgrp, ss);
+ WARN_ON_ONCE(!template[i]);
+ } else if (root->subsys_mask & (1UL << i)) {
/*
* @ss is in this hierarchy, so we want the
* effective css from @cgrp.
@@ -1853,11 +1869,17 @@ int rebind_subsystems(struct cgroup_root *dst_root, u32 ss_mask)
struct cgroup *scgrp = &cgrp_dfl_root.cgrp;
/*
- * Controllers from default hierarchy that need to be rebound
- * are all disabled together in one go.
+ * Controllers leaving the default hierarchy are disabled
+ * together. cgroup_rebind_ss_mask makes cgroup_apply_control()
+ * migrate their tasks to the root css, so the per-cgroup csses
+ * are unpopulated when cgroup_finalize_control() kills them.
+ * Clear it before cgroup_finalize_control(), which does no
+ * css_set lookup.
*/
cgrp_dfl_root.subsys_mask &= ~dfl_disable_ss_mask;
+ cgroup_rebind_ss_mask = dfl_disable_ss_mask;
WARN_ON(cgroup_apply_control(scgrp));
+ cgroup_rebind_ss_mask = 0;
cgroup_finalize_control(scgrp, 0);
}
@@ -1871,9 +1893,14 @@ int rebind_subsystems(struct cgroup_root *dst_root, u32 ss_mask)
WARN_ON(!css || cgroup_css(dcgrp, ss));
if (src_root != &cgrp_dfl_root) {
- /* disable from the source */
+ /*
+ * Disable from the source, migrating its tasks to the
+ * root css first (see cgroup_rebind_ss_mask).
+ */
src_root->subsys_mask &= ~(1 << ssid);
+ cgroup_rebind_ss_mask = 1 << ssid;
WARN_ON(cgroup_apply_control(scgrp));
+ cgroup_rebind_ss_mask = 0;
cgroup_finalize_control(scgrp, 0);
}
--
2.54.0
next prev parent reply other threads:[~2026-06-01 19:02 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-05 0:51 [PATCHSET cgroup/for-7.2] cgroup: Per-css kill_css_finish deferral Tejun Heo
2026-05-05 0:51 ` [PATCH 1/5] cgroup: Inline cgroup_has_tasks() in cgroup.h Tejun Heo
2026-05-05 0:51 ` [PATCH 2/5] cgroup: Annotate unlocked nr_populated_* accesses with READ_ONCE/WRITE_ONCE Tejun Heo
2026-05-05 0:51 ` [PATCH 3/5] cgroup: Move populated counters to cgroup_subsys_state Tejun Heo
2026-05-05 0:51 ` [PATCH 4/5] cgroup: Add per-subsys-css kill_css_finish deferral Tejun Heo
2026-05-05 0:51 ` [PATCH 5/5] cgroup: Defer kill_css_finish() in cgroup_apply_control_disable() Tejun Heo
2026-05-27 10:45 ` Mark Brown
2026-05-29 17:25 ` Tejun Heo
2026-05-29 21:08 ` Mark Brown
2026-05-31 9:19 ` Bert Karwatzki
2026-05-31 18:45 ` Bert Karwatzki
2026-06-01 9:22 ` Bert Karwatzki
2026-06-01 19:02 ` Tejun Heo [this message]
2026-06-01 19:07 ` [PATCH] cgroup: Migrate tasks to the root css when a controller is rebound Bert Karwatzki
2026-06-01 19:50 ` Bert Karwatzki
2026-06-02 16:28 ` Mark Brown
2026-06-02 18:34 ` Tejun Heo
2026-05-13 21:01 ` [PATCHSET cgroup/for-7.2] cgroup: Per-css kill_css_finish deferral Tejun Heo
2026-05-15 17:28 ` Tejun Heo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260601190256.1815778-1-tj@kernel.org \
--to=tj@kernel.org \
--cc=Aishwarya.TCV@arm.com \
--cc=bigeasy@linutronix.de \
--cc=broonie@kernel.org \
--cc=cgroups@vger.kernel.org \
--cc=hannes@cmpxchg.org \
--cc=linux-kernel@vger.kernel.org \
--cc=martin@piware.de \
--cc=mkoutny@suse.com \
--cc=oliver.sang@intel.com \
--cc=oss@malat.biz \
--cc=spasswolf@web.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.