* [PATCH] cgroup: Increment dying descendants from rmdir context
@ 2026-04-22 13:57 Petr Malat
2026-04-22 17:14 ` Tejun Heo
0 siblings, 1 reply; 3+ messages in thread
From: Petr Malat @ 2026-04-22 13:57 UTC (permalink / raw)
To: cgroups; +Cc: Johannes Weiner, Tejun Heo, Petr Malat
Incrementing dying descendants in offline_css(), which is executed by
cgroup_offline_wq worker, leads to a race where user can see dying
descendants to be 0 if he reads cgroup.stat after calling rmdir and
before the worker executes. This makes the user wrongly expect resources
released by the removed cgroup to be available for a new assignment.
Increment dying descendants from kill_css(), which is called from the
cgroup_rmdir() context.
Signed-off-by: Petr Malat <oss@malat.biz>
---
kernel/cgroup/cgroup.c | 22 ++++++++++++----------
1 file changed, 12 insertions(+), 10 deletions(-)
diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c
index 3243c2087ee3..c928dea9dea6 100644
--- a/kernel/cgroup/cgroup.c
+++ b/kernel/cgroup/cgroup.c
@@ -5724,16 +5724,6 @@ static void offline_css(struct cgroup_subsys_state *css)
RCU_INIT_POINTER(css->cgroup->subsys[ss->id], NULL);
wake_up_all(&css->cgroup->offline_waitq);
-
- css->cgroup->nr_dying_subsys[ss->id]++;
- /*
- * Parent css and cgroup cannot be freed until after the freeing
- * of child css, see css_free_rwork_fn().
- */
- while ((css = css->parent)) {
- css->nr_descendants--;
- css->cgroup->nr_dying_subsys[ss->id]++;
- }
}
/**
@@ -6045,6 +6035,8 @@ static void css_killed_ref_fn(struct percpu_ref *ref)
*/
static void kill_css(struct cgroup_subsys_state *css)
{
+ struct cgroup_subsys *ss = css->ss;
+
lockdep_assert_held(&cgroup_mutex);
if (css->flags & CSS_DYING)
@@ -6081,6 +6073,16 @@ static void kill_css(struct cgroup_subsys_state *css)
* css is confirmed to be seen as killed on all CPUs.
*/
percpu_ref_kill_and_confirm(&css->refcnt, css_killed_ref_fn);
+
+ css->cgroup->nr_dying_subsys[ss->id]++;
+ /*
+ * Parent css and cgroup cannot be freed until after the freeing
+ * of child css, see css_free_rwork_fn().
+ */
+ while ((css = css->parent)) {
+ css->nr_descendants--;
+ css->cgroup->nr_dying_subsys[ss->id]++;
+ }
}
/**
--
2.47.3
^ permalink raw reply related [flat|nested] 3+ messages in thread* Re: [PATCH] cgroup: Increment dying descendants from rmdir context
2026-04-22 13:57 [PATCH] cgroup: Increment dying descendants from rmdir context Petr Malat
@ 2026-04-22 17:14 ` Tejun Heo
2026-04-23 9:48 ` [PATCH v2] cgroup: Increment nr_dying_subsys_* " Petr Malat
0 siblings, 1 reply; 3+ messages in thread
From: Tejun Heo @ 2026-04-22 17:14 UTC (permalink / raw)
To: Petr Malat; +Cc: cgroups, Johannes Weiner
Hello Petr,
Thanks for the patch - the fix itself looks good to me. One thing
worth clarifying in the subject and changelog: the counters this
patch actually moves are cgroup->nr_dying_subsys[] and the per-css
nr_descendants walk, which surface as nr_dying_subsys_<name> and
nr_subsys_<name> in cgroup.stat. The top-level nr_dying_descendants
is already incremented synchronously in cgroup_destroy_locked() under
css_set_lock, so it's not the one that was racy.
Could you respin with the subject and description updated to name
the actual counters? That'd make the intent clearer for future
readers.
Thanks.
--
tejun
^ permalink raw reply [flat|nested] 3+ messages in thread
* [PATCH v2] cgroup: Increment nr_dying_subsys_* from rmdir context
2026-04-22 17:14 ` Tejun Heo
@ 2026-04-23 9:48 ` Petr Malat
0 siblings, 0 replies; 3+ messages in thread
From: Petr Malat @ 2026-04-23 9:48 UTC (permalink / raw)
To: cgroups; +Cc: Johannes Weiner, Tejun Heo, Petr Malat
Incrementing nr_dying_subsys_* in offline_css(), which is executed by
cgroup_offline_wq worker, leads to a race where user can see the value
to be 0 if he reads cgroup.stat after calling rmdir and before the worker
executes. This makes the user wrongly expect resources released by the
removed cgroup to be available for a new assignment.
Increment nr_dying_subsys_* from kill_css(), which is called from the
cgroup_rmdir() context.
Signed-off-by: Petr Malat <oss@malat.biz>
---
kernel/cgroup/cgroup.c | 22 ++++++++++++----------
1 file changed, 12 insertions(+), 10 deletions(-)
diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c
index 3243c2087ee3..c928dea9dea6 100644
--- a/kernel/cgroup/cgroup.c
+++ b/kernel/cgroup/cgroup.c
@@ -5724,16 +5724,6 @@ static void offline_css(struct cgroup_subsys_state *css)
RCU_INIT_POINTER(css->cgroup->subsys[ss->id], NULL);
wake_up_all(&css->cgroup->offline_waitq);
-
- css->cgroup->nr_dying_subsys[ss->id]++;
- /*
- * Parent css and cgroup cannot be freed until after the freeing
- * of child css, see css_free_rwork_fn().
- */
- while ((css = css->parent)) {
- css->nr_descendants--;
- css->cgroup->nr_dying_subsys[ss->id]++;
- }
}
/**
@@ -6045,6 +6035,8 @@ static void css_killed_ref_fn(struct percpu_ref *ref)
*/
static void kill_css(struct cgroup_subsys_state *css)
{
+ struct cgroup_subsys *ss = css->ss;
+
lockdep_assert_held(&cgroup_mutex);
if (css->flags & CSS_DYING)
@@ -6081,6 +6073,16 @@ static void kill_css(struct cgroup_subsys_state *css)
* css is confirmed to be seen as killed on all CPUs.
*/
percpu_ref_kill_and_confirm(&css->refcnt, css_killed_ref_fn);
+
+ css->cgroup->nr_dying_subsys[ss->id]++;
+ /*
+ * Parent css and cgroup cannot be freed until after the freeing
+ * of child css, see css_free_rwork_fn().
+ */
+ while ((css = css->parent)) {
+ css->nr_descendants--;
+ css->cgroup->nr_dying_subsys[ss->id]++;
+ }
}
/**
--
2.47.3
^ permalink raw reply related [flat|nested] 3+ messages in thread
end of thread, other threads:[~2026-04-23 9:48 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-22 13:57 [PATCH] cgroup: Increment dying descendants from rmdir context Petr Malat
2026-04-22 17:14 ` Tejun Heo
2026-04-23 9:48 ` [PATCH v2] cgroup: Increment nr_dying_subsys_* " Petr Malat
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox