* [PATCH 0/5 cgroup/for-6.16-fixes] harden css_create() for safe placement of call to css_rstat_init()
@ 2025-07-22 1:40 JP Kobryn
2025-07-22 1:40 ` [PATCH 1/5 cgroup/for-6.16-fixes] cgroup: add exclusive css rstat init/exit api for base stats JP Kobryn
` (5 more replies)
0 siblings, 6 replies; 10+ messages in thread
From: JP Kobryn @ 2025-07-22 1:40 UTC (permalink / raw)
To: tj, shakeel.butt, mkoutny, yosryahmed, hannes, akpm
Cc: linux-kernel, cgroups, kernel-team
Within css_create() there are several error paths that lead to async
cleanup of a given css. An warning was found [0] which appears to be the
result of calling css_rstat_exit() on this cleanup path with a css having
uninitialized rstat objects. This series makes adjustments to provide up a
place in css_create() where css_rstat_init() can both 1) fail gracefully
and 2) guarantee completion before async cleanup leading to
css_rstat_exit() occurs.
The core change is splitting init_and_link_css() into two separate
functions and calling css_rstat_init() between them. The reason for this is
that because of the refcount incrementing that occurs within, this function
puts a constraint on the error handling that follows. See excerpt below:
css_create(...)
{
...
init_and_link_css(css, ...);
/* any subsequent error needs async css cleanup */
err = percpu_ref_init(...);
if (err)
goto err_free_css;
err = cgroup_idr_alloc(...);
if (err)
goto err_free_css;
err = css_rstat_init(css, ...);
if (err)
goto err_free_css;
...
err_free_css:
INIT_RCU_WORK(&css->destroy_rwork, css_free_rwork_fn);
queue_rcu_work(cgroup_destroy_wq, &css->destroy_rwork);
return ERR_PTR(err);
}
If any of the three goto jumps are taken, async cleanup will begin and
css_rstat_exit() will be invoked. But since css_rstat_init() would not have
succeeded, the warning will eventually be reached.
In this series, the code above changes to resemble something like this:
css_create(...)
{
...
init_css(css);
if (css->css_rstat_flush) {
err = css_rstat_init(css)
if (err) {
ss->css_free(css);
return ERR_PTR(err);
}
}
link_css(css, ...);
...
}
There was some refactoring done to eliminate self-imposed constraints on
where css_rstat_init() may be called. For example, one previous constraint
was that css_rstat_init() could only be called after init_and_link_css()
since that is where it gets its subsystem and cgroup fields set, and
css->ss was used to determine how to allocate rstat (base or subsystem).
The changes in this series remove this constraint by eliminating the
dependency of subsystem/cgroup information in rstat init/exit. The
resulting init/exit routines can work exclusively on rstat entities and not
be concerned with other types of entities.
The refactoring may also improve maintainability. Existing code like this:
css_rstat_init(&cgrp->self); /* for base state */
css_rstat_init(css); /* for subsystems */
... will now look like this:
cgroup_rstat_base_init(cgrp); /* for base stats */
css_rstat_init(css); /* for subsystems */
Finally, the number of nested conditionals in two functions above has been
reduced compared to the amount previously in css_rstat_init/exit(). This
could help improve code readability.
[0] https://lore.kernel.org/all/684c5802.a00a0220.279073.0016.GAE@google.com/
JP Kobryn (5):
cgroup: add exclusive css rstat init/exit api for base stats
cgroup: check for rstat flush callback at css rstat init/exit call
sites
cgroup: split init_and_link_css()
cgroup: initialize css rstat before linking to cgroups in css_create()
cgroup: break up the internal rstat init/exit logic by subsys and base
kernel/cgroup/cgroup-internal.h | 2 +
kernel/cgroup/cgroup.c | 57 ++++++++++++++++++----------
kernel/cgroup/rstat.c | 67 +++++++++++++++++----------------
3 files changed, 74 insertions(+), 52 deletions(-)
--
2.47.1
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH 1/5 cgroup/for-6.16-fixes] cgroup: add exclusive css rstat init/exit api for base stats
2025-07-22 1:40 [PATCH 0/5 cgroup/for-6.16-fixes] harden css_create() for safe placement of call to css_rstat_init() JP Kobryn
@ 2025-07-22 1:40 ` JP Kobryn
2025-07-22 1:40 ` [PATCH 2/5 cgroup/for-6.16-fixes] cgroup: check for rstat flush callback at css rstat init/exit call sites JP Kobryn
` (4 subsequent siblings)
5 siblings, 0 replies; 10+ messages in thread
From: JP Kobryn @ 2025-07-22 1:40 UTC (permalink / raw)
To: tj, shakeel.butt, mkoutny, yosryahmed, hannes, akpm
Cc: linux-kernel, cgroups, kernel-team
It is known at call sites of css_rstat_init/exit() whether the given css is
associated with the base stats or a formal subsystem. Instead of passing
&cgrp->self or css to a shared API, offer an additional init/exit API pair
for exclusive use with the base stats.
Make use of this new API to make existing code more readable and also to
prepare for refactoring the complex css rstat init/exit logic.
Signed-off-by: JP Kobryn <inwardvessel@gmail.com>
---
kernel/cgroup/cgroup-internal.h | 2 ++
kernel/cgroup/cgroup.c | 11 ++++++-----
kernel/cgroup/rstat.c | 27 +++++++++++++++++++++++----
3 files changed, 31 insertions(+), 9 deletions(-)
diff --git a/kernel/cgroup/cgroup-internal.h b/kernel/cgroup/cgroup-internal.h
index b14e61c64a34..24fca840bf1c 100644
--- a/kernel/cgroup/cgroup-internal.h
+++ b/kernel/cgroup/cgroup-internal.h
@@ -270,6 +270,8 @@ int cgroup_task_count(const struct cgroup *cgrp);
/*
* rstat.c
*/
+int cgroup_rstat_base_init(struct cgroup *cgrp);
+void cgroup_rstat_base_exit(struct cgroup *cgrp);
int css_rstat_init(struct cgroup_subsys_state *css);
void css_rstat_exit(struct cgroup_subsys_state *css);
int ss_rstat_init(struct cgroup_subsys *ss);
diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c
index a723b7dc6e4e..d684d33236b8 100644
--- a/kernel/cgroup/cgroup.c
+++ b/kernel/cgroup/cgroup.c
@@ -2149,7 +2149,7 @@ int cgroup_setup_root(struct cgroup_root *root, u16 ss_mask)
if (ret)
goto destroy_root;
- ret = css_rstat_init(&root_cgrp->self);
+ ret = cgroup_rstat_base_init(root_cgrp);
if (ret)
goto destroy_root;
@@ -2190,7 +2190,7 @@ int cgroup_setup_root(struct cgroup_root *root, u16 ss_mask)
goto out;
exit_stats:
- css_rstat_exit(&root_cgrp->self);
+ cgroup_rstat_base_exit(root_cgrp);
destroy_root:
kernfs_destroy_root(root->kf_root);
root->kf_root = NULL;
@@ -5446,13 +5446,13 @@ static void css_free_rwork_fn(struct work_struct *work)
struct cgroup *cgrp = css->cgroup;
percpu_ref_exit(&css->refcnt);
- css_rstat_exit(css);
if (!css_is_self(css)) {
/* css free path */
struct cgroup_subsys_state *parent = css->parent;
int id = css->id;
+ css_rstat_exit(css);
ss->css_free(css);
cgroup_idr_remove(&ss->css_idr, id);
cgroup_put(cgrp);
@@ -5477,6 +5477,7 @@ static void css_free_rwork_fn(struct work_struct *work)
cgroup_put(cgroup_parent(cgrp));
kernfs_put(cgrp->kn);
psi_cgroup_free(cgrp);
+ cgroup_rstat_base_exit(cgrp);
kfree(cgrp);
} else {
/*
@@ -5742,7 +5743,7 @@ static struct cgroup *cgroup_create(struct cgroup *parent, const char *name,
* Now that init_cgroup_housekeeping() has been called and cgrp->self
* is setup, it is safe to perform rstat initialization on it.
*/
- ret = css_rstat_init(&cgrp->self);
+ ret = cgroup_rstat_base_init(cgrp);
if (ret)
goto out_kernfs_remove;
@@ -5818,7 +5819,7 @@ static struct cgroup *cgroup_create(struct cgroup *parent, const char *name,
out_psi_free:
psi_cgroup_free(cgrp);
out_stat_exit:
- css_rstat_exit(&cgrp->self);
+ cgroup_rstat_base_exit(cgrp);
out_kernfs_remove:
kernfs_remove(cgrp->kn);
out_cancel_ref:
diff --git a/kernel/cgroup/rstat.c b/kernel/cgroup/rstat.c
index cbeaa499a96a..8c3cb4a989ad 100644
--- a/kernel/cgroup/rstat.c
+++ b/kernel/cgroup/rstat.c
@@ -437,11 +437,10 @@ __bpf_kfunc void css_rstat_flush(struct cgroup_subsys_state *css)
}
}
-int css_rstat_init(struct cgroup_subsys_state *css)
+static int __css_rstat_init(struct cgroup_subsys_state *css, bool is_self)
{
struct cgroup *cgrp = css->cgroup;
int cpu;
- bool is_self = css_is_self(css);
if (is_self) {
/* the root cgrp has rstat_base_cpu preallocated */
@@ -481,7 +480,7 @@ int css_rstat_init(struct cgroup_subsys_state *css)
return 0;
}
-void css_rstat_exit(struct cgroup_subsys_state *css)
+static void __css_rstat_exit(struct cgroup_subsys_state *css, bool is_self)
{
int cpu;
@@ -499,7 +498,7 @@ void css_rstat_exit(struct cgroup_subsys_state *css)
return;
}
- if (css_is_self(css)) {
+ if (is_self) {
struct cgroup *cgrp = css->cgroup;
free_percpu(cgrp->rstat_base_cpu);
@@ -510,6 +509,26 @@ void css_rstat_exit(struct cgroup_subsys_state *css)
css->rstat_cpu = NULL;
}
+int css_rstat_init(struct cgroup_subsys_state *css)
+{
+ return __css_rstat_init(css, false);
+}
+
+void css_rstat_exit(struct cgroup_subsys_state *css)
+{
+ return __css_rstat_exit(css, false);
+}
+
+int cgroup_rstat_base_init(struct cgroup *cgrp)
+{
+ return __css_rstat_init(&cgrp->self, true);
+}
+
+void cgroup_rstat_base_exit(struct cgroup *cgrp)
+{
+ __css_rstat_exit(&cgrp->self, true);
+}
+
/**
* ss_rstat_init - subsystem-specific rstat initialization
* @ss: target subsystem
--
2.47.1
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH 2/5 cgroup/for-6.16-fixes] cgroup: check for rstat flush callback at css rstat init/exit call sites
2025-07-22 1:40 [PATCH 0/5 cgroup/for-6.16-fixes] harden css_create() for safe placement of call to css_rstat_init() JP Kobryn
2025-07-22 1:40 ` [PATCH 1/5 cgroup/for-6.16-fixes] cgroup: add exclusive css rstat init/exit api for base stats JP Kobryn
@ 2025-07-22 1:40 ` JP Kobryn
2025-07-22 1:40 ` [PATCH 3/5 cgroup/for-6.16-fixes] cgroup: split init_and_link_css() JP Kobryn
` (3 subsequent siblings)
5 siblings, 0 replies; 10+ messages in thread
From: JP Kobryn @ 2025-07-22 1:40 UTC (permalink / raw)
To: tj, shakeel.butt, mkoutny, yosryahmed, hannes, akpm
Cc: linux-kernel, cgroups, kernel-team
The css rstat init/exit functions have a dependency on the associated
subsystem which is inspected to see if the flush callback exists. Move this
logic out of these functions and instead let callers perform this check in
advance. This decoupling allows greater flexibility in where
css_rstat_init() may be called.
Signed-off-by: JP Kobryn <inwardvessel@gmail.com>
---
kernel/cgroup/cgroup.c | 18 ++++++++++++------
kernel/cgroup/rstat.c | 6 +-----
2 files changed, 13 insertions(+), 11 deletions(-)
diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c
index d684d33236b8..b034f47580f6 100644
--- a/kernel/cgroup/cgroup.c
+++ b/kernel/cgroup/cgroup.c
@@ -5452,7 +5452,9 @@ static void css_free_rwork_fn(struct work_struct *work)
struct cgroup_subsys_state *parent = css->parent;
int id = css->id;
- css_rstat_exit(css);
+ if (ss->css_rstat_flush)
+ css_rstat_exit(css);
+
ss->css_free(css);
cgroup_idr_remove(&ss->css_idr, id);
cgroup_put(cgrp);
@@ -5679,9 +5681,11 @@ static struct cgroup_subsys_state *css_create(struct cgroup *cgrp,
goto err_free_css;
css->id = err;
- err = css_rstat_init(css);
- if (err)
- goto err_free_css;
+ if (ss->css_rstat_flush) {
+ err = css_rstat_init(css);
+ if (err)
+ goto err_free_css;
+ }
/* @css is ready to be brought online now, make it visible */
list_add_tail_rcu(&css->sibling, &parent_css->children);
@@ -6141,8 +6145,10 @@ static void __init cgroup_init_subsys(struct cgroup_subsys *ss, bool early)
css->id = cgroup_idr_alloc(&ss->css_idr, css, 1, 2, GFP_KERNEL);
BUG_ON(css->id < 0);
- BUG_ON(ss_rstat_init(ss));
- BUG_ON(css_rstat_init(css));
+ if (ss->css_rstat_flush) {
+ BUG_ON(ss_rstat_init(ss));
+ BUG_ON(css_rstat_init(css));
+ }
}
/* Update the init_css_set to contain a subsys
diff --git a/kernel/cgroup/rstat.c b/kernel/cgroup/rstat.c
index 8c3cb4a989ad..ba656a53136a 100644
--- a/kernel/cgroup/rstat.c
+++ b/kernel/cgroup/rstat.c
@@ -449,8 +449,7 @@ static int __css_rstat_init(struct cgroup_subsys_state *css, bool is_self)
if (!cgrp->rstat_base_cpu)
return -ENOMEM;
}
- } else if (css->ss->css_rstat_flush == NULL)
- return 0;
+ }
/* the root cgrp's self css has rstat_cpu preallocated */
if (!css->rstat_cpu) {
@@ -484,9 +483,6 @@ static void __css_rstat_exit(struct cgroup_subsys_state *css, bool is_self)
{
int cpu;
- if (!css_uses_rstat(css))
- return;
-
css_rstat_flush(css);
/* sanity check */
--
2.47.1
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH 3/5 cgroup/for-6.16-fixes] cgroup: split init_and_link_css()
2025-07-22 1:40 [PATCH 0/5 cgroup/for-6.16-fixes] harden css_create() for safe placement of call to css_rstat_init() JP Kobryn
2025-07-22 1:40 ` [PATCH 1/5 cgroup/for-6.16-fixes] cgroup: add exclusive css rstat init/exit api for base stats JP Kobryn
2025-07-22 1:40 ` [PATCH 2/5 cgroup/for-6.16-fixes] cgroup: check for rstat flush callback at css rstat init/exit call sites JP Kobryn
@ 2025-07-22 1:40 ` JP Kobryn
2025-07-22 1:40 ` [PATCH 4/5 cgroup/for-6.16-fixes] cgroup: initialize css rstat before linking to cgroups in css_create() JP Kobryn
` (2 subsequent siblings)
5 siblings, 0 replies; 10+ messages in thread
From: JP Kobryn @ 2025-07-22 1:40 UTC (permalink / raw)
To: tj, shakeel.butt, mkoutny, yosryahmed, hannes, akpm
Cc: linux-kernel, cgroups, kernel-team
Just as its name implies, init_and_link_css() has two responsibilities.
One is defining default values for some of the given css's fields. The
other is defining the cgroup, parent, and subsystem relationships (linking)
while incrementing the newly associated cgroup refcounts (including
parent). Once the refcounts are changed, cleanup of the css has to be
performed asynchronously in a series of workqueue functions.
The cleanup constraint impacts the error handling of the the css_create()
function. Code that follows init_and_link_css() must jump to the async
cleanup path in the case of an error. This leaves the call to
css_rstat_init() in a bad position. If it fails or if any other function
between it and init_and_link_css() fails, the async cleanup sequence will
ultimately reach a call to css_rstat_exit() on an uninitialized css.
Split init_and_link_css() into separate functions for each of its two
responsibilies. This allows for handling errors without having to resort to
the async cleanup sequence. More specifically, css_rstat_init() could be
called before async cleanup becomes necessary within css_create(). This
patch serves as preparation for the change in where css_rstat_init() will
be called.
Signed-off-by: JP Kobryn <inwardvessel@gmail.com>
---
kernel/cgroup/cgroup.c | 24 +++++++++++++++---------
1 file changed, 15 insertions(+), 9 deletions(-)
diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c
index b034f47580f6..1990c6113c7f 100644
--- a/kernel/cgroup/cgroup.c
+++ b/kernel/cgroup/cgroup.c
@@ -5568,21 +5568,25 @@ static void css_release(struct percpu_ref *ref)
queue_work(cgroup_destroy_wq, &css->destroy_work);
}
-static void init_and_link_css(struct cgroup_subsys_state *css,
+static void init_css(struct cgroup_subsys_state *css)
+{
+ memset(css, 0, sizeof(*css));
+ css->id = -1;
+ INIT_LIST_HEAD(&css->sibling);
+ INIT_LIST_HEAD(&css->children);
+ css->serial_nr = css_serial_nr_next++;
+ atomic_set(&css->online_cnt, 0);
+}
+
+static void link_css(struct cgroup_subsys_state *css,
struct cgroup_subsys *ss, struct cgroup *cgrp)
{
lockdep_assert_held(&cgroup_mutex);
cgroup_get_live(cgrp);
- memset(css, 0, sizeof(*css));
css->cgroup = cgrp;
css->ss = ss;
- css->id = -1;
- INIT_LIST_HEAD(&css->sibling);
- INIT_LIST_HEAD(&css->children);
- css->serial_nr = css_serial_nr_next++;
- atomic_set(&css->online_cnt, 0);
if (cgroup_parent(cgrp)) {
css->parent = cgroup_css(cgroup_parent(cgrp), ss);
@@ -5670,7 +5674,8 @@ static struct cgroup_subsys_state *css_create(struct cgroup *cgrp,
if (IS_ERR(css))
return css;
- init_and_link_css(css, ss, cgrp);
+ init_css(css);
+ link_css(css, ss, cgrp);
err = percpu_ref_init(&css->refcnt, css_release, 0, GFP_KERNEL);
if (err)
@@ -6130,7 +6135,8 @@ static void __init cgroup_init_subsys(struct cgroup_subsys *ss, bool early)
css = ss->css_alloc(NULL);
/* We don't handle early failures gracefully */
BUG_ON(IS_ERR(css));
- init_and_link_css(css, ss, &cgrp_dfl_root.cgrp);
+ init_css(css);
+ link_css(css, ss, &cgrp_dfl_root.cgrp);
/*
* Root csses are never destroyed and we can't initialize
--
2.47.1
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH 4/5 cgroup/for-6.16-fixes] cgroup: initialize css rstat before linking to cgroups in css_create()
2025-07-22 1:40 [PATCH 0/5 cgroup/for-6.16-fixes] harden css_create() for safe placement of call to css_rstat_init() JP Kobryn
` (2 preceding siblings ...)
2025-07-22 1:40 ` [PATCH 3/5 cgroup/for-6.16-fixes] cgroup: split init_and_link_css() JP Kobryn
@ 2025-07-22 1:40 ` JP Kobryn
2025-07-22 1:40 ` [PATCH 5/5 cgroup/for-6.16-fixes] cgroup: break up the internal rstat init/exit logic by subsys and base JP Kobryn
2025-07-25 17:23 ` [PATCH 0/5 cgroup/for-6.16-fixes] harden css_create() for safe placement of call to css_rstat_init() Michal Koutný
5 siblings, 0 replies; 10+ messages in thread
From: JP Kobryn @ 2025-07-22 1:40 UTC (permalink / raw)
To: tj, shakeel.butt, mkoutny, yosryahmed, hannes, akpm
Cc: linux-kernel, cgroups, kernel-team
Calling css_rstat_init() after linking the css with its cgroup (and parent
cgroup) leaves the code vulnerable to async cleanup where css_rstat_exit()
is invoked on a fully initialized css. Avoid this by calling
css_rstat_init() before linking. The error can then be handled inline with
no async cleanup requirement.
Signed-off-by: JP Kobryn <inwardvessel@gmail.com>
---
kernel/cgroup/cgroup.c | 16 ++++++++++------
1 file changed, 10 insertions(+), 6 deletions(-)
diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c
index 1990c6113c7f..0f8c6ce21634 100644
--- a/kernel/cgroup/cgroup.c
+++ b/kernel/cgroup/cgroup.c
@@ -5675,6 +5675,15 @@ static struct cgroup_subsys_state *css_create(struct cgroup *cgrp,
return css;
init_css(css);
+
+ if (ss->css_rstat_flush) {
+ err = css_rstat_init(css);
+ if (err) {
+ ss->css_free(css);
+ goto err_out;
+ }
+ }
+
link_css(css, ss, cgrp);
err = percpu_ref_init(&css->refcnt, css_release, 0, GFP_KERNEL);
@@ -5686,12 +5695,6 @@ static struct cgroup_subsys_state *css_create(struct cgroup *cgrp,
goto err_free_css;
css->id = err;
- if (ss->css_rstat_flush) {
- err = css_rstat_init(css);
- if (err)
- goto err_free_css;
- }
-
/* @css is ready to be brought online now, make it visible */
list_add_tail_rcu(&css->sibling, &parent_css->children);
cgroup_idr_replace(&ss->css_idr, css, css->id);
@@ -5707,6 +5710,7 @@ static struct cgroup_subsys_state *css_create(struct cgroup *cgrp,
err_free_css:
INIT_RCU_WORK(&css->destroy_rwork, css_free_rwork_fn);
queue_rcu_work(cgroup_destroy_wq, &css->destroy_rwork);
+err_out:
return ERR_PTR(err);
}
--
2.47.1
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH 5/5 cgroup/for-6.16-fixes] cgroup: break up the internal rstat init/exit logic by subsys and base
2025-07-22 1:40 [PATCH 0/5 cgroup/for-6.16-fixes] harden css_create() for safe placement of call to css_rstat_init() JP Kobryn
` (3 preceding siblings ...)
2025-07-22 1:40 ` [PATCH 4/5 cgroup/for-6.16-fixes] cgroup: initialize css rstat before linking to cgroups in css_create() JP Kobryn
@ 2025-07-22 1:40 ` JP Kobryn
2025-07-25 17:23 ` [PATCH 0/5 cgroup/for-6.16-fixes] harden css_create() for safe placement of call to css_rstat_init() Michal Koutný
5 siblings, 0 replies; 10+ messages in thread
From: JP Kobryn @ 2025-07-22 1:40 UTC (permalink / raw)
To: tj, shakeel.butt, mkoutny, yosryahmed, hannes, akpm
Cc: linux-kernel, cgroups, kernel-team
The __css_rstat_{base_,}init/exit() functions have complexity in how they
distinguish between base stats and formal subsystem stats. Eliminate this
complexity by breaking up these functions and moving the logic for base and
subsystem directly into their respective public API functions.
Signed-off-by: JP Kobryn <inwardvessel@gmail.com>
---
kernel/cgroup/rstat.c | 72 ++++++++++++++++++-------------------------
1 file changed, 30 insertions(+), 42 deletions(-)
diff --git a/kernel/cgroup/rstat.c b/kernel/cgroup/rstat.c
index ba656a53136a..30fdf92a21a4 100644
--- a/kernel/cgroup/rstat.c
+++ b/kernel/cgroup/rstat.c
@@ -437,29 +437,15 @@ __bpf_kfunc void css_rstat_flush(struct cgroup_subsys_state *css)
}
}
-static int __css_rstat_init(struct cgroup_subsys_state *css, bool is_self)
+int css_rstat_init(struct cgroup_subsys_state *css)
{
- struct cgroup *cgrp = css->cgroup;
int cpu;
- if (is_self) {
- /* the root cgrp has rstat_base_cpu preallocated */
- if (!cgrp->rstat_base_cpu) {
- cgrp->rstat_base_cpu = alloc_percpu(struct cgroup_rstat_base_cpu);
- if (!cgrp->rstat_base_cpu)
- return -ENOMEM;
- }
- }
-
/* the root cgrp's self css has rstat_cpu preallocated */
if (!css->rstat_cpu) {
css->rstat_cpu = alloc_percpu(struct css_rstat_cpu);
- if (!css->rstat_cpu) {
- if (is_self)
- free_percpu(cgrp->rstat_base_cpu);
-
+ if (!css->rstat_cpu)
return -ENOMEM;
- }
}
/* ->updated_children list is self terminated */
@@ -467,19 +453,12 @@ static int __css_rstat_init(struct cgroup_subsys_state *css, bool is_self)
struct css_rstat_cpu *rstatc = css_rstat_cpu(css, cpu);
rstatc->updated_children = css;
-
- if (is_self) {
- struct cgroup_rstat_base_cpu *rstatbc;
-
- rstatbc = cgroup_rstat_base_cpu(cgrp, cpu);
- u64_stats_init(&rstatbc->bsync);
- }
}
return 0;
}
-static void __css_rstat_exit(struct cgroup_subsys_state *css, bool is_self)
+void css_rstat_exit(struct cgroup_subsys_state *css)
{
int cpu;
@@ -494,35 +473,44 @@ static void __css_rstat_exit(struct cgroup_subsys_state *css, bool is_self)
return;
}
- if (is_self) {
- struct cgroup *cgrp = css->cgroup;
-
- free_percpu(cgrp->rstat_base_cpu);
- cgrp->rstat_base_cpu = NULL;
- }
-
free_percpu(css->rstat_cpu);
css->rstat_cpu = NULL;
}
-int css_rstat_init(struct cgroup_subsys_state *css)
+int cgroup_rstat_base_init(struct cgroup *cgrp)
{
- return __css_rstat_init(css, false);
-}
+ int ret, cpu;
-void css_rstat_exit(struct cgroup_subsys_state *css)
-{
- return __css_rstat_exit(css, false);
-}
+ /* the root cgrp has rstat_base_cpu preallocated */
+ if (!cgrp->rstat_base_cpu) {
+ cgrp->rstat_base_cpu = alloc_percpu(struct cgroup_rstat_base_cpu);
+ if (!cgrp->rstat_base_cpu)
+ return -ENOMEM;
+ }
-int cgroup_rstat_base_init(struct cgroup *cgrp)
-{
- return __css_rstat_init(&cgrp->self, true);
+ ret = css_rstat_init(&cgrp->self);
+ if (ret) {
+ free_percpu(cgrp->rstat_base_cpu);
+ return ret;
+ }
+
+ /* ->updated_children list is self terminated */
+ for_each_possible_cpu(cpu) {
+ struct cgroup_rstat_base_cpu *rstatbc;
+
+ rstatbc = cgroup_rstat_base_cpu(cgrp, cpu);
+ u64_stats_init(&rstatbc->bsync);
+ }
+
+ return ret;
}
void cgroup_rstat_base_exit(struct cgroup *cgrp)
{
- __css_rstat_exit(&cgrp->self, true);
+ css_rstat_exit(&cgrp->self);
+
+ free_percpu(cgrp->rstat_base_cpu);
+ cgrp->rstat_base_cpu = NULL;
}
/**
--
2.47.1
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH 0/5 cgroup/for-6.16-fixes] harden css_create() for safe placement of call to css_rstat_init()
2025-07-22 1:40 [PATCH 0/5 cgroup/for-6.16-fixes] harden css_create() for safe placement of call to css_rstat_init() JP Kobryn
` (4 preceding siblings ...)
2025-07-22 1:40 ` [PATCH 5/5 cgroup/for-6.16-fixes] cgroup: break up the internal rstat init/exit logic by subsys and base JP Kobryn
@ 2025-07-25 17:23 ` Michal Koutný
2025-07-28 18:04 ` JP Kobryn
5 siblings, 1 reply; 10+ messages in thread
From: Michal Koutný @ 2025-07-25 17:23 UTC (permalink / raw)
To: JP Kobryn
Cc: tj, shakeel.butt, yosryahmed, hannes, akpm, linux-kernel, cgroups,
kernel-team
[-- Attachment #1: Type: text/plain, Size: 957 bytes --]
Hi.
On Mon, Jul 21, 2025 at 06:40:25PM -0700, JP Kobryn <inwardvessel@gmail.com> wrote:
...
Thanks for the instructive summary!
> If any of the three goto jumps are taken, async cleanup will begin and
> css_rstat_exit() will be invoked. But since css_rstat_init() would not have
> succeeded, the warning will eventually be reached.
First thought is why not simply add a flag that'd guide whether
css_rstat_exit() has work to do.
This is meant as a fix, so it should have some metadata, I'd consider this one:
Fixes: 5da3bfa029d68 ("cgroup: use separate rstat trees for each subsystem")
(that's when css_rstat_init was moved to css_create)
and likely this
Reported-by: syzbot+8d052e8b99e40bc625ed@syzkaller.appspotmail.com
(Sorry for being such a bureaucrat.)
It's most appropriate in your 4/5 but do you think it'd be possible to
reshuffle the series to put the fix in front (to ease it for stable
kernels) and refactorings after?
Regards,
Michal
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 0/5 cgroup/for-6.16-fixes] harden css_create() for safe placement of call to css_rstat_init()
2025-07-25 17:23 ` [PATCH 0/5 cgroup/for-6.16-fixes] harden css_create() for safe placement of call to css_rstat_init() Michal Koutný
@ 2025-07-28 18:04 ` JP Kobryn
2025-07-29 9:42 ` Michal Koutný
0 siblings, 1 reply; 10+ messages in thread
From: JP Kobryn @ 2025-07-28 18:04 UTC (permalink / raw)
To: Michal Koutný
Cc: tj, shakeel.butt, yosryahmed, hannes, akpm, linux-kernel, cgroups,
kernel-team
Thanks for taking a look Michal.
On 7/25/25 10:23 AM, Michal Koutný wrote:
> Hi.
>
> On Mon, Jul 21, 2025 at 06:40:25PM -0700, JP Kobryn <inwardvessel@gmail.com> wrote:
> ...
>
> Thanks for the instructive summary!
>
>> If any of the three goto jumps are taken, async cleanup will begin and
>> css_rstat_exit() will be invoked. But since css_rstat_init() would not have
>> succeeded, the warning will eventually be reached.
> First thought is why not simply add a flag that'd guide whether
> css_rstat_exit() has work to do.
I did consider adding an "initialized" flag to the css but since there
can be multiple css's per
cgroup it felt like it would be adding overhead. So I went the path of
getting the call
sequence right. I'm open to feedback on this, though.
>
> This is meant as a fix, so it should have some metadata, I'd consider this one:
> Fixes: 5da3bfa029d68 ("cgroup: use separate rstat trees for each subsystem")
>
> (that's when css_rstat_init was moved to css_create)
>
> and likely this
> Reported-by: syzbot+8d052e8b99e40bc625ed@syzkaller.appspotmail.com
>
> (Sorry for being such a bureaucrat.)
No problem, I overlooked that.
> It's most appropriate in your 4/5 but do you think it'd be possible to
> reshuffle the series to put the fix in front (to ease it for stable
> kernels) and refactorings after?
Let me give that a try. As it is right now, patches 1-3 are pre-reqs for
4. I can try to get the
actual fix to the front and then add patches to additionally make
nicer/refactor.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 0/5 cgroup/for-6.16-fixes] harden css_create() for safe placement of call to css_rstat_init()
2025-07-28 18:04 ` JP Kobryn
@ 2025-07-29 9:42 ` Michal Koutný
2025-07-29 23:53 ` JP Kobryn
0 siblings, 1 reply; 10+ messages in thread
From: Michal Koutný @ 2025-07-29 9:42 UTC (permalink / raw)
To: JP Kobryn
Cc: tj, shakeel.butt, yosryahmed, hannes, akpm, linux-kernel, cgroups,
kernel-team
[-- Attachment #1: Type: text/plain, Size: 871 bytes --]
On Mon, Jul 28, 2025 at 11:04:56AM -0700, JP Kobryn <inwardvessel@gmail.com> wrote:
> I did consider adding an "initialized" flag to the css but since there can
> be multiple css's per
> cgroup it felt like it would be adding overhead. So I went the path of
> getting the call
> sequence right. I'm open to feedback on this, though.
An implicit flag that builds upon the assumption that css_rstat_init()
must only succeed after it allocates ->rstat_cpu (didn't check gotchas
of this approach with !CONFIG_SMP)
--- a/kernel/cgroup/rstat.c
+++ b/kernel/cgroup/rstat.c
@@ -488,6 +488,10 @@ void css_rstat_exit(struct cgroup_subsys_state *css)
if (!css_uses_rstat(css))
return;
+ /* Incomplete css whose css_rstat_init failed */
+ if (!css->rstat_cpu)
+ return;
+
css_rstat_flush(css);
/* sanity check */
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 0/5 cgroup/for-6.16-fixes] harden css_create() for safe placement of call to css_rstat_init()
2025-07-29 9:42 ` Michal Koutný
@ 2025-07-29 23:53 ` JP Kobryn
0 siblings, 0 replies; 10+ messages in thread
From: JP Kobryn @ 2025-07-29 23:53 UTC (permalink / raw)
To: Michal Koutný
Cc: tj, shakeel.butt, yosryahmed, hannes, akpm, linux-kernel, cgroups,
kernel-team
On 7/29/25 2:42 AM, Michal Koutný wrote:
> On Mon, Jul 28, 2025 at 11:04:56AM -0700, JP Kobryn <inwardvessel@gmail.com> wrote:
>> I did consider adding an "initialized" flag to the css but since there can
>> be multiple css's per
>> cgroup it felt like it would be adding overhead. So I went the path of
>> getting the call
>> sequence right. I'm open to feedback on this, though.
>
> An implicit flag that builds upon the assumption that css_rstat_init()
> must only succeed after it allocates ->rstat_cpu (didn't check gotchas
> of this approach with !CONFIG_SMP)
I think this can work. This can probably be the early fix and then the
refactoring can follow.
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2025-07-29 23:53 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-07-22 1:40 [PATCH 0/5 cgroup/for-6.16-fixes] harden css_create() for safe placement of call to css_rstat_init() JP Kobryn
2025-07-22 1:40 ` [PATCH 1/5 cgroup/for-6.16-fixes] cgroup: add exclusive css rstat init/exit api for base stats JP Kobryn
2025-07-22 1:40 ` [PATCH 2/5 cgroup/for-6.16-fixes] cgroup: check for rstat flush callback at css rstat init/exit call sites JP Kobryn
2025-07-22 1:40 ` [PATCH 3/5 cgroup/for-6.16-fixes] cgroup: split init_and_link_css() JP Kobryn
2025-07-22 1:40 ` [PATCH 4/5 cgroup/for-6.16-fixes] cgroup: initialize css rstat before linking to cgroups in css_create() JP Kobryn
2025-07-22 1:40 ` [PATCH 5/5 cgroup/for-6.16-fixes] cgroup: break up the internal rstat init/exit logic by subsys and base JP Kobryn
2025-07-25 17:23 ` [PATCH 0/5 cgroup/for-6.16-fixes] harden css_create() for safe placement of call to css_rstat_init() Michal Koutný
2025-07-28 18:04 ` JP Kobryn
2025-07-29 9:42 ` Michal Koutný
2025-07-29 23:53 ` JP Kobryn
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).