* [PATCH cgroup/for-7.0-fixes] selftests/cgroup: Don't test populated synchrony against task exit
@ 2026-03-23 20:28 Tejun Heo
2026-03-24 7:50 ` Christian Brauner
` (2 more replies)
0 siblings, 3 replies; 4+ messages in thread
From: Tejun Heo @ 2026-03-23 20:28 UTC (permalink / raw)
To: cgroups, linux-kernel
Cc: Sebastian Andrzej Siewior, Christian Brauner, Johannes Weiner,
Michal Koutny, Shuah Khan, linux-kselftest
test_cgcore_populated (test_core) and test_cgkill_{simple,tree,forkbomb}
(test_kill) check cgroup.events "populated 0" immediately after reaping
child tasks with waitpid(). This used to work because cgroup_task_exit() in
do_exit() unlinked tasks from css_sets before exit_notify() woke up
waitpid().
d245698d727a ("cgroup: Defer task cgroup unlink until after the task is done
switching out") moved the unlink to cgroup_task_dead() in
finish_task_switch(), which runs after exit_notify(). The populated counter
is now decremented after the parent's waitpid() can return, so there is no
longer a synchronous ordering guarantee. On PREEMPT_RT, where
cgroup_task_dead() is further deferred through lazy irq_work, the race
window is even larger.
The synchronous populated transition was never part of the cgroup interface
contract - it was an implementation artifact. Use cg_read_strcmp_wait() which
retries for up to 1 second, matching what these tests actually need to
verify: that the cgroup eventually becomes unpopulated after all tasks exit.
Fixes: d245698d727a ("cgroup: Defer task cgroup unlink until after the task is done switching out")
Reported-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Christian Brauner <brauner@kernel.org>
Cc: cgroups@vger.kernel.org
---
tools/testing/selftests/cgroup/lib/cgroup_util.c | 15 +++++++++++++++
tools/testing/selftests/cgroup/lib/include/cgroup_util.h | 2 ++
tools/testing/selftests/cgroup/test_core.c | 3 ++-
tools/testing/selftests/cgroup/test_kill.c | 7 ++++---
4 files changed, 23 insertions(+), 4 deletions(-)
--- a/tools/testing/selftests/cgroup/lib/cgroup_util.c
+++ b/tools/testing/selftests/cgroup/lib/cgroup_util.c
@@ -123,6 +123,21 @@ int cg_read_strcmp(const char *cgroup, c
return ret;
}
+int cg_read_strcmp_wait(const char *cgroup, const char *control,
+ const char *expected)
+{
+ int i, ret;
+
+ for (i = 0; i < 100; i++) {
+ ret = cg_read_strcmp(cgroup, control, expected);
+ if (!ret)
+ return ret;
+ usleep(10000);
+ }
+
+ return ret;
+}
+
int cg_read_strstr(const char *cgroup, const char *control, const char *needle)
{
char buf[PAGE_SIZE];
--- a/tools/testing/selftests/cgroup/lib/include/cgroup_util.h
+++ b/tools/testing/selftests/cgroup/lib/include/cgroup_util.h
@@ -61,6 +61,8 @@ extern int cg_read(const char *cgroup, c
char *buf, size_t len);
extern int cg_read_strcmp(const char *cgroup, const char *control,
const char *expected);
+extern int cg_read_strcmp_wait(const char *cgroup, const char *control,
+ const char *expected);
extern int cg_read_strstr(const char *cgroup, const char *control,
const char *needle);
extern long cg_read_long(const char *cgroup, const char *control);
--- a/tools/testing/selftests/cgroup/test_core.c
+++ b/tools/testing/selftests/cgroup/test_core.c
@@ -233,7 +233,8 @@ static int test_cgcore_populated(const c
if (err)
goto cleanup;
- if (cg_read_strcmp(cg_test_d, "cgroup.events", "populated 0\n"))
+ if (cg_read_strcmp_wait(cg_test_d, "cgroup.events",
+ "populated 0\n"))
goto cleanup;
/* Remove cgroup. */
--- a/tools/testing/selftests/cgroup/test_kill.c
+++ b/tools/testing/selftests/cgroup/test_kill.c
@@ -86,7 +86,7 @@ cleanup:
wait_for_pid(pids[i]);
if (ret == KSFT_PASS &&
- cg_read_strcmp(cgroup, "cgroup.events", "populated 0\n"))
+ cg_read_strcmp_wait(cgroup, "cgroup.events", "populated 0\n"))
ret = KSFT_FAIL;
if (cgroup)
@@ -190,7 +190,8 @@ cleanup:
wait_for_pid(pids[i]);
if (ret == KSFT_PASS &&
- cg_read_strcmp(cgroup[0], "cgroup.events", "populated 0\n"))
+ cg_read_strcmp_wait(cgroup[0], "cgroup.events",
+ "populated 0\n"))
ret = KSFT_FAIL;
for (i = 9; i >= 0 && cgroup[i]; i--) {
@@ -251,7 +252,7 @@ cleanup:
wait_for_pid(pid);
if (ret == KSFT_PASS &&
- cg_read_strcmp(cgroup, "cgroup.events", "populated 0\n"))
+ cg_read_strcmp_wait(cgroup, "cgroup.events", "populated 0\n"))
ret = KSFT_FAIL;
if (cgroup)
^ permalink raw reply [flat|nested] 4+ messages in thread* Re: [PATCH cgroup/for-7.0-fixes] selftests/cgroup: Don't test populated synchrony against task exit
2026-03-23 20:28 [PATCH cgroup/for-7.0-fixes] selftests/cgroup: Don't test populated synchrony against task exit Tejun Heo
@ 2026-03-24 7:50 ` Christian Brauner
2026-03-24 9:04 ` Sebastian Andrzej Siewior
2026-03-24 20:24 ` Tejun Heo
2 siblings, 0 replies; 4+ messages in thread
From: Christian Brauner @ 2026-03-24 7:50 UTC (permalink / raw)
To: Tejun Heo
Cc: cgroups, linux-kernel, Sebastian Andrzej Siewior, Johannes Weiner,
Michal Koutny, Shuah Khan, linux-kselftest
On Mon, Mar 23, 2026 at 10:28:29AM -1000, Tejun Heo wrote:
> test_cgcore_populated (test_core) and test_cgkill_{simple,tree,forkbomb}
> (test_kill) check cgroup.events "populated 0" immediately after reaping
> child tasks with waitpid(). This used to work because cgroup_task_exit() in
> do_exit() unlinked tasks from css_sets before exit_notify() woke up
> waitpid().
>
> d245698d727a ("cgroup: Defer task cgroup unlink until after the task is done
> switching out") moved the unlink to cgroup_task_dead() in
> finish_task_switch(), which runs after exit_notify(). The populated counter
> is now decremented after the parent's waitpid() can return, so there is no
> longer a synchronous ordering guarantee. On PREEMPT_RT, where
> cgroup_task_dead() is further deferred through lazy irq_work, the race
> window is even larger.
>
> The synchronous populated transition was never part of the cgroup interface
> contract - it was an implementation artifact. Use cg_read_strcmp_wait() which
> retries for up to 1 second, matching what these tests actually need to
> verify: that the cgroup eventually becomes unpopulated after all tasks exit.
>
> Fixes: d245698d727a ("cgroup: Defer task cgroup unlink until after the task is done switching out")
> Reported-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
> Signed-off-by: Tejun Heo <tj@kernel.org>
> Cc: Christian Brauner <brauner@kernel.org>
> Cc: cgroups@vger.kernel.org
> ---
Seems fine to me.
^ permalink raw reply [flat|nested] 4+ messages in thread* Re: [PATCH cgroup/for-7.0-fixes] selftests/cgroup: Don't test populated synchrony against task exit
2026-03-23 20:28 [PATCH cgroup/for-7.0-fixes] selftests/cgroup: Don't test populated synchrony against task exit Tejun Heo
2026-03-24 7:50 ` Christian Brauner
@ 2026-03-24 9:04 ` Sebastian Andrzej Siewior
2026-03-24 20:24 ` Tejun Heo
2 siblings, 0 replies; 4+ messages in thread
From: Sebastian Andrzej Siewior @ 2026-03-24 9:04 UTC (permalink / raw)
To: Tejun Heo
Cc: cgroups, linux-kernel, Christian Brauner, Johannes Weiner,
Michal Koutny, Shuah Khan, linux-kselftest
On 2026-03-23 10:28:29 [-1000], Tejun Heo wrote:
> test_cgcore_populated (test_core) and test_cgkill_{simple,tree,forkbomb}
> (test_kill) check cgroup.events "populated 0" immediately after reaping
> child tasks with waitpid(). This used to work because cgroup_task_exit() in
> do_exit() unlinked tasks from css_sets before exit_notify() woke up
> waitpid().
>
> d245698d727a ("cgroup: Defer task cgroup unlink until after the task is done
> switching out") moved the unlink to cgroup_task_dead() in
> finish_task_switch(), which runs after exit_notify(). The populated counter
> is now decremented after the parent's waitpid() can return, so there is no
> longer a synchronous ordering guarantee. On PREEMPT_RT, where
> cgroup_task_dead() is further deferred through lazy irq_work, the race
> window is even larger.
>
> The synchronous populated transition was never part of the cgroup interface
> contract - it was an implementation artifact. Use cg_read_strcmp_wait() which
> retries for up to 1 second, matching what these tests actually need to
> verify: that the cgroup eventually becomes unpopulated after all tasks exit.
>
> Fixes: d245698d727a ("cgroup: Defer task cgroup unlink until after the task is done switching out")
> Reported-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
> Signed-off-by: Tejun Heo <tj@kernel.org>
> Cc: Christian Brauner <brauner@kernel.org>
> Cc: cgroups@vger.kernel.org
Tested-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Sebastian
^ permalink raw reply [flat|nested] 4+ messages in thread* Re: [PATCH cgroup/for-7.0-fixes] selftests/cgroup: Don't test populated synchrony against task exit
2026-03-23 20:28 [PATCH cgroup/for-7.0-fixes] selftests/cgroup: Don't test populated synchrony against task exit Tejun Heo
2026-03-24 7:50 ` Christian Brauner
2026-03-24 9:04 ` Sebastian Andrzej Siewior
@ 2026-03-24 20:24 ` Tejun Heo
2 siblings, 0 replies; 4+ messages in thread
From: Tejun Heo @ 2026-03-24 20:24 UTC (permalink / raw)
To: cgroups, linux-kernel
Cc: Sebastian Andrzej Siewior, Johannes Weiner, Michal Koutny,
Shuah Khan, linux-kselftest, Christian Brauner
Applied to cgroup/for-7.0-fixes with the subject updated to:
selftests/cgroup: Don't require synchronous populated update on task exit
Thanks.
--
tejun
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2026-03-24 20:24 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-23 20:28 [PATCH cgroup/for-7.0-fixes] selftests/cgroup: Don't test populated synchrony against task exit Tejun Heo
2026-03-24 7:50 ` Christian Brauner
2026-03-24 9:04 ` Sebastian Andrzej Siewior
2026-03-24 20:24 ` Tejun Heo
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox