From: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
To: Tejun Heo <tj@kernel.org>
Cc: linux-rt-devel@lists.linux.dev, cgroups@vger.kernel.org,
Johannes Weiner <hannes@cmpxchg.org>,
Michal Koutny <mkoutny@suse.com>,
Clark Williams <clrkwllms@kernel.org>,
Steven Rostedt <rostedt@goodmis.org>,
Bert Karwatzki <spasswolf@web.de>
Subject: [PATCH v2] cgroup: Don't expose dead tasks in cgroup
Date: Fri, 6 Mar 2026 20:22:35 +0100 [thread overview]
Message-ID: <20260306192235.DY60tMnM@linutronix.de> (raw)
Once a task exits it has its state set to TASK_DEAD and then it is
removed the cgroup it belonged to. The last step happens on the task
gets out of its last schedule() invocation and is delayed on PREEMPT_RT
due to locking constrains.
As a result it is possible to receive a pid via waitpid() of a task
which is still listed in cgroup.procs for the cgroup it belonged
to. This is something that systemd does not expect and as a result it
waits for its exit until a time out occurs.
This can also be reproduced on !PREEMPT_RT kernel with a significant
delay in do_exit() after exit_notify().
Hide the task from the output which have PF_EXITING set which is done
before the parent is notified. Keeping zombies with live threads
shouldn't break anything (suggested by Tejun).
Reported-by: Bert Karwatzki <spasswolf@web.de>
Closes: https://lore.kernel.org/all/20260219164648.3014-1-spasswolf@web.de/
Tested-by: Bert Karwatzki <spasswolf@web.de>
Fixes: 9311e6c29b348 ("cgroup: Fix sleeping from invalid context warning on PREEMPT_RT")
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
v1…v2: https://lore.kernel.org/all/20260302120738.6KkDipsR@linutronix.de/
- Close the race window filtering out PF_EXITING tasks instead.
- Document the possible race window after it has been verified.
kernel/cgroup/cgroup.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c
index c22cda7766d84..eef01b80ec933 100644
--- a/kernel/cgroup/cgroup.c
+++ b/kernel/cgroup/cgroup.c
@@ -5108,6 +5108,12 @@ static void css_task_iter_advance(struct css_task_iter *it)
return;
task = list_entry(it->task_pos, struct task_struct, cg_list);
+ /*
+ * Hide task which are exitting but not yet removed. Keep zombie
+ * leaders with live threads visible.
+ */
+ if ((task->flags & PF_EXITING) && !atomic_read(&task->signal->live))
+ goto repeat;
if (it->flags & CSS_TASK_ITER_PROCS) {
/* if PROCS, skip over tasks which aren't group leaders */
--
2.53.0
next reply other threads:[~2026-03-06 19:22 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-06 19:22 Sebastian Andrzej Siewior [this message]
2026-03-06 22:48 ` [PATCH v2] cgroup: Don't expose dead tasks in cgroup Tejun Heo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260306192235.DY60tMnM@linutronix.de \
--to=bigeasy@linutronix.de \
--cc=cgroups@vger.kernel.org \
--cc=clrkwllms@kernel.org \
--cc=hannes@cmpxchg.org \
--cc=linux-rt-devel@lists.linux.dev \
--cc=mkoutny@suse.com \
--cc=rostedt@goodmis.org \
--cc=spasswolf@web.de \
--cc=tj@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox