public inbox for cgroups@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2] cgroup: Don't expose dead tasks in cgroup
@ 2026-03-06 19:22 Sebastian Andrzej Siewior
  2026-03-06 22:48 ` Tejun Heo
  0 siblings, 1 reply; 2+ messages in thread
From: Sebastian Andrzej Siewior @ 2026-03-06 19:22 UTC (permalink / raw)
  To: Tejun Heo
  Cc: linux-rt-devel, cgroups, Johannes Weiner, Michal Koutny,
	Clark Williams, Steven Rostedt, Bert Karwatzki

Once a task exits it has its state set to TASK_DEAD and then it is
removed the cgroup it belonged to. The last step happens on the task
gets out of its last schedule() invocation and is delayed on PREEMPT_RT
due to locking constrains.

As a result it is possible to receive a pid via waitpid() of a task
which is still listed in cgroup.procs for the cgroup it belonged
to. This is something that systemd does not expect and as a result it
waits for its exit until a time out occurs.
This can also be reproduced on !PREEMPT_RT kernel with a significant
delay in do_exit() after exit_notify().

Hide the task from the output which have PF_EXITING set which is done
before the parent is notified. Keeping zombies with live threads
shouldn't break anything (suggested by Tejun).

Reported-by: Bert Karwatzki <spasswolf@web.de>
Closes: https://lore.kernel.org/all/20260219164648.3014-1-spasswolf@web.de/
Tested-by: Bert Karwatzki <spasswolf@web.de>
Fixes: 9311e6c29b348 ("cgroup: Fix sleeping from invalid context warning on PREEMPT_RT")
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
 v1…v2: https://lore.kernel.org/all/20260302120738.6KkDipsR@linutronix.de/
   - Close the race window filtering out PF_EXITING tasks instead.
   - Document the possible race window after it has been verified.

 kernel/cgroup/cgroup.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c
index c22cda7766d84..eef01b80ec933 100644
--- a/kernel/cgroup/cgroup.c
+++ b/kernel/cgroup/cgroup.c
@@ -5108,6 +5108,12 @@ static void css_task_iter_advance(struct css_task_iter *it)
 		return;
 
 	task = list_entry(it->task_pos, struct task_struct, cg_list);
+	/*
+	 * Hide task which are exitting but not yet removed. Keep zombie
+	 * leaders with live threads visible.
+	 */
+	if ((task->flags & PF_EXITING) && !atomic_read(&task->signal->live))
+		goto repeat;
 
 	if (it->flags & CSS_TASK_ITER_PROCS) {
 		/* if PROCS, skip over tasks which aren't group leaders */
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 2+ messages in thread

* Re: [PATCH v2] cgroup: Don't expose dead tasks in cgroup
  2026-03-06 19:22 [PATCH v2] cgroup: Don't expose dead tasks in cgroup Sebastian Andrzej Siewior
@ 2026-03-06 22:48 ` Tejun Heo
  0 siblings, 0 replies; 2+ messages in thread
From: Tejun Heo @ 2026-03-06 22:48 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior, linux-rt-devel, cgroups
  Cc: Johannes Weiner, Michal Koutny, Clark Williams, Steven Rostedt,
	Bert Karwatzki

Applied to cgroup/for-7.0-fixes with the following fixes.

- s/removed the cgroup/removed from the cgroup/
- s/constrains/constraints/
- Fixes tag SHA trimmed to 12 chars
- s/exitting/exiting/ and s/task which are/tasks that are/ in comment
- Added Cc: stable

Thanks.

--
tejun

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2026-03-06 22:48 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-06 19:22 [PATCH v2] cgroup: Don't expose dead tasks in cgroup Sebastian Andrzej Siewior
2026-03-06 22:48 ` Tejun Heo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox