public inbox for cgroups@vger.kernel.org
 help / color / mirror / Atom feed
From: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
To: Tejun Heo <tj@kernel.org>
Cc: cgroups@vger.kernel.org, linux-kernel@vger.kernel.org,
	Bert Karwatzki <spasswolf@web.de>,
	Michal Koutny <mkoutny@suse.com>,
	kernel test robot <oliver.sang@intel.com>
Subject: Re: [PATCH] cgroup: Wait for dying tasks to leave on rmdir
Date: Tue, 24 Mar 2026 09:21:47 +0100	[thread overview]
Message-ID: <20260324082147.9ysLN_6x@linutronix.de> (raw)
In-Reply-To: <acGavAFVTfggKIKy@slm.duckdns.org>

On 2026-03-23 09:55:40 [-1000], Tejun Heo wrote:
> Hello,
Hi,

> > Then I added my RCU patch. This led to a problem already during boot up
> > (didn't manage to get to the test suite).
> 
> Is that the patch to move cgroup_task_dead() to delayed_put_task_struct()? I
> don't think we can delay populated state update till usage count reaches
> zero. e.g. bpf_task_acquire() can be used by arbitrary bpf programs and will
> pin the usage count indefinitely delaying populated state update. Similar to
> delaying the event to free path, you can construct a deadlock scenario too.

Okay, then. I expected it to be limited window within a bpf program or
the sched_ext.

> > systemd-1 places modprobe-1044 in a cgroup, then destroys the cgroup.
> > It hangs in cgroup_drain_dying() because nr_populated_csets is still 1.
> > modprobe-1044 is still there in Z so the cgroup removal didn't get there
> > yet. That irq_work was quicker than RCU in this case. This can be
> > reproduced without RCU by
> 
> Isn't this the exact scenario? systemd is the one who should reap and drop
> the usage count but it's waiting for rmdir() to finish which can't finish
> due to the usage count which hasn't been reapted by systemd? We can't
> interlock these two. They have to make progress independently.

But nobody is holding it back. For some reason systemd-1 did not reap
modprobe-1044 first but went first for the rmdir(). I noticed it with
RCU first but it was also there after delayed the cleanup by one second
without RCU.

> > -       irq_work_queue(this_cpu_ptr(&cgrp_dead_tasks_iwork));
> > +       schedule_delayed_work(this_cpu_ptr(&cgrp_delayed_tasks_iwork), HZ);
> > 
> > So there is always a one second delay. If I give up waiting after 10secs
> > then it boots eventually and there are no zombies around. The test_core
> > seems to complete…
> > 
> > Having the irq_work as-is, then the "cgroup_dead()" happens on the HZ
> > tick. test_core then complains just with
> > | not ok 7 test_cgcore_populated
> 
> The test is assuming that waitpid() success guarantees cgroup !populated
> event. While before all these changes, that held, it wasn't intentional and
> the test just picked up on arbitrary ordering. I'll just remove that
> particular test.

okay. Thanks.

> Thanks.

Sebastian

      reply	other threads:[~2026-03-24  8:21 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-23  3:58 [PATCH] cgroup: Wait for dying tasks to leave on rmdir Tejun Heo
2026-03-23 11:32 ` Sebastian Andrzej Siewior
2026-03-23 19:55   ` Tejun Heo
2026-03-24  8:21     ` Sebastian Andrzej Siewior [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260324082147.9ysLN_6x@linutronix.de \
    --to=bigeasy@linutronix.de \
    --cc=cgroups@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mkoutny@suse.com \
    --cc=oliver.sang@intel.com \
    --cc=spasswolf@web.de \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox