From: Martin Pitt <martin@piware.de>
To: Tejun Heo <tj@kernel.org>
Cc: regressions@lists.linux.dev, cgroups@vger.kernel.org,
hannes@cmpxchg.org,
Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Subject: Re: [REGRESSION] 6.9.11: systemd hangs in cgroup_drain_dying during cleanup after podman operations
Date: Thu, 30 Apr 2026 08:15:33 +0200 [thread overview]
Message-ID: <afLzhRPSaD2Atp7G@piware.de> (raw)
In-Reply-To: <35e0670adb4abeab13da2c321582af9f@kernel.org> <f19d08689301f9cc0211e6273f833246@kernel.org>
Hello Tejun,
(Dropping lizefan.x@bytedance.com from CC:, it doesn't exist any more)
Tejun Heo [2026-04-29 6:21 -1000]:
> Thanks for the report. The dmesg you attached has only a partial sysrq-t
> - the dying-task stacks I need were pushed out of the ring buffer. Could
> you increase log_buf_len, reproduce, trigger sysrq-t, and send the
> resulting dmesg?
Increased to 4M, which was enough. I added it to the bottom of the debug notes
comment [1], direct link: [2]. I suppose its' not necessary any more, but just
for the records..
[1] https://github.com/cockpit-project/bots/pull/8970#issuecomment-4342147158
[2] https://github.com/user-attachments/files/27231725/dmesg-task-dump.txt
Tejun Heo [2026-04-29 11:15 -1000]:
> I think I have the mechanism. The deadlock chains three things together.
You are a genius!
> 3. The container's PID 1 (whatever the entrypoint runs) is in
> do_exit() but parked in zap_pid_ns_processes' second wait loop:
FTR, the container is pretty dumb, just
podman run quay.io/prometheus/busybox sh -c 'echo 123; sleep infinity'
we are not actually interested in the container workload for this tests, but
testing cockpit-podman for managing containers on the host.
However, I just confirmed that busybox'es sh, like "proper" bash, does reap
child processes (unlike for example running `sleep` directly as pid 1, then you
do get zombies)
> ----- min-repro.c -----
On Fedora 44 with 6.9.13, this hangs at
A: rmdir(/sys/fs/cgroup/drain-min/inner) — wedges if bug present (deliberately NOT wait4-ing C)
root 1501 0.0 0.1 2460 1764 pts/0 D+ 06:10 0:00 /tmp/repr
root 1502 0.0 0.0 0 0 pts/0 S+ 06:10 0:00 [repr]
root 1503 0.0 0.0 0 0 pts/0 Z+ 06:10 0:00 [repr] <defunct>
as expected. It does not wedge up the system in the same way as breaking all
"ls /proc" and such.
On Fedora 44 with older 6.9.10 kernel the reproducer finishes (no hang), with
EBUSY:
: B host pid=1444, C host pid=1445
pid=1444 NSpid: 1444 1
pid=1445 NSpid: 1445 2
A: rmdir(/sys/fs/cgroup/drain-min/inner) — wedges if bug present (deliberately NOT wait4-ing C)
A: rmdir returned -1 (errno=16 Device or resource busy)
I suppose you know all that, but just in case confirming on my setup helps in
any way.
Thanks!
Martin
next prev parent reply other threads:[~2026-04-30 6:15 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-29 9:21 [REGRESSION] 6.9.11: systemd hangs in cgroup_drain_dying during cleanup after podman operations Martin Pitt
2026-04-29 16:21 ` Tejun Heo
2026-04-29 21:15 ` Tejun Heo
2026-04-30 6:15 ` Martin Pitt [this message]
2026-05-01 2:29 ` [PATCH] cgroup: Defer css percpu_ref kill on rmdir until cgroup is depopulated Tejun Heo
2026-05-03 19:30 ` kernel test robot
2026-05-03 20:15 ` kernel test robot
2026-05-03 22:45 ` kernel test robot
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=afLzhRPSaD2Atp7G@piware.de \
--to=martin@piware.de \
--cc=bigeasy@linutronix.de \
--cc=cgroups@vger.kernel.org \
--cc=hannes@cmpxchg.org \
--cc=regressions@lists.linux.dev \
--cc=tj@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.