From: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
To: Tejun Heo <tj@kernel.org>
Cc: cgroups@vger.kernel.org, linux-kernel@vger.kernel.org,
Bert Karwatzki <spasswolf@web.de>,
Michal Koutny <mkoutny@suse.com>,
Johannes Weiner <hannes@cmpxchg.org>,
kernel test robot <oliver.sang@intel.com>
Subject: Re: [PATCH v3 cgroup/for-7.0-fixes] cgroup: Fix cgroup_drain_dying() testing the wrong condition
Date: Thu, 26 Mar 2026 08:35:11 +0100 [thread overview]
Message-ID: <20260326073511.0rcA5AGb@linutronix.de> (raw)
In-Reply-To: <acR3fYVD_blwD93_@slm.duckdns.org>
On 2026-03-25 14:02:05 [-1000], Tejun Heo wrote:
> > The only issue I see is if I delay the irq_work callback by a second.
> > Other than that, I don't see any problems.
>
> What issue do you see when delaying it by a second? Just things being slowed
> down?
This is during boot:
[ OK ] Mounted sys-kernel-debug.mount - Kernel Debug File System.
[ OK ] Mounted sys-kernel-tracing.mount - Kernel Trace File System.
[ OK ] Mounted tmp.mount - Temporary Directory /tmp.
[ 20.845878] INFO: task systemd:1 blocked for more than 10 seconds.
[ 20.845885] Not tainted 7.0.0-rc5+ #178
[ 20.845887] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 20.845888] task:systemd state:D stack:0 pid:1 tgid:1 ppid:0 task_flags:0x400100 flags:0x00080000
[ 20.845906] Call Trace:
[ 20.845911] <TASK>
[ 20.845915] __schedule+0x3db/0xf90
[ 20.845947] schedule+0x27/0xd0
[ 20.845950] cgroup_drain_dying+0x9b/0x190
[ 20.845971] cgroup_rmdir+0x2d/0x100
[ 20.845980] kernfs_iop_rmdir+0x6a/0xd0
[ 20.845993] vfs_rmdir+0x11a/0x280
[ 20.846002] filename_rmdir+0x16f/0x1e0
[ 20.846009] __x64_sys_rmdir+0x28/0x40
[ 20.846015] do_syscall_64+0x119/0x5a0
[ 20.846152] entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ 20.846158] RIP: 0033:0x7ff495627337
[ 20.846164] RSP: 002b:00007ffd7efa66f8 EFLAGS: 00000246 ORIG_RAX: 0000000000000054
[ 20.846170] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007ff495627337
[ 20.846172] RDX: 0000000000000000 RSI: 0000000000000001 RDI: 00005646ca9583a0
[ 20.846173] RBP: 00005646ca9583a0 R08: 000000000000000c R09: 0000000000000000
[ 20.846174] R10: 0000000000000000 R11: 0000000000000246 R12: 00005646ca957ac0
[ 20.846175] R13: 0000000000000001 R14: 0000000000000004 R15: 0000000000000000
[ 20.846178] </TASK>
It does not recover. Therefore I think there might be another race
lurking. This is what I talk about:
--- a/kernel/cgroup/cgroup.c
+++ b/kernel/cgroup/cgroup.c
@@ -7112,9 +7112,9 @@ static void do_cgroup_task_dead(struct task_struct *tsk)
* irq_work to allow batching while ensuring timely completion.
*/
static DEFINE_PER_CPU(struct llist_head, cgrp_dead_tasks);
-static DEFINE_PER_CPU(struct irq_work, cgrp_dead_tasks_iwork);
+static DEFINE_PER_CPU(struct delayed_work, cgrp_delayed_tasks_iwork);
-static void cgrp_dead_tasks_iwork_fn(struct irq_work *iwork)
+static void cgrp_dead_tasks_iwork_fn(struct work_struct *iwork)
{
struct llist_node *lnode;
struct task_struct *task, *next;
@@ -7131,9 +7131,11 @@ static void __init cgroup_rt_init(void)
int cpu;
for_each_possible_cpu(cpu) {
+ struct delayed_work *dwork;
+
init_llist_head(per_cpu_ptr(&cgrp_dead_tasks, cpu));
- per_cpu(cgrp_dead_tasks_iwork, cpu) =
- IRQ_WORK_INIT_LAZY(cgrp_dead_tasks_iwork_fn);
+ dwork = &per_cpu(cgrp_delayed_tasks_iwork, cpu);
+ INIT_DELAYED_WORK(dwork, cgrp_dead_tasks_iwork_fn);
}
}
@@ -7141,7 +7143,7 @@ void cgroup_task_dead(struct task_struct *task)
{
get_task_struct(task);
llist_add(&task->cg_dead_lnode, this_cpu_ptr(&cgrp_dead_tasks));
- irq_work_queue(this_cpu_ptr(&cgrp_dead_tasks_iwork));
+ schedule_delayed_work(this_cpu_ptr(&cgrp_delayed_tasks_iwork), HZ);
}
#else /* CONFIG_PREEMPT_RT */
static void __init cgroup_rt_init(void) {}
> Thanks.
>
Sebastian
next prev parent reply other threads:[~2026-03-26 7:35 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-25 17:23 [PATCH v3 cgroup/for-7.0-fixes] cgroup: Fix cgroup_drain_dying() testing the wrong condition Tejun Heo
2026-03-25 17:30 ` Tejun Heo
2026-03-25 18:06 ` Sebastian Andrzej Siewior
2026-03-26 0:02 ` Tejun Heo
2026-03-26 7:35 ` Sebastian Andrzej Siewior [this message]
2026-03-27 20:18 ` Tejun Heo
[not found] <20260325172348.1836430-1-tj@kernel.org>
2026-03-26 0:09 ` Tejun Heo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260326073511.0rcA5AGb@linutronix.de \
--to=bigeasy@linutronix.de \
--cc=cgroups@vger.kernel.org \
--cc=hannes@cmpxchg.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mkoutny@suse.com \
--cc=oliver.sang@intel.com \
--cc=spasswolf@web.de \
--cc=tj@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox