Re: [PATCH v3 cgroup/for-7.0-fixes] cgroup: Fix cgroup_drain_dying() testing the wrong condition

public inbox for cgroups@vger.kernel.org
 help / color / mirror / Atom feed

From: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
To: Tejun Heo <tj@kernel.org>
Cc: cgroups@vger.kernel.org, linux-kernel@vger.kernel.org,
	Bert Karwatzki <spasswolf@web.de>,
	Michal Koutny <mkoutny@suse.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	kernel test robot <oliver.sang@intel.com>
Subject: Re: [PATCH v3 cgroup/for-7.0-fixes] cgroup: Fix cgroup_drain_dying() testing the wrong condition
Date: Thu, 26 Mar 2026 08:35:11 +0100	[thread overview]
Message-ID: <20260326073511.0rcA5AGb@linutronix.de> (raw)
In-Reply-To: <acR3fYVD_blwD93_@slm.duckdns.org>

On 2026-03-25 14:02:05 [-1000], Tejun Heo wrote:
> > The only issue I see is if I delay the irq_work callback by a second.
> > Other than that, I don't see any problems.
> 
> What issue do you see when delaying it by a second? Just things being slowed
> down?

This is during boot:

[  OK  ] Mounted sys-kernel-debug.mount - Kernel Debug File System.
[  OK  ] Mounted sys-kernel-tracing.mount - Kernel Trace File System.
[  OK  ] Mounted tmp.mount - Temporary Directory /tmp.
[   20.845878] INFO: task systemd:1 blocked for more than 10 seconds.
[   20.845885]       Not tainted 7.0.0-rc5+ #178
[   20.845887] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[   20.845888] task:systemd         state:D stack:0     pid:1     tgid:1     ppid:0      task_flags:0x400100 flags:0x00080000
[   20.845906] Call Trace:
[   20.845911]  <TASK>
[   20.845915]  __schedule+0x3db/0xf90
[   20.845947]  schedule+0x27/0xd0
[   20.845950]  cgroup_drain_dying+0x9b/0x190
[   20.845971]  cgroup_rmdir+0x2d/0x100
[   20.845980]  kernfs_iop_rmdir+0x6a/0xd0
[   20.845993]  vfs_rmdir+0x11a/0x280
[   20.846002]  filename_rmdir+0x16f/0x1e0
[   20.846009]  __x64_sys_rmdir+0x28/0x40
[   20.846015]  do_syscall_64+0x119/0x5a0
[   20.846152]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
[   20.846158] RIP: 0033:0x7ff495627337
[   20.846164] RSP: 002b:00007ffd7efa66f8 EFLAGS: 00000246 ORIG_RAX: 0000000000000054
[   20.846170] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007ff495627337
[   20.846172] RDX: 0000000000000000 RSI: 0000000000000001 RDI: 00005646ca9583a0
[   20.846173] RBP: 00005646ca9583a0 R08: 000000000000000c R09: 0000000000000000
[   20.846174] R10: 0000000000000000 R11: 0000000000000246 R12: 00005646ca957ac0
[   20.846175] R13: 0000000000000001 R14: 0000000000000004 R15: 0000000000000000
[   20.846178]  </TASK>

It does not recover. Therefore I think there might be another race
lurking. This is what I talk about:

--- a/kernel/cgroup/cgroup.c
+++ b/kernel/cgroup/cgroup.c
@@ -7112,9 +7112,9 @@ static void do_cgroup_task_dead(struct task_struct *tsk)
  * irq_work to allow batching while ensuring timely completion.
  */
 static DEFINE_PER_CPU(struct llist_head, cgrp_dead_tasks);
-static DEFINE_PER_CPU(struct irq_work, cgrp_dead_tasks_iwork);
+static DEFINE_PER_CPU(struct delayed_work, cgrp_delayed_tasks_iwork);
 
-static void cgrp_dead_tasks_iwork_fn(struct irq_work *iwork)
+static void cgrp_dead_tasks_iwork_fn(struct work_struct *iwork)
 {
 	struct llist_node *lnode;
 	struct task_struct *task, *next;
@@ -7131,9 +7131,11 @@ static void __init cgroup_rt_init(void)
 	int cpu;
 
 	for_each_possible_cpu(cpu) {
+		struct delayed_work *dwork;
+
 		init_llist_head(per_cpu_ptr(&cgrp_dead_tasks, cpu));
-		per_cpu(cgrp_dead_tasks_iwork, cpu) =
-			IRQ_WORK_INIT_LAZY(cgrp_dead_tasks_iwork_fn);
+		dwork = &per_cpu(cgrp_delayed_tasks_iwork, cpu);
+		INIT_DELAYED_WORK(dwork, cgrp_dead_tasks_iwork_fn);
 	}
 }
 
@@ -7141,7 +7143,7 @@ void cgroup_task_dead(struct task_struct *task)
 {
 	get_task_struct(task);
 	llist_add(&task->cg_dead_lnode, this_cpu_ptr(&cgrp_dead_tasks));
-	irq_work_queue(this_cpu_ptr(&cgrp_dead_tasks_iwork));
+	schedule_delayed_work(this_cpu_ptr(&cgrp_delayed_tasks_iwork), HZ);
 }
 #else	/* CONFIG_PREEMPT_RT */
 static void __init cgroup_rt_init(void) {}

> Thanks.
> 

Sebastian

next prev parent reply	other threads:[~2026-03-26  7:35 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-25 17:23 [PATCH v3 cgroup/for-7.0-fixes] cgroup: Fix cgroup_drain_dying() testing the wrong condition Tejun Heo
2026-03-25 17:30 ` Tejun Heo
2026-03-25 18:06 ` Sebastian Andrzej Siewior
2026-03-26  0:02   ` Tejun Heo
2026-03-26  7:35     ` Sebastian Andrzej Siewior [this message]
2026-03-27 20:18       ` Tejun Heo
     [not found] <20260325172348.1836430-1-tj@kernel.org>
2026-03-26  0:09 ` Tejun Heo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260326073511.0rcA5AGb@linutronix.de \
    --to=bigeasy@linutronix.de \
    --cc=cgroups@vger.kernel.org \
    --cc=hannes@cmpxchg.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mkoutny@suse.com \
    --cc=oliver.sang@intel.com \
    --cc=spasswolf@web.de \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox