From: Frederic Weisbecker <frederic@kernel.org>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Tao <wangtao554@huawei.com>,
stable@vger.kernel.org, mingo@redhat.com, juri.lelli@redhat.com,
vincent.guittot@linaro.org, dietmar.eggemann@arm.com,
rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de,
bristot@redhat.com, tglx@linutronix.de,
linux-kernel@vger.kernel.org, tanghui20@huawei.com,
zhangqiao22@huawei.com
Subject: Re: [PATCH] sched/core: Fix potential deadlock on rq lock
Date: Thu, 11 Sep 2025 17:02:45 +0200 [thread overview]
Message-ID: <aMLklWUzm1ZqZgZF@localhost.localdomain> (raw)
In-Reply-To: <20250911135358.GY3245006@noisy.programming.kicks-ass.net>
Le Thu, Sep 11, 2025 at 03:53:58PM +0200, Peter Zijlstra a écrit :
> On Thu, Sep 11, 2025 at 12:42:49PM +0000, Wang Tao wrote:
> > When CPU 1 enters the nohz_full state, and the kworker on CPU 0 executes
> > the function sched_tick_remote, holding the lock on CPU1's rq
> > and triggering the warning WARN_ON_ONCE(delta > (u64)NSEC_PER_SEC * 3).
> > This leads to the process of printing the warning message, where the
> > console_sem semaphore is held. At this point, the print task on the
> > CPU1's rq cannot acquire the console_sem and joins the wait queue,
> > entering the UNINTERRUPTIBLE state. It waits for the console_sem to be
> > released and then wakes up. After the task on CPU 0 releases
> > the console_sem, it wakes up the waiting console_sem task.
> > In try_to_wake_up, it attempts to acquire the lock on CPU1's rq again,
> > resulting in a deadlock.
> >
> > The triggering scenario is as follows:
> >
> > CPU0 CPU1
> > sched_tick_remote
> > WARN_ON_ONCE(delta > (u64)NSEC_PER_SEC * 3)
> >
> > report_bug con_write
> > printk
> >
> > console_unlock
> > do_con_write
> > console_lock
> > down(&console_sem)
> > list_add_tail(&waiter.list, &sem->wait_list);
> > up(&console_sem)
> > wake_up_q(&wake_q)
> > try_to_wake_up
> > __task_rq_lock
> > _raw_spin_lock
> >
> > This patch fixes the issue by deffering all printk console printing
> > during the lock holding period.
> >
> > Fixes: d84b31313ef8 ("sched/isolation: Offload residual 1Hz scheduler tick")
> > Signed-off-by: Wang Tao <wangtao554@huawei.com>
>
> I fundamentally hate that deferred thing and consider it a printk bug.
>
> But really, if you trip that WARN, fix it and the problem goes away.
And probably it triggers a lot of false positives. An overloaded housekeeping
CPU can easily be off for 2 seconds. We should make it 30 seconds.
Thanks.
--
Frederic Weisbecker
SUSE Labs
next prev parent reply other threads:[~2025-09-11 15:02 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-09-11 12:42 [PATCH] sched/core: Fix potential deadlock on rq lock Wang Tao
2025-09-11 13:53 ` Peter Zijlstra
2025-09-11 15:02 ` Frederic Weisbecker [this message]
2025-09-11 15:14 ` Phil Auld
2025-09-11 15:38 ` Frederic Weisbecker
2025-09-11 16:13 ` [PATCH] sched: Increase sched_tick_remote timeout Phil Auld
2025-09-11 16:29 ` Frederic Weisbecker
2025-09-17 6:26 ` wangtao (EQ)
2025-09-16 8:44 ` wangtao (EQ)
2025-09-16 12:49 ` Phil Auld
2025-09-23 10:47 ` Phil Auld
2025-10-10 12:13 ` Phil Auld
2025-11-03 21:56 ` Phil Auld
2025-11-14 12:19 ` [tip: sched/core] " tip-bot2 for Phil Auld
2025-11-14 13:07 ` Phil Auld
2025-11-17 16:23 ` tip-bot2 for Phil Auld
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aMLklWUzm1ZqZgZF@localhost.localdomain \
--to=frederic@kernel.org \
--cc=bristot@redhat.com \
--cc=bsegall@google.com \
--cc=dietmar.eggemann@arm.com \
--cc=juri.lelli@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mgorman@suse.de \
--cc=mingo@redhat.com \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=stable@vger.kernel.org \
--cc=tanghui20@huawei.com \
--cc=tglx@linutronix.de \
--cc=vincent.guittot@linaro.org \
--cc=wangtao554@huawei.com \
--cc=zhangqiao22@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.