From: Jiazi Li <jqqlijiazi@gmail.com>
To: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Frederic Weisbecker <frederic@kernel.org>,
Neeraj Upadhyay <neeraj.upadhyay@kernel.org>,
Joel Fernandes <joelagnelf@nvidia.com>,
Josh Triplett <josh@joshtriplett.org>,
Boqun Feng <boqun@kernel.org>,
Uladzislau Rezki <urezki@gmail.com>,
rcu@vger.kernel.org, "mingzhu.wang" <mingzhu.wang@transsion.com>
Subject: Re: [PATCH] rcu: use NMI to dump backtrace of blkd-task running on other cpu
Date: Wed, 22 Apr 2026 14:45:36 +0800 [thread overview]
Message-ID: <20260422063525.GA3155@Jiazi.Li> (raw)
In-Reply-To: <7a46efa3-3d11-45df-8d73-5af8b87a4b84@paulmck-laptop>
On Tue, Apr 21, 2026 at 04:20:54PM -0700, Paul E. McKenney wrote:
> On Fri, Apr 17, 2026 at 09:38:13AM +0800, Jiazi Li wrote:
> > sched_show_task cannot dump backtrace of blkd-task running on other
> > cpu:
> > [117421.286553][ C0] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
> > [117421.286579][ C0] rcu: Tasks blocked on level-0 rcu_node (CPUs 0-7): P2280
> > [117421.286595][ C0] rcu: (detected by 0, t=5252 jiffies, g=751845, q=66318 ncpus=8)
> > [117421.286604][ C0] task:android.imms2 state:R running task stack:0 ...
> > [117421.286617][ C0] Call trace:
> > [117421.286622][ C0] __switch_to+0x1a0/0x318
> > [117421.286636][ C0] 0x0
> >
> > So use NMI to dump backtrace:
> > [ 390.584143] rcub/0: rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
> > [ 390.585156] rcub/0: rcu: Tasks blocked on level-0 rcu_node (CPUs 0-7): P6816
> > [ 390.586207] rcub/0: rcu: (detected by 5, t=52532 jiffies, g=7405, q=63942 ncpus=8)
> > [ 390.587320] rcub/0: Sending NMI from CPU 5 to CPUs 4:
> > [ 390.588111] rcu_stall_threa: NMI backtrace for cpu 4
> > [ 390.588116] rcu_stall_threa: CPU: 4 UID: 0 PID: 6816 Comm: rcu_stall_threa Tainted: P...
> > [ 390.588120] rcu_stall_threa: Tainted: [P]=PROPRIETARY_MODULE, [W]=WARN, [O]=OOT_MODULE
> > [ 390.588122] rcu_stall_threa: Hardware name: MT6858 (DT)
> > [ 390.588123] rcu_stall_threa: pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> > [ 390.588125] rcu_stall_threa: pc : _raw_spin_unlock_irqrestore+0x1c/0x44
> > [ 390.588131] rcu_stall_threa: lr : ___ratelimit+0xd4/0x110
> > [ 390.588134] rcu_stall_threa: sp : ffffffc08464bdf0
> > [ 390.588135] rcu_stall_threa: x29: ffffffc08464bdf0 x28: 0000000000000000 x27: 0000000000000000
> > [ 390.588138] rcu_stall_threa: x26: 0000000000000000 x25: 0000000000000000 x24: 00000000000004e2
> > [ 390.588140] rcu_stall_threa: x23: ffffffd82ae77000 x22: ffffffd82af1fae8 x21: 000000000000000a
> > [ 390.588142] rcu_stall_threa: x20: 0000000000000000 x19: 0000000000000000 x18: ffffffc08456d020
> > [ 390.588144] rcu_stall_threa: x17: 000000008c623181 x16: 000000008c623181 x15: 0000000000000010
> > [ 390.588146] rcu_stall_threa: x14: 0000000000000100 x13: ffffffc084648000 x12: ffffffc08464c000
> > [ 390.588148] rcu_stall_threa: x11: 5e2da9f91a08d800 x10: ffffffd8299b39fc x9 : 0000000100005874
> > [ 390.588150] rcu_stall_threa: x8 : 0000000000000000 x7 : 0000000000000001 x6 : fffffffebea2b0a0
> > [ 390.588152] rcu_stall_threa: x5 : 0000000000000000 x4 : 0000000000000402 x3 : 0000000000000000
> > [ 390.588154] rcu_stall_threa: x2 : ffffff81ca8d9680 x1 : 0000000000000000 x0 : 0000000000000001
> > [ 390.588156] rcu_stall_threa: Call trace:
> > [ 390.588157] rcu_stall_threa: _raw_spin_unlock_irqrestore+0x1c/0x44
> > [ 390.588159] rcu_stall_threa: ___ratelimit+0xd4/0x110
> > [ 390.588161] rcu_stall_threa: rcu_thread_func+0x90/0xa8
> > [ 390.588164] rcu_stall_threa: kthread+0x110/0x1a4
> > [ 390.588167] rcu_stall_threa: ret_from_fork+0x10/0x20
> >
> > Signed-off-by: Jiazi Li <jqqlijiazi@gmail.com>
> > Tested-by: mingzhu.wang <mingzhu.wang@transsion.com>
>
> This looks like an arm64 stack trace. Are there any arm64 systems in
> production that do real NMIs? (Don't get me wrong, it would be nice if
> there are!)
>
From commit 331a1b3a836c ("arm64: smp: Add arch support for backtrace using pseudo-NMI"), ARM64 using
pseudo-NMI, it's actually an IPI.
> > ---
> > kernel/rcu/tree_stall.h | 7 ++++++-
> > 1 file changed, 6 insertions(+), 1 deletion(-)
> >
> > diff --git a/kernel/rcu/tree_stall.h b/kernel/rcu/tree_stall.h
> > index b67532cb8770..5806f9a43579 100644
> > --- a/kernel/rcu/tree_stall.h
> > +++ b/kernel/rcu/tree_stall.h
> > @@ -289,7 +289,12 @@ static void rcu_print_detail_task_stall_rnp(struct rcu_node *rnp)
> > * Avoid triggering hard lockup.
> > */
> > touch_nmi_watchdog();
> > - sched_show_task(t);
> > + if (unlikely(t->on_cpu && t != current) &&
>
> What if task t blocks or migrates to some other CPU at this point?
>
Yes, that's indeed a concern. We can identify such scenarios by checking
whether the PID reported by RCU matched the PID captured in the NMI
backtrace.
Do you have any suggestions?
> > + trigger_single_cpu_backtrace(task_cpu(t))) {
> > + /*Successfully triggered remote backtrace*/
>
> Wouldn't inverting the condition save a couple of lines of code here?
> And make it a bit more straightforward?
>
> Thanx, Paul
>
Do you mean something like the following code?
if (!unlikely(t->on_cpu && t != current) ||
!trigger_single_cpu_backtrace(task_cpu(t)))
sched_show_task(t);
> > + } else {
> > + sched_show_task(t);
> > + }
> > }
> > raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
> > }
> > --
> > 2.49.0
> >
prev parent reply other threads:[~2026-04-22 6:46 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-17 1:38 [PATCH] rcu: use NMI to dump backtrace of blkd-task running on other cpu Jiazi Li
2026-04-21 23:20 ` Paul E. McKenney
2026-04-22 6:45 ` Jiazi Li [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260422063525.GA3155@Jiazi.Li \
--to=jqqlijiazi@gmail.com \
--cc=boqun@kernel.org \
--cc=frederic@kernel.org \
--cc=joelagnelf@nvidia.com \
--cc=josh@joshtriplett.org \
--cc=mingzhu.wang@transsion.com \
--cc=neeraj.upadhyay@kernel.org \
--cc=paulmck@kernel.org \
--cc=rcu@vger.kernel.org \
--cc=urezki@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox