Linux RCU subsystem development
 help / color / mirror / Atom feed
From: Jiazi Li <jqqlijiazi@gmail.com>
To: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Frederic Weisbecker <frederic@kernel.org>,
	Neeraj Upadhyay <neeraj.upadhyay@kernel.org>,
	Joel Fernandes <joelagnelf@nvidia.com>,
	Josh Triplett <josh@joshtriplett.org>,
	Boqun Feng <boqun@kernel.org>,
	Uladzislau Rezki <urezki@gmail.com>,
	rcu@vger.kernel.org, "mingzhu.wang" <mingzhu.wang@transsion.com>
Subject: Re: [PATCH] rcu: use NMI to dump backtrace of blkd-task running on other cpu
Date: Wed, 22 Apr 2026 14:45:36 +0800	[thread overview]
Message-ID: <20260422063525.GA3155@Jiazi.Li> (raw)
In-Reply-To: <7a46efa3-3d11-45df-8d73-5af8b87a4b84@paulmck-laptop>

On Tue, Apr 21, 2026 at 04:20:54PM -0700, Paul E. McKenney wrote:
> On Fri, Apr 17, 2026 at 09:38:13AM +0800, Jiazi Li wrote:
> > sched_show_task cannot dump backtrace of blkd-task running on other
> > cpu:
> > [117421.286553][    C0] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
> > [117421.286579][    C0] rcu:    Tasks blocked on level-0 rcu_node (CPUs 0-7): P2280
> > [117421.286595][    C0] rcu:    (detected by 0, t=5252 jiffies, g=751845, q=66318 ncpus=8)
> > [117421.286604][    C0] task:android.imms2   state:R  running task     stack:0     ...
> > [117421.286617][    C0] Call trace:
> > [117421.286622][    C0]  __switch_to+0x1a0/0x318
> > [117421.286636][    C0]  0x0
> > 
> > So use NMI to dump backtrace:
> > [  390.584143] rcub/0: rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
> > [  390.585156] rcub/0: rcu:     Tasks blocked on level-0 rcu_node (CPUs 0-7): P6816
> > [  390.586207] rcub/0: rcu:     (detected by 5, t=52532 jiffies, g=7405, q=63942 ncpus=8)
> > [  390.587320] rcub/0: Sending NMI from CPU 5 to CPUs 4:
> > [  390.588111] rcu_stall_threa: NMI backtrace for cpu 4
> > [  390.588116] rcu_stall_threa: CPU: 4 UID: 0 PID: 6816 Comm: rcu_stall_threa Tainted: P...
> > [  390.588120] rcu_stall_threa: Tainted: [P]=PROPRIETARY_MODULE, [W]=WARN, [O]=OOT_MODULE
> > [  390.588122] rcu_stall_threa: Hardware name: MT6858 (DT)
> > [  390.588123] rcu_stall_threa: pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> > [  390.588125] rcu_stall_threa: pc : _raw_spin_unlock_irqrestore+0x1c/0x44
> > [  390.588131] rcu_stall_threa: lr : ___ratelimit+0xd4/0x110
> > [  390.588134] rcu_stall_threa: sp : ffffffc08464bdf0
> > [  390.588135] rcu_stall_threa: x29: ffffffc08464bdf0 x28: 0000000000000000 x27: 0000000000000000
> > [  390.588138] rcu_stall_threa: x26: 0000000000000000 x25: 0000000000000000 x24: 00000000000004e2
> > [  390.588140] rcu_stall_threa: x23: ffffffd82ae77000 x22: ffffffd82af1fae8 x21: 000000000000000a
> > [  390.588142] rcu_stall_threa: x20: 0000000000000000 x19: 0000000000000000 x18: ffffffc08456d020
> > [  390.588144] rcu_stall_threa: x17: 000000008c623181 x16: 000000008c623181 x15: 0000000000000010
> > [  390.588146] rcu_stall_threa: x14: 0000000000000100 x13: ffffffc084648000 x12: ffffffc08464c000
> > [  390.588148] rcu_stall_threa: x11: 5e2da9f91a08d800 x10: ffffffd8299b39fc x9 : 0000000100005874
> > [  390.588150] rcu_stall_threa: x8 : 0000000000000000 x7 : 0000000000000001 x6 : fffffffebea2b0a0
> > [  390.588152] rcu_stall_threa: x5 : 0000000000000000 x4 : 0000000000000402 x3 : 0000000000000000
> > [  390.588154] rcu_stall_threa: x2 : ffffff81ca8d9680 x1 : 0000000000000000 x0 : 0000000000000001
> > [  390.588156] rcu_stall_threa: Call trace:
> > [  390.588157] rcu_stall_threa:  _raw_spin_unlock_irqrestore+0x1c/0x44
> > [  390.588159] rcu_stall_threa:  ___ratelimit+0xd4/0x110
> > [  390.588161] rcu_stall_threa:  rcu_thread_func+0x90/0xa8
> > [  390.588164] rcu_stall_threa:  kthread+0x110/0x1a4
> > [  390.588167] rcu_stall_threa:  ret_from_fork+0x10/0x20
> > 
> > Signed-off-by: Jiazi Li <jqqlijiazi@gmail.com>
> > Tested-by: mingzhu.wang <mingzhu.wang@transsion.com>
> 
> This looks like an arm64 stack trace.  Are there any arm64 systems in
> production that do real NMIs?  (Don't get me wrong, it would be nice if
> there are!)
> 
From commit 331a1b3a836c ("arm64: smp: Add arch support for backtrace using pseudo-NMI"), ARM64 using
pseudo-NMI, it's actually an IPI.
> > ---
> >  kernel/rcu/tree_stall.h | 7 ++++++-
> >  1 file changed, 6 insertions(+), 1 deletion(-)
> > 
> > diff --git a/kernel/rcu/tree_stall.h b/kernel/rcu/tree_stall.h
> > index b67532cb8770..5806f9a43579 100644
> > --- a/kernel/rcu/tree_stall.h
> > +++ b/kernel/rcu/tree_stall.h
> > @@ -289,7 +289,12 @@ static void rcu_print_detail_task_stall_rnp(struct rcu_node *rnp)
> >  		 * Avoid triggering hard lockup.
> >  		 */
> >  		touch_nmi_watchdog();
> > -		sched_show_task(t);
> > +		if (unlikely(t->on_cpu && t != current) &&
> 
> What if task t blocks or migrates to some other CPU at this point?
> 
Yes, that's indeed a concern. We can identify such scenarios by checking
whether the PID reported by RCU matched the PID captured in the NMI
backtrace.
Do you have any suggestions?
> > +				trigger_single_cpu_backtrace(task_cpu(t))) {
> > +			/*Successfully triggered remote backtrace*/
> 
> Wouldn't inverting the condition save a couple of lines of code here?
> And make it a bit more straightforward?
> 
> 							Thanx, Paul
> 
Do you mean something like the following code?
		if (!unlikely(t->on_cpu && t != current) ||
			!trigger_single_cpu_backtrace(task_cpu(t)))
			sched_show_task(t);

> > +		} else {
> > +			sched_show_task(t);
> > +		}
> >  	}
> >  	raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
> >  }
> > -- 
> > 2.49.0
> > 

      reply	other threads:[~2026-04-22  6:46 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-17  1:38 [PATCH] rcu: use NMI to dump backtrace of blkd-task running on other cpu Jiazi Li
2026-04-21 23:20 ` Paul E. McKenney
2026-04-22  6:45   ` Jiazi Li [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260422063525.GA3155@Jiazi.Li \
    --to=jqqlijiazi@gmail.com \
    --cc=boqun@kernel.org \
    --cc=frederic@kernel.org \
    --cc=joelagnelf@nvidia.com \
    --cc=josh@joshtriplett.org \
    --cc=mingzhu.wang@transsion.com \
    --cc=neeraj.upadhyay@kernel.org \
    --cc=paulmck@kernel.org \
    --cc=rcu@vger.kernel.org \
    --cc=urezki@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox