From: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
To: "Paul E. McKenney" <paulmck@kernel.org>
Cc: frederic@kernel.org, neeraj.iitr10@gmail.com, urezki@gmail.com,
joelagnelf@nvidia.com, boqun.feng@gmail.com, rcu@vger.kernel.org,
Kumar Kartikeya Dwivedi <memxor@gmail.com>
Subject: Re: Next-level bug in SRCU implementation of RCU Tasks Trace + PREEMPT_RT
Date: Wed, 18 Mar 2026 11:50:58 +0100 [thread overview]
Message-ID: <20260318105058.j2aKncBU@linutronix.de> (raw)
In-Reply-To: <fe28d664-3872-40f6-83c6-818627ad5b7d@paulmck-laptop>
On 2026-03-17 06:34:26 [-0700], Paul E. McKenney wrote:
> Hello!
Hi,
> Kumar Kartikeya Dwivedi (CCed) privately reported a bug in
> my implementation of the RCU Tasks Trace API in terms of SRCU-fast.
> You see, I forgot to ask what contexts call_rcu_tasks_trace() is called
> from, and it turns out that it can in fact be called with the scheduler
> pi/rq locks held. This results in a deadlock when SRCU-fast invokes the
> scheduler in order to start the SRCU-fast grace period. So RCU needs
> a fix to my fix found here:
>
> b540c63cf6e5 ("srcu: Use raw spinlocks so call_srcu() can be used under preempt_disable()")
I can't find it. I looked in next and the rcu tree.
> Sebastian, the PREEMPT_RT aspect is that lockdep does not complain
> about acquisition of non-raw spinlocks from preemption-disabled regions
> of code. This might be intentional, for example, there might be large
> bodies of Linux-kernel code that frequently acquire non-raw spinlocks
> from preemption-disabled regions of code, but which are never part of
> PREEMPT_RT kernels. Otherwise, it might be good for lockdep to diagnose
> this sort of thing.
The point is you don't know where this preempt_disable() is coming from
on !RT. It might be part of spinlock_t it might be explicit. We only
have the might_sleep() on PREEMPT_RT.
To catch this we would have to iterate over all held locks, compare the
expected preemption level with the current and account for possible
corner cases such as in-IRQ will be one higher and so on…
However, if you hold a raw_spinlock_t (such as rq/pi) then are asking
for a spinlock_t lockdep should respond with a
| BUG: Invalid wait context
report.
> Back to the actual bug, that call_srcu() now needs to tolerate being called
> with scheduler rq/pi locks held...
This is because it is called from sched_ext BPF callbacks?
> The straightforward (but perhaps broken) way to resolve this is to make
> srcu_gp_start_if_needed() defer invoking the scheduler, similar to the
Quick question. If srcu_gp_start_if_needed() can be invoked from a
preempt-disabled section (due to rq/pi lock) then
spin_lock_irqsave_sdp_contention(sdp, &flags);
does not work, right?
> way that vanilla RCU's call_rcu_core() function takes an early exit if
> interrupts are disabled. Of course, vanilla RCU can rely on things like
> the scheduling-clock interrupt to start any needed grace periods [1],
> but SRCU will instead need to manually defer this work, perhaps using
> workqueues or IRQ work.
>
> In addition, rcutorture needs to be upgraded to sometimes invoke
> ->call() with the scheduler pi lock held, but this change is not fixing
> a regression, so could be deferred. (There is already code in rcutorture
> that invokes the readers while holding a scheduler pi lock.)
>
> Given that RCU for this week through the end of March belongs to you guys,
> if one of you can get this done by end of day Thursday, London time,
> very good! Otherwise, I can put something together.
>
> Please let me know!
Given that the current locking does allow it and lockdep should have
complained, I am curious if we could rule that out ;)
>
> Thanx, Paul [2]
>
> [1] The exceptions to this rule being handled by the call to
> invoke_rcu_core() when rcu_is_watching() returns false.
>
> [2] Ah, and should vanilla RCU's call_rcu() be invokable from NMI
> handlers? Or should there be a call_rcu_nmi() for this purpose?
> Or should we continue to have its callers check in_nmi() when needed?
Did someone ask for this?
Sebastian
next prev parent reply other threads:[~2026-03-18 10:51 UTC|newest]
Thread overview: 101+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-17 13:34 Next-level bug in SRCU implementation of RCU Tasks Trace + PREEMPT_RT Paul E. McKenney
2026-03-18 10:50 ` Sebastian Andrzej Siewior [this message]
2026-03-18 11:49 ` Paul E. McKenney
2026-03-18 14:43 ` Sebastian Andrzej Siewior
2026-03-18 15:43 ` Paul E. McKenney
2026-03-18 16:04 ` Sebastian Andrzej Siewior
2026-03-18 16:32 ` Paul E. McKenney
2026-03-18 16:42 ` Boqun Feng
2026-03-18 18:45 ` Paul E. McKenney
2026-03-18 16:47 ` Sebastian Andrzej Siewior
2026-03-18 18:48 ` Paul E. McKenney
2026-03-19 8:55 ` Sebastian Andrzej Siewior
2026-03-19 10:05 ` Paul E. McKenney
2026-03-19 10:43 ` Paul E. McKenney
2026-03-19 10:51 ` Sebastian Andrzej Siewior
2026-03-18 15:51 ` Boqun Feng
2026-03-18 18:42 ` Paul E. McKenney
2026-03-18 20:04 ` Joel Fernandes
2026-03-18 20:11 ` Kumar Kartikeya Dwivedi
2026-03-18 20:25 ` Joel Fernandes
2026-03-18 21:52 ` Boqun Feng
2026-03-18 21:55 ` Boqun Feng
2026-03-18 22:15 ` Boqun Feng
2026-03-18 22:52 ` Joel Fernandes
2026-03-18 23:27 ` Boqun Feng
2026-03-19 1:08 ` Boqun Feng
2026-03-19 9:03 ` Sebastian Andrzej Siewior
2026-03-19 16:27 ` Boqun Feng
2026-03-19 16:33 ` Sebastian Andrzej Siewior
2026-03-19 16:48 ` Boqun Feng
2026-03-19 16:59 ` Kumar Kartikeya Dwivedi
2026-03-19 17:27 ` Boqun Feng
2026-03-19 18:41 ` Kumar Kartikeya Dwivedi
2026-03-19 20:14 ` Boqun Feng
2026-03-19 20:21 ` Joel Fernandes
2026-03-19 20:39 ` Boqun Feng
2026-03-20 15:34 ` Paul E. McKenney
2026-03-20 15:59 ` Boqun Feng
2026-03-20 16:24 ` Paul E. McKenney
2026-03-20 16:57 ` Boqun Feng
2026-03-20 17:54 ` Joel Fernandes
2026-03-20 18:14 ` [PATCH] rcu: Use an intermediate irq_work to start process_srcu() Boqun Feng
2026-03-20 19:18 ` Joel Fernandes
2026-03-20 20:47 ` Andrea Righi
2026-03-20 20:54 ` Boqun Feng
2026-03-20 21:00 ` Andrea Righi
2026-03-20 21:02 ` Andrea Righi
2026-03-20 21:06 ` Boqun Feng
2026-03-20 22:29 ` [PATCH v2] " Boqun Feng
2026-03-23 21:09 ` Joel Fernandes
2026-03-23 22:18 ` Boqun Feng
2026-03-23 22:50 ` Joel Fernandes
2026-03-24 11:27 ` Frederic Weisbecker
2026-03-24 14:56 ` Joel Fernandes
2026-03-24 14:56 ` Alexei Starovoitov
2026-03-24 17:36 ` Boqun Feng
2026-03-24 18:40 ` Joel Fernandes
2026-03-24 19:23 ` Paul E. McKenney
2026-03-26 19:12 ` patchwork-bot+netdevbpf
2026-03-21 4:27 ` [PATCH] " Zqiang
2026-03-21 18:15 ` Boqun Feng
2026-03-21 10:10 ` Paul E. McKenney
2026-03-21 17:15 ` Boqun Feng
2026-03-21 17:41 ` Paul E. McKenney
2026-03-21 18:06 ` Boqun Feng
2026-03-21 19:31 ` Paul E. McKenney
2026-03-21 19:45 ` Boqun Feng
2026-03-21 20:07 ` Paul E. McKenney
2026-03-21 20:08 ` Boqun Feng
2026-03-22 10:09 ` Paul E. McKenney
2026-03-22 16:16 ` Boqun Feng
2026-03-22 17:09 ` Paul E. McKenney
2026-03-22 17:31 ` Boqun Feng
2026-03-22 17:44 ` Paul E. McKenney
2026-03-22 18:17 ` Boqun Feng
2026-03-22 19:47 ` Paul E. McKenney
2026-03-22 20:26 ` Boqun Feng
2026-03-23 7:50 ` Paul E. McKenney
2026-03-20 18:20 ` Next-level bug in SRCU implementation of RCU Tasks Trace + PREEMPT_RT Boqun Feng
2026-03-20 23:11 ` Paul E. McKenney
2026-03-21 3:29 ` Paul E. McKenney
2026-03-21 17:03 ` [RFC PATCH] rcu-tasks: Avoid using mod_timer() in call_rcu_tasks_generic() Boqun Feng
2026-03-23 15:17 ` Boqun Feng
2026-03-23 20:37 ` Joel Fernandes
2026-03-23 21:50 ` Kumar Kartikeya Dwivedi
2026-03-23 22:13 ` Boqun Feng
2026-03-20 16:15 ` Next-level bug in SRCU implementation of RCU Tasks Trace + PREEMPT_RT Boqun Feng
2026-03-20 16:24 ` Paul E. McKenney
2026-03-19 17:02 ` Sebastian Andrzej Siewior
2026-03-19 17:44 ` Boqun Feng
2026-03-19 18:42 ` Joel Fernandes
2026-03-19 20:20 ` Boqun Feng
2026-03-19 20:26 ` Joel Fernandes
2026-03-19 20:45 ` Joel Fernandes
2026-03-19 10:02 ` Paul E. McKenney
2026-03-19 14:34 ` Boqun Feng
2026-03-19 16:10 ` Paul E. McKenney
2026-03-18 23:56 ` Kumar Kartikeya Dwivedi
2026-03-19 0:26 ` Zqiang
2026-03-19 1:13 ` Boqun Feng
2026-03-19 2:47 ` Joel Fernandes
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260318105058.j2aKncBU@linutronix.de \
--to=bigeasy@linutronix.de \
--cc=boqun.feng@gmail.com \
--cc=frederic@kernel.org \
--cc=joelagnelf@nvidia.com \
--cc=memxor@gmail.com \
--cc=neeraj.iitr10@gmail.com \
--cc=paulmck@kernel.org \
--cc=rcu@vger.kernel.org \
--cc=urezki@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.