public inbox for rcu@vger.kernel.org
 help / color / mirror / Atom feed
From: Kumar Kartikeya Dwivedi <memxor@gmail.com>
To: Boqun Feng <boqun@kernel.org>
Cc: Joel Fernandes <joelagnelf@nvidia.com>,
	"Paul E. McKenney" <paulmck@kernel.org>,
	 Sebastian Andrzej Siewior <bigeasy@linutronix.de>,
	frederic@kernel.org, neeraj.iitr10@gmail.com,  urezki@gmail.com,
	boqun.feng@gmail.com, rcu@vger.kernel.org,
	 Tejun Heo <tj@kernel.org>,
	bpf@vger.kernel.org, Alexei Starovoitov <ast@kernel.org>,
	 Daniel Borkmann <daniel@iogearbox.net>,
	John Fastabend <john.fastabend@gmail.com>,
	 Song Liu <song@kernel.org>,
	stable@kernel.org
Subject: Re: [RFC PATCH] rcu-tasks: Avoid using mod_timer() in call_rcu_tasks_generic()
Date: Mon, 23 Mar 2026 22:50:18 +0100	[thread overview]
Message-ID: <CAP01T761Y4AMau-4OyuvgSCNMHRD-3YW6+8JDYFxm0Nf9ycwaw@mail.gmail.com> (raw)
In-Reply-To: <acFZem9MNHRXin2h@tardis.local>

On Mon, 23 Mar 2026 at 16:17, Boqun Feng <boqun@kernel.org> wrote:
>
> On Sat, Mar 21, 2026 at 10:03:21AM -0700, Boqun Feng wrote:
> > The following deadlock is possible:
> >
> > __mod_timer()
> >   lock_timer_base()
> >     raw_spin_lock_irqsave(&base->lock)   <- base->lock ACQUIRED
> >   trace_timer_start()                    <- tp_btf/timer_start fires here
> >     [probe_timer_start BPF program]
> >       bpf_task_storage_delete()
> >         bpf_selem_unlink(selem, false)   <- reuse_now=false
> >           bpf_selem_free(false)
> >             call_rcu_tasks_trace()
> >               call_rcu_tasks_generic()
> >                 raw_spin_trylock(rtpcp)  <- succeeds (different lock)
> >                 mod_timer(lazy_timer)    <- lazy_timer is on this CPU's base
> >                   lock_timer_base()
> >                     raw_spin_lock_irqsave(&base->lock) <- SAME LOCK -> DEADLOCK
> >
> > because BPF can instrument a place while the timer base lock is held.
> > Fix it by using an intermediate irq_work.
> >
> > Further, because a "timer base->lock" to a "rtpcp lock" lock dependency
> > can be establish in this way, we cannot mod_timer() with a rtpcp lock
> > held. Fix that as well.
> >
> > Fixes: d119357d0743 ("rcu-tasks: Treat only synchronous grace periods urgently")
> > Cc: stable@kernel.org
> > Signed-off-by: Boqun Feng <boqun@kernel.org>
> > ---
> > This is a follow-up of [1], and yes we can trigger a whole system
> > deadlock freeze easily with (even non-recursively) tracing the
> > timer_start tracepoint. I have a reproduce at:
> >
> >       https://github.com/fbq/rcu_tasks_deadlock
> >
> > Be very careful, since it'll freeze your system when run it.
> >
> > I've tested it on 6.17 and 6.19 and can confirm the deadlock could be
> > triggered. So this is an old bug if it's a bug.
> >
> > It's up to BPF whether this is a bug or not, because it has existed for
> > a while and nobody seems to get hurt(?).
> >
>
> Ping BPF ;-) I know this was sent in a Saturday and only 2 days have
> passed, but we are at the decision point about how hard/urgent we should
> fix these "BPF deadlocks": As this patch shows, the deadlocks existed
> before v7.0 (i.e. before SRCU switches). And yes, ideally we should fix
> all of them, but given we are close to v7.0 release, I would like to
> focus the new issue that SRCU introduces, because that one would likely
> affect SCHED_EXT. Thoughts?

I tried both of your changes, thanks for working on these. I agree the
one reported by Andrea is more important, so that should be the first
priority and should be sent as a fix for 7.0.

It would be good to make the timer-related fix too, but it doesn't
need to be rushed for 7.0. We're aware of the issue being fixed by
this patch in timer tracepoints [0].
It's a corner case which no one has hit thus far. We already made
similar fixes where we could, e.g. [1], but it's difficult to make a
similar change for local storage.

Given Paul said he plans into looking into call_{s,}rcu_nolock()
anyway, the guard can be dropped once call_{s,}rcu_nolock()
materializes.

Something like what you did in this patch would be prerequisite for
the call_{s,}rcu_nolock() anyway, the assumption will be that it could
be invoked from NMI, so reentrancy can happen anywhere, it doesn't
matter much then whether it happens in the same context when the lock
is held and we end up invoking the same path through call_srcu(), or
when an NMI prog interrupts when the timer lock is held.

  [0]: https://lore.kernel.org/bpf/CAP01T76xUCrDH4G2XikNvhPTn6ZbNTgQH59qt2Q_o0c9uudd8w@mail.gmail.com
  [1]: https://lore.kernel.org/bpf/20260204055147.54960-2-alexei.starovoitov@gmail.com

>
> [...]

  parent reply	other threads:[~2026-03-23 21:50 UTC|newest]

Thread overview: 100+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-17 13:34 Next-level bug in SRCU implementation of RCU Tasks Trace + PREEMPT_RT Paul E. McKenney
2026-03-18 10:50 ` Sebastian Andrzej Siewior
2026-03-18 11:49   ` Paul E. McKenney
2026-03-18 14:43     ` Sebastian Andrzej Siewior
2026-03-18 15:43       ` Paul E. McKenney
2026-03-18 16:04         ` Sebastian Andrzej Siewior
2026-03-18 16:32           ` Paul E. McKenney
2026-03-18 16:42             ` Boqun Feng
2026-03-18 18:45               ` Paul E. McKenney
2026-03-18 16:47             ` Sebastian Andrzej Siewior
2026-03-18 18:48               ` Paul E. McKenney
2026-03-19  8:55                 ` Sebastian Andrzej Siewior
2026-03-19 10:05                   ` Paul E. McKenney
2026-03-19 10:43                     ` Paul E. McKenney
2026-03-19 10:51                       ` Sebastian Andrzej Siewior
2026-03-18 15:51       ` Boqun Feng
2026-03-18 18:42         ` Paul E. McKenney
2026-03-18 20:04           ` Joel Fernandes
2026-03-18 20:11             ` Kumar Kartikeya Dwivedi
2026-03-18 20:25               ` Joel Fernandes
2026-03-18 21:52             ` Boqun Feng
2026-03-18 21:55               ` Boqun Feng
2026-03-18 22:15                 ` Boqun Feng
2026-03-18 22:52                   ` Joel Fernandes
2026-03-18 23:27                     ` Boqun Feng
2026-03-19  1:08                       ` Boqun Feng
2026-03-19  9:03                         ` Sebastian Andrzej Siewior
2026-03-19 16:27                           ` Boqun Feng
2026-03-19 16:33                             ` Sebastian Andrzej Siewior
2026-03-19 16:48                               ` Boqun Feng
2026-03-19 16:59                                 ` Kumar Kartikeya Dwivedi
2026-03-19 17:27                                   ` Boqun Feng
2026-03-19 18:41                                     ` Kumar Kartikeya Dwivedi
2026-03-19 20:14                                       ` Boqun Feng
2026-03-19 20:21                                         ` Joel Fernandes
2026-03-19 20:39                                           ` Boqun Feng
2026-03-20 15:34                                             ` Paul E. McKenney
2026-03-20 15:59                                               ` Boqun Feng
2026-03-20 16:24                                                 ` Paul E. McKenney
2026-03-20 16:57                                                   ` Boqun Feng
2026-03-20 17:54                                                     ` Joel Fernandes
2026-03-20 18:14                                                       ` [PATCH] rcu: Use an intermediate irq_work to start process_srcu() Boqun Feng
2026-03-20 19:18                                                         ` Joel Fernandes
2026-03-20 20:47                                                         ` Andrea Righi
2026-03-20 20:54                                                           ` Boqun Feng
2026-03-20 21:00                                                             ` Andrea Righi
2026-03-20 21:02                                                               ` Andrea Righi
2026-03-20 21:06                                                                 ` Boqun Feng
2026-03-20 22:29                                                           ` [PATCH v2] " Boqun Feng
2026-03-23 21:09                                                             ` Joel Fernandes
2026-03-23 22:18                                                               ` Boqun Feng
2026-03-23 22:50                                                                 ` Joel Fernandes
2026-03-24 11:27                                                             ` Frederic Weisbecker
2026-03-24 14:56                                                               ` Joel Fernandes
2026-03-24 14:56                                                               ` Alexei Starovoitov
2026-03-24 17:36                                                                 ` Boqun Feng
2026-03-24 18:40                                                                   ` Joel Fernandes
2026-03-24 19:23                                                                   ` Paul E. McKenney
2026-03-21  4:27                                                         ` [PATCH] " Zqiang
2026-03-21 18:15                                                           ` Boqun Feng
2026-03-21 10:10                                                         ` Paul E. McKenney
2026-03-21 17:15                                                           ` Boqun Feng
2026-03-21 17:41                                                             ` Paul E. McKenney
2026-03-21 18:06                                                               ` Boqun Feng
2026-03-21 19:31                                                                 ` Paul E. McKenney
2026-03-21 19:45                                                                   ` Boqun Feng
2026-03-21 20:07                                                                     ` Paul E. McKenney
2026-03-21 20:08                                                                       ` Boqun Feng
2026-03-22 10:09                                                                         ` Paul E. McKenney
2026-03-22 16:16                                                                           ` Boqun Feng
2026-03-22 17:09                                                                             ` Paul E. McKenney
2026-03-22 17:31                                                                               ` Boqun Feng
2026-03-22 17:44                                                                                 ` Paul E. McKenney
2026-03-22 18:17                                                                                   ` Boqun Feng
2026-03-22 19:47                                                                                     ` Paul E. McKenney
2026-03-22 20:26                                                                                       ` Boqun Feng
2026-03-23  7:50                                                                                         ` Paul E. McKenney
2026-03-20 18:20                                                       ` Next-level bug in SRCU implementation of RCU Tasks Trace + PREEMPT_RT Boqun Feng
2026-03-20 23:11                                                     ` Paul E. McKenney
2026-03-21  3:29                                                       ` Paul E. McKenney
2026-03-21 17:03                                                   ` [RFC PATCH] rcu-tasks: Avoid using mod_timer() in call_rcu_tasks_generic() Boqun Feng
2026-03-23 15:17                                                     ` Boqun Feng
2026-03-23 20:37                                                       ` Joel Fernandes
2026-03-23 21:50                                                       ` Kumar Kartikeya Dwivedi [this message]
2026-03-23 22:13                                                         ` Boqun Feng
2026-03-20 16:15                                         ` Next-level bug in SRCU implementation of RCU Tasks Trace + PREEMPT_RT Boqun Feng
2026-03-20 16:24                                           ` Paul E. McKenney
2026-03-19 17:02                                 ` Sebastian Andrzej Siewior
2026-03-19 17:44                                   ` Boqun Feng
2026-03-19 18:42                                     ` Joel Fernandes
2026-03-19 20:20                                       ` Boqun Feng
2026-03-19 20:26                                         ` Joel Fernandes
2026-03-19 20:45                                           ` Joel Fernandes
2026-03-19 10:02                         ` Paul E. McKenney
2026-03-19 14:34                           ` Boqun Feng
2026-03-19 16:10                             ` Paul E. McKenney
2026-03-18 23:56                   ` Kumar Kartikeya Dwivedi
2026-03-19  0:26                     ` Zqiang
2026-03-19  1:13                       ` Boqun Feng
2026-03-19  2:47                         ` Joel Fernandes

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAP01T761Y4AMau-4OyuvgSCNMHRD-3YW6+8JDYFxm0Nf9ycwaw@mail.gmail.com \
    --to=memxor@gmail.com \
    --cc=ast@kernel.org \
    --cc=bigeasy@linutronix.de \
    --cc=boqun.feng@gmail.com \
    --cc=boqun@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=frederic@kernel.org \
    --cc=joelagnelf@nvidia.com \
    --cc=john.fastabend@gmail.com \
    --cc=neeraj.iitr10@gmail.com \
    --cc=paulmck@kernel.org \
    --cc=rcu@vger.kernel.org \
    --cc=song@kernel.org \
    --cc=stable@kernel.org \
    --cc=tj@kernel.org \
    --cc=urezki@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox