From: Frederic Weisbecker <frederic@kernel.org>
To: "Paul E. McKenney" <paulmck@kernel.org>
Cc: LKML <linux-kernel@vger.kernel.org>,
Thomas Gleixner <tglx@linutronix.de>,
Boqun Feng <boqun.feng@gmail.com>,
Lai Jiangshan <jiangshanlai@gmail.com>,
Neeraj Upadhyay <neeraju@codeaurora.org>,
Josh Triplett <josh@joshtriplett.org>,
Stable <stable@vger.kernel.org>,
Joel Fernandes <joel@joelfernandes.org>
Subject: Re: [PATCH 01/13] rcu/nocb: Fix potential missed nocb_timer rearm
Date: Tue, 2 Mar 2021 13:34:44 +0100 [thread overview]
Message-ID: <20210302123444.GA97498@lothringen> (raw)
In-Reply-To: <20210302014829.GK2696@paulmck-ThinkPad-P72>
On Mon, Mar 01, 2021 at 05:48:29PM -0800, Paul E. McKenney wrote:
> On Wed, Feb 24, 2021 at 11:06:06PM +0100, Frederic Weisbecker wrote:
> > On Wed, Feb 24, 2021 at 10:37:09AM -0800, Paul E. McKenney wrote:
> > > On Tue, Feb 23, 2021 at 01:09:59AM +0100, Frederic Weisbecker wrote:
> > > > Two situations can cause a missed nocb timer rearm:
> > > >
> > > > 1) rdp(CPU A) queues its nocb timer. The grace period elapses before
> > > > the timer get a chance to fire. The nocb_gp kthread is awaken by
> > > > rdp(CPU B). The nocb_cb kthread for rdp(CPU A) is awaken and
> > > > process the callbacks, again before the nocb_timer for CPU A get a
> > > > chance to fire. rdp(CPU A) queues a callback and wakes up nocb_gp
> > > > kthread, cancelling the pending nocb_timer without resetting the
> > > > corresponding nocb_defer_wakeup.
> > >
> > > As discussed offlist, expanding the above scenario results in this
> > > sequence of steps:
>
> I renumbered the CPUs, since the ->nocb_gp_kthread would normally be
> associated with CPU 0. If the first CPU to enqueue a callback was also
> CPU 0, nocb_gp_wait() might clear that CPU's ->nocb_defer_wakeup, which
> would prevent this scenario from playing out. (But admittedly only if
> some other CPU handled by this same ->nocb_gp_kthread used its bypass.)
Ok good point.
>
> > > 1. There are no callbacks queued for any CPU covered by CPU 0-2's
> > > ->nocb_gp_kthread.
>
> And ->nocb_gp_kthread is associated with CPU 0.
>
> > > 2. CPU 1 enqueues its first callback with interrupts disabled, and
> > > thus must defer awakening its ->nocb_gp_kthread. It therefore
> > > queues its rcu_data structure's ->nocb_timer.
>
> At this point, CPU 1's rdp->nocb_defer_wakeup is RCU_NOCB_WAKE.
Right.
> > > 7. The grace period ends, so rcu_gp_kthread awakens the
> > > ->nocb_gp_kthread, which in turn awakens both CPU 1's and
> > > CPU 2's ->nocb_cb_kthread.
>
> And then ->nocb_cb_kthread sleeps waiting for more callbacks.
Yep
> > I managed to recollect some pieces of my brain. So keep the above but
> > let's change the point 10:
> >
> > 10. CPU 1 enqueues its second callback, this time with interrupts
> > enabled so it can wake directly ->nocb_gp_kthread.
> > It does so with calling __wake_nocb_gp() which also cancels the
>
> wake_nocb_gp() in current -rcu, correct?
Heh, right.
> > > So far so good, but why isn't the timer still queued from back in step 2?
> > > What am I missing here? Either way, could you please update the commit
> > > logs to tell the full story? At some later time, you might be very
> > > happy that you did. ;-)
> > >
> > > > 2) The "nocb_bypass_timer" ends up calling wake_nocb_gp() which deletes
> > > > the pending "nocb_timer" (note they are not the same timers) for the
> > > > given rdp without resetting the matching state stored in nocb_defer
> > > > wakeup.
>
> Would like to similarly expand this one, or would you prefer to rest your
> case on Case 1) above?
I was about to say that we can skip that one, the changelog will already be
big enough but the "Fixes:" tag refers to the second scenario, since it's the
oldest vulnerable commit AFAICS.
> > > > Fixes: d1b222c6be1f (rcu/nocb: Add bypass callback queueing)
Thanks.
next prev parent reply other threads:[~2021-03-02 15:09 UTC|newest]
Thread overview: 39+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-02-23 0:09 [PATCH 00/13] rcu/nocb updates v2 Frederic Weisbecker
2021-02-23 0:09 ` [PATCH 01/13] rcu/nocb: Fix potential missed nocb_timer rearm Frederic Weisbecker
2021-02-24 18:37 ` Paul E. McKenney
2021-02-24 22:06 ` Frederic Weisbecker
2021-02-25 0:14 ` Paul E. McKenney
2021-02-25 0:48 ` Frederic Weisbecker
2021-02-25 1:07 ` Paul E. McKenney
2021-03-02 1:48 ` Paul E. McKenney
2021-03-02 12:34 ` Frederic Weisbecker [this message]
2021-03-02 18:17 ` Paul E. McKenney
2021-03-03 1:35 ` Frederic Weisbecker
2021-03-03 2:06 ` Paul E. McKenney
2021-03-03 2:17 ` Frederic Weisbecker
2021-03-03 11:15 ` Neeraj Upadhyay
2021-02-23 0:10 ` [PATCH 02/13] rcu/nocb: Disable bypass when CPU isn't completely offloaded Frederic Weisbecker
2021-02-23 0:10 ` [PATCH 03/13] rcu/nocb: Remove stale comment above rcu_segcblist_offload() Frederic Weisbecker
2021-02-23 0:10 ` [PATCH 04/13] rcu/nocb: Move trace_rcu_nocb_wake() calls outside nocb_lock when possible Frederic Weisbecker
2021-02-23 0:10 ` [PATCH 05/13] rcu/nocb: Merge nocb_timer to the rdp leader Frederic Weisbecker
2021-03-03 1:15 ` [PATCH 05/13] rcu/nocb: Use the rcuog CPU's ->nocb_timer Paul E. McKenney
2021-03-10 22:05 ` Frederic Weisbecker
2021-03-16 0:02 ` Paul E. McKenney
2021-02-23 0:10 ` [PATCH 06/13] timer: Revert "timer: Add timer_curr_running()" Frederic Weisbecker
2021-02-23 0:10 ` [PATCH 07/13] rcu/nocb: Directly call __wake_nocb_gp() from bypass timer Frederic Weisbecker
2021-02-23 0:10 ` [PATCH 08/13] rcu/nocb: Allow de-offloading rdp leader Frederic Weisbecker
2021-02-23 0:10 ` [PATCH 09/13] rcu/nocb: Cancel nocb_timer upon nocb_gp wakeup Frederic Weisbecker
2021-02-23 0:10 ` [PATCH 10/13] rcu/nocb: Delete bypass_timer " Frederic Weisbecker
2021-03-03 1:24 ` Paul E. McKenney
2021-03-10 22:17 ` Frederic Weisbecker
2021-03-15 14:53 ` Boqun Feng
2021-03-15 22:56 ` Frederic Weisbecker
2021-03-16 0:02 ` Paul E. McKenney
2021-02-23 0:10 ` [PATCH 11/13] rcu/nocb: Only cancel nocb timer if not polling Frederic Weisbecker
2021-03-03 1:22 ` Paul E. McKenney
2021-03-10 22:08 ` Frederic Weisbecker
2021-02-23 0:10 ` [PATCH 12/13] rcu/nocb: Prepare for finegrained deferred wakeup Frederic Weisbecker
2021-03-16 3:02 ` Paul E. McKenney
2021-03-16 11:45 ` Frederic Weisbecker
2021-03-16 14:02 ` Paul E. McKenney
2021-02-23 0:10 ` [PATCH 13/13] rcu/nocb: Unify timers Frederic Weisbecker
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20210302123444.GA97498@lothringen \
--to=frederic@kernel.org \
--cc=boqun.feng@gmail.com \
--cc=jiangshanlai@gmail.com \
--cc=joel@joelfernandes.org \
--cc=josh@joshtriplett.org \
--cc=linux-kernel@vger.kernel.org \
--cc=neeraju@codeaurora.org \
--cc=paulmck@kernel.org \
--cc=stable@vger.kernel.org \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox