All of lore.kernel.org
 help / color / mirror / Atom feed
From: Frederic Weisbecker <frederic@kernel.org>
To: "Paul E. McKenney" <paulmck@kernel.org>
Cc: LKML <linux-kernel@vger.kernel.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Boqun Feng <boqun.feng@gmail.com>,
	Lai Jiangshan <jiangshanlai@gmail.com>,
	Neeraj Upadhyay <neeraju@codeaurora.org>,
	Josh Triplett <josh@joshtriplett.org>,
	Stable <stable@vger.kernel.org>,
	Joel Fernandes <joel@joelfernandes.org>
Subject: Re: [PATCH 01/13] rcu/nocb: Fix potential missed nocb_timer rearm
Date: Wed, 24 Feb 2021 23:06:06 +0100	[thread overview]
Message-ID: <20210224220606.GA3179@lothringen> (raw)
In-Reply-To: <20210224183709.GI2743@paulmck-ThinkPad-P72>

On Wed, Feb 24, 2021 at 10:37:09AM -0800, Paul E. McKenney wrote:
> On Tue, Feb 23, 2021 at 01:09:59AM +0100, Frederic Weisbecker wrote:
> > Two situations can cause a missed nocb timer rearm:
> > 
> > 1) rdp(CPU A) queues its nocb timer. The grace period elapses before
> >    the timer get a chance to fire. The nocb_gp kthread is awaken by
> >    rdp(CPU B). The nocb_cb kthread for rdp(CPU A) is awaken and
> >    process the callbacks, again before the nocb_timer for CPU A get a
> >    chance to fire. rdp(CPU A) queues a callback and wakes up nocb_gp
> >    kthread, cancelling the pending nocb_timer without resetting the
> >    corresponding nocb_defer_wakeup.
> 
> As discussed offlist, expanding the above scenario results in this
> sequence of steps:
> 
> 1.	There are no callbacks queued for any CPU covered by CPU 0-2's
> 	->nocb_gp_kthread.
> 
> 2.	CPU 0 enqueues its first callback with interrupts disabled, and
> 	thus must defer awakening its ->nocb_gp_kthread.  It therefore
> 	queues its rcu_data structure's ->nocb_timer.
> 
> 3.	CPU 1, which shares the same ->nocb_gp_kthread, also enqueues a
> 	callback, but with interrupts enabled, allowing it to directly
> 	awaken the ->nocb_gp_kthread.
> 
> 4.	The newly awakened ->nocb_gp_kthread associates both CPU 0's
> 	and CPU 1's callbacks with a future grace period and arranges
> 	for that grace period to be started.
> 
> 5.	This ->nocb_gp_kthread goes to sleep waiting for the end of this
> 	future grace period.
> 
> 6.	This grace period elapses before the CPU 0's timer fires.
> 	This is normally improbably given that the timer is set for only
> 	one jiffy, but timers can be delayed.  Besides, it is possible
> 	that kernel was built with CONFIG_RCU_STRICT_GRACE_PERIOD=y.
> 
> 7.	The grace period ends, so rcu_gp_kthread awakens the
> 	->nocb_gp_kthread, which in turn awakens both CPU 0's and
> 	CPU 1's ->nocb_cb_kthread.
> 
> 8.	CPU 0's ->nocb_cb_kthread invokes its callback.
> 
> 9.	Note that neither kthread updated any ->nocb_timer state,
> 	so CPU 0's ->nocb_defer_wakeup is still set to either
> 	RCU_NOCB_WAKE or RCU_NOCB_WAKE_FORCE.
> 
> 10.	CPU 0 enqueues its second callback, again with interrupts
> 	disabled, and thus must again defer awakening its
> 	->nocb_gp_kthread.  However, ->nocb_defer_wakeup prevents
> 	CPU 0 from queueing the timer.

I managed to recollect some pieces of my brain. So keep the above but
let's change the point 10:

10.     CPU 0 enqueues its second callback, this time with interrupts
 	enabled so it can wake directly	->nocb_gp_kthread.
	It does so with calling __wake_nocb_gp() which also cancels the
	pending timer that got queued in step 2. But that doesn't reset
	CPU 0's ->nocb_defer_wakeup which is still set to RCU_NOCB_WAKE.
	So CPU 0's ->nocb_defer_wakeup and CPU 0's ->nocb_timer are now
	desynchronized.

11.	->nocb_gp_kthread associates the callback queued in 10 with a new
	grace period, arrange for it to start and sleeps on it.

12.     The grace period ends, ->nocb_gp_kthread awakens and wakes up
	CPU 0's ->nocb_cb_kthread which invokes the callback queued in 10.

13.	CPU 0 enqueues its third callback, this time with interrupts
	disabled so it tries to queue a deferred wakeup. However
	->nocb_defer_wakeup has a stalled RCU_NOCB_WAKE value which prevents
	the CPU 0's ->nocb_timer, that got cancelled in 10, from being armed.

14.     CPU 0 has its pending callback and it may go unnoticed until
        some other CPU ever wakes up ->nocb_gp_kthread or CPU 0 ever calls
	an explicit deferred wake up caller like idle entry.

I hope I'm not missing something this time...

Thanks.
	

> 
> So far so good, but why isn't the timer still queued from back in step 2?
> What am I missing here?  Either way, could you please update the commit
> logs to tell the full story?  At some later time, you might be very
> happy that you did.  ;-)
> 
> 							Thanx, Paul
> 
> > 2) The "nocb_bypass_timer" ends up calling wake_nocb_gp() which deletes
> >    the pending "nocb_timer" (note they are not the same timers) for the
> >    given rdp without resetting the matching state stored in nocb_defer
> >    wakeup.
> > 
> > On both situations, a future call_rcu() on that rdp may be fooled and
> > think the timer is armed when it's not, missing a deferred nocb_gp
> > wakeup.
> > 
> > Case 1) is very unlikely due to timing constraint (the timer fires after
> > 1 jiffy) but still possible in theory. Case 2) is more likely to happen.
> > But in any case such scenario require the CPU to spend a long time
> > within a kernel thread without exiting to idle or user space, which is
> > a pretty exotic behaviour.
> > 
> > Fix this with resetting rdp->nocb_defer_wakeup everytime we disarm the
> > timer.
> > 
> > Fixes: d1b222c6be1f (rcu/nocb: Add bypass callback queueing)
> > Cc: Stable <stable@vger.kernel.org>
> > Cc: Josh Triplett <josh@joshtriplett.org>
> > Cc: Lai Jiangshan <jiangshanlai@gmail.com>
> > Cc: Joel Fernandes <joel@joelfernandes.org>
> > Cc: Neeraj Upadhyay <neeraju@codeaurora.org>
> > Cc: Boqun Feng <boqun.feng@gmail.com>
> > Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
> > ---
> >  kernel/rcu/tree_plugin.h | 7 +++++--
> >  1 file changed, 5 insertions(+), 2 deletions(-)
> > 
> > diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
> > index 2ec9d7f55f99..dd0dc66c282d 100644
> > --- a/kernel/rcu/tree_plugin.h
> > +++ b/kernel/rcu/tree_plugin.h
> > @@ -1720,7 +1720,11 @@ static bool wake_nocb_gp(struct rcu_data *rdp, bool force,
> >  		rcu_nocb_unlock_irqrestore(rdp, flags);
> >  		return false;
> >  	}
> > -	del_timer(&rdp->nocb_timer);
> > +
> > +	if (READ_ONCE(rdp->nocb_defer_wakeup) > RCU_NOCB_WAKE_NOT) {
> > +		WRITE_ONCE(rdp->nocb_defer_wakeup, RCU_NOCB_WAKE_NOT);
> > +		del_timer(&rdp->nocb_timer);
> > +	}
> >  	rcu_nocb_unlock_irqrestore(rdp, flags);
> >  	raw_spin_lock_irqsave(&rdp_gp->nocb_gp_lock, flags);
> >  	if (force || READ_ONCE(rdp_gp->nocb_gp_sleep)) {
> > @@ -2349,7 +2353,6 @@ static bool do_nocb_deferred_wakeup_common(struct rcu_data *rdp)
> >  		return false;
> >  	}
> >  	ndw = READ_ONCE(rdp->nocb_defer_wakeup);
> > -	WRITE_ONCE(rdp->nocb_defer_wakeup, RCU_NOCB_WAKE_NOT);
> >  	ret = wake_nocb_gp(rdp, ndw == RCU_NOCB_WAKE_FORCE, flags);
> >  	trace_rcu_nocb_wake(rcu_state.name, rdp->cpu, TPS("DeferredWake"));
> >  
> > -- 
> > 2.25.1
> > 

  reply	other threads:[~2021-02-24 22:07 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-02-23  0:09 [PATCH 00/13] rcu/nocb updates v2 Frederic Weisbecker
2021-02-23  0:09 ` [PATCH 01/13] rcu/nocb: Fix potential missed nocb_timer rearm Frederic Weisbecker
2021-02-24 18:37   ` Paul E. McKenney
2021-02-24 22:06     ` Frederic Weisbecker [this message]
2021-02-25  0:14       ` Paul E. McKenney
2021-02-25  0:48         ` Frederic Weisbecker
2021-02-25  1:07           ` Paul E. McKenney
2021-03-02  1:48       ` Paul E. McKenney
2021-03-02 12:34         ` Frederic Weisbecker
2021-03-02 18:17           ` Paul E. McKenney
2021-03-03  1:35             ` Frederic Weisbecker
2021-03-03  2:06               ` Paul E. McKenney
2021-03-03  2:17                 ` Frederic Weisbecker
2021-03-03 11:15             ` Neeraj Upadhyay
2021-02-23  0:10 ` [PATCH 02/13] rcu/nocb: Disable bypass when CPU isn't completely offloaded Frederic Weisbecker
2021-02-23  0:10 ` [PATCH 03/13] rcu/nocb: Remove stale comment above rcu_segcblist_offload() Frederic Weisbecker
2021-02-23  0:10 ` [PATCH 04/13] rcu/nocb: Move trace_rcu_nocb_wake() calls outside nocb_lock when possible Frederic Weisbecker
2021-02-23  0:10 ` [PATCH 05/13] rcu/nocb: Merge nocb_timer to the rdp leader Frederic Weisbecker
2021-03-03  1:15   ` [PATCH 05/13] rcu/nocb: Use the rcuog CPU's ->nocb_timer Paul E. McKenney
2021-03-10 22:05     ` Frederic Weisbecker
2021-03-16  0:02       ` Paul E. McKenney
2021-02-23  0:10 ` [PATCH 06/13] timer: Revert "timer: Add timer_curr_running()" Frederic Weisbecker
2021-02-23  0:10 ` [PATCH 07/13] rcu/nocb: Directly call __wake_nocb_gp() from bypass timer Frederic Weisbecker
2021-02-23  0:10 ` [PATCH 08/13] rcu/nocb: Allow de-offloading rdp leader Frederic Weisbecker
2021-02-23  0:10 ` [PATCH 09/13] rcu/nocb: Cancel nocb_timer upon nocb_gp wakeup Frederic Weisbecker
2021-02-23  0:10 ` [PATCH 10/13] rcu/nocb: Delete bypass_timer " Frederic Weisbecker
2021-03-03  1:24   ` Paul E. McKenney
2021-03-10 22:17     ` Frederic Weisbecker
2021-03-15 14:53       ` Boqun Feng
2021-03-15 22:56         ` Frederic Weisbecker
2021-03-16  0:02           ` Paul E. McKenney
2021-02-23  0:10 ` [PATCH 11/13] rcu/nocb: Only cancel nocb timer if not polling Frederic Weisbecker
2021-03-03  1:22   ` Paul E. McKenney
2021-03-10 22:08     ` Frederic Weisbecker
2021-02-23  0:10 ` [PATCH 12/13] rcu/nocb: Prepare for finegrained deferred wakeup Frederic Weisbecker
2021-03-16  3:02   ` Paul E. McKenney
2021-03-16 11:45     ` Frederic Weisbecker
2021-03-16 14:02       ` Paul E. McKenney
2021-02-23  0:10 ` [PATCH 13/13] rcu/nocb: Unify timers Frederic Weisbecker

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210224220606.GA3179@lothringen \
    --to=frederic@kernel.org \
    --cc=boqun.feng@gmail.com \
    --cc=jiangshanlai@gmail.com \
    --cc=joel@joelfernandes.org \
    --cc=josh@joshtriplett.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=neeraju@codeaurora.org \
    --cc=paulmck@kernel.org \
    --cc=stable@vger.kernel.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.