Re: [PATCH] rcu: Fix missing TICK_DEP_MASK_RCU_EXP dependency check

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Frederic Weisbecker <frederic@kernel.org>
To: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Joel Fernandes <joel@joelfernandes.org>,
	Zqiang <qiang1.zhang@intel.com>,
	quic_neeraju@quicinc.com, rcu@vger.kernel.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH] rcu: Fix missing TICK_DEP_MASK_RCU_EXP dependency check
Date: Tue, 10 Jan 2023 00:55:51 +0100	[thread overview]
Message-ID: <Y7yph3aMNbxGa7o5@lothringen> (raw)
In-Reply-To: <20230109193226.GX4028633@paulmck-ThinkPad-P17-Gen-1>

On Mon, Jan 09, 2023 at 11:32:26AM -0800, Paul E. McKenney wrote:
> On Mon, Jan 09, 2023 at 12:09:48AM +0100, Frederic Weisbecker wrote:
> > On Sat, Jan 07, 2023 at 09:55:22PM -0500, Joel Fernandes wrote:
> > > 
> > > 
> > > > On Jan 7, 2023, at 9:48 PM, Joel Fernandes <joel@joelfernandes.org> wrote:
> > > > 
> > > > 
> > > >>> On Jan 7, 2023, at 5:11 PM, Frederic Weisbecker <frederic@kernel.org> wrote:
> > > >>> 
> > > >>> On Fri, Jan 06, 2023 at 07:01:28PM -0500, Joel Fernandes wrote:
> > > >>> (lost html content)
> > > > 
> > > > My problem is the iPhone wises up when I put a web link in an email. I want to look into smtp relays but then if I spent time on fixing that, I might not get time to learn from emails like these... 
> > > > 
> > > >> I can't find a place where the exp grace period sends an IPI to
> > > >> CPUs slow to report a QS. But anyway you really need the tick to poll
> > > >> periodically on the CPU to chase a quiescent state.
> > > > 
> > > > Ok.
> > > > 
> > > >> Now arguably it's probably only useful when CONFIG_PREEMPT_COUNT=y
> > > >> and rcu_exp_handler() has interrupted a preempt-disabled or bh-disabled
> > > >> section. Although rcu_exp_handler() sets TIF_RESCHED, which is handled
> > > >> by preempt_enable() and local_bh_enable() when CONFIG_PREEMPT=y.
> > > >> So probably it's only useful when CONFIG_PREEMPT_COUNT=y and CONFIG_PREEMPT=n
> > > >> (and there is also PREEMPT_DYNAMIC to consider).
> > > > 
> > > > Makes sense. I think I was missing this use case and was going by the general design of exp grace periods.  I was incorrectly assuming the IPIs were being sent repeatedly for hold out CPUs, which is not the case I think. But that would another way to fix it?
> > > > 
> > > > But yeah I get your point, the first set of IPIs missed it, so we need the rescue-tick for long non-rcu_read_lock() implicit critical sections.. 
> > > > 
> > > >> If CONFIG_PREEMPT_COUNT=n, the tick can only report idle and user
> > > >> as QS, but those are already reported explicitly on ct_kernel_exit() ->
> > > >> rcu_preempt_deferred_qs().
> > > > 
> > > > Oh hmm, because that function is a NOOP for PREEMPT_COUNT=y and PREEMPT=n and will not report the deferred QS?  Maybe it should then. However I think the tick is still useful if after the preempt disabled section, will still did not exit the kernel.
> > > 
> > > I think meant I here, an atomic section (like bh or Irq disabled). There is no such thing as disabling preemption for CONFIG_PREEMPT=n. Or maybe I am confused again.  This RCU thing…
> > 
> > Right, so when CONFIG_PREEMPT_COUNT=n, there is no way for a tick to tell if the
> > the interrupted code is safely considered as a QS. That's because
> > preempt_disable() <-> preempt_enable() are no-ops so the whole kernel is
> > assumed non-preemptible, and therefore the whole kernel is a READ side critical
> > section, except for the explicit points reporting a QS.
> > 
> > The only exception is when the tick interrupts idle (or user with
> > nohz_full). But we already have an exp QS reported on idle (and user with
> > nohz_full) entry through ct_kernel_exit(), and that happens on all RCU_TREE
> > configs (PREEMPT or not). Therefore the tick doesn't appear to be helpful at
> > all on a nohz_full CPU with CONFIG_PREEMPT_COUNT=n.
> > 
> > I suggest we don't bother optimizing that case though...
> > 
> > To summarize:
> > 
> > 1) nohz_full && !CONFIG_PREEMPT_COUNT && !CONFIG_PREEMPT_RCU:
> >   Tick isn't helpful. It can only report idle/user QS, but that is
> >   already reported explicitly.
> > 
> > 2) nohz_full && CONFIG_PREEMPT_COUNT && !CONFIG_PREEMPT_RCU:
> >   Tick is very helpful because it can tell if the kernel is in
> >   a QS state.
> > 
> > 3) nohz_full && CONFIG_PREEMPT_RCU:
> >    Tick doesn't appear to be helpful because:
> >        - If the rcu_exp_handler() fires in an rcu_read_lock'ed section, then the
> >          exp QS is reported on rcu_read_unlock()
> >        - If the rcu_exp_handler() fires in a preempt/bh disabled section,
> >          TIF_RESCHED is forced which is handled on preempt/bh re-enablement,
> > 	 reporting a QS.
> >    
> >   
> > The case 2) is a niche, only useful for debugging. But anyway I'm not sure it's
> > worth changing/optimizing the current state. Might be worth add a comment
> > though.
> 
> Thank you both for the analysis!  I would welcome a comment.

I'm preparing that.

> One could argue that we should increase the delay before turning the
> tick on, but my experience is that expedited grace periods almost always
> complete in less than a jiffy, so there would almost never be any benefit
> in doing so.  But if some large NO_HZ_FULL system with long RCU readers
> starts having trouble with too-frequent tick enablement, that is one
> possible fix.

And last but not least: wait for anybody to complain before changing anything
;-))

Thanks.

next prev parent reply	other threads:[~2023-01-09 23:56 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-12-20 14:16 [PATCH] rcu: Fix missing TICK_DEP_MASK_RCU_EXP dependency check Zqiang
2022-12-21 20:16 ` Paul E. McKenney
2023-01-07 21:39   ` Frederic Weisbecker
2023-01-09 19:28     ` Paul E. McKenney
     [not found]   ` <344A2A3B-628E-467C-BBDF-11C3AB63D432@joelfernandes.org>
2023-01-07 22:11     ` Frederic Weisbecker
2023-01-08  2:48       ` Joel Fernandes
2023-01-08  2:55         ` Joel Fernandes
2023-01-08 23:09           ` Frederic Weisbecker
2023-01-09 19:32             ` Paul E. McKenney
2023-01-09 23:55               ` Frederic Weisbecker [this message]
2023-01-13 19:25                 ` Paul E. McKenney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y7yph3aMNbxGa7o5@lothringen \
    --to=frederic@kernel.org \
    --cc=joel@joelfernandes.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=paulmck@kernel.org \
    --cc=qiang1.zhang@intel.com \
    --cc=quic_neeraju@quicinc.com \
    --cc=rcu@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.