Re: [PATCH] rcu: Fix missing TICK_DEP_MASK_RCU_EXP dependency check

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

From: Frederic Weisbecker <frederic@kernel.org>
To: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Joel Fernandes <joel@joelfernandes.org>,
	Zqiang <qiang1.zhang@intel.com>,
	quic_neeraju@quicinc.com, rcu@vger.kernel.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH] rcu: Fix missing TICK_DEP_MASK_RCU_EXP dependency check
Date: Tue, 10 Jan 2023 00:55:51 +0100	[thread overview]
Message-ID: <Y7yph3aMNbxGa7o5@lothringen> (raw)
In-Reply-To: <20230109193226.GX4028633@paulmck-ThinkPad-P17-Gen-1>

On Mon, Jan 09, 2023 at 11:32:26AM -0800, Paul E. McKenney wrote:
> On Mon, Jan 09, 2023 at 12:09:48AM +0100, Frederic Weisbecker wrote:
> > On Sat, Jan 07, 2023 at 09:55:22PM -0500, Joel Fernandes wrote:
> > > 
> > > 
> > > > On Jan 7, 2023, at 9:48 PM, Joel Fernandes <joel@joelfernandes.org> wrote:
> > > > 
> > > > 
> > > >>> On Jan 7, 2023, at 5:11 PM, Frederic Weisbecker <frederic@kernel.org> wrote:
> > > >>> 
> > > >>> On Fri, Jan 06, 2023 at 07:01:28PM -0500, Joel Fernandes wrote:
> > > >>> (lost html content)
> > > > 
> > > > My problem is the iPhone wises up when I put a web link in an email. I want to look into smtp relays but then if I spent time on fixing that, I might not get time to learn from emails like these... 
> > > > 
> > > >> I can't find a place where the exp grace period sends an IPI to
> > > >> CPUs slow to report a QS. But anyway you really need the tick to poll
> > > >> periodically on the CPU to chase a quiescent state.
> > > > 
> > > > Ok.
> > > > 
> > > >> Now arguably it's probably only useful when CONFIG_PREEMPT_COUNT=y
> > > >> and rcu_exp_handler() has interrupted a preempt-disabled or bh-disabled
> > > >> section. Although rcu_exp_handler() sets TIF_RESCHED, which is handled
> > > >> by preempt_enable() and local_bh_enable() when CONFIG_PREEMPT=y.
> > > >> So probably it's only useful when CONFIG_PREEMPT_COUNT=y and CONFIG_PREEMPT=n
> > > >> (and there is also PREEMPT_DYNAMIC to consider).
> > > > 
> > > > Makes sense. I think I was missing this use case and was going by the general design of exp grace periods.  I was incorrectly assuming the IPIs were being sent repeatedly for hold out CPUs, which is not the case I think. But that would another way to fix it?
> > > > 
> > > > But yeah I get your point, the first set of IPIs missed it, so we need the rescue-tick for long non-rcu_read_lock() implicit critical sections.. 
> > > > 
> > > >> If CONFIG_PREEMPT_COUNT=n, the tick can only report idle and user
> > > >> as QS, but those are already reported explicitly on ct_kernel_exit() ->
> > > >> rcu_preempt_deferred_qs().
> > > > 
> > > > Oh hmm, because that function is a NOOP for PREEMPT_COUNT=y and PREEMPT=n and will not report the deferred QS?  Maybe it should then. However I think the tick is still useful if after the preempt disabled section, will still did not exit the kernel.
> > > 
> > > I think meant I here, an atomic section (like bh or Irq disabled). There is no such thing as disabling preemption for CONFIG_PREEMPT=n. Or maybe I am confused again.  This RCU thing…
> > 
> > Right, so when CONFIG_PREEMPT_COUNT=n, there is no way for a tick to tell if the
> > the interrupted code is safely considered as a QS. That's because
> > preempt_disable() <-> preempt_enable() are no-ops so the whole kernel is
> > assumed non-preemptible, and therefore the whole kernel is a READ side critical
> > section, except for the explicit points reporting a QS.
> > 
> > The only exception is when the tick interrupts idle (or user with
> > nohz_full). But we already have an exp QS reported on idle (and user with
> > nohz_full) entry through ct_kernel_exit(), and that happens on all RCU_TREE
> > configs (PREEMPT or not). Therefore the tick doesn't appear to be helpful at
> > all on a nohz_full CPU with CONFIG_PREEMPT_COUNT=n.
> > 
> > I suggest we don't bother optimizing that case though...
> > 
> > To summarize:
> > 
> > 1) nohz_full && !CONFIG_PREEMPT_COUNT && !CONFIG_PREEMPT_RCU:
> >   Tick isn't helpful. It can only report idle/user QS, but that is
> >   already reported explicitly.
> > 
> > 2) nohz_full && CONFIG_PREEMPT_COUNT && !CONFIG_PREEMPT_RCU:
> >   Tick is very helpful because it can tell if the kernel is in
> >   a QS state.
> > 
> > 3) nohz_full && CONFIG_PREEMPT_RCU:
> >    Tick doesn't appear to be helpful because:
> >        - If the rcu_exp_handler() fires in an rcu_read_lock'ed section, then the
> >          exp QS is reported on rcu_read_unlock()
> >        - If the rcu_exp_handler() fires in a preempt/bh disabled section,
> >          TIF_RESCHED is forced which is handled on preempt/bh re-enablement,
> > 	 reporting a QS.
> >    
> >   
> > The case 2) is a niche, only useful for debugging. But anyway I'm not sure it's
> > worth changing/optimizing the current state. Might be worth add a comment
> > though.
> 
> Thank you both for the analysis!  I would welcome a comment.

I'm preparing that.

> One could argue that we should increase the delay before turning the
> tick on, but my experience is that expedited grace periods almost always
> complete in less than a jiffy, so there would almost never be any benefit
> in doing so.  But if some large NO_HZ_FULL system with long RCU readers
> starts having trouble with too-frequent tick enablement, that is one
> possible fix.

And last but not least: wait for anybody to complain before changing anything
;-))

Thanks.

next prev parent reply	other threads:[~2023-01-09 23:56 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-12-20 14:16 [PATCH] rcu: Fix missing TICK_DEP_MASK_RCU_EXP dependency check Zqiang
2022-12-21 20:16 ` Paul E. McKenney
2023-01-07 21:39   ` Frederic Weisbecker
2023-01-09 19:28     ` Paul E. McKenney
     [not found]   ` <344A2A3B-628E-467C-BBDF-11C3AB63D432@joelfernandes.org>
2023-01-07 22:11     ` Frederic Weisbecker
2023-01-08  2:48       ` Joel Fernandes
2023-01-08  2:55         ` Joel Fernandes
2023-01-08 23:09           ` Frederic Weisbecker
2023-01-09 19:32             ` Paul E. McKenney
2023-01-09 23:55               ` Frederic Weisbecker [this message]
2023-01-13 19:25                 ` Paul E. McKenney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y7yph3aMNbxGa7o5@lothringen \
    --to=frederic@kernel.org \
    --cc=joel@joelfernandes.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=paulmck@kernel.org \
    --cc=qiang1.zhang@intel.com \
    --cc=quic_neeraju@quicinc.com \
    --cc=rcu@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox