From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
luto@kernel.org, tglx@linutronix.de, fweisbec@gmail.com,
cmetcalf@mellanox.com, mingo@kernel.org
Subject: Re: [PATCH RFC v2 tip/core/rcu] Maintain special bits at bottom of ->dynticks counter
Date: Mon, 13 Feb 2017 09:01:04 -0800 [thread overview]
Message-ID: <20170213170104.GC30506@linux.vnet.ibm.com> (raw)
In-Reply-To: <20170213122115.GO6515@twins.programming.kicks-ass.net>
On Mon, Feb 13, 2017 at 01:21:15PM +0100, Peter Zijlstra wrote:
> On Thu, Feb 09, 2017 at 03:51:03PM -0800, Paul E. McKenney wrote:
> > Currently, IPIs are used to force other CPUs to invalidate their TLBs
> > in response to a kernel virtual-memory mapping change. This works, but
> > degrades both battery lifetime (for idle CPUs) and real-time response
> > (for nohz_full CPUs), and in addition results in unnecessary IPIs due to
> > the fact that CPUs executing in usermode are unaffected by stale kernel
> > mappings. It would be better to cause a CPU executing in usermode to
> > wait until it is entering kernel mode to do the flush, first to avoid
> > interrupting usemode tasks and second to handle multiple flush requests
> > with a single flush in the case of a long-running user task.
> >
> > This commit therefore reserves a bit at the bottom of the ->dynticks
> > counter, which is checked upon exit from extended quiescent states.
> > If it is set, it is cleared and then a new rcu_eqs_special_exit() macro is
> > invoked, which, if not supplied, is an empty single-pass do-while loop.
> > If this bottom bit is set on -entry- to an extended quiescent state,
> > then a WARN_ON_ONCE() triggers.
> >
> > This bottom bit may be set using a new rcu_eqs_special_set() function,
> > which returns true if the bit was set, or false if the CPU turned
> > out to not be in an extended quiescent state. Please note that this
> > function refuses to set the bit for a non-nohz_full CPU when that CPU
> > is executing in usermode because usermode execution is tracked by RCU
> > as a dyntick-idle extended quiescent state only for nohz_full CPUs.
> >
> > Changes since v1: Fix ordering of atomic_and() and the call to
> > rcu_eqs_special_exit() in rcu_dynticks_eqs_exit().
> >
> > Reported-by: Andy Lutomirski <luto@amacapital.net>
> > Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
>
> I think I've asked this before, but why does this live in the guts of
> RCU?
>
> Should we lift this state tracking stuff out and make RCU and
> NOHZ(_FULL) users of it, or doesn't that make sense (reason)?
The dyntick-idle stuff is pretty specific to RCU. And what precisely
would be helped by moving it?
But that was an excellent question, as it reminded me of RCU's
dyntick-idle's NMI handling, and I never did ask Andy if it was OK for
rcu_eqs_special_exit() to be invoked when exiting NMI handler, which would
currently happen. It would be easy for me to pass in a flag indicating
whether or not the call is in NMI context, if that is needed.
It is of course not possible to detect this at rcu_eqs_special_set()
time, because rcu_eqs_special_set() has no way of knowing that the next
event that pulls the remote CPU out of idle will be an NMI.
> In any case, small nit below:
>
>
> > + seq = atomic_add_return(RCU_DYNTICK_CTRL_CTR, &rdtp->dynticks);
> > + WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) &&
> > + !(seq & RCU_DYNTICK_CTRL_CTR));
> > + if (seq & RCU_DYNTICK_CTRL_MASK) {
> > + atomic_and(~RCU_DYNTICK_CTRL_MASK, &rdtp->dynticks);
> > + smp_mb__after_atomic(); /* _exit after clearing mask. */
> > + /* Prefer duplicate flushes to losing a flush. */
> > + rcu_eqs_special_exit();
> > + }
>
> we have atomic_andnot() for just these occasions :-)
I suppose that that could generate more efficient code on some
architectures. I have changed this.
> > +/*
> > + * Set the special (bottom) bit of the specified CPU so that it
> > + * will take special action (such as flushing its TLB) on the
> > + * next exit from an extended quiescent state. Returns true if
> > + * the bit was successfully set, or false if the CPU was not in
> > + * an extended quiescent state.
> > + */
> > +bool rcu_eqs_special_set(int cpu)
> > +{
> > + int old;
> > + int new;
> > + struct rcu_dynticks *rdtp = &per_cpu(rcu_dynticks, cpu);
> > +
> > + do {
> > + old = atomic_read(&rdtp->dynticks);
> > + if (old & RCU_DYNTICK_CTRL_CTR)
> > + return false;
> > + new = old | RCU_DYNTICK_CTRL_MASK;
> > + } while (atomic_cmpxchg(&rdtp->dynticks, old, new) != old);
> > + return true;
>
> Is that what we call atomic_fetch_or() ?
I don't think so. The above code takes an early exit if the next bit
up is set, which atomic_fetch_or() does not. If the CPU is not in
an extended quiescent state (old & RCU_DYNTICK_CTRL_CTR), then this
code returns false to indicate that TLB shootdown cannot wait.
So it is more like a very specific form of atomic_fetch_or_unless().
Thanx, Paul
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2017-02-13 17:01 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-02-09 23:51 [PATCH RFC v2 tip/core/rcu] Maintain special bits at bottom of ->dynticks counter Paul E. McKenney
2017-02-13 12:21 ` Peter Zijlstra
2017-02-13 17:01 ` Paul E. McKenney [this message]
2017-02-13 17:57 ` Peter Zijlstra
2017-02-13 18:19 ` Paul E. McKenney
2017-02-13 19:08 ` Andy Lutomirski
2017-02-13 19:52 ` Paul E. McKenney
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170213170104.GC30506@linux.vnet.ibm.com \
--to=paulmck@linux.vnet.ibm.com \
--cc=cmetcalf@mellanox.com \
--cc=fweisbec@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=luto@kernel.org \
--cc=mingo@kernel.org \
--cc=peterz@infradead.org \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).