From: paulmck@linux.vnet.ibm.com (Paul E. McKenney)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH RFC idle 2/3] arm: Avoid invoking RCU when CPU is idle
Date: Thu, 2 Feb 2012 11:07:08 -0800 [thread overview]
Message-ID: <20120202190708.GE2518@linux.vnet.ibm.com> (raw)
In-Reply-To: <alpine.LFD.2.02.1202021257340.2759@xanadu.home>
On Thu, Feb 02, 2012 at 01:31:28PM -0500, Nicolas Pitre wrote:
> On Thu, 2 Feb 2012, Paul E. McKenney wrote:
>
> > On Thu, Feb 02, 2012 at 12:13:00PM -0500, Nicolas Pitre wrote:
> > > I would have to know more about what the rcu_idle_*() calls imply before
> > > suggesting an alternative.
> >
> > After a call to rcu_idle_enter(), RCU ignores the CPU completely until the
> > next call to rcu_idle_exit(). This means that in the interval between
> > rcu_idle_enter() and rcu_idle_exit(), you can say rcu_read_lock() all
> > you want, but your data structures will get absolutely no RCU protection.
> > This will result in random memory corruption, and for all I know might
> > already be resultin in random memory corruption. So a fix is required.
> >
> > So why does RCU need to ignore idle CPUs? Because otherwise, RCU needs
> > to periodically wake them up to check their status. Waking them up
> > periodically defeats CONFIG_NO_HZ's attempts to conserve power. Avoiding
> > waking them up and avoiding ignoring them extended RCU grace periods,
> > eventually OOMing the systems.
> >
> > Why this new imposition? It is not new. It has always been illegal to
> > use RCU in the idle loop from the beginning. But people happily ignored
> > that restriction, partly because it is not always immediately obvious
> > what is using RCU. Event tracing does, but unless you know that...
>
> Not having the slightest idea about the tracing context, I'd simply
> suggest flaming the people who ignored the restriction harder. Or turn
> that into a BUG() to get their attention.
Done in my tree. But it screams enough that effort to get it fixed
are in order.
> > So we added diagnostics to check for illegal uses of RCU in the idle
> > loop, and also separated rcu_idle_enter() and rcu_idle_exit() from
> > tick_nohz_stop_sched_tick(), so that things like idle notifiers can work.
> > The diagnostics promptly located all sorts of problems, including these.
>
> Good.
>
> > The two options I see are:
> >
> > 1. Rip tracing out of the inner idle loops and everything that
> > they invoke.
>
> What I suggested above. But as I said I know sh*t about that tracing
> implementation so that's an easy suggestion for me to make.
Works for me as well. ;-)
> > 2. Push rcu_idle_enter() and rcu_idle_exit() further down into
> > the idle loops. As you say, -all- of them.
> >
> > My patch set was an attempt at #2, but clearly a very poorly conceived
> > attempt. But I had to start somewhere.
>
> I'm pretty opposed to #2 in any case. Spreading knowledge about generic
> kernel infrastructure requirements across 53 or so different ARM
> subarchitectures is what has put ARM in trouble in the past, as core
> kernel folks called it a maintenance hell. Just imagine that you do put
> your rcu_idle_enter() call in those subarchs, and a year from now you
> need to modify its semantics. At that point you would have to audit all
> those 53 subarchs which might have evolved their idle support in the
> mean time, and the new ones that might have appeared with buggy usage of
> rcu_idle_enter() just because they copied it from somewhere else without
> thinking it through.
Fair point!
> Now we're working very hard to kill that trend and move things in the
> other direction so to keep only minimal and very platform specific
> knowledge in the subarch code. If rcu were to push its requirements
> down there again instead of keeping things abstracted in one place
> higher the stack then that would be a big step backward.
Again, fair point!
> > Other ideas?
>
> Have a special tracing API just for the idle code that queues its events
> in a per-CPU buffer (there should not be that many events to trace in
> the idle code) and have rcu_idle_exit() flush that buffer back in the
> actual tracing infrastructure. Maybe rcu_idle_enter() could set things
> in a way so the regular tracing API could still be used regardless.
> This has the advantage of keeping the knowledge about rcu restrictions
> in a central place that can be much more easily maintained.
Hmmm... Worth some thought. Though the delay in trace messages might
be a bit disconcerting. Probably better than random memory corruption,
though.
Thanx, Paul
next prev parent reply other threads:[~2012-02-02 19:07 UTC|newest]
Thread overview: 37+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20120202004253.GA10946@linux.vnet.ibm.com>
[not found] ` <1328143404-11038-1-git-send-email-paulmck@linux.vnet.ibm.com>
2012-02-02 0:43 ` [PATCH RFC idle 2/3] arm: Avoid invoking RCU when CPU is idle Paul E. McKenney
2012-02-02 2:48 ` Rob Herring
2012-02-02 4:40 ` Paul E. McKenney
2012-02-02 3:49 ` Nicolas Pitre
2012-02-02 4:44 ` Paul E. McKenney
2012-02-02 17:13 ` Nicolas Pitre
2012-02-02 17:43 ` Paul E. McKenney
2012-02-02 18:31 ` Nicolas Pitre
2012-02-02 19:07 ` Paul E. McKenney [this message]
2012-02-02 22:20 ` Kevin Hilman
2012-02-02 22:49 ` Rob Herring
2012-02-02 23:03 ` Steven Rostedt
2012-02-02 23:27 ` Paul E. McKenney
2012-02-02 23:51 ` Paul E. McKenney
2012-02-03 2:45 ` Steven Rostedt
2012-02-03 6:04 ` Paul E. McKenney
2012-02-03 18:55 ` Steven Rostedt
2012-02-03 19:40 ` Paul E. McKenney
2012-02-03 20:02 ` Steven Rostedt
2012-02-03 20:23 ` Paul E. McKenney
2012-02-06 21:18 ` [PATCH][RFC] tracing/rcu: Add trace_##name##__rcuidle() static tracepoint for inside rcu_idle_exit() sections Steven Rostedt
2012-02-06 23:38 ` Paul E. McKenney
2012-02-07 12:32 ` Steven Rostedt
2012-02-07 14:11 ` Paul E. McKenney
2012-02-08 13:57 ` Frederic Weisbecker
2012-02-07 14:40 ` Josh Triplett
[not found] ` <20120206220502.GA21340@leaf>
2012-02-07 0:36 ` Steven Rostedt
[not found] ` <20120203025350.GF13456@leaf>
2012-02-03 6:06 ` [PATCH RFC idle 2/3] arm: Avoid invoking RCU when CPU is idle Paul E. McKenney
2012-02-02 23:39 ` Rob Herring
2012-02-03 18:41 ` Kevin Hilman
2012-02-03 19:26 ` Paul E. McKenney
2012-02-03 19:36 ` Steven Rostedt
2012-02-04 14:21 ` Paul E. McKenney
2012-02-06 19:32 ` Steven Rostedt
2012-02-02 23:03 ` Paul E. McKenney
2012-02-03 19:12 ` Kevin Hilman
2012-02-03 19:26 ` Paul E. McKenney
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120202190708.GE2518@linux.vnet.ibm.com \
--to=paulmck@linux.vnet.ibm.com \
--cc=linux-arm-kernel@lists.infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).