linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
From: paulmck@linux.vnet.ibm.com (Paul E. McKenney)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH RFC idle 2/3] arm: Avoid invoking RCU when CPU is idle
Date: Sat, 4 Feb 2012 06:21:23 -0800	[thread overview]
Message-ID: <20120204142123.GA14901@linux.vnet.ibm.com> (raw)
In-Reply-To: <1328297787.5882.203.camel@gandalf.stny.rr.com>

On Fri, Feb 03, 2012 at 02:36:27PM -0500, Steven Rostedt wrote:
> On Fri, 2012-02-03 at 10:41 -0800, Kevin Hilman wrote:
> 
> > > How is it a step backwards if it is already broken. 
> > 
> > Well, I didn't know it was broken. ;) And, as Paul mentioned, this has
> > been broken for a long time. Apparently it's been working well enough
> > for nobody to notice until recently.
> > 
> > > Obviously you haven't actually used any tracing here because it
> > > doesn't work right with things as is.
> > 
> > It's been working well enough for me to debug several idle path problems
> > with tracing.  Admittedly, this has been primarily on UP systems, but
> > I've recently started using the tracing on SMP as well.  (however, due
> > to "coupled" low-power states on OMAP, large parts of the idle path are
> > effectively UP since one CPU0 has to wait for CPU1 to hit a low-power
> > state before it can.)
> 
> It's used by all users of powertop, and we haven't heard about a bug
> yet. This doesn't mean that the bug doesn't exist. The race is extremely
> hard to hit. It's one of those "good bugs". You know, the kind that you
> don't really have to worry about because you are more likely to win the
> lottery, become President of the United States, and find a cure for
> cancer (all those together, not just one) than the chance of hitting
> this bug. But it's a bug regardless and should, unfortunately, be fixed.
> 
> But here's the explanation of the bug:
> 
> As Paul has stated, when rcu_idle_enter() is in effect, the calls to
> rcu_read_lock_* are ignored. Thus we can pretend they don't exist.
> 
> The code in question is the __DO_TRACE() in include/linux/tracepoint.h:
> 
> 		rcu_read_lock_sched_notrace();				\
> 		it_func_ptr = rcu_dereference_sched((tp)->funcs);	\
> 		if (it_func_ptr) {					\
> 			do {						\
> 				it_func = (it_func_ptr)->func;		\
> 				__data = (it_func_ptr)->data;		\
> 				((void(*)(proto))(it_func))(args);	\
> 			} while ((++it_func_ptr)->func);		\
> 		}							\
> 		rcu_read_unlock_sched_notrace();	
> 
> As stated above, the rcu_read_(un)lock_sched_notrace() are worthless
> when in rcu_idle_enter().
> 
> They protect the referencing of tp->funcs, which is an array of all
> funcs that are attached to this tracepoint.
> 
> Now we need to look at kernel/tracepoint.c:
> 
> The protection is needed against a simultaneous insertion or deletion of
> a tracepoint hook. This happens when a user enables or disables tracing.
> 
> Note, this race is even made harder to hit, because due to the static
> branch that controls whether this gets called, will be off if no
> tracepoints are attached. So the race can only happen after at least one
> tracepoint is active.

I agree that this race is hard to hit when running Linux on bare metal.

But consider a Linux kernel running as a guest OS.  Then the host might
preempt the guest in the middle of a tracepoint.  Then from the guest OS's
viewpoint, that VCPU has just stopped, possibly for a very long time --
easily long enough for all the other VCPUs to pass through quiescent
states.  And the guest OS is ignoring that VCPU, so a too-short grace
period could easily happen in this scenario.

							Thanx, Paul

> But if two probes are are added to this tracepoint, then we can hit the
> race. And it is possible to trigger with only one probe on removal.
> 
> When adding or removing a tracepoint, the array (the one that
> it_func_ptr points to) is updated by allocating a new array, copying the
> old array plus or minus the tracpoint being added or removed, setting
> the tp->funcs to the new array, and then it calls call_rcu_sched() to
> free it.
> 
> Now for the bug to hit, something had to be coming in or out of idle,
> and jumping to this code. Between the time it got the it_func_ptr to the
> time it accessed any of that pointer's data in the loop, the tp->func
> had to be updated to the new array, and then all CPUs would have passed
> a schedule point (except the rcu_idle CPUs).
> 
> On uniprocessor, this is not an issue, but on SMP, it is possible that
> with two CPUs the first being in rcu_idle may be ignored, and the second
> would have been adding the tracepoint and then going directly to freeing
> the code. But as tracepoints are very low weight, it is most likely that
> the tracepoints will finish before the first could even free the memory.
> 
> But the chance does exist. As the chance of me winning the lottery,
> becoming President of the United States, and curing cancer also exists!
> 
> ;-)
> 
> -- Steve
> 
> 

  reply	other threads:[~2012-02-04 14:21 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20120202004253.GA10946@linux.vnet.ibm.com>
     [not found] ` <1328143404-11038-1-git-send-email-paulmck@linux.vnet.ibm.com>
2012-02-02  0:43   ` [PATCH RFC idle 2/3] arm: Avoid invoking RCU when CPU is idle Paul E. McKenney
2012-02-02  2:48     ` Rob Herring
2012-02-02  4:40       ` Paul E. McKenney
2012-02-02  3:49     ` Nicolas Pitre
2012-02-02  4:44       ` Paul E. McKenney
2012-02-02 17:13         ` Nicolas Pitre
2012-02-02 17:43           ` Paul E. McKenney
2012-02-02 18:31             ` Nicolas Pitre
2012-02-02 19:07               ` Paul E. McKenney
2012-02-02 22:20                 ` Kevin Hilman
2012-02-02 22:49                   ` Rob Herring
2012-02-02 23:03                     ` Steven Rostedt
2012-02-02 23:27                       ` Paul E. McKenney
2012-02-02 23:51                         ` Paul E. McKenney
2012-02-03  2:45                         ` Steven Rostedt
2012-02-03  6:04                           ` Paul E. McKenney
2012-02-03 18:55                             ` Steven Rostedt
2012-02-03 19:40                               ` Paul E. McKenney
2012-02-03 20:02                                 ` Steven Rostedt
2012-02-03 20:23                                   ` Paul E. McKenney
2012-02-06 21:18                                 ` [PATCH][RFC] tracing/rcu: Add trace_##name##__rcuidle() static tracepoint for inside rcu_idle_exit() sections Steven Rostedt
2012-02-06 23:38                                   ` Paul E. McKenney
2012-02-07 12:32                                     ` Steven Rostedt
2012-02-07 14:11                                       ` Paul E. McKenney
2012-02-08 13:57                                         ` Frederic Weisbecker
2012-02-07 14:40                                       ` Josh Triplett
     [not found]                                   ` <20120206220502.GA21340@leaf>
2012-02-07  0:36                                     ` Steven Rostedt
     [not found]                           ` <20120203025350.GF13456@leaf>
2012-02-03  6:06                             ` [PATCH RFC idle 2/3] arm: Avoid invoking RCU when CPU is idle Paul E. McKenney
2012-02-02 23:39                       ` Rob Herring
2012-02-03 18:41                     ` Kevin Hilman
2012-02-03 19:26                       ` Paul E. McKenney
2012-02-03 19:36                       ` Steven Rostedt
2012-02-04 14:21                         ` Paul E. McKenney [this message]
2012-02-06 19:32                           ` Steven Rostedt
2012-02-02 23:03                   ` Paul E. McKenney
2012-02-03 19:12                     ` Kevin Hilman
2012-02-03 19:26                       ` Paul E. McKenney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120204142123.GA14901@linux.vnet.ibm.com \
    --to=paulmck@linux.vnet.ibm.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).