public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: "Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com>
To: paulmck@linux.vnet.ibm.com
Cc: linux-kernel@vger.kernel.org, mingo@elte.hu,
	laijs@cn.fujitsu.com, dipankar@in.ibm.com,
	akpm@linux-foundation.org, mathieu.desnoyers@polymtl.ca,
	josh@joshtriplett.org, niv@us.ibm.com, tglx@linutronix.de,
	peterz@infradead.org, rostedt@goodmis.org,
	Valdis.Kletnieks@vt.edu, dhowells@redhat.com,
	eric.dumazet@gmail.com, darren@dvhart.com, fweisbec@gmail.com,
	patches@linaro.org, "Paul E. McKenney" <paul.mckenney@linaro.org>,
	venki@google.com, KOSAKI Motohiro <kosaki.motohiro@gmail.com>,
	"rusty@rustcorp.com.au" <rusty@rustcorp.com.au>
Subject: Re: [PATCH RFC tip/core/rcu 1/6] rcu: Stabilize use of num_online_cpus() for GP short circuit
Date: Tue, 24 Apr 2012 23:16:34 +0530	[thread overview]
Message-ID: <4F96E6FA.40900@linux.vnet.ibm.com> (raw)
In-Reply-To: <20120424165056.GB2403@linux.vnet.ibm.com>

On 04/24/2012 10:20 PM, Paul E. McKenney wrote:

> On Tue, Apr 24, 2012 at 09:05:20PM +0530, Srivatsa S. Bhat wrote:
> 
>>
>>> From: "Paul E. McKenney" <paul.mckenney@linaro.org>
>>>
>>> The rcu_blocking_is_gp() function tests to see if there is only one
>>> online CPU, and if so, synchronize_sched() and friends become no-ops.
>>> However, for larger systems, num_online_cpus() scans a large vector,
>>
>>
>> Venki had posted a patch to optimize that by using a variable, so that we
>> don't calculate the value each and every time.
>> http://thread.gmane.org/gmane.linux.kernel/1240569/focus=1246659
>>
>> However, unfortunately there was some confusion around that patch and
>> even though it made it to akpm's tree and stayed there briefly, it didn't
>> go upstream. Venki had attempted to resolve the confusion here:
>> http://thread.gmane.org/gmane.linux.kernel/1240569/focus=1260702
> 
> Having a single variable tracking the online state would be good,
> but as you say it isn't there yet.
> 
>>> and might be preempted while doing so.  While preempted, any number
>>> of CPUs might come online and go offline, potentially resulting in
>>> num_online_cpus() returning 1 when there never had only been one
>>> CPU online.  This would result in a too-short RCU grace period, which
>>> could in turn result in total failure.
>>
>>
[...]

> 
> The problematic case is instead the one where we were SMP throughout,
> but rcu_blocking_is_gp() mistakenly decides that we were UP.  For example,
> consider the following sequence of events, based on the commit log's
> sentence "While preempted, any number of CPUs might come online and go
> offline, potentially resulting in num_online_cpus() returning 1 when
> there never had only been one CPU online":
> 


Oh, I didn't think in the direction illustrated below when reading that
sentence.. :-(

> o	CPUs 100 and 150 are initially online, with a long-running RCU
> 	read-side critical section on CPU 100 and rcu_blocking_is_gp()
> 	initially running on CPU 150.
> 
> o	The rcu_blocking_is_gp() function checks the bits for CPUs
> 	0-63, and counts zero online CPUs.
> 
> o	CPU 1 comes online.
> 
> o	The rcu_blocking_is_gp() function checks the bits for CPUs
> 	64-127, and counts one online CPUs, for a total thus far
> 	of one CPU online..
> 
> o	CPU 150 goes offline.  Ah, but it cannot do this, because
> 	this is non-preemptible RCU, which means that the RCU
> 	read-side critical section has disabled preemption on
> 	CPU 100, which prevents CPU 150 from going offline, which
> 	prevents this scenario from occurring.
> 
> 	So, yes, rcu_blocking_is_gp() can be fooled into incorrectly
> 	stating that the system has only one CPU (or even that it has
> 	only zero CPUs), but not while there is actually a non-preemptible
> 	RCU read-side critical section running.  Yow!
>


Awesome :-)

 
> 	I clearly had not thought this change through far enough,
> 	thank you for calling it to my attention!
> 
> So I could replace this patch with a patch that adds a comment
> explaining why this works.


Yes, that would be great..

>  Though this patch might be simpler and
> easier to understand.  ;-)


Oh well, but I completely missed the intention behind the patch!
So I guess a comment would be better ;-)

>  But not so good for real-time response
> on large systems, I suppose.
> 
> And rcu_blocking_is_gp() is called only from synchronize_sched() and
> synchronize_rcu_bh(), so it is safe for all current callers.  But it is
> defined publicly, so I should move it to somewhere like kernel/rcutree.c
> to keep new users from being fatally confused and disappointed.
> 
> I can also change the comparison from "num_online_cpus() == 1" to
> "num_online_cpus() <= 1".  That should at least serve as a caution to
> people who might attempt to use it where it shouldn't be used.  ;-)
> 


Hehe, yeah!

Regards,
Srivatsa S. Bhat


  reply	other threads:[~2012-04-24 17:47 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-04-23 16:41 [PATCH RFC 0/6] Miscellaneous RCU fixes for 3.5 Paul E. McKenney
2012-04-23 16:42 ` [PATCH RFC tip/core/rcu 1/6] rcu: Stabilize use of num_online_cpus() for GP short circuit Paul E. McKenney
2012-04-23 16:42   ` [PATCH RFC tip/core/rcu 2/6] rcu: List-debug variants of rcu list routines Paul E. McKenney
2012-04-23 16:42   ` [PATCH RFC tip/core/rcu 3/6] rcu: Replace list_first_entry_rcu() with list_first_or_null_rcu() Paul E. McKenney
2012-04-23 16:42   ` [PATCH RFC tip/core/rcu 4/6] rcu: Clarify help text for RCU_BOOST_PRIO Paul E. McKenney
2012-04-26 12:46     ` Peter Zijlstra
2012-04-26 17:28       ` Paul E. McKenney
2012-04-23 16:42   ` [PATCH RFC tip/core/rcu 5/6] rcu: Make __kfree_rcu() less dependent on compiler choices Paul E. McKenney
2012-04-26 12:48     ` Peter Zijlstra
2012-04-26 13:29       ` Jan Engelhardt
2012-04-26 13:50         ` Peter Zijlstra
2012-04-23 16:42   ` [PATCH RFC tip/core/rcu 6/6] rcu: Reduce cache-miss initialization latencies for large systems Paul E. McKenney
2012-04-26 12:51     ` Peter Zijlstra
2012-04-26 14:12       ` Paul E. McKenney
2012-04-26 15:28         ` Peter Zijlstra
2012-04-26 16:15           ` Paul E. McKenney
2012-04-26 19:41             ` Peter Zijlstra
2012-04-26 19:47               ` Peter Zijlstra
2012-04-26 20:29                 ` Paul E. McKenney
2012-04-26 22:04                   ` Peter Zijlstra
2012-04-26 20:28               ` Paul E. McKenney
2012-04-26 22:01                 ` Peter Zijlstra
2012-04-27 14:17                   ` Paul E. McKenney
2012-04-27  4:36     ` Mike Galbraith
2012-04-27 15:15       ` Paul E. McKenney
2012-04-28  4:42         ` Mike Galbraith
2012-04-28 17:21           ` Paul E. McKenney
2012-04-29  3:54             ` Mike Galbraith
2012-04-24 15:35   ` [PATCH RFC tip/core/rcu 1/6] rcu: Stabilize use of num_online_cpus() for GP short circuit Srivatsa S. Bhat
2012-04-24 16:50     ` Paul E. McKenney
2012-04-24 17:46       ` Srivatsa S. Bhat [this message]
2012-05-07  3:47       ` Rusty Russell

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4F96E6FA.40900@linux.vnet.ibm.com \
    --to=srivatsa.bhat@linux.vnet.ibm.com \
    --cc=Valdis.Kletnieks@vt.edu \
    --cc=akpm@linux-foundation.org \
    --cc=darren@dvhart.com \
    --cc=dhowells@redhat.com \
    --cc=dipankar@in.ibm.com \
    --cc=eric.dumazet@gmail.com \
    --cc=fweisbec@gmail.com \
    --cc=josh@joshtriplett.org \
    --cc=kosaki.motohiro@gmail.com \
    --cc=laijs@cn.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mathieu.desnoyers@polymtl.ca \
    --cc=mingo@elte.hu \
    --cc=niv@us.ibm.com \
    --cc=patches@linaro.org \
    --cc=paul.mckenney@linaro.org \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=rusty@rustcorp.com.au \
    --cc=tglx@linutronix.de \
    --cc=venki@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox