public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Tejun Heo <tj@kernel.org>
Cc: linux-kernel@vger.kernel.org, mingo@elte.hu,
	laijs@cn.fujitsu.com, dipankar@in.ibm.com,
	akpm@linux-foundation.org, mathieu.desnoyers@polymtl.ca,
	josh@joshtriplett.org, niv@us.ibm.com, tglx@linutronix.de,
	peterz@infradead.org, rostedt@goodmis.org,
	Valdis.Kletnieks@vt.edu, dhowells@redhat.com,
	eric.dumazet@gmail.com, darren@dvhart.com
Subject: Re: [PATCH RFC tip/core/rcu 11/20] rcu: fix race condition in synchronize_sched_expedited()
Date: Sat, 18 Dec 2010 11:58:49 -0800	[thread overview]
Message-ID: <20101218195849.GC2143@linux.vnet.ibm.com> (raw)
In-Reply-To: <4D0CD8C7.8070604@kernel.org>

On Sat, Dec 18, 2010 at 04:52:39PM +0100, Tejun Heo wrote:
> Hello,
> 
> On 12/17/2010 09:54 PM, Paul E. McKenney wrote:
> > The new (early 2010) implementation of synchronize_sched_expedited() uses
> > try_stop_cpu() to force a context switch on every CPU.  It also permits
> > concurrent calls to synchronize_sched_expedited() to share a single call
> > to try_stop_cpu() through use of an atomically incremented
> > synchronize_sched_expedited_count variable.  Unfortunately, this is
> > subject to failure as follows:
> > 
> > o	Task A invokes synchronize_sched_expedited(), try_stop_cpus()
> > 	succeeds, but Task A is preempted before getting to the atomic
> > 	increment of synchronize_sched_expedited_count.
> > 
> > o	Task B also invokes synchronize_sched_expedited(), with exactly
> > 	the same outcome as Task A.
> > 
> > o	Task C also invokes synchronize_sched_expedited(), again with
> > 	exactly the same outcome as Tasks A and B.
> > 
> > o	Task D also invokes synchronize_sched_expedited(), but only
> > 	gets as far as acquiring the mutex within try_stop_cpus()
> > 	before being preempted, interrupted, or otherwise delayed.
> > 
> > o	Task E also invokes synchronize_sched_expedited(), but only
> > 	gets to the snapshotting of synchronize_sched_expedited_count.
> > 
> > o	Tasks A, B, and C all increment synchronize_sched_expedited_count.
> > 
> > o	Task E fails to get the mutex, so checks the new value
> > 	of synchronize_sched_expedited_count.  It finds that the
> > 	value has increased, so (wrongly) assumes that its work
> > 	has been done, returning despite there having been no
> > 	expedited grace period since it began.
> > 
> > The solution is to have the lowest-numbered CPU atomically increment
> > the synchronize_sched_expedited_count variable within the
> > synchronize_sched_expedited_cpu_stop() function, which is under
> > the protection of the mutex acquired by try_stop_cpus().  However, this
> > also requires that piggybacking tasks wait for three rather than two
> > instances of try_stop_cpu(), because we cannot control the order in
> > which the per-CPU callback function occur.
> > 
> > Cc: Tejun Heo <tj@kernel.org>
> > Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
> > Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> 
> Acked-by: Tejun Heo <tj@kernel.org>

Thank you!

> I suppose this should go -stable?

Given that it is only a theoretical bug, I am targeting 2.6.38 rather
than 2.6.37.  But yes, looks to me like a -stable candidate.

							Thanx, Paul

  reply	other threads:[~2010-12-18 19:58 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-12-17 20:54 [PATCH tip/core/rcu 0/20] second preview of RCU patches for 2.6.38 Paul E. McKenney
2010-12-17 20:54 ` [PATCH RFC tip/core/rcu 01/20] rcu: add priority-inversion testing to rcutorture Paul E. McKenney
2010-12-17 20:54 ` [PATCH RFC tip/core/rcu 02/20] rcu: move TINY_RCU from softirq to kthread Paul E. McKenney
2010-12-17 20:54 ` [PATCH RFC tip/core/rcu 03/20] rcu: priority boosting for TINY_PREEMPT_RCU Paul E. McKenney
2010-12-17 20:54 ` [PATCH RFC tip/core/rcu 04/20] rcu: add tracing for TINY_RCU and TINY_PREEMPT_RCU Paul E. McKenney
2010-12-17 20:54 ` [PATCH RFC tip/core/rcu 05/20] rcu: document TINY_RCU and TINY_PREEMPT_RCU tracing Paul E. McKenney
2010-12-17 20:54 ` [PATCH RFC tip/core/rcu 06/20] rcu: Distinguish between boosting and boosted Paul E. McKenney
2010-12-17 20:54 ` [PATCH RFC tip/core/rcu 07/20] rcu: get rid of obsolete "classic" names in TREE_RCU tracing Paul E. McKenney
2010-12-17 20:54 ` [PATCH RFC tip/core/rcu 08/20] rcu,cleanup: move synchronize_sched_expedited() out of sched.c Paul E. McKenney
2010-12-17 20:54 ` [PATCH RFC tip/core/rcu 09/20] rcu,cleanup: simplify the code when cpu is dying Paul E. McKenney
2010-12-17 20:54 ` [PATCH RFC tip/core/rcu 10/20] rcu: update documentation/comments for Lai's adoption patch Paul E. McKenney
2010-12-17 20:54 ` [PATCH RFC tip/core/rcu 11/20] rcu: fix race condition in synchronize_sched_expedited() Paul E. McKenney
2010-12-18 15:52   ` Tejun Heo
2010-12-18 19:58     ` Paul E. McKenney [this message]
2010-12-17 20:54 ` [PATCH RFC tip/core/rcu 12/20] rcu: Make synchronize_srcu_expedited() fast if running readers Paul E. McKenney
2010-12-17 20:54 ` [PATCH RFC tip/core/rcu 13/20] rcu: increase synchronize_sched_expedited() batching Paul E. McKenney
2010-12-18 16:13   ` Tejun Heo
2010-12-18 20:14     ` Paul E. McKenney
2010-12-19  9:43       ` Tejun Heo
2010-12-19 16:35         ` Paul E. McKenney
2010-12-20 10:33           ` Peter Zijlstra
2010-12-20 13:40             ` Mathieu Desnoyers
2010-12-20 10:31         ` Peter Zijlstra
2010-12-21  7:58           ` Paul E. McKenney
2010-12-17 20:54 ` [PATCH RFC tip/core/rcu 14/20] rcu: Stop chasing QS if another CPU did it for us Paul E. McKenney
2010-12-17 20:54 ` [PATCH RFC tip/core/rcu 15/20] rcu: Keep gpnum and completed fields synchronized Paul E. McKenney
2010-12-20  2:13   ` Lai Jiangshan
2010-12-20  2:14     ` Frederic Weisbecker
2010-12-20 16:51     ` Paul E. McKenney
2010-12-17 20:54 ` [PATCH RFC tip/core/rcu 16/20] rcu: fine-tune grace-period begin/end checks Paul E. McKenney
2010-12-17 20:54 ` [PATCH RFC tip/core/rcu 17/20] rcu: limit rcu_node leaf-level fanout Paul E. McKenney
2010-12-17 20:54 ` [PATCH RFC tip/core/rcu 18/20] rcu: reduce __call_rcu()-induced contention on rcu_node structures Paul E. McKenney
2010-12-17 20:54 ` [PATCH RFC tip/core/rcu 19/20] rculist: fix borked __list_for_each_rcu() macro Paul E. McKenney
2010-12-17 20:54 ` [PATCH RFC tip/core/rcu 20/20] rcu: remove unused " Paul E. McKenney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20101218195849.GC2143@linux.vnet.ibm.com \
    --to=paulmck@linux.vnet.ibm.com \
    --cc=Valdis.Kletnieks@vt.edu \
    --cc=akpm@linux-foundation.org \
    --cc=darren@dvhart.com \
    --cc=dhowells@redhat.com \
    --cc=dipankar@in.ibm.com \
    --cc=eric.dumazet@gmail.com \
    --cc=josh@joshtriplett.org \
    --cc=laijs@cn.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mathieu.desnoyers@polymtl.ca \
    --cc=mingo@elte.hu \
    --cc=niv@us.ibm.com \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=tglx@linutronix.de \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox