All of lore.kernel.org
 help / color / mirror / Atom feed
From: Tejun Heo <tj@kernel.org>
To: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: linux-kernel@vger.kernel.org, mingo@elte.hu,
	laijs@cn.fujitsu.com, dipankar@in.ibm.com,
	akpm@linux-foundation.org, mathieu.desnoyers@polymtl.ca,
	josh@joshtriplett.org, niv@us.ibm.com, tglx@linutronix.de,
	peterz@infradead.org, rostedt@goodmis.org,
	Valdis.Kletnieks@vt.edu, dhowells@redhat.com,
	eric.dumazet@gmail.com, darren@dvhart.com
Subject: Re: [PATCH RFC tip/core/rcu 11/12] rcu: fix race condition in synchronize_sched_expedited()
Date: Tue, 09 Nov 2010 14:26:37 +0100	[thread overview]
Message-ID: <4CD94C0D.3030007@kernel.org> (raw)
In-Reply-To: <1289095532-5398-11-git-send-email-paulmck@linux.vnet.ibm.com>

Hello, Paul.

On 11/07/2010 03:05 AM, Paul E. McKenney wrote:
> The new (early 2010) implementation of synchronize_sched_expedited() uses
> try_stop_cpu() to force a context switch on every CPU.  It also permits
> concurrent calls to synchronize_sched_expedited() to share a single call
> to try_stop_cpu() through use of an atomically incremented
> synchronize_sched_expedited_count variable.  Unfortunately, this is
> subject to failure as follows:
> 
> o	Task A invokes synchronize_sched_expedited(), try_stop_cpus()
> 	succeeds, but Task A is preempted before getting to the atomic
> 	increment of synchronize_sched_expedited_count.
> 
> o	Task B also invokes synchronize_sched_expedited(), with exactly
> 	the same outcome as Task A.
> 
> o	Task C also invokes synchronize_sched_expedited(), again with
> 	exactly the same outcome as Tasks A and B.
> 
> o	Task D also invokes synchronize_sched_expedited(), but only
> 	gets as far as acquiring the mutex within try_stop_cpus()
> 	before being preempted, interrupted, or otherwise delayed.
> 
> o	Task E also invokes synchronize_sched_expedited(), but only
> 	gets to the snapshotting of synchronize_sched_expedited_count.
> 
> o	Tasks A, B, and C all increment synchronize_sched_expedited_count.
> 
> o	Task E fails to get the mutex, so checks the new value
> 	of synchronize_sched_expedited_count.  It finds that the
> 	value has increased, so (wrongly) assumes that its work
> 	has been done, returning despite there having been no
> 	expedited grace period since it began.
> 
> The solution is to have the lowest-numbered CPU atomically increment
> the synchronize_sched_expedited_count variable within the
> synchronize_sched_expedited_cpu_stop() function, which is under
> the protection of the mutex acquired by try_stop_cpus().  However, this
> also requires that piggybacking tasks wait for three rather than two
> instances of try_stop_cpu(), because we cannot control the order in
> which the per-CPU callback function occur.

How about something like the following?  It's slightly bigger but I
think it's a bit easier to understand.  Thanks.

diff --git a/kernel/sched.c b/kernel/sched.c
index aa14a56..0069be5 100644
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -9342,7 +9342,8 @@ EXPORT_SYMBOL_GPL(synchronize_sched_expedited);

 #else /* #ifndef CONFIG_SMP */

-static atomic_t synchronize_sched_expedited_count = ATOMIC_INIT(0);
+static atomic_t sync_sched_expedited_token = ATOMIC_INIT(0);
+static atomic_t sync_sched_expedited_done = ATOMIC_INIT(0);

 static int synchronize_sched_expedited_cpu_stop(void *data)
 {
@@ -9373,11 +9374,18 @@ static int synchronize_sched_expedited_cpu_stop(void *data)
  */
 void synchronize_sched_expedited(void)
 {
-	int snap, trycount = 0;
+	int my_tok, tok, t, trycount = 0;
+
+	smp_mb();  /* ensure prior mod happens before getting token. */
+
+	/*
+	 * Get a token.  This is used to coordinate with other
+	 * concurrent syncers and consolidate multiple syncs.
+	 */
+	my_tok = tok = atomic_inc_return(&sync_sched_expedited_token);

-	smp_mb();  /* ensure prior mod happens before capturing snap. */
-	snap = atomic_read(&synchronize_sched_expedited_count) + 1;
 	get_online_cpus();
+
 	while (try_stop_cpus(cpu_online_mask,
 			     synchronize_sched_expedited_cpu_stop,
 			     NULL) == -EAGAIN) {
@@ -9388,13 +9396,34 @@ void synchronize_sched_expedited(void)
 			synchronize_sched();
 			return;
 		}
-		if (atomic_read(&synchronize_sched_expedited_count) - snap > 0) {
+
+		/*
+		 * If the done count reached @my_tok, we know at least
+		 * one synchronization happened since we entered this
+		 * function.
+		 */
+		if (atomic_read(&sync_sched_expedited_done) - my_tok >= 0) {
 			smp_mb(); /* ensure test happens before caller kfree */
 			return;
 		}
+
 		get_online_cpus();
+
+		/* about to retry, get the latest token value */
+		tok = atomic_read(&sync_sched_expedited_token);
 	}
-	atomic_inc(&synchronize_sched_expedited_count);
+
+	/*
+	 * We now know that everything upto @tok is synchronized.
+	 * Update done counter which should always monotonically
+	 * increase (with wrapping considered).
+	 */
+	do {
+		t = atomic_read(&sync_sched_expedited_done);
+		if (t - tok >= 0)
+			break;
+	} while (atomic_cmpxchg(&sync_sched_expedited_done, t, tok) != t);
+
 	smp_mb__after_atomic_inc(); /* ensure post-GP actions seen after GP. */
 	put_online_cpus();
 }

  reply	other threads:[~2010-11-09 13:28 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-11-07  2:05 [PATCH RFC tip/core/rcu 0/12] preview of RCU patches for 2.6.38 Paul E. McKenney
2010-11-07  2:05 ` [PATCH RFC tip/core/rcu 01/12] rcu: add priority-inversion testing to rcutorture Paul E. McKenney
2010-11-07  2:05 ` [PATCH RFC tip/core/rcu 02/12] rcu: move TINY_RCU from softirq to kthread Paul E. McKenney
2010-11-07  2:05 ` [PATCH RFC tip/core/rcu 03/12] rcu: priority boosting for TINY_PREEMPT_RCU Paul E. McKenney
2010-11-07  2:05 ` [PATCH RFC tip/core/rcu 04/12] rcu: add tracing for TINY_RCU and TINY_PREEMPT_RCU Paul E. McKenney
2010-11-07  2:05 ` [PATCH RFC tip/core/rcu 05/12] rcu: document TINY_RCU and TINY_PREEMPT_RCU tracing Paul E. McKenney
2010-11-07  2:05 ` [PATCH RFC tip/core/rcu 06/12] rcu: Distinguish between boosting and boosted Paul E. McKenney
2010-11-07  2:05 ` [PATCH RFC tip/core/rcu 07/12] rcu: get rid of obsolete "classic" names in TREE_RCU tracing Paul E. McKenney
2010-11-07  2:05 ` [PATCH RFC tip/core/rcu 08/12] rcu,cleanup: move synchronize_sched_expedited() out of sched.c Paul E. McKenney
2010-11-07  2:05 ` [PATCH RFC tip/core/rcu 09/12] rcu,cleanup: simplify the code when cpu is dying Paul E. McKenney
2010-11-07  2:05 ` [PATCH RFC tip/core/rcu 10/12] rcu: update documentation/comments for Lai's adoption patch Paul E. McKenney
2010-11-07  2:05 ` [PATCH RFC tip/core/rcu 11/12] rcu: fix race condition in synchronize_sched_expedited() Paul E. McKenney
2010-11-09 13:26   ` Tejun Heo [this message]
2010-11-10  8:56     ` Lai Jiangshan
2010-11-11  4:20       ` Paul E. McKenney
2010-11-11  9:10         ` Tejun Heo
2010-11-11 12:31           ` Paul E. McKenney
2010-11-11 12:52             ` Tejun Heo
2010-11-07  2:05 ` [PATCH RFC tip/core/rcu 12/12] rcu: Make synchronize_srcu_expedited() fast if running readers Paul E. McKenney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4CD94C0D.3030007@kernel.org \
    --to=tj@kernel.org \
    --cc=Valdis.Kletnieks@vt.edu \
    --cc=akpm@linux-foundation.org \
    --cc=darren@dvhart.com \
    --cc=dhowells@redhat.com \
    --cc=dipankar@in.ibm.com \
    --cc=eric.dumazet@gmail.com \
    --cc=josh@joshtriplett.org \
    --cc=laijs@cn.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mathieu.desnoyers@polymtl.ca \
    --cc=mingo@elte.hu \
    --cc=niv@us.ibm.com \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.