public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>,
	Waiman Long <waiman.long@hp.com>,
	Mikulas Patocka <mpatocka@redhat.com>,
	"James E.J. Bottomley" <jejb@parisc-linux.org>,
	Helge Deller <deller@gmx.de>,
	John David Anglin <dave.anglin@bell.net>,
	Parisc List <linux-parisc@vger.kernel.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	"Vinod, Chegu" <chegu_vinod@hp.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Rik van Riel <riel@redhat.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Davidlohr Bueso <davidlohr@hp.com>, Peter Anvin <hpa@zytor.com>,
	Andi Kleen <andi@firstfloor.org>,
	"Chandramouleeswaran, Aswin" <aswin@hp.com>,
	"Norton, Scott J" <scott.norton@hp.com>,
	Jason Low <jason.low2@hp.com>
Subject: Re: [PATCH v2] introduce atomic_pointer to fix a race condition in cancelable mcs spinlocks
Date: Mon, 2 Jun 2014 15:08:31 -0700	[thread overview]
Message-ID: <20140602220831.GG22231@linux.vnet.ibm.com> (raw)
In-Reply-To: <CA+55aFzX3oDvzqh4AfkJ7-X1Yfo8obFM+jOOmwzeN-GyqOPLJg@mail.gmail.com>

On Mon, Jun 02, 2014 at 02:12:30PM -0700, Linus Torvalds wrote:
> On Mon, Jun 2, 2014 at 2:02 PM, Paul E. McKenney
> <paulmck@linux.vnet.ibm.com> wrote:
> >
> > In the ->qlen case, interrupts are disabled and the current CPU is
> > the only one who can write, so the read need not be volatile.  In the
> > ->n_barrier_done, modifications are done holding ->barrier_mutex, so again
> > the read need not be volatile.  In the sync_rcu_preempt_exp_count case,
> > modifications are done holding sync_rcu_preempt_exp_mutex, so once again,
> > the read need not be volatile.  So I could do something like:
> >
> >         ACCESS_ONCE(rdp->qlen) = rdp->qlen + 1;
> >
> > But that still makes gcc generate bad code
> >
> > The reason I was not all that worried about this is that these are not
> > in fastpaths, and the last two are especially not in fastpaths.
> >
> > Suggestions?
> 
> So I think it probably *works*, but even so splitting it up to use
> ACCESS_ONCE() on just the write is probably a better option, if only
> because it would then make it much easier to change if we do end up
> splitting reads and writes.
> 
> Because from a gcc code generation standpoint, using "volatile" will
> always be horrible, because gcc will never be able to turn it into a
> read-modify-write cycle. Arguable gcc _should_ be able to do that (it
> is certainly allowable within the virtual machine definition), but I
> understand why it doesn't ("volatile? Let's not optimize anything at
> all, because it's special").
> 
> So "ACCESS_ONCE() + R-M-W" operation is actually pretty much
> guaranteed to be "ACCESS_TWICE()", which may well be ok (performance
> may not matter, and even when it does most architectures don't
> actually have r-m-w instructions and when they do they aren't always
> even faster), but I think it's just horribly horribly bad from a
> conceptual and readability standpoint because it's so misleading.
> 
> So I'd actually rather see two explicit ACCESS_ONCE() calls - once to
> read, once to write. Because that at least describes what is
> happening, unlike the current situation.
> 
> Put another way: I can understand why you do it, and I can even agree
> that it is "correct" from a functionality standpoint. But even despite
> that all, I really don't like the construct very much..

OK, I have queued the following commit for 3.17.  Is this what you had
in mind?

							Thanx, Paul

------------------------------------------------------------------------

rcu: Eliminate read-modify-write ACCESS_ONCE() calls

RCU contains code of the following forms:

	ACCESS_ONCE(x)++;
	ACCESS_ONCE(x) += y;
	ACCESS_ONCE(x) -= y;

Now these constructs do operate correctly, but they really result in a
pair of volatile accesses, one to do the load and another to do the store.
This can be confusing, as the casual reader might well assume that (for
example) gcc might generate a memory-to-memory add instruction for each
of these three cases.  In fact, gcc will do no such thing.  Also, there
is a good chance that the kernel will move to separate load and store
variants of ACCESS_ONCE(), and constructs like the above could easily
confuse both people and scripts attempting to make that sort of change.
Finally, most of RCU's read-modify-write uses of ACCESS_ONCE() really
only need the store to be volatile, so that the read-modify-write form
might be misleading.

This commit therefore changes the above forms in RCU so that each instance
of ACCESS_ONCE() either does a load or a store, but not both.  In a few
cases, ACCESS_ONCE() was not critical, for example, for maintaining 
statisitics.  In these cases, ACCESS_ONCE() has been dispensed with
entirely

Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

diff --git a/kernel/rcu/srcu.c b/kernel/rcu/srcu.c
index c639556f3fa0..c0120279dead 100644
--- a/kernel/rcu/srcu.c
+++ b/kernel/rcu/srcu.c
@@ -295,12 +295,15 @@ EXPORT_SYMBOL_GPL(cleanup_srcu_struct);
 int __srcu_read_lock(struct srcu_struct *sp)
 {
 	int idx;
+	unsigned long *lp;
 
 	idx = ACCESS_ONCE(sp->completed) & 0x1;
 	preempt_disable();
-	ACCESS_ONCE(this_cpu_ptr(sp->per_cpu_ref)->c[idx]) += 1;
+	lp = this_cpu_ptr(&sp->per_cpu_ref->c[idx]);
+	ACCESS_ONCE(*lp) = *lp + 1;
 	smp_mb(); /* B */  /* Avoid leaking the critical section. */
-	ACCESS_ONCE(this_cpu_ptr(sp->per_cpu_ref)->seq[idx]) += 1;
+	lp = this_cpu_ptr(&sp->per_cpu_ref->seq[idx]);
+	ACCESS_ONCE(*lp) = *lp + 1;
 	preempt_enable();
 	return idx;
 }
diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index d1c8e4a85b92..f0ed867070cd 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -2275,7 +2275,7 @@ static void rcu_do_batch(struct rcu_state *rsp, struct rcu_data *rdp)
 	}
 	smp_mb(); /* List handling before counting for rcu_barrier(). */
 	rdp->qlen_lazy -= count_lazy;
-	ACCESS_ONCE(rdp->qlen) -= count;
+	ACCESS_ONCE(rdp->qlen) = rdp->qlen - count;
 	rdp->n_cbs_invoked += count;
 
 	/* Reinstate batch limit if we have worked down the excess. */
@@ -2420,7 +2420,7 @@ static void force_quiescent_state(struct rcu_state *rsp)
 		if (rnp_old != NULL)
 			raw_spin_unlock(&rnp_old->fqslock);
 		if (ret) {
-			ACCESS_ONCE(rsp->n_force_qs_lh)++;
+			rsp->n_force_qs_lh++;
 			return;
 		}
 		rnp_old = rnp;
@@ -2432,7 +2432,7 @@ static void force_quiescent_state(struct rcu_state *rsp)
 	smp_mb__after_unlock_lock();
 	raw_spin_unlock(&rnp_old->fqslock);
 	if (ACCESS_ONCE(rsp->gp_flags) & RCU_GP_FLAG_FQS) {
-		ACCESS_ONCE(rsp->n_force_qs_lh)++;
+		rsp->n_force_qs_lh++;
 		raw_spin_unlock_irqrestore(&rnp_old->lock, flags);
 		return;  /* Someone beat us to it. */
 	}
@@ -2621,7 +2621,7 @@ __call_rcu(struct rcu_head *head, void (*func)(struct rcu_head *rcu),
 		local_irq_restore(flags);
 		return;
 	}
-	ACCESS_ONCE(rdp->qlen)++;
+	ACCESS_ONCE(rdp->qlen) = rdp->qlen + 1;
 	if (lazy)
 		rdp->qlen_lazy++;
 	else
@@ -3185,7 +3185,7 @@ static void _rcu_barrier(struct rcu_state *rsp)
 	 * ACCESS_ONCE() to prevent the compiler from speculating
 	 * the increment to precede the early-exit check.
 	 */
-	ACCESS_ONCE(rsp->n_barrier_done)++;
+	ACCESS_ONCE(rsp->n_barrier_done) = rsp->n_barrier_done + 1;
 	WARN_ON_ONCE((rsp->n_barrier_done & 0x1) != 1);
 	_rcu_barrier_trace(rsp, "Inc1", -1, rsp->n_barrier_done);
 	smp_mb(); /* Order ->n_barrier_done increment with below mechanism. */
@@ -3235,7 +3235,7 @@ static void _rcu_barrier(struct rcu_state *rsp)
 
 	/* Increment ->n_barrier_done to prevent duplicate work. */
 	smp_mb(); /* Keep increment after above mechanism. */
-	ACCESS_ONCE(rsp->n_barrier_done)++;
+	ACCESS_ONCE(rsp->n_barrier_done) = rsp->n_barrier_done + 1;
 	WARN_ON_ONCE((rsp->n_barrier_done & 0x1) != 0);
 	_rcu_barrier_trace(rsp, "Inc2", -1, rsp->n_barrier_done);
 	smp_mb(); /* Keep increment before caller's subsequent code. */
diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
index aee1e924b048..7ce734040a5e 100644
--- a/kernel/rcu/tree_plugin.h
+++ b/kernel/rcu/tree_plugin.h
@@ -2274,8 +2274,8 @@ static int rcu_nocb_kthread(void *arg)
 		tail = xchg(&rdp->nocb_tail, &rdp->nocb_head);
 		c = atomic_long_xchg(&rdp->nocb_q_count, 0);
 		cl = atomic_long_xchg(&rdp->nocb_q_count_lazy, 0);
-		ACCESS_ONCE(rdp->nocb_p_count) += c;
-		ACCESS_ONCE(rdp->nocb_p_count_lazy) += cl;
+		rdp->nocb_p_count += c;
+		rdp->nocb_p_count_lazy += cl;
 		rcu_nocb_wait_gp(rdp);
 
 		/* Each pass through the following loop invokes a callback. */


  reply	other threads:[~2014-06-02 22:08 UTC|newest]

Thread overview: 61+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-06-01 17:53 [PATCH] fix a race condition in cancelable mcs spinlocks Mikulas Patocka
2014-06-01 19:20 ` Peter Zijlstra
2014-06-01 20:46   ` John David Anglin
2014-06-01 21:30     ` Peter Zijlstra
2014-06-01 21:46       ` Paul E. McKenney
2014-06-02  9:19       ` Mikulas Patocka
2014-06-02 13:24         ` Paul E. McKenney
2014-06-02 15:57           ` Mikulas Patocka
2014-06-02 16:39             ` Paul E. McKenney
2014-06-02 19:56       ` James Bottomley
2014-06-03  7:56         ` Peter Zijlstra
2014-06-04 12:53           ` Mikulas Patocka
2014-06-02 13:58     ` Mikulas Patocka
2014-06-02 14:02       ` Mikulas Patocka
2014-06-02 15:39         ` John David Anglin
2014-06-02 10:34   ` Mikulas Patocka
2014-06-02 14:14 ` Waiman Long
2014-06-02 15:27   ` Jason Low
2014-06-02 16:00 ` [PATCH v2] introduce atomic_pointer to " Mikulas Patocka
2014-06-02 16:25   ` Peter Zijlstra
2014-06-02 16:30     ` Peter Zijlstra
2014-06-02 16:46       ` Paul E. McKenney
2014-06-02 17:33       ` Waiman Long
2014-06-02 20:05         ` Peter Zijlstra
2014-06-02 20:22           ` Linus Torvalds
2014-06-02 21:02             ` Paul E. McKenney
2014-06-02 21:12               ` Linus Torvalds
2014-06-02 22:08                 ` Paul E. McKenney [this message]
2014-06-02 22:44                   ` Eric Dumazet
2014-06-02 23:17                     ` Paul E. McKenney
2014-06-02 23:53                       ` Eric Dumazet
2014-06-03  0:28                         ` Paul E. McKenney
2014-06-02 22:55                   ` Linus Torvalds
2014-06-03 16:48                     ` Paul E. McKenney
2014-06-03  7:54             ` Peter Zijlstra
2014-06-02 20:24           ` James Bottomley
2014-06-02 16:43     ` Paul E. McKenney
2014-06-02 17:14       ` James Bottomley
2014-06-02 17:29       ` Waiman Long
2014-06-02 17:09     ` Linus Torvalds
2014-06-02 17:12       ` Davidlohr Bueso
2014-06-02 17:42         ` Waiman Long
2014-06-02 20:46       ` Mikulas Patocka
2014-06-02 20:53         ` Linus Torvalds
2014-06-02 21:12           ` Mikulas Patocka
2014-06-03  7:36             ` Peter Zijlstra
2014-06-03 11:14               ` Mikulas Patocka
2014-06-03 13:24                 ` Peter Zijlstra
2014-06-03 14:18                   ` Mikulas Patocka
2014-06-03 14:07               ` Paul E. McKenney
2014-06-03 15:09                 ` Peter Zijlstra
2014-06-03 15:56                   ` Paul E. McKenney
2014-06-03  7:20         ` Peter Zijlstra
2014-06-02 21:03       ` Paul E. McKenney
2014-06-06 15:06       ` Peter Zijlstra
2014-06-06 15:15         ` Paul E. McKenney
2014-06-06 15:42         ` Davidlohr Bueso
2014-06-02 16:50   ` Jason Low
2014-06-02 17:03     ` Paul E. McKenney
2014-06-02 17:25     ` Waiman Long
2014-06-02 17:38       ` H. Peter Anvin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140602220831.GG22231@linux.vnet.ibm.com \
    --to=paulmck@linux.vnet.ibm.com \
    --cc=akpm@linux-foundation.org \
    --cc=andi@firstfloor.org \
    --cc=aswin@hp.com \
    --cc=chegu_vinod@hp.com \
    --cc=dave.anglin@bell.net \
    --cc=davidlohr@hp.com \
    --cc=deller@gmx.de \
    --cc=hpa@zytor.com \
    --cc=jason.low2@hp.com \
    --cc=jejb@parisc-linux.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-parisc@vger.kernel.org \
    --cc=mpatocka@redhat.com \
    --cc=peterz@infradead.org \
    --cc=riel@redhat.com \
    --cc=scott.norton@hp.com \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=waiman.long@hp.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox