linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Ingo Molnar <mingo@kernel.org>,
	Tim Chen <tim.c.chen@linux.intel.com>,
	Will Deacon <will.deacon@arm.com>, Ingo Molnar <mingo@elte.hu>,
	Andrew Morton <akpm@linux-foundation.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	linux-mm <linux-mm@kvack.org>,
	"linux-arch@vger.kernel.org" <linux-arch@vger.kernel.org>,
	Waiman Long <waiman.long@hp.com>,
	Andrea Arcangeli <aarcange@redhat.com>,
	Alex Shi <alex.shi@linaro.org>, Andi Kleen <andi@firstfloor.org>,
	Michel Lespinasse <walken@google.com>,
	Davidlohr Bueso <davidlohr.bueso@hp.com>,
	Matthew R Wilcox <matthew.r.wilcox@intel.com>,
	Dave Hansen <dave.hansen@intel.com>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>,
	Rik van Riel <riel@redhat.com>,
	Peter Hurley <peter@hurleysoftware.com>,
	Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>,
	George Spelvin <linux@horizon.com>,
	"H. Peter Anvin" <hpa@zytor.com>, Arnd Bergmann <arnd@arndb.de>,
	Aswin Chandramouleeswaran <aswin@hp.com>,
	Scott J Norton <scott.norton@hp.com>,
	"Figo.zhang" <figo1802@gmail.com>
Subject: Re: [PATCH v6 4/5] MCS Lock: Barrier corrections
Date: Fri, 22 Nov 2013 12:06:20 -0800	[thread overview]
Message-ID: <20131122200620.GA4138@linux.vnet.ibm.com> (raw)
In-Reply-To: <CA+55aFyKKpf-i4pQ_dhy9gic74xtCbO+U8GXU6mCtQj1ZHy05A@mail.gmail.com>

On Fri, Nov 22, 2013 at 11:06:41AM -0800, Linus Torvalds wrote:
> On Fri, Nov 22, 2013 at 10:49 AM, Paul E. McKenney
> <paulmck@linux.vnet.ibm.com> wrote:
> >
> > You see, my problem is not the "crazy ordering" DEC Alpha, Itanium,
> > PowerPC, or even ARM.  It is really obvious what instructions to use in
> > a stiffened-up smp_store_release() for those guys: "mb" for DEC Alpha,
> > "st.rel" for Itanium, "sync" for PowerPC, and "dmb" for ARM.  Believe it
> > or not, my problem is instead with good old tightly ordered x86.
> >
> > We -could- just put an mfence into x86's smp_store_release() and
> > be done with it
> 
> Why would you bother?  The *acquire* has a memory barrier. End of
> story. On x86, it has to (since otherwise a load inside the locked
> region could be re-ordered wrt the write that takes the lock).

I am sorry, but that is not always correct.  For example, in the contended
case for Tim Chen's MCS queued locks, the x86 acquisition-side handoff
code does -not- contain any stores or memory-barrier instructions.
Here is that portion of the arch_mcs_spin_lock() code, along with the
x86 definition for smp_load_acquire:

+       while (!(smp_load_acquire(&node->locked)))                      \
+               arch_mutex_cpu_relax();                                 \

+#define smp_load_acquire(p)                                            \
+({                                                                     \
+       typeof(*p) ___p1 = ACCESS_ONCE(*p);                             \
+       compiletime_assert_atomic_type(*p);                             \
+       barrier();                                                      \
+       ___p1;                                                          \
+})

No stores, no memory-barrier instructions.

Of course, the fact that there are no stores means that on x86 the
critical section cannot leak out, even with no memory barrier.  That is
the easy part.  The hard part is if we want unlock+lock to be a full
memory barrier for MCS lock from the viewpoint of code not holding
that lock.  We clearly cannot rely on the non-existent memory barrier
in the acquisition handoff code.

And yes, there is a full barrier implied by the xchg() further up in
arch_mcs_spin_lock(), shown in full below, but that barrier is before
the handoff code, so that xchg() cannot have any effect on the handoff.
That xchg() therefore cannot force unlock+lock to act as a full memory
barrier in the contended queue-handoff case.

> Basically, any time you think you need to add a memory barrier on x86,
> you should go "I'm doing something wrong". It's that simple.

It -appears- that the MCS queue handoff code is one of the many cases
where we don't need a memory barrier on x86, even if we do want MCS
unlock+lock to be a full memory barrier.  But I wouldn't call it simple.
I -think- we do have a proof, but I don't yet totally trust it.

							Thanx, Paul


>                   Linus

------------------------------------------------------------------------

+#define arch_mcs_spin_lock(lock, node)                                 \
+{                                                                      \
+       struct mcs_spinlock *prev;                                      \
+                                                                       \
+       /* Init node */                                                 \
+       node->locked = 0;                                               \
+       node->next   = NULL;                                            \
+                                                                       \
+       /* xchg() provides a memory barrier */                          \
+       prev = xchg(lock, node);                                        \
+       if (likely(prev == NULL)) {                                     \
+               /* Lock acquired */                                     \
+               return;                                                 \
+       }                                                               \
+       ACCESS_ONCE(prev->next) = node;                                 \
+       /*                                                              \
+        * Wait until the lock holder passes the lock down.             \
+        * Using smp_load_acquire() provides a memory barrier that      \
+        * ensures subsequent operations happen after the lock is       \
+        * acquired.                                                    \
+        */                                                             \
+       while (!(smp_load_acquire(&node->locked)))                      \
+               arch_mutex_cpu_relax();                                 \

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2013-11-22 20:06 UTC|newest]

Thread overview: 116+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <cover.1384885312.git.tim.c.chen@linux.intel.com>
2013-11-20  1:37 ` [PATCH v6 0/5] MCS Lock: MCS lock code cleanup and optimizations Tim Chen
2013-11-20 10:19   ` Will Deacon
2013-11-20 12:50     ` Paul E. McKenney
2013-11-20 17:00       ` Will Deacon
2013-11-20 17:14         ` Paul E. McKenney
2013-11-20 17:00     ` Tim Chen
2013-11-20 17:16       ` Paul E. McKenney
2013-11-20  1:37 ` [PATCH v6 1/5] MCS Lock: Restructure the MCS lock defines and locking code into its own file Tim Chen
2013-11-20  1:37 ` [PATCH v6 2/5] MCS Lock: optimizations and extra comments Tim Chen
2013-11-20  1:37 ` [PATCH v6 3/5] MCS Lock: Move mcs_lock/unlock function into its own file Tim Chen
2013-11-20  1:37 ` [PATCH v6 4/5] MCS Lock: Barrier corrections Tim Chen
2013-11-20 15:31   ` Paul E. McKenney
2013-11-20 15:46     ` Will Deacon
2013-11-20 17:14       ` Paul E. McKenney
2013-11-20 18:43         ` Tim Chen
2013-11-20 19:06           ` Paul E. McKenney
2013-11-20 20:36             ` Tim Chen
2013-11-20 21:44               ` Paul E. McKenney
2013-11-20 23:51                 ` Tim Chen
2013-11-21  4:53                   ` Paul E. McKenney
2013-11-21 10:17                     ` Will Deacon
2013-11-21 13:16                       ` Paul E. McKenney
2013-11-21 10:45                     ` Peter Zijlstra
2013-11-21 13:18                       ` Paul E. McKenney
2013-11-21 22:27                     ` Linus Torvalds
2013-11-21 22:52                       ` Paul E. McKenney
2013-11-22  0:09                         ` Linus Torvalds
2013-11-22  4:08                           ` Paul E. McKenney
2013-11-22  4:25                             ` Linus Torvalds
2013-11-22  6:23                               ` Paul E. McKenney
2013-11-22 15:16                                 ` Ingo Molnar
2013-11-22 18:49                                   ` Paul E. McKenney
2013-11-22 19:06                                     ` Linus Torvalds
2013-11-22 20:06                                       ` Paul E. McKenney [this message]
2013-11-22 20:09                                         ` Linus Torvalds
2013-11-22 20:37                                           ` Paul E. McKenney
2013-11-22 21:01                                             ` Linus Torvalds
2013-11-22 21:52                                               ` Paul E. McKenney
2013-11-22 22:19                                                 ` Linus Torvalds
2013-11-23  0:25                                                   ` Paul E. McKenney
2013-11-23  0:42                                                     ` Linus Torvalds
2013-11-23  1:36                                                       ` Paul E. McKenney
2013-11-23  2:11                                                         ` Linus Torvalds
2013-11-23  4:05                                                           ` Paul E. McKenney
2013-11-23 11:24                                                             ` Ingo Molnar
2013-11-23 17:06                                                               ` Paul E. McKenney
2013-11-26 12:02                                                                 ` Ingo Molnar
2013-11-26 19:28                                                                   ` Paul E. McKenney
2013-11-23 20:21                                                         ` Linus Torvalds
2013-11-23 20:39                                                           ` Linus Torvalds
2013-11-25 12:09                                                             ` Peter Zijlstra
2013-11-25 17:18                                                               ` Will Deacon
2013-11-25 17:56                                                                 ` Paul E. McKenney
2013-11-25 17:54                                                             ` Paul E. McKenney
2013-11-23 21:29                                                           ` Peter Zijlstra
2013-11-23 22:24                                                             ` Linus Torvalds
2013-11-25 17:53                                                           ` Paul E. McKenney
2013-11-25 18:21                                                             ` Peter Zijlstra
2013-11-21 11:03         ` Peter Zijlstra
2013-11-21 12:56           ` Peter Zijlstra
2013-11-21 13:20             ` Paul E. McKenney
2013-11-21 17:25               ` Paul E. McKenney
2013-11-21 21:52                 ` Peter Zijlstra
2013-11-21 22:18                   ` Paul E. McKenney
2013-11-22 15:58                     ` Peter Zijlstra
2013-11-22 18:26                       ` Paul E. McKenney
2013-11-22 18:51                         ` Peter Zijlstra
2013-11-22 18:59                           ` Paul E. McKenney
2013-11-25 17:35                           ` Peter Zijlstra
2013-11-25 18:02                             ` Paul E. McKenney
2013-11-25 18:24                               ` Peter Zijlstra
2013-11-25 18:34                                 ` Tim Chen
2013-11-25 18:27                               ` Peter Zijlstra
2013-11-25 23:52                                 ` Paul E. McKenney
2013-11-26  9:59                                   ` Peter Zijlstra
2013-11-26 17:11                                     ` Paul E. McKenney
2013-11-26 17:18                                       ` Peter Zijlstra
2013-11-26 19:00                                     ` Linus Torvalds
2013-11-26 19:20                                       ` Paul E. McKenney
2013-11-26 19:32                                         ` Linus Torvalds
2013-11-26 22:51                                           ` Paul E. McKenney
2013-11-26 23:58                                             ` Linus Torvalds
2013-11-27  0:21                                               ` Thomas Gleixner
2013-11-27  0:39                                               ` Paul E. McKenney
2013-11-27  1:05                                                 ` Linus Torvalds
2013-11-27  1:31                                                   ` Paul E. McKenney
2013-11-27 10:16                                             ` Will Deacon
2013-11-27 17:11                                               ` Paul E. McKenney
2013-11-28 11:40                                                 ` Will Deacon
2013-11-28 17:38                                                   ` Paul E. McKenney
2013-11-28 18:03                                                     ` Will Deacon
2013-11-28 18:27                                                       ` Paul E. McKenney
2013-11-28 18:53                                                         ` Will Deacon
2013-11-28 19:50                                                           ` Paul E. McKenney
2013-11-29 16:17                                                             ` Will Deacon
2013-11-29 16:44                                                               ` Linus Torvalds
2013-11-29 18:18                                                                 ` Will Deacon
2013-11-30 17:38                                                                 ` Paul E. McKenney
2013-11-26 19:21                                       ` Peter Zijlstra
2013-11-27 16:58                                         ` Oleg Nesterov
2013-11-26 23:08                                       ` Benjamin Herrenschmidt
2013-11-25 23:55                               ` H. Peter Anvin
2013-11-26  3:16                                 ` Paul E. McKenney
2013-11-27  0:46                                   ` H. Peter Anvin
2013-11-27  1:07                                     ` Linus Torvalds
2013-11-27  1:27                                     ` Paul E. McKenney
2013-11-27  2:59                                       ` H. Peter Anvin
2013-11-25 18:52                             ` H. Peter Anvin
2013-11-25 22:58                               ` Tim Chen
2013-11-25 23:28                                 ` H. Peter Anvin
2013-11-25 23:51                                   ` Paul E. McKenney
2013-11-25 23:36                               ` Paul E. McKenney
2013-12-04 21:26                 ` Andi Kleen
2013-12-04 22:07                   ` Paul E. McKenney
2013-11-21 13:19           ` Paul E. McKenney
2013-11-20  1:37 ` [PATCH v6 5/5] MCS Lock: Allows for architecture specific mcs lock and unlock Tim Chen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20131122200620.GA4138@linux.vnet.ibm.com \
    --to=paulmck@linux.vnet.ibm.com \
    --cc=a.p.zijlstra@chello.nl \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=alex.shi@linaro.org \
    --cc=andi@firstfloor.org \
    --cc=arnd@arndb.de \
    --cc=aswin@hp.com \
    --cc=dave.hansen@intel.com \
    --cc=davidlohr.bueso@hp.com \
    --cc=figo1802@gmail.com \
    --cc=hpa@zytor.com \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux@horizon.com \
    --cc=matthew.r.wilcox@intel.com \
    --cc=mingo@elte.hu \
    --cc=mingo@kernel.org \
    --cc=peter@hurleysoftware.com \
    --cc=raghavendra.kt@linux.vnet.ibm.com \
    --cc=riel@redhat.com \
    --cc=scott.norton@hp.com \
    --cc=tglx@linutronix.de \
    --cc=tim.c.chen@linux.intel.com \
    --cc=torvalds@linux-foundation.org \
    --cc=waiman.long@hp.com \
    --cc=walken@google.com \
    --cc=will.deacon@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).