From: will.deacon@arm.com (Will Deacon)
To: linux-arm-kernel@lists.infradead.org
Subject: FW: Commit 81a43adae3b9 (locking/mutex: Use acquire/release semantics) causing failures on arm64 (ThunderX)
Date: Fri, 11 Dec 2015 12:04:19 +0000 [thread overview]
Message-ID: <20151211120419.GD18828@arm.com> (raw)
In-Reply-To: <20151211084133.GE6356@twins.programming.kicks-ass.net>
Hi all,
On Fri, Dec 11, 2015 at 09:41:33AM +0100, Peter Zijlstra wrote:
> On Thu, Dec 10, 2015 at 08:51:34PM -0800, Andrew Pinski wrote:
>
> > So looking further I think I understand what is going wrong and why
> > c55a6ffa6285e29f874ed403979472631ec70bff is incorrect.
>
> The osq_wait_next() call in osq_lock() is when we fail the lock. This is
> effectively trylock() semantics and like for cmpxchg a failed trylock
> has no implied barrier semantics. So from that POV osq_wait_next() does
> not need to provide ACQUIRE semantics.
>
> In osq_unlock() there's an xchg() in front, which implies full barriers
> and thereby provides RELEASE semantics for that part of osq_unlock(), so
> again, from this POV osq_wait_next() does not need to provide RELEASE
> semantics.
>
> > The compare and swap inside osq_lock needs to be both release and
> > acquire semantics memory barriers because the stores (to node) need to
> > be visible to the other cores before the setting of lock->tail
> > happens.
>
> I'm a wee bit confused on what exactly you mean. Both stores to @node:
>
> 1) osq_wait_next(): next = xchg(&node->next, NULL)
> 2) osq_unlock(): next = xchg(&node->next, NULL)
>
> are xchg() calls which imply full ordering (sequential consistency).
I think Andrew meant the atomic_xchg_acquire at the start of osq_lock,
as opposed to "compare and swap". In which case, it does look like
there's a bug here because there is nothing to order the initialisation
of the node fields with publishing of the node, whether that's
indirectly as a result of setting the tail to the current CPU or
directly as a result of the WRITE_ONCE.
Andrew, David: does making that atomic_xchg_acquire and atomic_xchg
fix things for you?
I don't fully grok what 81a43adae3b9 has to do with any of this, so
maybe there's another bug too.
Will
--->8
diff --git a/kernel/locking/osq_lock.c b/kernel/locking/osq_lock.c
index d092a0c9c2d4..05a37857ab55 100644
--- a/kernel/locking/osq_lock.c
+++ b/kernel/locking/osq_lock.c
@@ -93,10 +93,12 @@ bool osq_lock(struct optimistic_spin_queue *lock)
node->cpu = curr;
/*
- * ACQUIRE semantics, pairs with corresponding RELEASE
- * in unlock() uncontended, or fastpath.
+ * We need both ACQUIRE (pairs with corresponding RELEASE in
+ * unlock() uncontended, or fastpath) and RELEASE (to publish
+ * the node fields we just initialised) semantics when updating
+ * the lock tail.
*/
- old = atomic_xchg_acquire(&lock->tail, curr);
+ old = atomic_xchg(&lock->tail, curr);
if (old == OSQ_UNLOCKED_VAL)
return true;
WARNING: multiple messages have this Message-ID (diff)
From: Will Deacon <will.deacon@arm.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Andrew Pinski <andrew.pinski@caviumnetworks.com>,
Davidlohr Bueso <dbueso@suse.de>,
Thomas Gleixner <tglx@linutronix.de>,
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
Ingo Molnar <mingo@kernel.org>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
"linux-arm-kernel@lists.infradead.org"
<linux-arm-kernel@lists.infradead.org>
Subject: Re: FW: Commit 81a43adae3b9 (locking/mutex: Use acquire/release semantics) causing failures on arm64 (ThunderX)
Date: Fri, 11 Dec 2015 12:04:19 +0000 [thread overview]
Message-ID: <20151211120419.GD18828@arm.com> (raw)
In-Reply-To: <20151211084133.GE6356@twins.programming.kicks-ass.net>
Hi all,
On Fri, Dec 11, 2015 at 09:41:33AM +0100, Peter Zijlstra wrote:
> On Thu, Dec 10, 2015 at 08:51:34PM -0800, Andrew Pinski wrote:
>
> > So looking further I think I understand what is going wrong and why
> > c55a6ffa6285e29f874ed403979472631ec70bff is incorrect.
>
> The osq_wait_next() call in osq_lock() is when we fail the lock. This is
> effectively trylock() semantics and like for cmpxchg a failed trylock
> has no implied barrier semantics. So from that POV osq_wait_next() does
> not need to provide ACQUIRE semantics.
>
> In osq_unlock() there's an xchg() in front, which implies full barriers
> and thereby provides RELEASE semantics for that part of osq_unlock(), so
> again, from this POV osq_wait_next() does not need to provide RELEASE
> semantics.
>
> > The compare and swap inside osq_lock needs to be both release and
> > acquire semantics memory barriers because the stores (to node) need to
> > be visible to the other cores before the setting of lock->tail
> > happens.
>
> I'm a wee bit confused on what exactly you mean. Both stores to @node:
>
> 1) osq_wait_next(): next = xchg(&node->next, NULL)
> 2) osq_unlock(): next = xchg(&node->next, NULL)
>
> are xchg() calls which imply full ordering (sequential consistency).
I think Andrew meant the atomic_xchg_acquire at the start of osq_lock,
as opposed to "compare and swap". In which case, it does look like
there's a bug here because there is nothing to order the initialisation
of the node fields with publishing of the node, whether that's
indirectly as a result of setting the tail to the current CPU or
directly as a result of the WRITE_ONCE.
Andrew, David: does making that atomic_xchg_acquire and atomic_xchg
fix things for you?
I don't fully grok what 81a43adae3b9 has to do with any of this, so
maybe there's another bug too.
Will
--->8
diff --git a/kernel/locking/osq_lock.c b/kernel/locking/osq_lock.c
index d092a0c9c2d4..05a37857ab55 100644
--- a/kernel/locking/osq_lock.c
+++ b/kernel/locking/osq_lock.c
@@ -93,10 +93,12 @@ bool osq_lock(struct optimistic_spin_queue *lock)
node->cpu = curr;
/*
- * ACQUIRE semantics, pairs with corresponding RELEASE
- * in unlock() uncontended, or fastpath.
+ * We need both ACQUIRE (pairs with corresponding RELEASE in
+ * unlock() uncontended, or fastpath) and RELEASE (to publish
+ * the node fields we just initialised) semantics when updating
+ * the lock tail.
*/
- old = atomic_xchg_acquire(&lock->tail, curr);
+ old = atomic_xchg(&lock->tail, curr);
if (old == OSQ_UNLOCKED_VAL)
return true;
next prev parent reply other threads:[~2015-12-11 12:04 UTC|newest]
Thread overview: 43+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-12-10 19:43 Commit 81a43adae3b9 (locking/mutex: Use acquire/release semantics) causing failures on arm64 (ThunderX) David Daney
2015-12-10 19:43 ` David Daney
[not found] ` <SN1PR07MB21577C72379C8440A208D6BC9EEA0@SN1PR07MB2157.namprd07.prod.outlook.com>
2015-12-11 3:29 ` FW: " Andrew Pinski
2015-12-11 3:29 ` Andrew Pinski
2015-12-11 4:51 ` Andrew Pinski
2015-12-11 4:51 ` Andrew Pinski
2015-12-11 8:41 ` Peter Zijlstra
2015-12-11 8:41 ` Peter Zijlstra
2015-12-11 12:04 ` Will Deacon [this message]
2015-12-11 12:04 ` Will Deacon
2015-12-11 12:13 ` Peter Zijlstra
2015-12-11 12:13 ` Peter Zijlstra
2015-12-11 12:18 ` Will Deacon
2015-12-11 12:18 ` Will Deacon
2015-12-11 12:26 ` Peter Zijlstra
2015-12-11 12:26 ` Peter Zijlstra
2015-12-11 13:33 ` Will Deacon
2015-12-11 13:33 ` Will Deacon
2015-12-11 13:48 ` Peter Zijlstra
2015-12-11 13:48 ` Peter Zijlstra
2015-12-11 14:06 ` Will Deacon
2015-12-11 14:06 ` Will Deacon
2015-12-11 17:11 ` Peter Zijlstra
2015-12-11 17:11 ` Peter Zijlstra
2015-12-11 17:24 ` Will Deacon
2015-12-11 17:24 ` Will Deacon
2015-12-11 22:35 ` Paul E. McKenney
2015-12-11 22:35 ` Paul E. McKenney
2015-12-14 18:49 ` One Thousand Gnomes
2015-12-14 20:31 ` Peter Zijlstra
2015-12-15 4:36 ` Paul E. McKenney
2015-12-14 20:28 ` FW: " Peter Zijlstra
2015-12-14 20:28 ` Peter Zijlstra
2015-12-15 4:36 ` Paul E. McKenney
2015-12-15 4:36 ` Paul E. McKenney
2015-12-11 14:17 ` Davidlohr Bueso
2015-12-11 14:17 ` Davidlohr Bueso
2015-12-17 21:52 ` Jeremy Linton
2015-12-17 21:52 ` Jeremy Linton
2015-12-11 7:33 ` Peter Zijlstra
2015-12-11 7:33 ` Peter Zijlstra
2015-12-11 9:59 ` Will Deacon
2015-12-11 9:59 ` Will Deacon
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20151211120419.GD18828@arm.com \
--to=will.deacon@arm.com \
--cc=linux-arm-kernel@lists.infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.