* [RFC PATCH 1/4] asm-generic: cmpxchg: implement dummy cmpxchg64_relaxed operation
@ 2013-09-26 15:13 Will Deacon
2013-09-26 15:13 ` Will Deacon
` (4 more replies)
0 siblings, 5 replies; 9+ messages in thread
From: Will Deacon @ 2013-09-26 15:13 UTC (permalink / raw)
To: linux-kernel; +Cc: tony.luck, torvalds, linux-arch, Will Deacon, Arnd Bergmann
cmpxchg64_relaxed can be used to provide barrier-less semantics for a
64-bit cmpxchg operation in cases where the strong memory ordering is
not required.
This patch implements a dummy implementation for asm-generic, falling
back to the usual cmpxchg64 code.
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Will Deacon <will.deacon@arm.com>
---
include/asm-generic/cmpxchg.h | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/include/asm-generic/cmpxchg.h b/include/asm-generic/cmpxchg.h
index 811fb1e..298b9d4 100644
--- a/include/asm-generic/cmpxchg.h
+++ b/include/asm-generic/cmpxchg.h
@@ -102,7 +102,8 @@ unsigned long __xchg(unsigned long x, volatile void *ptr, int size)
#define cmpxchg64_local(ptr, o, n) __cmpxchg64_local_generic((ptr), (o), (n))
#endif
-#define cmpxchg(ptr, o, n) cmpxchg_local((ptr), (o), (n))
-#define cmpxchg64(ptr, o, n) cmpxchg64_local((ptr), (o), (n))
+#define cmpxchg(ptr, o, n) cmpxchg_local((ptr), (o), (n))
+#define cmpxchg64(ptr, o, n) cmpxchg64_local((ptr), (o), (n))
+#define cmpxchg64_relaxed(ptr, o, n) cmpxchg64((ptr), (o), (n))
#endif /* __ASM_GENERIC_CMPXCHG_H */
--
1.8.2.2
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [RFC PATCH 1/4] asm-generic: cmpxchg: implement dummy cmpxchg64_relaxed operation
2013-09-26 15:13 [RFC PATCH 1/4] asm-generic: cmpxchg: implement dummy cmpxchg64_relaxed operation Will Deacon
@ 2013-09-26 15:13 ` Will Deacon
2013-09-26 15:13 ` [RFC PATCH 2/4] x86: " Will Deacon
` (3 subsequent siblings)
4 siblings, 0 replies; 9+ messages in thread
From: Will Deacon @ 2013-09-26 15:13 UTC (permalink / raw)
To: linux-kernel; +Cc: tony.luck, torvalds, linux-arch, Will Deacon, Arnd Bergmann
cmpxchg64_relaxed can be used to provide barrier-less semantics for a
64-bit cmpxchg operation in cases where the strong memory ordering is
not required.
This patch implements a dummy implementation for asm-generic, falling
back to the usual cmpxchg64 code.
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Will Deacon <will.deacon@arm.com>
---
include/asm-generic/cmpxchg.h | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/include/asm-generic/cmpxchg.h b/include/asm-generic/cmpxchg.h
index 811fb1e..298b9d4 100644
--- a/include/asm-generic/cmpxchg.h
+++ b/include/asm-generic/cmpxchg.h
@@ -102,7 +102,8 @@ unsigned long __xchg(unsigned long x, volatile void *ptr, int size)
#define cmpxchg64_local(ptr, o, n) __cmpxchg64_local_generic((ptr), (o), (n))
#endif
-#define cmpxchg(ptr, o, n) cmpxchg_local((ptr), (o), (n))
-#define cmpxchg64(ptr, o, n) cmpxchg64_local((ptr), (o), (n))
+#define cmpxchg(ptr, o, n) cmpxchg_local((ptr), (o), (n))
+#define cmpxchg64(ptr, o, n) cmpxchg64_local((ptr), (o), (n))
+#define cmpxchg64_relaxed(ptr, o, n) cmpxchg64((ptr), (o), (n))
#endif /* __ASM_GENERIC_CMPXCHG_H */
--
1.8.2.2
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [RFC PATCH 2/4] x86: cmpxchg: implement dummy cmpxchg64_relaxed operation
2013-09-26 15:13 [RFC PATCH 1/4] asm-generic: cmpxchg: implement dummy cmpxchg64_relaxed operation Will Deacon
2013-09-26 15:13 ` Will Deacon
@ 2013-09-26 15:13 ` Will Deacon
2013-09-26 15:13 ` [RFC PATCH 3/4] ia64: " Will Deacon
` (2 subsequent siblings)
4 siblings, 0 replies; 9+ messages in thread
From: Will Deacon @ 2013-09-26 15:13 UTC (permalink / raw)
To: linux-kernel; +Cc: tony.luck, torvalds, linux-arch, Will Deacon, x86
cmpxchg64_relaxed can be used to provide barrier-less semantics for a
64-bit cmpxchg operation in cases where the strong memory ordering is
not required. A useful use-case for this is in the recently merged
lockless lockref code.
This patch implements a dummy implementation for x86, since the memory
ordering issues aren't a concern for this architecture.
Cc: <x86@kernel.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
---
arch/x86/include/asm/cmpxchg.h | 3 +++
1 file changed, 3 insertions(+)
diff --git a/arch/x86/include/asm/cmpxchg.h b/arch/x86/include/asm/cmpxchg.h
index d47786a..aacb99a0 100644
--- a/arch/x86/include/asm/cmpxchg.h
+++ b/arch/x86/include/asm/cmpxchg.h
@@ -152,6 +152,9 @@ extern void __add_wrong_size(void)
#define cmpxchg_local(ptr, old, new) \
__cmpxchg_local(ptr, old, new, sizeof(*(ptr)))
+
+#define cmpxchg64_relaxed(ptr, old, new) \
+ cmpxchg64(ptr, old, new)
#endif
/*
--
1.8.2.2
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [RFC PATCH 3/4] ia64: cmpxchg: implement dummy cmpxchg64_relaxed operation
2013-09-26 15:13 [RFC PATCH 1/4] asm-generic: cmpxchg: implement dummy cmpxchg64_relaxed operation Will Deacon
2013-09-26 15:13 ` Will Deacon
2013-09-26 15:13 ` [RFC PATCH 2/4] x86: " Will Deacon
@ 2013-09-26 15:13 ` Will Deacon
2013-09-26 15:13 ` [RFC PATCH 4/4] lib: lockref: use relaxed cmpxchg64 variant for lockless updates Will Deacon
2013-09-26 15:34 ` [RFC PATCH 1/4] asm-generic: cmpxchg: implement dummy cmpxchg64_relaxed operation Linus Torvalds
4 siblings, 0 replies; 9+ messages in thread
From: Will Deacon @ 2013-09-26 15:13 UTC (permalink / raw)
To: linux-kernel; +Cc: tony.luck, torvalds, linux-arch, Will Deacon
cmpxchg64_relaxed can be used to provide barrier-less semantics for a
64-bit cmpxchg operation in cases where the strong memory ordering is
not required. A useful use-case for this is in the recently merged
lockless lockref code.
This patch implements a dummy implementation for ia64, which could
probably be improved by removing the half barrier associated with the
default cmpxchg64 macro.
Cc: Tony Luck <tony.luck@intel.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
---
arch/ia64/include/uapi/asm/cmpxchg.h | 1 +
1 file changed, 1 insertion(+)
diff --git a/arch/ia64/include/uapi/asm/cmpxchg.h b/arch/ia64/include/uapi/asm/cmpxchg.h
index 4f37dbb..8984b6e 100644
--- a/arch/ia64/include/uapi/asm/cmpxchg.h
+++ b/arch/ia64/include/uapi/asm/cmpxchg.h
@@ -124,6 +124,7 @@ extern long ia64_cmpxchg_called_with_bad_pointer(void);
#define cmpxchg_local cmpxchg
#define cmpxchg64_local cmpxchg64
+#define cmpxchg64_relaxed cmpxchg64
#ifdef CONFIG_IA64_DEBUG_CMPXCHG
# define CMPXCHG_BUGCHECK_DECL int _cmpxchg_bugcheck_count = 128;
--
1.8.2.2
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [RFC PATCH 4/4] lib: lockref: use relaxed cmpxchg64 variant for lockless updates
2013-09-26 15:13 [RFC PATCH 1/4] asm-generic: cmpxchg: implement dummy cmpxchg64_relaxed operation Will Deacon
` (2 preceding siblings ...)
2013-09-26 15:13 ` [RFC PATCH 3/4] ia64: " Will Deacon
@ 2013-09-26 15:13 ` Will Deacon
2013-09-26 15:13 ` Will Deacon
2013-09-26 15:34 ` [RFC PATCH 1/4] asm-generic: cmpxchg: implement dummy cmpxchg64_relaxed operation Linus Torvalds
4 siblings, 1 reply; 9+ messages in thread
From: Will Deacon @ 2013-09-26 15:13 UTC (permalink / raw)
To: linux-kernel; +Cc: tony.luck, torvalds, linux-arch, Will Deacon, Waiman Long
The 64-bit cmpxchg operation on the lockref is ordered by virtue of
hazarding between the cmpxchg operation and the reference count
manipulation. On weakly ordered memory architectures (such as ARM), it
can be of great benefit to omit the barrier instructions where they are
not needed.
This patch moves the lockless lockref code over to the new
cmpxchg64_relaxed operation, which doesn't provide barrier semantics.
Cc: Waiman Long <Waiman.Long@hp.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
---
So here's a quick stab at allowing the memory barrier semantics to be
avoided on weakly ordered architectures. This helps ARM, but it would be
interesting to see if ia64 gets a boost too (although I've not relaxed
their cmpxchg because there is uapi stuff involved that I wasn't
comfortable refactoring).
lib/lockref.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/lib/lockref.c b/lib/lockref.c
index 677d036..6d896ab 100644
--- a/lib/lockref.c
+++ b/lib/lockref.c
@@ -14,8 +14,9 @@
while (likely(arch_spin_value_unlocked(old.lock.rlock.raw_lock))) { \
struct lockref new = old, prev = old; \
CODE \
- old.lock_count = cmpxchg64(&lockref->lock_count, \
- old.lock_count, new.lock_count); \
+ old.lock_count = cmpxchg64_relaxed(&lockref->lock_count, \
+ old.lock_count, \
+ new.lock_count); \
if (likely(old.lock_count == prev.lock_count)) { \
SUCCESS; \
} \
--
1.8.2.2
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [RFC PATCH 4/4] lib: lockref: use relaxed cmpxchg64 variant for lockless updates
2013-09-26 15:13 ` [RFC PATCH 4/4] lib: lockref: use relaxed cmpxchg64 variant for lockless updates Will Deacon
@ 2013-09-26 15:13 ` Will Deacon
0 siblings, 0 replies; 9+ messages in thread
From: Will Deacon @ 2013-09-26 15:13 UTC (permalink / raw)
To: linux-kernel; +Cc: tony.luck, torvalds, linux-arch, Will Deacon, Waiman Long
The 64-bit cmpxchg operation on the lockref is ordered by virtue of
hazarding between the cmpxchg operation and the reference count
manipulation. On weakly ordered memory architectures (such as ARM), it
can be of great benefit to omit the barrier instructions where they are
not needed.
This patch moves the lockless lockref code over to the new
cmpxchg64_relaxed operation, which doesn't provide barrier semantics.
Cc: Waiman Long <Waiman.Long@hp.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
---
So here's a quick stab at allowing the memory barrier semantics to be
avoided on weakly ordered architectures. This helps ARM, but it would be
interesting to see if ia64 gets a boost too (although I've not relaxed
their cmpxchg because there is uapi stuff involved that I wasn't
comfortable refactoring).
lib/lockref.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/lib/lockref.c b/lib/lockref.c
index 677d036..6d896ab 100644
--- a/lib/lockref.c
+++ b/lib/lockref.c
@@ -14,8 +14,9 @@
while (likely(arch_spin_value_unlocked(old.lock.rlock.raw_lock))) { \
struct lockref new = old, prev = old; \
CODE \
- old.lock_count = cmpxchg64(&lockref->lock_count, \
- old.lock_count, new.lock_count); \
+ old.lock_count = cmpxchg64_relaxed(&lockref->lock_count, \
+ old.lock_count, \
+ new.lock_count); \
if (likely(old.lock_count == prev.lock_count)) { \
SUCCESS; \
} \
--
1.8.2.2
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [RFC PATCH 1/4] asm-generic: cmpxchg: implement dummy cmpxchg64_relaxed operation
2013-09-26 15:13 [RFC PATCH 1/4] asm-generic: cmpxchg: implement dummy cmpxchg64_relaxed operation Will Deacon
` (3 preceding siblings ...)
2013-09-26 15:13 ` [RFC PATCH 4/4] lib: lockref: use relaxed cmpxchg64 variant for lockless updates Will Deacon
@ 2013-09-26 15:34 ` Linus Torvalds
2013-09-26 15:34 ` Linus Torvalds
2013-09-26 15:53 ` Will Deacon
4 siblings, 2 replies; 9+ messages in thread
From: Linus Torvalds @ 2013-09-26 15:34 UTC (permalink / raw)
To: Will Deacon
Cc: Linux Kernel Mailing List, Tony Luck, linux-arch@vger.kernel.org,
Arnd Bergmann
On Thu, Sep 26, 2013 at 8:13 AM, Will Deacon <will.deacon@arm.com> wrote:
>
> This patch implements a dummy implementation for asm-generic, falling
> back to the usual cmpxchg64 code.
I don't like the "let's add dummy operations for everybody who doesn't
care" when it is this specialized.
I'd much rather just add a single
#ifndef cmpxchg64_relaxed
# define cmpxchg64_relaxed cmpxchg64
#endif
to the LOCKREF code, and then ARM (and others) can define it as they wish.
And *if* anybody else ever realizes that they want this outside of the
lockref code, let's look at doing that then. Right now I don't know of
any users, and I'd be leery of people using this willy-nilly, because
very few people really understand memory ordering.
Linus
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [RFC PATCH 1/4] asm-generic: cmpxchg: implement dummy cmpxchg64_relaxed operation
2013-09-26 15:34 ` [RFC PATCH 1/4] asm-generic: cmpxchg: implement dummy cmpxchg64_relaxed operation Linus Torvalds
@ 2013-09-26 15:34 ` Linus Torvalds
2013-09-26 15:53 ` Will Deacon
1 sibling, 0 replies; 9+ messages in thread
From: Linus Torvalds @ 2013-09-26 15:34 UTC (permalink / raw)
To: Will Deacon
Cc: Linux Kernel Mailing List, Tony Luck, linux-arch@vger.kernel.org,
Arnd Bergmann
On Thu, Sep 26, 2013 at 8:13 AM, Will Deacon <will.deacon@arm.com> wrote:
>
> This patch implements a dummy implementation for asm-generic, falling
> back to the usual cmpxchg64 code.
I don't like the "let's add dummy operations for everybody who doesn't
care" when it is this specialized.
I'd much rather just add a single
#ifndef cmpxchg64_relaxed
# define cmpxchg64_relaxed cmpxchg64
#endif
to the LOCKREF code, and then ARM (and others) can define it as they wish.
And *if* anybody else ever realizes that they want this outside of the
lockref code, let's look at doing that then. Right now I don't know of
any users, and I'd be leery of people using this willy-nilly, because
very few people really understand memory ordering.
Linus
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [RFC PATCH 1/4] asm-generic: cmpxchg: implement dummy cmpxchg64_relaxed operation
2013-09-26 15:34 ` [RFC PATCH 1/4] asm-generic: cmpxchg: implement dummy cmpxchg64_relaxed operation Linus Torvalds
2013-09-26 15:34 ` Linus Torvalds
@ 2013-09-26 15:53 ` Will Deacon
1 sibling, 0 replies; 9+ messages in thread
From: Will Deacon @ 2013-09-26 15:53 UTC (permalink / raw)
To: Linus Torvalds
Cc: Linux Kernel Mailing List, Tony Luck, linux-arch@vger.kernel.org,
Arnd Bergmann
On Thu, Sep 26, 2013 at 04:34:04PM +0100, Linus Torvalds wrote:
> On Thu, Sep 26, 2013 at 8:13 AM, Will Deacon <will.deacon@arm.com> wrote:
> >
> > This patch implements a dummy implementation for asm-generic, falling
> > back to the usual cmpxchg64 code.
>
> I don't like the "let's add dummy operations for everybody who doesn't
> care" when it is this specialized.
>
> I'd much rather just add a single
>
> #ifndef cmpxchg64_relaxed
> # define cmpxchg64_relaxed cmpxchg64
> #endif
>
> to the LOCKREF code, and then ARM (and others) can define it as they wish.
Okey doke.
> And *if* anybody else ever realizes that they want this outside of the
> lockref code, let's look at doing that then. Right now I don't know of
> any users, and I'd be leery of people using this willy-nilly, because
> very few people really understand memory ordering.
Agreed, and I really doubt there are many cases where the cmpxchg hazarding
works out nicely so as to obviate the need for memory barriers.
I'll send a revised patch.
Cheers,
Will
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2013-09-26 15:53 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-09-26 15:13 [RFC PATCH 1/4] asm-generic: cmpxchg: implement dummy cmpxchg64_relaxed operation Will Deacon
2013-09-26 15:13 ` Will Deacon
2013-09-26 15:13 ` [RFC PATCH 2/4] x86: " Will Deacon
2013-09-26 15:13 ` [RFC PATCH 3/4] ia64: " Will Deacon
2013-09-26 15:13 ` [RFC PATCH 4/4] lib: lockref: use relaxed cmpxchg64 variant for lockless updates Will Deacon
2013-09-26 15:13 ` Will Deacon
2013-09-26 15:34 ` [RFC PATCH 1/4] asm-generic: cmpxchg: implement dummy cmpxchg64_relaxed operation Linus Torvalds
2013-09-26 15:34 ` Linus Torvalds
2013-09-26 15:53 ` Will Deacon
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).