* [PATCH 0/3] locking/atomic/x86: Improve arch_cmpxchg64() and friends for x86_32
@ 2024-04-08 9:13 Uros Bizjak
2024-04-08 9:13 ` [PATCH 1/3] locking/atomic/x86: Correct the definition of __arch_try_cmpxchg128() Uros Bizjak
` (2 more replies)
0 siblings, 3 replies; 8+ messages in thread
From: Uros Bizjak @ 2024-04-08 9:13 UTC (permalink / raw)
To: x86, linux-kernel
Cc: Uros Bizjak, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
Dave Hansen, H. Peter Anvin, Peter Zijlstra
Following patch series improves arch_cmpxchg64() family
of functions and rewrites arch_cmpxchg64() family in a
similar way as was recently done for arch_cmpxchg128() family.
The improvements builds on recent improvement of emulated
cmpxchg8b_emu() library function and introduce
arch_try_cmpxchg64() also for !CONFIG_X86_CMPXCHG64 targets.
While the patch series enables impressive assembly code
reductions for somehow obsolete !CONFIG_X86_CMPXCHG64 targets,
the true reason for the changes will be evident from the
follow-up patch series, where this series enables unification
of several atomic functions between x86_64 and x86_32 targets.
OTOH, by adopting the same approach and similar structure of
arch_cmpxchg64() macros to arch_cmpxchg128() macros, these
patches lower future maintenace burden and technical debt
of the source code.
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Uros Bizjak (3):
locking/atomic/x86: Correct the definition of __arch_try_cmpxchg128()
locking/atomic/x86: Modernize x86_32 arch_{,try_}_cmpxchg64{,_local}()
locking/atomic/x86: Introduce arch_try_cmpxchg64() for
!CONFIG_X86_CMPXCHG64
arch/x86/include/asm/cmpxchg_32.h | 207 ++++++++++++++++++------------
arch/x86/include/asm/cmpxchg_64.h | 2 +-
2 files changed, 129 insertions(+), 80 deletions(-)
--
2.44.0
^ permalink raw reply [flat|nested] 8+ messages in thread* [PATCH 1/3] locking/atomic/x86: Correct the definition of __arch_try_cmpxchg128() 2024-04-08 9:13 [PATCH 0/3] locking/atomic/x86: Improve arch_cmpxchg64() and friends for x86_32 Uros Bizjak @ 2024-04-08 9:13 ` Uros Bizjak 2024-04-09 8:26 ` [tip: locking/core] " tip-bot2 for Uros Bizjak 2024-04-08 9:13 ` [PATCH 2/3] locking/atomic/x86: Modernize x86_32 arch_{,try_}_cmpxchg64{,_local}() Uros Bizjak 2024-04-08 9:13 ` [PATCH 3/3] locking/atomic/x86: Introduce arch_try_cmpxchg64() for !CONFIG_X86_CMPXCHG64 Uros Bizjak 2 siblings, 1 reply; 8+ messages in thread From: Uros Bizjak @ 2024-04-08 9:13 UTC (permalink / raw) To: x86, linux-kernel Cc: Uros Bizjak, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, H. Peter Anvin, Peter Zijlstra Correct the definition of __arch_try_cmpxchg128(), introduced by: b23e139d0b66 ("arch: Introduce arch_{,try_}_cmpxchg128{,_local}()") Fixes: b23e139d0b66 ("arch: Introduce arch_{,try_}_cmpxchg128{,_local}()") Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Peter Zijlstra <peterz@infradead.org> --- arch/x86/include/asm/cmpxchg_64.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/include/asm/cmpxchg_64.h b/arch/x86/include/asm/cmpxchg_64.h index 44b08b53ab32..c1d6cd58f809 100644 --- a/arch/x86/include/asm/cmpxchg_64.h +++ b/arch/x86/include/asm/cmpxchg_64.h @@ -62,7 +62,7 @@ static __always_inline u128 arch_cmpxchg128_local(volatile u128 *ptr, u128 old, asm volatile(_lock "cmpxchg16b %[ptr]" \ CC_SET(e) \ : CC_OUT(e) (ret), \ - [ptr] "+m" (*ptr), \ + [ptr] "+m" (*(_ptr)), \ "+a" (o.low), "+d" (o.high) \ : "b" (n.low), "c" (n.high) \ : "memory"); \ -- 2.44.0 ^ permalink raw reply related [flat|nested] 8+ messages in thread
* [tip: locking/core] locking/atomic/x86: Correct the definition of __arch_try_cmpxchg128() 2024-04-08 9:13 ` [PATCH 1/3] locking/atomic/x86: Correct the definition of __arch_try_cmpxchg128() Uros Bizjak @ 2024-04-09 8:26 ` tip-bot2 for Uros Bizjak 0 siblings, 0 replies; 8+ messages in thread From: tip-bot2 for Uros Bizjak @ 2024-04-09 8:26 UTC (permalink / raw) To: linux-tip-commits Cc: Uros Bizjak, Ingo Molnar, Linus Torvalds, H. Peter Anvin, x86, linux-kernel The following commit has been merged into the locking/core branch of tip: Commit-ID: 929ad065ba2967be238dfdc0895b79fda62c7f16 Gitweb: https://git.kernel.org/tip/929ad065ba2967be238dfdc0895b79fda62c7f16 Author: Uros Bizjak <ubizjak@gmail.com> AuthorDate: Mon, 08 Apr 2024 11:13:56 +02:00 Committer: Ingo Molnar <mingo@kernel.org> CommitterDate: Tue, 09 Apr 2024 09:51:02 +02:00 locking/atomic/x86: Correct the definition of __arch_try_cmpxchg128() Correct the definition of __arch_try_cmpxchg128(), introduced by: b23e139d0b66 ("arch: Introduce arch_{,try_}_cmpxchg128{,_local}()") Fixes: b23e139d0b66 ("arch: Introduce arch_{,try_}_cmpxchg128{,_local}()") Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Link: https://lore.kernel.org/r/20240408091547.90111-2-ubizjak@gmail.com --- arch/x86/include/asm/cmpxchg_64.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/include/asm/cmpxchg_64.h b/arch/x86/include/asm/cmpxchg_64.h index 44b08b5..c1d6cd5 100644 --- a/arch/x86/include/asm/cmpxchg_64.h +++ b/arch/x86/include/asm/cmpxchg_64.h @@ -62,7 +62,7 @@ static __always_inline u128 arch_cmpxchg128_local(volatile u128 *ptr, u128 old, asm volatile(_lock "cmpxchg16b %[ptr]" \ CC_SET(e) \ : CC_OUT(e) (ret), \ - [ptr] "+m" (*ptr), \ + [ptr] "+m" (*(_ptr)), \ "+a" (o.low), "+d" (o.high) \ : "b" (n.low), "c" (n.high) \ : "memory"); \ ^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH 2/3] locking/atomic/x86: Modernize x86_32 arch_{,try_}_cmpxchg64{,_local}() 2024-04-08 9:13 [PATCH 0/3] locking/atomic/x86: Improve arch_cmpxchg64() and friends for x86_32 Uros Bizjak 2024-04-08 9:13 ` [PATCH 1/3] locking/atomic/x86: Correct the definition of __arch_try_cmpxchg128() Uros Bizjak @ 2024-04-08 9:13 ` Uros Bizjak 2024-04-09 8:26 ` [tip: locking/core] " tip-bot2 for Uros Bizjak 2024-04-08 9:13 ` [PATCH 3/3] locking/atomic/x86: Introduce arch_try_cmpxchg64() for !CONFIG_X86_CMPXCHG64 Uros Bizjak 2 siblings, 1 reply; 8+ messages in thread From: Uros Bizjak @ 2024-04-08 9:13 UTC (permalink / raw) To: x86, linux-kernel Cc: Uros Bizjak, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, H. Peter Anvin, Peter Zijlstra Commit: b23e139d0b66 ("arch: Introduce arch_{,try_}_cmpxchg128{,_local}()") introduced arch_{,try_}_cmpxchg128{,_local}() for x86_64 targets. Modernize existing x86_32 arch_{,try_}_cmpxchg64{,_local}() definitions to follow the same structure as the definitions introduced by the above commit. No functional changes intended. Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Peter Zijlstra <peterz@infradead.org> --- arch/x86/include/asm/cmpxchg_32.h | 179 +++++++++++++++++------------- 1 file changed, 100 insertions(+), 79 deletions(-) diff --git a/arch/x86/include/asm/cmpxchg_32.h b/arch/x86/include/asm/cmpxchg_32.h index b5731c51f0f4..fe40d0681ea8 100644 --- a/arch/x86/include/asm/cmpxchg_32.h +++ b/arch/x86/include/asm/cmpxchg_32.h @@ -3,103 +3,124 @@ #define _ASM_X86_CMPXCHG_32_H /* - * Note: if you use set64_bit(), __cmpxchg64(), or their variants, + * Note: if you use __cmpxchg64(), or their variants, * you need to test for the feature in boot_cpu_data. */ -#ifdef CONFIG_X86_CMPXCHG64 -#define arch_cmpxchg64(ptr, o, n) \ - ((__typeof__(*(ptr)))__cmpxchg64((ptr), (unsigned long long)(o), \ - (unsigned long long)(n))) -#define arch_cmpxchg64_local(ptr, o, n) \ - ((__typeof__(*(ptr)))__cmpxchg64_local((ptr), (unsigned long long)(o), \ - (unsigned long long)(n))) -#define arch_try_cmpxchg64(ptr, po, n) \ - __try_cmpxchg64((ptr), (unsigned long long *)(po), \ - (unsigned long long)(n)) -#endif +union __u64_halves { + u64 full; + struct { + u32 low, high; + }; +}; + +#define __arch_cmpxchg64(_ptr, _old, _new, _lock) \ +({ \ + union __u64_halves o = { .full = (_old), }, \ + n = { .full = (_new), }; \ + \ + asm volatile(_lock "cmpxchg8b %[ptr]" \ + : [ptr] "+m" (*(_ptr)), \ + "+a" (o.low), "+d" (o.high) \ + : "b" (n.low), "c" (n.high) \ + : "memory"); \ + \ + o.full; \ +}) -static inline u64 __cmpxchg64(volatile u64 *ptr, u64 old, u64 new) + +static __always_inline u64 __cmpxchg64(volatile u64 *ptr, u64 old, u64 new) { - u64 prev; - asm volatile(LOCK_PREFIX "cmpxchg8b %1" - : "=A" (prev), - "+m" (*ptr) - : "b" ((u32)new), - "c" ((u32)(new >> 32)), - "0" (old) - : "memory"); - return prev; + return __arch_cmpxchg64(ptr, old, new, LOCK_PREFIX); } -static inline u64 __cmpxchg64_local(volatile u64 *ptr, u64 old, u64 new) +static __always_inline u64 __cmpxchg64_local(volatile u64 *ptr, u64 old, u64 new) { - u64 prev; - asm volatile("cmpxchg8b %1" - : "=A" (prev), - "+m" (*ptr) - : "b" ((u32)new), - "c" ((u32)(new >> 32)), - "0" (old) - : "memory"); - return prev; + return __arch_cmpxchg64(ptr, old, new,); } -static inline bool __try_cmpxchg64(volatile u64 *ptr, u64 *pold, u64 new) +#define __arch_try_cmpxchg64(_ptr, _oldp, _new, _lock) \ +({ \ + union __u64_halves o = { .full = *(_oldp), }, \ + n = { .full = (_new), }; \ + bool ret; \ + \ + asm volatile(_lock "cmpxchg8b %[ptr]" \ + CC_SET(e) \ + : CC_OUT(e) (ret), \ + [ptr] "+m" (*(_ptr)), \ + "+a" (o.low), "+d" (o.high) \ + : "b" (n.low), "c" (n.high) \ + : "memory"); \ + \ + if (unlikely(!ret)) \ + *(_oldp) = o.full; \ + \ + likely(ret); \ +}) + +static __always_inline bool __try_cmpxchg64(volatile u64 *ptr, u64 *oldp, u64 new) { - bool success; - u64 old = *pold; - asm volatile(LOCK_PREFIX "cmpxchg8b %[ptr]" - CC_SET(z) - : CC_OUT(z) (success), - [ptr] "+m" (*ptr), - "+A" (old) - : "b" ((u32)new), - "c" ((u32)(new >> 32)) - : "memory"); - - if (unlikely(!success)) - *pold = old; - return success; + return __arch_try_cmpxchg64(ptr, oldp, new, LOCK_PREFIX); } -#ifndef CONFIG_X86_CMPXCHG64 +#ifdef CONFIG_X86_CMPXCHG64 + +#define arch_cmpxchg64 __cmpxchg64 + +#define arch_cmpxchg64_local __cmpxchg64_local + +#define arch_try_cmpxchg64 __try_cmpxchg64 + +#else + /* * Building a kernel capable running on 80386 and 80486. It may be necessary * to simulate the cmpxchg8b on the 80386 and 80486 CPU. */ -#define arch_cmpxchg64(ptr, o, n) \ -({ \ - __typeof__(*(ptr)) __ret; \ - __typeof__(*(ptr)) __old = (o); \ - __typeof__(*(ptr)) __new = (n); \ - alternative_io(LOCK_PREFIX_HERE \ - "call cmpxchg8b_emu", \ - "lock; cmpxchg8b (%%esi)" , \ - X86_FEATURE_CX8, \ - "=A" (__ret), \ - "S" ((ptr)), "0" (__old), \ - "b" ((unsigned int)__new), \ - "c" ((unsigned int)(__new>>32)) \ - : "memory"); \ - __ret; }) - - -#define arch_cmpxchg64_local(ptr, o, n) \ -({ \ - __typeof__(*(ptr)) __ret; \ - __typeof__(*(ptr)) __old = (o); \ - __typeof__(*(ptr)) __new = (n); \ - alternative_io("call cmpxchg8b_emu", \ - "cmpxchg8b (%%esi)" , \ - X86_FEATURE_CX8, \ - "=A" (__ret), \ - "S" ((ptr)), "0" (__old), \ - "b" ((unsigned int)__new), \ - "c" ((unsigned int)(__new>>32)) \ - : "memory"); \ - __ret; }) +#define __arch_cmpxchg64_emu(_ptr, _old, _new) \ +({ \ + union __u64_halves o = { .full = (_old), }, \ + n = { .full = (_new), }; \ + \ + asm volatile(ALTERNATIVE(LOCK_PREFIX_HERE \ + "call cmpxchg8b_emu", \ + "lock; cmpxchg8b %[ptr]", X86_FEATURE_CX8) \ + : [ptr] "+m" (*(_ptr)), \ + "+a" (o.low), "+d" (o.high) \ + : "b" (n.low), "c" (n.high), "S" (_ptr) \ + : "memory"); \ + \ + o.full; \ +}) + +static __always_inline u64 arch_cmpxchg64(volatile u64 *ptr, u64 old, u64 new) +{ + return __arch_cmpxchg64_emu(ptr, old, new); +} +#define arch_cmpxchg64 arch_cmpxchg64 + +#define __arch_cmpxchg64_emu_local(_ptr, _old, _new) \ +({ \ + union __u64_halves o = { .full = (_old), }, \ + n = { .full = (_new), }; \ + \ + asm volatile(ALTERNATIVE("call cmpxchg8b_emu", \ + "cmpxchg8b %[ptr]", X86_FEATURE_CX8) \ + : [ptr] "+m" (*(_ptr)), \ + "+a" (o.low), "+d" (o.high) \ + : "b" (n.low), "c" (n.high), "S" (_ptr) \ + : "memory"); \ + \ + o.full; \ +}) + +static __always_inline u64 arch_cmpxchg64_local(volatile u64 *ptr, u64 old, u64 new) +{ + return __arch_cmpxchg64_emu_local(ptr, old, new); +} +#define arch_cmpxchg64_local arch_cmpxchg64_local #endif -- 2.44.0 ^ permalink raw reply related [flat|nested] 8+ messages in thread
* [tip: locking/core] locking/atomic/x86: Modernize x86_32 arch_{,try_}_cmpxchg64{,_local}() 2024-04-08 9:13 ` [PATCH 2/3] locking/atomic/x86: Modernize x86_32 arch_{,try_}_cmpxchg64{,_local}() Uros Bizjak @ 2024-04-09 8:26 ` tip-bot2 for Uros Bizjak 0 siblings, 0 replies; 8+ messages in thread From: tip-bot2 for Uros Bizjak @ 2024-04-09 8:26 UTC (permalink / raw) To: linux-tip-commits Cc: Uros Bizjak, Ingo Molnar, Linus Torvalds, H. Peter Anvin, x86, linux-kernel The following commit has been merged into the locking/core branch of tip: Commit-ID: 7016cc5def44b9dcb28089efae4412fa0d6c78c2 Gitweb: https://git.kernel.org/tip/7016cc5def44b9dcb28089efae4412fa0d6c78c2 Author: Uros Bizjak <ubizjak@gmail.com> AuthorDate: Mon, 08 Apr 2024 11:13:57 +02:00 Committer: Ingo Molnar <mingo@kernel.org> CommitterDate: Tue, 09 Apr 2024 09:51:03 +02:00 locking/atomic/x86: Modernize x86_32 arch_{,try_}_cmpxchg64{,_local}() Commit: b23e139d0b66 ("arch: Introduce arch_{,try_}_cmpxchg128{,_local}()") introduced arch_{,try_}_cmpxchg128{,_local}() for x86_64 targets. Modernize existing x86_32 arch_{,try_}_cmpxchg64{,_local}() definitions to follow the same structure as the definitions introduced by the above commit. No functional changes intended. Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Link: https://lore.kernel.org/r/20240408091547.90111-3-ubizjak@gmail.com --- arch/x86/include/asm/cmpxchg_32.h | 179 ++++++++++++++++------------- 1 file changed, 100 insertions(+), 79 deletions(-) diff --git a/arch/x86/include/asm/cmpxchg_32.h b/arch/x86/include/asm/cmpxchg_32.h index b5731c5..fe40d06 100644 --- a/arch/x86/include/asm/cmpxchg_32.h +++ b/arch/x86/include/asm/cmpxchg_32.h @@ -3,103 +3,124 @@ #define _ASM_X86_CMPXCHG_32_H /* - * Note: if you use set64_bit(), __cmpxchg64(), or their variants, + * Note: if you use __cmpxchg64(), or their variants, * you need to test for the feature in boot_cpu_data. */ -#ifdef CONFIG_X86_CMPXCHG64 -#define arch_cmpxchg64(ptr, o, n) \ - ((__typeof__(*(ptr)))__cmpxchg64((ptr), (unsigned long long)(o), \ - (unsigned long long)(n))) -#define arch_cmpxchg64_local(ptr, o, n) \ - ((__typeof__(*(ptr)))__cmpxchg64_local((ptr), (unsigned long long)(o), \ - (unsigned long long)(n))) -#define arch_try_cmpxchg64(ptr, po, n) \ - __try_cmpxchg64((ptr), (unsigned long long *)(po), \ - (unsigned long long)(n)) -#endif +union __u64_halves { + u64 full; + struct { + u32 low, high; + }; +}; + +#define __arch_cmpxchg64(_ptr, _old, _new, _lock) \ +({ \ + union __u64_halves o = { .full = (_old), }, \ + n = { .full = (_new), }; \ + \ + asm volatile(_lock "cmpxchg8b %[ptr]" \ + : [ptr] "+m" (*(_ptr)), \ + "+a" (o.low), "+d" (o.high) \ + : "b" (n.low), "c" (n.high) \ + : "memory"); \ + \ + o.full; \ +}) -static inline u64 __cmpxchg64(volatile u64 *ptr, u64 old, u64 new) + +static __always_inline u64 __cmpxchg64(volatile u64 *ptr, u64 old, u64 new) { - u64 prev; - asm volatile(LOCK_PREFIX "cmpxchg8b %1" - : "=A" (prev), - "+m" (*ptr) - : "b" ((u32)new), - "c" ((u32)(new >> 32)), - "0" (old) - : "memory"); - return prev; + return __arch_cmpxchg64(ptr, old, new, LOCK_PREFIX); } -static inline u64 __cmpxchg64_local(volatile u64 *ptr, u64 old, u64 new) +static __always_inline u64 __cmpxchg64_local(volatile u64 *ptr, u64 old, u64 new) { - u64 prev; - asm volatile("cmpxchg8b %1" - : "=A" (prev), - "+m" (*ptr) - : "b" ((u32)new), - "c" ((u32)(new >> 32)), - "0" (old) - : "memory"); - return prev; + return __arch_cmpxchg64(ptr, old, new,); } -static inline bool __try_cmpxchg64(volatile u64 *ptr, u64 *pold, u64 new) +#define __arch_try_cmpxchg64(_ptr, _oldp, _new, _lock) \ +({ \ + union __u64_halves o = { .full = *(_oldp), }, \ + n = { .full = (_new), }; \ + bool ret; \ + \ + asm volatile(_lock "cmpxchg8b %[ptr]" \ + CC_SET(e) \ + : CC_OUT(e) (ret), \ + [ptr] "+m" (*(_ptr)), \ + "+a" (o.low), "+d" (o.high) \ + : "b" (n.low), "c" (n.high) \ + : "memory"); \ + \ + if (unlikely(!ret)) \ + *(_oldp) = o.full; \ + \ + likely(ret); \ +}) + +static __always_inline bool __try_cmpxchg64(volatile u64 *ptr, u64 *oldp, u64 new) { - bool success; - u64 old = *pold; - asm volatile(LOCK_PREFIX "cmpxchg8b %[ptr]" - CC_SET(z) - : CC_OUT(z) (success), - [ptr] "+m" (*ptr), - "+A" (old) - : "b" ((u32)new), - "c" ((u32)(new >> 32)) - : "memory"); - - if (unlikely(!success)) - *pold = old; - return success; + return __arch_try_cmpxchg64(ptr, oldp, new, LOCK_PREFIX); } -#ifndef CONFIG_X86_CMPXCHG64 +#ifdef CONFIG_X86_CMPXCHG64 + +#define arch_cmpxchg64 __cmpxchg64 + +#define arch_cmpxchg64_local __cmpxchg64_local + +#define arch_try_cmpxchg64 __try_cmpxchg64 + +#else + /* * Building a kernel capable running on 80386 and 80486. It may be necessary * to simulate the cmpxchg8b on the 80386 and 80486 CPU. */ -#define arch_cmpxchg64(ptr, o, n) \ -({ \ - __typeof__(*(ptr)) __ret; \ - __typeof__(*(ptr)) __old = (o); \ - __typeof__(*(ptr)) __new = (n); \ - alternative_io(LOCK_PREFIX_HERE \ - "call cmpxchg8b_emu", \ - "lock; cmpxchg8b (%%esi)" , \ - X86_FEATURE_CX8, \ - "=A" (__ret), \ - "S" ((ptr)), "0" (__old), \ - "b" ((unsigned int)__new), \ - "c" ((unsigned int)(__new>>32)) \ - : "memory"); \ - __ret; }) - - -#define arch_cmpxchg64_local(ptr, o, n) \ -({ \ - __typeof__(*(ptr)) __ret; \ - __typeof__(*(ptr)) __old = (o); \ - __typeof__(*(ptr)) __new = (n); \ - alternative_io("call cmpxchg8b_emu", \ - "cmpxchg8b (%%esi)" , \ - X86_FEATURE_CX8, \ - "=A" (__ret), \ - "S" ((ptr)), "0" (__old), \ - "b" ((unsigned int)__new), \ - "c" ((unsigned int)(__new>>32)) \ - : "memory"); \ - __ret; }) +#define __arch_cmpxchg64_emu(_ptr, _old, _new) \ +({ \ + union __u64_halves o = { .full = (_old), }, \ + n = { .full = (_new), }; \ + \ + asm volatile(ALTERNATIVE(LOCK_PREFIX_HERE \ + "call cmpxchg8b_emu", \ + "lock; cmpxchg8b %[ptr]", X86_FEATURE_CX8) \ + : [ptr] "+m" (*(_ptr)), \ + "+a" (o.low), "+d" (o.high) \ + : "b" (n.low), "c" (n.high), "S" (_ptr) \ + : "memory"); \ + \ + o.full; \ +}) + +static __always_inline u64 arch_cmpxchg64(volatile u64 *ptr, u64 old, u64 new) +{ + return __arch_cmpxchg64_emu(ptr, old, new); +} +#define arch_cmpxchg64 arch_cmpxchg64 + +#define __arch_cmpxchg64_emu_local(_ptr, _old, _new) \ +({ \ + union __u64_halves o = { .full = (_old), }, \ + n = { .full = (_new), }; \ + \ + asm volatile(ALTERNATIVE("call cmpxchg8b_emu", \ + "cmpxchg8b %[ptr]", X86_FEATURE_CX8) \ + : [ptr] "+m" (*(_ptr)), \ + "+a" (o.low), "+d" (o.high) \ + : "b" (n.low), "c" (n.high), "S" (_ptr) \ + : "memory"); \ + \ + o.full; \ +}) + +static __always_inline u64 arch_cmpxchg64_local(volatile u64 *ptr, u64 old, u64 new) +{ + return __arch_cmpxchg64_emu_local(ptr, old, new); +} +#define arch_cmpxchg64_local arch_cmpxchg64_local #endif ^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH 3/3] locking/atomic/x86: Introduce arch_try_cmpxchg64() for !CONFIG_X86_CMPXCHG64 2024-04-08 9:13 [PATCH 0/3] locking/atomic/x86: Improve arch_cmpxchg64() and friends for x86_32 Uros Bizjak 2024-04-08 9:13 ` [PATCH 1/3] locking/atomic/x86: Correct the definition of __arch_try_cmpxchg128() Uros Bizjak 2024-04-08 9:13 ` [PATCH 2/3] locking/atomic/x86: Modernize x86_32 arch_{,try_}_cmpxchg64{,_local}() Uros Bizjak @ 2024-04-08 9:13 ` Uros Bizjak 2024-04-09 7:50 ` Ingo Molnar 2024-04-09 8:26 ` [tip: locking/core] " tip-bot2 for Uros Bizjak 2 siblings, 2 replies; 8+ messages in thread From: Uros Bizjak @ 2024-04-08 9:13 UTC (permalink / raw) To: x86, linux-kernel Cc: Uros Bizjak, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, H. Peter Anvin, Peter Zijlstra Commit: 6d12c8d308e68 ("percpu: Wire up cmpxchg128") improved emulated cmpxchg8b_emu() library function to return success/failure in a ZF flag. Define arch_try_cmpxchg64() for !CONFIG_X86_CMPXCHG64 targets to override the generic archy_try_cmpxchg() with an optimized target specific implementation that handles ZF flag. The assembly code at the call sites improves from: bf56d: e8 fc ff ff ff call cmpxchg8b_emu bf572: 8b 74 24 28 mov 0x28(%esp),%esi bf576: 89 c3 mov %eax,%ebx bf578: 89 d1 mov %edx,%ecx bf57a: 8b 7c 24 2c mov 0x2c(%esp),%edi bf57e: 89 f0 mov %esi,%eax bf580: 89 fa mov %edi,%edx bf582: 31 d8 xor %ebx,%eax bf584: 31 ca xor %ecx,%edx bf586: 09 d0 or %edx,%eax bf588: 0f 84 e3 01 00 00 je bf771 <...> to: bf572: e8 fc ff ff ff call cmpxchg8b_emu bf577: 0f 84 b6 01 00 00 je bf733 <...> No functional changes intended. Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Peter Zijlstra <peterz@infradead.org> --- arch/x86/include/asm/cmpxchg_32.h | 28 ++++++++++++++++++++++++++++ 1 file changed, 28 insertions(+) diff --git a/arch/x86/include/asm/cmpxchg_32.h b/arch/x86/include/asm/cmpxchg_32.h index fe40d0681ea8..9e0d330dd5d0 100644 --- a/arch/x86/include/asm/cmpxchg_32.h +++ b/arch/x86/include/asm/cmpxchg_32.h @@ -122,6 +122,34 @@ static __always_inline u64 arch_cmpxchg64_local(volatile u64 *ptr, u64 old, u64 } #define arch_cmpxchg64_local arch_cmpxchg64_local +#define __arch_try_cmpxchg64_emu(_ptr, _oldp, _new) \ +({ \ + union __u64_halves o = { .full = *(_oldp), }, \ + n = { .full = (_new), }; \ + bool ret; \ + \ + asm volatile(ALTERNATIVE(LOCK_PREFIX_HERE \ + "call cmpxchg8b_emu", \ + "lock; cmpxchg8b %[ptr]", X86_FEATURE_CX8) \ + CC_SET(e) \ + : CC_OUT(e) (ret), \ + [ptr] "+m" (*(_ptr)), \ + "+a" (o.low), "+d" (o.high) \ + : "b" (n.low), "c" (n.high), "S" (_ptr) \ + : "memory"); \ + \ + if (unlikely(!ret)) \ + *(_oldp) = o.full; \ + \ + likely(ret); \ +}) + +static __always_inline bool arch_try_cmpxchg64(volatile u64 *ptr, u64 *oldp, u64 new) +{ + return __arch_try_cmpxchg64_emu(ptr, oldp, new); +} +#define arch_try_cmpxchg64 arch_try_cmpxchg64 + #endif #define system_has_cmpxchg64() boot_cpu_has(X86_FEATURE_CX8) -- 2.44.0 ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH 3/3] locking/atomic/x86: Introduce arch_try_cmpxchg64() for !CONFIG_X86_CMPXCHG64 2024-04-08 9:13 ` [PATCH 3/3] locking/atomic/x86: Introduce arch_try_cmpxchg64() for !CONFIG_X86_CMPXCHG64 Uros Bizjak @ 2024-04-09 7:50 ` Ingo Molnar 2024-04-09 8:26 ` [tip: locking/core] " tip-bot2 for Uros Bizjak 1 sibling, 0 replies; 8+ messages in thread From: Ingo Molnar @ 2024-04-09 7:50 UTC (permalink / raw) To: Uros Bizjak Cc: x86, linux-kernel, Thomas Gleixner, Borislav Petkov, Dave Hansen, H. Peter Anvin, Peter Zijlstra * Uros Bizjak <ubizjak@gmail.com> wrote: > Commit: > > 6d12c8d308e68 ("percpu: Wire up cmpxchg128") > > improved emulated cmpxchg8b_emu() library function to return > success/failure in a ZF flag. > > Define arch_try_cmpxchg64() for !CONFIG_X86_CMPXCHG64 targets > to override the generic archy_try_cmpxchg() with an optimized > target specific implementation that handles ZF flag. > > The assembly code at the call sites improves from: > > bf56d: e8 fc ff ff ff call cmpxchg8b_emu > bf572: 8b 74 24 28 mov 0x28(%esp),%esi > bf576: 89 c3 mov %eax,%ebx > bf578: 89 d1 mov %edx,%ecx > bf57a: 8b 7c 24 2c mov 0x2c(%esp),%edi > bf57e: 89 f0 mov %esi,%eax > bf580: 89 fa mov %edi,%edx > bf582: 31 d8 xor %ebx,%eax > bf584: 31 ca xor %ecx,%edx > bf586: 09 d0 or %edx,%eax > bf588: 0f 84 e3 01 00 00 je bf771 <...> > > to: > > bf572: e8 fc ff ff ff call cmpxchg8b_emu > bf577: 0f 84 b6 01 00 00 je bf733 <...> > > No functional changes intended. Side note: while there's no hard-written rule for it, I tend to use the 'no functional changes intended' line for pure identity transformations - which this one isn't, as it changes code generation materially. So I removed that line - the explanation of the patch is clear enough IMO. Thanks, Ingo ^ permalink raw reply [flat|nested] 8+ messages in thread
* [tip: locking/core] locking/atomic/x86: Introduce arch_try_cmpxchg64() for !CONFIG_X86_CMPXCHG64 2024-04-08 9:13 ` [PATCH 3/3] locking/atomic/x86: Introduce arch_try_cmpxchg64() for !CONFIG_X86_CMPXCHG64 Uros Bizjak 2024-04-09 7:50 ` Ingo Molnar @ 2024-04-09 8:26 ` tip-bot2 for Uros Bizjak 1 sibling, 0 replies; 8+ messages in thread From: tip-bot2 for Uros Bizjak @ 2024-04-09 8:26 UTC (permalink / raw) To: linux-tip-commits Cc: Uros Bizjak, Ingo Molnar, Linus Torvalds, H. Peter Anvin, x86, linux-kernel The following commit has been merged into the locking/core branch of tip: Commit-ID: aef95dac9ce4f271cc43195ffc175114ed934cbe Gitweb: https://git.kernel.org/tip/aef95dac9ce4f271cc43195ffc175114ed934cbe Author: Uros Bizjak <ubizjak@gmail.com> AuthorDate: Mon, 08 Apr 2024 11:13:58 +02:00 Committer: Ingo Molnar <mingo@kernel.org> CommitterDate: Tue, 09 Apr 2024 09:51:03 +02:00 locking/atomic/x86: Introduce arch_try_cmpxchg64() for !CONFIG_X86_CMPXCHG64 Commit: 6d12c8d308e68 ("percpu: Wire up cmpxchg128") improved emulated cmpxchg8b_emu() library function to return success/failure in a ZF flag. Define arch_try_cmpxchg64() for !CONFIG_X86_CMPXCHG64 targets to override the generic archy_try_cmpxchg() with an optimized target specific implementation that handles ZF flag. The assembly code at the call sites improves from: bf56d: e8 fc ff ff ff call cmpxchg8b_emu bf572: 8b 74 24 28 mov 0x28(%esp),%esi bf576: 89 c3 mov %eax,%ebx bf578: 89 d1 mov %edx,%ecx bf57a: 8b 7c 24 2c mov 0x2c(%esp),%edi bf57e: 89 f0 mov %esi,%eax bf580: 89 fa mov %edi,%edx bf582: 31 d8 xor %ebx,%eax bf584: 31 ca xor %ecx,%edx bf586: 09 d0 or %edx,%eax bf588: 0f 84 e3 01 00 00 je bf771 <...> to: bf572: e8 fc ff ff ff call cmpxchg8b_emu bf577: 0f 84 b6 01 00 00 je bf733 <...> Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Link: https://lore.kernel.org/r/20240408091547.90111-4-ubizjak@gmail.com --- arch/x86/include/asm/cmpxchg_32.h | 28 ++++++++++++++++++++++++++++ 1 file changed, 28 insertions(+) diff --git a/arch/x86/include/asm/cmpxchg_32.h b/arch/x86/include/asm/cmpxchg_32.h index fe40d06..9e0d330 100644 --- a/arch/x86/include/asm/cmpxchg_32.h +++ b/arch/x86/include/asm/cmpxchg_32.h @@ -122,6 +122,34 @@ static __always_inline u64 arch_cmpxchg64_local(volatile u64 *ptr, u64 old, u64 } #define arch_cmpxchg64_local arch_cmpxchg64_local +#define __arch_try_cmpxchg64_emu(_ptr, _oldp, _new) \ +({ \ + union __u64_halves o = { .full = *(_oldp), }, \ + n = { .full = (_new), }; \ + bool ret; \ + \ + asm volatile(ALTERNATIVE(LOCK_PREFIX_HERE \ + "call cmpxchg8b_emu", \ + "lock; cmpxchg8b %[ptr]", X86_FEATURE_CX8) \ + CC_SET(e) \ + : CC_OUT(e) (ret), \ + [ptr] "+m" (*(_ptr)), \ + "+a" (o.low), "+d" (o.high) \ + : "b" (n.low), "c" (n.high), "S" (_ptr) \ + : "memory"); \ + \ + if (unlikely(!ret)) \ + *(_oldp) = o.full; \ + \ + likely(ret); \ +}) + +static __always_inline bool arch_try_cmpxchg64(volatile u64 *ptr, u64 *oldp, u64 new) +{ + return __arch_try_cmpxchg64_emu(ptr, oldp, new); +} +#define arch_try_cmpxchg64 arch_try_cmpxchg64 + #endif #define system_has_cmpxchg64() boot_cpu_has(X86_FEATURE_CX8) ^ permalink raw reply related [flat|nested] 8+ messages in thread
end of thread, other threads:[~2024-04-09 8:26 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-04-08 9:13 [PATCH 0/3] locking/atomic/x86: Improve arch_cmpxchg64() and friends for x86_32 Uros Bizjak
2024-04-08 9:13 ` [PATCH 1/3] locking/atomic/x86: Correct the definition of __arch_try_cmpxchg128() Uros Bizjak
2024-04-09 8:26 ` [tip: locking/core] " tip-bot2 for Uros Bizjak
2024-04-08 9:13 ` [PATCH 2/3] locking/atomic/x86: Modernize x86_32 arch_{,try_}_cmpxchg64{,_local}() Uros Bizjak
2024-04-09 8:26 ` [tip: locking/core] " tip-bot2 for Uros Bizjak
2024-04-08 9:13 ` [PATCH 3/3] locking/atomic/x86: Introduce arch_try_cmpxchg64() for !CONFIG_X86_CMPXCHG64 Uros Bizjak
2024-04-09 7:50 ` Ingo Molnar
2024-04-09 8:26 ` [tip: locking/core] " tip-bot2 for Uros Bizjak
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox