* [PATCH 0/3] locking/atomic/x86: Improve arch_cmpxchg64() and friends for x86_32
@ 2024-04-08 9:13 Uros Bizjak
2024-04-08 9:13 ` [PATCH 1/3] locking/atomic/x86: Correct the definition of __arch_try_cmpxchg128() Uros Bizjak
` (2 more replies)
0 siblings, 3 replies; 8+ messages in thread
From: Uros Bizjak @ 2024-04-08 9:13 UTC (permalink / raw)
To: x86, linux-kernel
Cc: Uros Bizjak, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
Dave Hansen, H. Peter Anvin, Peter Zijlstra
Following patch series improves arch_cmpxchg64() family
of functions and rewrites arch_cmpxchg64() family in a
similar way as was recently done for arch_cmpxchg128() family.
The improvements builds on recent improvement of emulated
cmpxchg8b_emu() library function and introduce
arch_try_cmpxchg64() also for !CONFIG_X86_CMPXCHG64 targets.
While the patch series enables impressive assembly code
reductions for somehow obsolete !CONFIG_X86_CMPXCHG64 targets,
the true reason for the changes will be evident from the
follow-up patch series, where this series enables unification
of several atomic functions between x86_64 and x86_32 targets.
OTOH, by adopting the same approach and similar structure of
arch_cmpxchg64() macros to arch_cmpxchg128() macros, these
patches lower future maintenace burden and technical debt
of the source code.
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Uros Bizjak (3):
locking/atomic/x86: Correct the definition of __arch_try_cmpxchg128()
locking/atomic/x86: Modernize x86_32 arch_{,try_}_cmpxchg64{,_local}()
locking/atomic/x86: Introduce arch_try_cmpxchg64() for
!CONFIG_X86_CMPXCHG64
arch/x86/include/asm/cmpxchg_32.h | 207 ++++++++++++++++++------------
arch/x86/include/asm/cmpxchg_64.h | 2 +-
2 files changed, 129 insertions(+), 80 deletions(-)
--
2.44.0
^ permalink raw reply [flat|nested] 8+ messages in thread* [PATCH 1/3] locking/atomic/x86: Correct the definition of __arch_try_cmpxchg128() 2024-04-08 9:13 [PATCH 0/3] locking/atomic/x86: Improve arch_cmpxchg64() and friends for x86_32 Uros Bizjak @ 2024-04-08 9:13 ` Uros Bizjak 2024-04-09 8:26 ` [tip: locking/core] " tip-bot2 for Uros Bizjak 2024-04-08 9:13 ` [PATCH 2/3] locking/atomic/x86: Modernize x86_32 arch_{,try_}_cmpxchg64{,_local}() Uros Bizjak 2024-04-08 9:13 ` [PATCH 3/3] locking/atomic/x86: Introduce arch_try_cmpxchg64() for !CONFIG_X86_CMPXCHG64 Uros Bizjak 2 siblings, 1 reply; 8+ messages in thread From: Uros Bizjak @ 2024-04-08 9:13 UTC (permalink / raw) To: x86, linux-kernel Cc: Uros Bizjak, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, H. Peter Anvin, Peter Zijlstra Correct the definition of __arch_try_cmpxchg128(), introduced by: b23e139d0b66 ("arch: Introduce arch_{,try_}_cmpxchg128{,_local}()") Fixes: b23e139d0b66 ("arch: Introduce arch_{,try_}_cmpxchg128{,_local}()") Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Peter Zijlstra <peterz@infradead.org> --- arch/x86/include/asm/cmpxchg_64.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/include/asm/cmpxchg_64.h b/arch/x86/include/asm/cmpxchg_64.h index 44b08b53ab32..c1d6cd58f809 100644 --- a/arch/x86/include/asm/cmpxchg_64.h +++ b/arch/x86/include/asm/cmpxchg_64.h @@ -62,7 +62,7 @@ static __always_inline u128 arch_cmpxchg128_local(volatile u128 *ptr, u128 old, asm volatile(_lock "cmpxchg16b %[ptr]" \ CC_SET(e) \ : CC_OUT(e) (ret), \ - [ptr] "+m" (*ptr), \ + [ptr] "+m" (*(_ptr)), \ "+a" (o.low), "+d" (o.high) \ : "b" (n.low), "c" (n.high) \ : "memory"); \ -- 2.44.0 ^ permalink raw reply related [flat|nested] 8+ messages in thread
* [tip: locking/core] locking/atomic/x86: Correct the definition of __arch_try_cmpxchg128() 2024-04-08 9:13 ` [PATCH 1/3] locking/atomic/x86: Correct the definition of __arch_try_cmpxchg128() Uros Bizjak @ 2024-04-09 8:26 ` tip-bot2 for Uros Bizjak 0 siblings, 0 replies; 8+ messages in thread From: tip-bot2 for Uros Bizjak @ 2024-04-09 8:26 UTC (permalink / raw) To: linux-tip-commits Cc: Uros Bizjak, Ingo Molnar, Linus Torvalds, H. Peter Anvin, x86, linux-kernel The following commit has been merged into the locking/core branch of tip: Commit-ID: 929ad065ba2967be238dfdc0895b79fda62c7f16 Gitweb: https://git.kernel.org/tip/929ad065ba2967be238dfdc0895b79fda62c7f16 Author: Uros Bizjak <ubizjak@gmail.com> AuthorDate: Mon, 08 Apr 2024 11:13:56 +02:00 Committer: Ingo Molnar <mingo@kernel.org> CommitterDate: Tue, 09 Apr 2024 09:51:02 +02:00 locking/atomic/x86: Correct the definition of __arch_try_cmpxchg128() Correct the definition of __arch_try_cmpxchg128(), introduced by: b23e139d0b66 ("arch: Introduce arch_{,try_}_cmpxchg128{,_local}()") Fixes: b23e139d0b66 ("arch: Introduce arch_{,try_}_cmpxchg128{,_local}()") Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Link: https://lore.kernel.org/r/20240408091547.90111-2-ubizjak@gmail.com --- arch/x86/include/asm/cmpxchg_64.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/include/asm/cmpxchg_64.h b/arch/x86/include/asm/cmpxchg_64.h index 44b08b5..c1d6cd5 100644 --- a/arch/x86/include/asm/cmpxchg_64.h +++ b/arch/x86/include/asm/cmpxchg_64.h @@ -62,7 +62,7 @@ static __always_inline u128 arch_cmpxchg128_local(volatile u128 *ptr, u128 old, asm volatile(_lock "cmpxchg16b %[ptr]" \ CC_SET(e) \ : CC_OUT(e) (ret), \ - [ptr] "+m" (*ptr), \ + [ptr] "+m" (*(_ptr)), \ "+a" (o.low), "+d" (o.high) \ : "b" (n.low), "c" (n.high) \ : "memory"); \ ^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH 2/3] locking/atomic/x86: Modernize x86_32 arch_{,try_}_cmpxchg64{,_local}() 2024-04-08 9:13 [PATCH 0/3] locking/atomic/x86: Improve arch_cmpxchg64() and friends for x86_32 Uros Bizjak 2024-04-08 9:13 ` [PATCH 1/3] locking/atomic/x86: Correct the definition of __arch_try_cmpxchg128() Uros Bizjak @ 2024-04-08 9:13 ` Uros Bizjak 2024-04-09 8:26 ` [tip: locking/core] " tip-bot2 for Uros Bizjak 2024-04-08 9:13 ` [PATCH 3/3] locking/atomic/x86: Introduce arch_try_cmpxchg64() for !CONFIG_X86_CMPXCHG64 Uros Bizjak 2 siblings, 1 reply; 8+ messages in thread From: Uros Bizjak @ 2024-04-08 9:13 UTC (permalink / raw) To: x86, linux-kernel Cc: Uros Bizjak, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, H. Peter Anvin, Peter Zijlstra Commit: b23e139d0b66 ("arch: Introduce arch_{,try_}_cmpxchg128{,_local}()") introduced arch_{,try_}_cmpxchg128{,_local}() for x86_64 targets. Modernize existing x86_32 arch_{,try_}_cmpxchg64{,_local}() definitions to follow the same structure as the definitions introduced by the above commit. No functional changes intended. Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Peter Zijlstra <peterz@infradead.org> --- arch/x86/include/asm/cmpxchg_32.h | 179 +++++++++++++++++------------- 1 file changed, 100 insertions(+), 79 deletions(-) diff --git a/arch/x86/include/asm/cmpxchg_32.h b/arch/x86/include/asm/cmpxchg_32.h index b5731c51f0f4..fe40d0681ea8 100644 --- a/arch/x86/include/asm/cmpxchg_32.h +++ b/arch/x86/include/asm/cmpxchg_32.h @@ -3,103 +3,124 @@ #define _ASM_X86_CMPXCHG_32_H /* - * Note: if you use set64_bit(), __cmpxchg64(), or their variants, + * Note: if you use __cmpxchg64(), or their variants, * you need to test for the feature in boot_cpu_data. */ -#ifdef CONFIG_X86_CMPXCHG64 -#define arch_cmpxchg64(ptr, o, n) \ - ((__typeof__(*(ptr)))__cmpxchg64((ptr), (unsigned long long)(o), \ - (unsigned long long)(n))) -#define arch_cmpxchg64_local(ptr, o, n) \ - ((__typeof__(*(ptr)))__cmpxchg64_local((ptr), (unsigned long long)(o), \ - (unsigned long long)(n))) -#define arch_try_cmpxchg64(ptr, po, n) \ - __try_cmpxchg64((ptr), (unsigned long long *)(po), \ - (unsigned long long)(n)) -#endif +union __u64_halves { + u64 full; + struct { + u32 low, high; + }; +}; + +#define __arch_cmpxchg64(_ptr, _old, _new, _lock) \ +({ \ + union __u64_halves o = { .full = (_old), }, \ + n = { .full = (_new), }; \ + \ + asm volatile(_lock "cmpxchg8b %[ptr]" \ + : [ptr] "+m" (*(_ptr)), \ + "+a" (o.low), "+d" (o.high) \ + : "b" (n.low), "c" (n.high) \ + : "memory"); \ + \ + o.full; \ +}) -static inline u64 __cmpxchg64(volatile u64 *ptr, u64 old, u64 new) + +static __always_inline u64 __cmpxchg64(volatile u64 *ptr, u64 old, u64 new) { - u64 prev; - asm volatile(LOCK_PREFIX "cmpxchg8b %1" - : "=A" (prev), - "+m" (*ptr) - : "b" ((u32)new), - "c" ((u32)(new >> 32)), - "0" (old) - : "memory"); - return prev; + return __arch_cmpxchg64(ptr, old, new, LOCK_PREFIX); } -static inline u64 __cmpxchg64_local(volatile u64 *ptr, u64 old, u64 new) +static __always_inline u64 __cmpxchg64_local(volatile u64 *ptr, u64 old, u64 new) { - u64 prev; - asm volatile("cmpxchg8b %1" - : "=A" (prev), - "+m" (*ptr) - : "b" ((u32)new), - "c" ((u32)(new >> 32)), - "0" (old) - : "memory"); - return prev; + return __arch_cmpxchg64(ptr, old, new,); } -static inline bool __try_cmpxchg64(volatile u64 *ptr, u64 *pold, u64 new) +#define __arch_try_cmpxchg64(_ptr, _oldp, _new, _lock) \ +({ \ + union __u64_halves o = { .full = *(_oldp), }, \ + n = { .full = (_new), }; \ + bool ret; \ + \ + asm volatile(_lock "cmpxchg8b %[ptr]" \ + CC_SET(e) \ + : CC_OUT(e) (ret), \ + [ptr] "+m" (*(_ptr)), \ + "+a" (o.low), "+d" (o.high) \ + : "b" (n.low), "c" (n.high) \ + : "memory"); \ + \ + if (unlikely(!ret)) \ + *(_oldp) = o.full; \ + \ + likely(ret); \ +}) + +static __always_inline bool __try_cmpxchg64(volatile u64 *ptr, u64 *oldp, u64 new) { - bool success; - u64 old = *pold; - asm volatile(LOCK_PREFIX "cmpxchg8b %[ptr]" - CC_SET(z) - : CC_OUT(z) (success), - [ptr] "+m" (*ptr), - "+A" (old) - : "b" ((u32)new), - "c" ((u32)(new >> 32)) - : "memory"); - - if (unlikely(!success)) - *pold = old; - return success; + return __arch_try_cmpxchg64(ptr, oldp, new, LOCK_PREFIX); } -#ifndef CONFIG_X86_CMPXCHG64 +#ifdef CONFIG_X86_CMPXCHG64 + +#define arch_cmpxchg64 __cmpxchg64 + +#define arch_cmpxchg64_local __cmpxchg64_local + +#define arch_try_cmpxchg64 __try_cmpxchg64 + +#else + /* * Building a kernel capable running on 80386 and 80486. It may be necessary * to simulate the cmpxchg8b on the 80386 and 80486 CPU. */ -#define arch_cmpxchg64(ptr, o, n) \ -({ \ - __typeof__(*(ptr)) __ret; \ - __typeof__(*(ptr)) __old = (o); \ - __typeof__(*(ptr)) __new = (n); \ - alternative_io(LOCK_PREFIX_HERE \ - "call cmpxchg8b_emu", \ - "lock; cmpxchg8b (%%esi)" , \ - X86_FEATURE_CX8, \ - "=A" (__ret), \ - "S" ((ptr)), "0" (__old), \ - "b" ((unsigned int)__new), \ - "c" ((unsigned int)(__new>>32)) \ - : "memory"); \ - __ret; }) - - -#define arch_cmpxchg64_local(ptr, o, n) \ -({ \ - __typeof__(*(ptr)) __ret; \ - __typeof__(*(ptr)) __old = (o); \ - __typeof__(*(ptr)) __new = (n); \ - alternative_io("call cmpxchg8b_emu", \ - "cmpxchg8b (%%esi)" , \ - X86_FEATURE_CX8, \ - "=A" (__ret), \ - "S" ((ptr)), "0" (__old), \ - "b" ((unsigned int)__new), \ - "c" ((unsigned int)(__new>>32)) \ - : "memory"); \ - __ret; }) +#define __arch_cmpxchg64_emu(_ptr, _old, _new) \ +({ \ + union __u64_halves o = { .full = (_old), }, \ + n = { .full = (_new), }; \ + \ + asm volatile(ALTERNATIVE(LOCK_PREFIX_HERE \ + "call cmpxchg8b_emu", \ + "lock; cmpxchg8b %[ptr]", X86_FEATURE_CX8) \ + : [ptr] "+m" (*(_ptr)), \ + "+a" (o.low), "+d" (o.high) \ + : "b" (n.low), "c" (n.high), "S" (_ptr) \ + : "memory"); \ + \ + o.full; \ +}) + +static __always_inline u64 arch_cmpxchg64(volatile u64 *ptr, u64 old, u64 new) +{ + return __arch_cmpxchg64_emu(ptr, old, new); +} +#define arch_cmpxchg64 arch_cmpxchg64 + +#define __arch_cmpxchg64_emu_local(_ptr, _old, _new) \ +({ \ + union __u64_halves o = { .full = (_old), }, \ + n = { .full = (_new), }; \ + \ + asm volatile(ALTERNATIVE("call cmpxchg8b_emu", \ + "cmpxchg8b %[ptr]", X86_FEATURE_CX8) \ + : [ptr] "+m" (*(_ptr)), \ + "+a" (o.low), "+d" (o.high) \ + : "b" (n.low), "c" (n.high), "S" (_ptr) \ + : "memory"); \ + \ + o.full; \ +}) + +static __always_inline u64 arch_cmpxchg64_local(volatile u64 *ptr, u64 old, u64 new) +{ + return __arch_cmpxchg64_emu_local(ptr, old, new); +} +#define arch_cmpxchg64_local arch_cmpxchg64_local #endif -- 2.44.0 ^ permalink raw reply related [flat|nested] 8+ messages in thread
* [tip: locking/core] locking/atomic/x86: Modernize x86_32 arch_{,try_}_cmpxchg64{,_local}() 2024-04-08 9:13 ` [PATCH 2/3] locking/atomic/x86: Modernize x86_32 arch_{,try_}_cmpxchg64{,_local}() Uros Bizjak @ 2024-04-09 8:26 ` tip-bot2 for Uros Bizjak 0 siblings, 0 replies; 8+ messages in thread From: tip-bot2 for Uros Bizjak @ 2024-04-09 8:26 UTC (permalink / raw) To: linux-tip-commits Cc: Uros Bizjak, Ingo Molnar, Linus Torvalds, H. Peter Anvin, x86, linux-kernel The following commit has been merged into the locking/core branch of tip: Commit-ID: 7016cc5def44b9dcb28089efae4412fa0d6c78c2 Gitweb: https://git.kernel.org/tip/7016cc5def44b9dcb28089efae4412fa0d6c78c2 Author: Uros Bizjak <ubizjak@gmail.com> AuthorDate: Mon, 08 Apr 2024 11:13:57 +02:00 Committer: Ingo Molnar <mingo@kernel.org> CommitterDate: Tue, 09 Apr 2024 09:51:03 +02:00 locking/atomic/x86: Modernize x86_32 arch_{,try_}_cmpxchg64{,_local}() Commit: b23e139d0b66 ("arch: Introduce arch_{,try_}_cmpxchg128{,_local}()") introduced arch_{,try_}_cmpxchg128{,_local}() for x86_64 targets. Modernize existing x86_32 arch_{,try_}_cmpxchg64{,_local}() definitions to follow the same structure as the definitions introduced by the above commit. No functional changes intended. Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Link: https://lore.kernel.org/r/20240408091547.90111-3-ubizjak@gmail.com --- arch/x86/include/asm/cmpxchg_32.h | 179 ++++++++++++++++------------- 1 file changed, 100 insertions(+), 79 deletions(-) diff --git a/arch/x86/include/asm/cmpxchg_32.h b/arch/x86/include/asm/cmpxchg_32.h index b5731c5..fe40d06 100644 --- a/arch/x86/include/asm/cmpxchg_32.h +++ b/arch/x86/include/asm/cmpxchg_32.h @@ -3,103 +3,124 @@ #define _ASM_X86_CMPXCHG_32_H /* - * Note: if you use set64_bit(), __cmpxchg64(), or their variants, + * Note: if you use __cmpxchg64(), or their variants, * you need to test for the feature in boot_cpu_data. */ -#ifdef CONFIG_X86_CMPXCHG64 -#define arch_cmpxchg64(ptr, o, n) \ - ((__typeof__(*(ptr)))__cmpxchg64((ptr), (unsigned long long)(o), \ - (unsigned long long)(n))) -#define arch_cmpxchg64_local(ptr, o, n) \ - ((__typeof__(*(ptr)))__cmpxchg64_local((ptr), (unsigned long long)(o), \ - (unsigned long long)(n))) -#define arch_try_cmpxchg64(ptr, po, n) \ - __try_cmpxchg64((ptr), (unsigned long long *)(po), \ - (unsigned long long)(n)) -#endif +union __u64_halves { + u64 full; + struct { + u32 low, high; + }; +}; + +#define __arch_cmpxchg64(_ptr, _old, _new, _lock) \ +({ \ + union __u64_halves o = { .full = (_old), }, \ + n = { .full = (_new), }; \ + \ + asm volatile(_lock "cmpxchg8b %[ptr]" \ + : [ptr] "+m" (*(_ptr)), \ + "+a" (o.low), "+d" (o.high) \ + : "b" (n.low), "c" (n.high) \ + : "memory"); \ + \ + o.full; \ +}) -static inline u64 __cmpxchg64(volatile u64 *ptr, u64 old, u64 new) + +static __always_inline u64 __cmpxchg64(volatile u64 *ptr, u64 old, u64 new) { - u64 prev; - asm volatile(LOCK_PREFIX "cmpxchg8b %1" - : "=A" (prev), - "+m" (*ptr) - : "b" ((u32)new), - "c" ((u32)(new >> 32)), - "0" (old) - : "memory"); - return prev; + return __arch_cmpxchg64(ptr, old, new, LOCK_PREFIX); } -static inline u64 __cmpxchg64_local(volatile u64 *ptr, u64 old, u64 new) +static __always_inline u64 __cmpxchg64_local(volatile u64 *ptr, u64 old, u64 new) { - u64 prev; - asm volatile("cmpxchg8b %1" - : "=A" (prev), - "+m" (*ptr) - : "b" ((u32)new), - "c" ((u32)(new >> 32)), - "0" (old) - : "memory"); - return prev; + return __arch_cmpxchg64(ptr, old, new,); } -static inline bool __try_cmpxchg64(volatile u64 *ptr, u64 *pold, u64 new) +#define __arch_try_cmpxchg64(_ptr, _oldp, _new, _lock) \ +({ \ + union __u64_halves o = { .full = *(_oldp), }, \ + n = { .full = (_new), }; \ + bool ret; \ + \ + asm volatile(_lock "cmpxchg8b %[ptr]" \ + CC_SET(e) \ + : CC_OUT(e) (ret), \ + [ptr] "+m" (*(_ptr)), \ + "+a" (o.low), "+d" (o.high) \ + : "b" (n.low), "c" (n.high) \ + : "memory"); \ + \ + if (unlikely(!ret)) \ + *(_oldp) = o.full; \ + \ + likely(ret); \ +}) + +static __always_inline bool __try_cmpxchg64(volatile u64 *ptr, u64 *oldp, u64 new) { - bool success; - u64 old = *pold; - asm volatile(LOCK_PREFIX "cmpxchg8b %[ptr]" - CC_SET(z) - : CC_OUT(z) (success), - [ptr] "+m" (*ptr), - "+A" (old) - : "b" ((u32)new), - "c" ((u32)(new >> 32)) - : "memory"); - - if (unlikely(!success)) - *pold = old; - return success; + return __arch_try_cmpxchg64(ptr, oldp, new, LOCK_PREFIX); } -#ifndef CONFIG_X86_CMPXCHG64 +#ifdef CONFIG_X86_CMPXCHG64 + +#define arch_cmpxchg64 __cmpxchg64 + +#define arch_cmpxchg64_local __cmpxchg64_local + +#define arch_try_cmpxchg64 __try_cmpxchg64 + +#else + /* * Building a kernel capable running on 80386 and 80486. It may be necessary * to simulate the cmpxchg8b on the 80386 and 80486 CPU. */ -#define arch_cmpxchg64(ptr, o, n) \ -({ \ - __typeof__(*(ptr)) __ret; \ - __typeof__(*(ptr)) __old = (o); \ - __typeof__(*(ptr)) __new = (n); \ - alternative_io(LOCK_PREFIX_HERE \ - "call cmpxchg8b_emu", \ - "lock; cmpxchg8b (%%esi)" , \ - X86_FEATURE_CX8, \ - "=A" (__ret), \ - "S" ((ptr)), "0" (__old), \ - "b" ((unsigned int)__new), \ - "c" ((unsigned int)(__new>>32)) \ - : "memory"); \ - __ret; }) - - -#define arch_cmpxchg64_local(ptr, o, n) \ -({ \ - __typeof__(*(ptr)) __ret; \ - __typeof__(*(ptr)) __old = (o); \ - __typeof__(*(ptr)) __new = (n); \ - alternative_io("call cmpxchg8b_emu", \ - "cmpxchg8b (%%esi)" , \ - X86_FEATURE_CX8, \ - "=A" (__ret), \ - "S" ((ptr)), "0" (__old), \ - "b" ((unsigned int)__new), \ - "c" ((unsigned int)(__new>>32)) \ - : "memory"); \ - __ret; }) +#define __arch_cmpxchg64_emu(_ptr, _old, _new) \ +({ \ + union __u64_halves o = { .full = (_old), }, \ + n = { .full = (_new), }; \ + \ + asm volatile(ALTERNATIVE(LOCK_PREFIX_HERE \ + "call cmpxchg8b_emu", \ + "lock; cmpxchg8b %[ptr]", X86_FEATURE_CX8) \ + : [ptr] "+m" (*(_ptr)), \ + "+a" (o.low), "+d" (o.high) \ + : "b" (n.low), "c" (n.high), "S" (_ptr) \ + : "memory"); \ + \ + o.full; \ +}) + +static __always_inline u64 arch_cmpxchg64(volatile u64 *ptr, u64 old, u64 new) +{ + return __arch_cmpxchg64_emu(ptr, old, new); +} +#define arch_cmpxchg64 arch_cmpxchg64 + +#define __arch_cmpxchg64_emu_local(_ptr, _old, _new) \ +({ \ + union __u64_halves o = { .full = (_old), }, \ + n = { .full = (_new), }; \ + \ + asm volatile(ALTERNATIVE("call cmpxchg8b_emu", \ + "cmpxchg8b %[ptr]", X86_FEATURE_CX8) \ + : [ptr] "+m" (*(_ptr)), \ + "+a" (o.low), "+d" (o.high) \ + : "b" (n.low), "c" (n.high), "S" (_ptr) \ + : "memory"); \ + \ + o.full; \ +}) + +static __always_inline u64 arch_cmpxchg64_local(volatile u64 *ptr, u64 old, u64 new) +{ + return __arch_cmpxchg64_emu_local(ptr, old, new); +} +#define arch_cmpxchg64_local arch_cmpxchg64_local #endif ^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH 3/3] locking/atomic/x86: Introduce arch_try_cmpxchg64() for !CONFIG_X86_CMPXCHG64 2024-04-08 9:13 [PATCH 0/3] locking/atomic/x86: Improve arch_cmpxchg64() and friends for x86_32 Uros Bizjak 2024-04-08 9:13 ` [PATCH 1/3] locking/atomic/x86: Correct the definition of __arch_try_cmpxchg128() Uros Bizjak 2024-04-08 9:13 ` [PATCH 2/3] locking/atomic/x86: Modernize x86_32 arch_{,try_}_cmpxchg64{,_local}() Uros Bizjak @ 2024-04-08 9:13 ` Uros Bizjak 2024-04-09 7:50 ` Ingo Molnar 2024-04-09 8:26 ` [tip: locking/core] " tip-bot2 for Uros Bizjak 2 siblings, 2 replies; 8+ messages in thread From: Uros Bizjak @ 2024-04-08 9:13 UTC (permalink / raw) To: x86, linux-kernel Cc: Uros Bizjak, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, H. Peter Anvin, Peter Zijlstra Commit: 6d12c8d308e68 ("percpu: Wire up cmpxchg128") improved emulated cmpxchg8b_emu() library function to return success/failure in a ZF flag. Define arch_try_cmpxchg64() for !CONFIG_X86_CMPXCHG64 targets to override the generic archy_try_cmpxchg() with an optimized target specific implementation that handles ZF flag. The assembly code at the call sites improves from: bf56d: e8 fc ff ff ff call cmpxchg8b_emu bf572: 8b 74 24 28 mov 0x28(%esp),%esi bf576: 89 c3 mov %eax,%ebx bf578: 89 d1 mov %edx,%ecx bf57a: 8b 7c 24 2c mov 0x2c(%esp),%edi bf57e: 89 f0 mov %esi,%eax bf580: 89 fa mov %edi,%edx bf582: 31 d8 xor %ebx,%eax bf584: 31 ca xor %ecx,%edx bf586: 09 d0 or %edx,%eax bf588: 0f 84 e3 01 00 00 je bf771 <...> to: bf572: e8 fc ff ff ff call cmpxchg8b_emu bf577: 0f 84 b6 01 00 00 je bf733 <...> No functional changes intended. Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Peter Zijlstra <peterz@infradead.org> --- arch/x86/include/asm/cmpxchg_32.h | 28 ++++++++++++++++++++++++++++ 1 file changed, 28 insertions(+) diff --git a/arch/x86/include/asm/cmpxchg_32.h b/arch/x86/include/asm/cmpxchg_32.h index fe40d0681ea8..9e0d330dd5d0 100644 --- a/arch/x86/include/asm/cmpxchg_32.h +++ b/arch/x86/include/asm/cmpxchg_32.h @@ -122,6 +122,34 @@ static __always_inline u64 arch_cmpxchg64_local(volatile u64 *ptr, u64 old, u64 } #define arch_cmpxchg64_local arch_cmpxchg64_local +#define __arch_try_cmpxchg64_emu(_ptr, _oldp, _new) \ +({ \ + union __u64_halves o = { .full = *(_oldp), }, \ + n = { .full = (_new), }; \ + bool ret; \ + \ + asm volatile(ALTERNATIVE(LOCK_PREFIX_HERE \ + "call cmpxchg8b_emu", \ + "lock; cmpxchg8b %[ptr]", X86_FEATURE_CX8) \ + CC_SET(e) \ + : CC_OUT(e) (ret), \ + [ptr] "+m" (*(_ptr)), \ + "+a" (o.low), "+d" (o.high) \ + : "b" (n.low), "c" (n.high), "S" (_ptr) \ + : "memory"); \ + \ + if (unlikely(!ret)) \ + *(_oldp) = o.full; \ + \ + likely(ret); \ +}) + +static __always_inline bool arch_try_cmpxchg64(volatile u64 *ptr, u64 *oldp, u64 new) +{ + return __arch_try_cmpxchg64_emu(ptr, oldp, new); +} +#define arch_try_cmpxchg64 arch_try_cmpxchg64 + #endif #define system_has_cmpxchg64() boot_cpu_has(X86_FEATURE_CX8) -- 2.44.0 ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH 3/3] locking/atomic/x86: Introduce arch_try_cmpxchg64() for !CONFIG_X86_CMPXCHG64 2024-04-08 9:13 ` [PATCH 3/3] locking/atomic/x86: Introduce arch_try_cmpxchg64() for !CONFIG_X86_CMPXCHG64 Uros Bizjak @ 2024-04-09 7:50 ` Ingo Molnar 2024-04-09 8:26 ` [tip: locking/core] " tip-bot2 for Uros Bizjak 1 sibling, 0 replies; 8+ messages in thread From: Ingo Molnar @ 2024-04-09 7:50 UTC (permalink / raw) To: Uros Bizjak Cc: x86, linux-kernel, Thomas Gleixner, Borislav Petkov, Dave Hansen, H. Peter Anvin, Peter Zijlstra * Uros Bizjak <ubizjak@gmail.com> wrote: > Commit: > > 6d12c8d308e68 ("percpu: Wire up cmpxchg128") > > improved emulated cmpxchg8b_emu() library function to return > success/failure in a ZF flag. > > Define arch_try_cmpxchg64() for !CONFIG_X86_CMPXCHG64 targets > to override the generic archy_try_cmpxchg() with an optimized > target specific implementation that handles ZF flag. > > The assembly code at the call sites improves from: > > bf56d: e8 fc ff ff ff call cmpxchg8b_emu > bf572: 8b 74 24 28 mov 0x28(%esp),%esi > bf576: 89 c3 mov %eax,%ebx > bf578: 89 d1 mov %edx,%ecx > bf57a: 8b 7c 24 2c mov 0x2c(%esp),%edi > bf57e: 89 f0 mov %esi,%eax > bf580: 89 fa mov %edi,%edx > bf582: 31 d8 xor %ebx,%eax > bf584: 31 ca xor %ecx,%edx > bf586: 09 d0 or %edx,%eax > bf588: 0f 84 e3 01 00 00 je bf771 <...> > > to: > > bf572: e8 fc ff ff ff call cmpxchg8b_emu > bf577: 0f 84 b6 01 00 00 je bf733 <...> > > No functional changes intended. Side note: while there's no hard-written rule for it, I tend to use the 'no functional changes intended' line for pure identity transformations - which this one isn't, as it changes code generation materially. So I removed that line - the explanation of the patch is clear enough IMO. Thanks, Ingo ^ permalink raw reply [flat|nested] 8+ messages in thread
* [tip: locking/core] locking/atomic/x86: Introduce arch_try_cmpxchg64() for !CONFIG_X86_CMPXCHG64 2024-04-08 9:13 ` [PATCH 3/3] locking/atomic/x86: Introduce arch_try_cmpxchg64() for !CONFIG_X86_CMPXCHG64 Uros Bizjak 2024-04-09 7:50 ` Ingo Molnar @ 2024-04-09 8:26 ` tip-bot2 for Uros Bizjak 1 sibling, 0 replies; 8+ messages in thread From: tip-bot2 for Uros Bizjak @ 2024-04-09 8:26 UTC (permalink / raw) To: linux-tip-commits Cc: Uros Bizjak, Ingo Molnar, Linus Torvalds, H. Peter Anvin, x86, linux-kernel The following commit has been merged into the locking/core branch of tip: Commit-ID: aef95dac9ce4f271cc43195ffc175114ed934cbe Gitweb: https://git.kernel.org/tip/aef95dac9ce4f271cc43195ffc175114ed934cbe Author: Uros Bizjak <ubizjak@gmail.com> AuthorDate: Mon, 08 Apr 2024 11:13:58 +02:00 Committer: Ingo Molnar <mingo@kernel.org> CommitterDate: Tue, 09 Apr 2024 09:51:03 +02:00 locking/atomic/x86: Introduce arch_try_cmpxchg64() for !CONFIG_X86_CMPXCHG64 Commit: 6d12c8d308e68 ("percpu: Wire up cmpxchg128") improved emulated cmpxchg8b_emu() library function to return success/failure in a ZF flag. Define arch_try_cmpxchg64() for !CONFIG_X86_CMPXCHG64 targets to override the generic archy_try_cmpxchg() with an optimized target specific implementation that handles ZF flag. The assembly code at the call sites improves from: bf56d: e8 fc ff ff ff call cmpxchg8b_emu bf572: 8b 74 24 28 mov 0x28(%esp),%esi bf576: 89 c3 mov %eax,%ebx bf578: 89 d1 mov %edx,%ecx bf57a: 8b 7c 24 2c mov 0x2c(%esp),%edi bf57e: 89 f0 mov %esi,%eax bf580: 89 fa mov %edi,%edx bf582: 31 d8 xor %ebx,%eax bf584: 31 ca xor %ecx,%edx bf586: 09 d0 or %edx,%eax bf588: 0f 84 e3 01 00 00 je bf771 <...> to: bf572: e8 fc ff ff ff call cmpxchg8b_emu bf577: 0f 84 b6 01 00 00 je bf733 <...> Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Link: https://lore.kernel.org/r/20240408091547.90111-4-ubizjak@gmail.com --- arch/x86/include/asm/cmpxchg_32.h | 28 ++++++++++++++++++++++++++++ 1 file changed, 28 insertions(+) diff --git a/arch/x86/include/asm/cmpxchg_32.h b/arch/x86/include/asm/cmpxchg_32.h index fe40d06..9e0d330 100644 --- a/arch/x86/include/asm/cmpxchg_32.h +++ b/arch/x86/include/asm/cmpxchg_32.h @@ -122,6 +122,34 @@ static __always_inline u64 arch_cmpxchg64_local(volatile u64 *ptr, u64 old, u64 } #define arch_cmpxchg64_local arch_cmpxchg64_local +#define __arch_try_cmpxchg64_emu(_ptr, _oldp, _new) \ +({ \ + union __u64_halves o = { .full = *(_oldp), }, \ + n = { .full = (_new), }; \ + bool ret; \ + \ + asm volatile(ALTERNATIVE(LOCK_PREFIX_HERE \ + "call cmpxchg8b_emu", \ + "lock; cmpxchg8b %[ptr]", X86_FEATURE_CX8) \ + CC_SET(e) \ + : CC_OUT(e) (ret), \ + [ptr] "+m" (*(_ptr)), \ + "+a" (o.low), "+d" (o.high) \ + : "b" (n.low), "c" (n.high), "S" (_ptr) \ + : "memory"); \ + \ + if (unlikely(!ret)) \ + *(_oldp) = o.full; \ + \ + likely(ret); \ +}) + +static __always_inline bool arch_try_cmpxchg64(volatile u64 *ptr, u64 *oldp, u64 new) +{ + return __arch_try_cmpxchg64_emu(ptr, oldp, new); +} +#define arch_try_cmpxchg64 arch_try_cmpxchg64 + #endif #define system_has_cmpxchg64() boot_cpu_has(X86_FEATURE_CX8) ^ permalink raw reply related [flat|nested] 8+ messages in thread
end of thread, other threads:[~2024-04-09 8:26 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-04-08 9:13 [PATCH 0/3] locking/atomic/x86: Improve arch_cmpxchg64() and friends for x86_32 Uros Bizjak
2024-04-08 9:13 ` [PATCH 1/3] locking/atomic/x86: Correct the definition of __arch_try_cmpxchg128() Uros Bizjak
2024-04-09 8:26 ` [tip: locking/core] " tip-bot2 for Uros Bizjak
2024-04-08 9:13 ` [PATCH 2/3] locking/atomic/x86: Modernize x86_32 arch_{,try_}_cmpxchg64{,_local}() Uros Bizjak
2024-04-09 8:26 ` [tip: locking/core] " tip-bot2 for Uros Bizjak
2024-04-08 9:13 ` [PATCH 3/3] locking/atomic/x86: Introduce arch_try_cmpxchg64() for !CONFIG_X86_CMPXCHG64 Uros Bizjak
2024-04-09 7:50 ` Ingo Molnar
2024-04-09 8:26 ` [tip: locking/core] " tip-bot2 for Uros Bizjak
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.