From: Namhyung Kim <namhyung@kernel.org>
To: Yuzhuo Jing <yuzhuo@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>,
Ingo Molnar <mingo@redhat.com>,
Arnaldo Carvalho de Melo <acme@kernel.org>,
Mark Rutland <mark.rutland@arm.com>,
Alexander Shishkin <alexander.shishkin@linux.intel.com>,
Jiri Olsa <jolsa@kernel.org>, Ian Rogers <irogers@google.com>,
Adrian Hunter <adrian.hunter@intel.com>,
Liang Kan <kan.liang@linux.intel.com>,
Yuzhuo Jing <yzj@umich.edu>,
Andrea Parri <parri.andrea@gmail.com>,
Palmer Dabbelt <palmer@rivosinc.com>,
Charlie Jenkins <charlie@rivosinc.com>,
Sebastian Andrzej Siewior <bigeasy@linutronix.de>,
Kumar Kartikeya Dwivedi <memxor@gmail.com>,
Alexei Starovoitov <ast@kernel.org>,
Barret Rhoden <brho@google.com>,
Alexandre Ghiti <alexghiti@rivosinc.com>,
Guo Ren <guoren@kernel.org>,
linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org
Subject: Re: [PATCH v1 1/7] tools: Import cmpxchg and xchg functions
Date: Wed, 30 Jul 2025 21:52:34 -0700 [thread overview]
Message-ID: <aIr2krdN4UXOniJ7@google.com> (raw)
In-Reply-To: <20250729022640.3134066-2-yuzhuo@google.com>
On Mon, Jul 28, 2025 at 07:26:34PM -0700, Yuzhuo Jing wrote:
> Import necessary atomic functions used by qspinlock. Copied x86
> implementation verbatim, and used compiler builtin for generic
> implementation.
Why x86 only? Can we just use the generic version always?
Thanks,
Namhyung
>
> Signed-off-by: Yuzhuo Jing <yuzhuo@google.com>
> ---
> tools/arch/x86/include/asm/atomic.h | 14 +++
> tools/arch/x86/include/asm/cmpxchg.h | 113 +++++++++++++++++++++++++
> tools/include/asm-generic/atomic-gcc.h | 47 ++++++++++
> tools/include/linux/atomic.h | 24 ++++++
> tools/include/linux/compiler_types.h | 24 ++++++
> 5 files changed, 222 insertions(+)
>
> diff --git a/tools/arch/x86/include/asm/atomic.h b/tools/arch/x86/include/asm/atomic.h
> index 365cf182df12..a55ffd4eb5f1 100644
> --- a/tools/arch/x86/include/asm/atomic.h
> +++ b/tools/arch/x86/include/asm/atomic.h
> @@ -71,6 +71,20 @@ static __always_inline int atomic_cmpxchg(atomic_t *v, int old, int new)
> return cmpxchg(&v->counter, old, new);
> }
>
> +static __always_inline bool atomic_try_cmpxchg(atomic_t *v, int *old, int new)
> +{
> + return try_cmpxchg(&v->counter, old, new);
> +}
> +
> +static __always_inline int atomic_fetch_or(int i, atomic_t *v)
> +{
> + int val = atomic_read(v);
> +
> + do { } while (!atomic_try_cmpxchg(v, &val, val | i));
> +
> + return val;
> +}
> +
> static inline int test_and_set_bit(long nr, unsigned long *addr)
> {
> GEN_BINARY_RMWcc(LOCK_PREFIX __ASM_SIZE(bts), *addr, "Ir", nr, "%0", "c");
> diff --git a/tools/arch/x86/include/asm/cmpxchg.h b/tools/arch/x86/include/asm/cmpxchg.h
> index 0ed9ca2766ad..5372da8b27fc 100644
> --- a/tools/arch/x86/include/asm/cmpxchg.h
> +++ b/tools/arch/x86/include/asm/cmpxchg.h
> @@ -8,6 +8,8 @@
> * Non-existant functions to indicate usage errors at link time
> * (or compile-time if the compiler implements __compiletime_error().
> */
> +extern void __xchg_wrong_size(void)
> + __compiletime_error("Bad argument size for xchg");
> extern void __cmpxchg_wrong_size(void)
> __compiletime_error("Bad argument size for cmpxchg");
>
> @@ -27,6 +29,49 @@ extern void __cmpxchg_wrong_size(void)
> #define __X86_CASE_Q -1 /* sizeof will never return -1 */
> #endif
>
> +/*
> + * An exchange-type operation, which takes a value and a pointer, and
> + * returns the old value.
> + */
> +#define __xchg_op(ptr, arg, op, lock) \
> + ({ \
> + __typeof__ (*(ptr)) __ret = (arg); \
> + switch (sizeof(*(ptr))) { \
> + case __X86_CASE_B: \
> + asm_inline volatile (lock #op "b %b0, %1" \
> + : "+q" (__ret), "+m" (*(ptr)) \
> + : : "memory", "cc"); \
> + break; \
> + case __X86_CASE_W: \
> + asm_inline volatile (lock #op "w %w0, %1" \
> + : "+r" (__ret), "+m" (*(ptr)) \
> + : : "memory", "cc"); \
> + break; \
> + case __X86_CASE_L: \
> + asm_inline volatile (lock #op "l %0, %1" \
> + : "+r" (__ret), "+m" (*(ptr)) \
> + : : "memory", "cc"); \
> + break; \
> + case __X86_CASE_Q: \
> + asm_inline volatile (lock #op "q %q0, %1" \
> + : "+r" (__ret), "+m" (*(ptr)) \
> + : : "memory", "cc"); \
> + break; \
> + default: \
> + __ ## op ## _wrong_size(); \
> + __cmpxchg_wrong_size(); \
> + } \
> + __ret; \
> + })
> +
> +/*
> + * Note: no "lock" prefix even on SMP: xchg always implies lock anyway.
> + * Since this is generally used to protect other memory information, we
> + * use "asm volatile" and "memory" clobbers to prevent gcc from moving
> + * information around.
> + */
> +#define xchg(ptr, v) __xchg_op((ptr), (v), xchg, "")
> +
> /*
> * Atomic compare and exchange. Compare OLD with MEM, if identical,
> * store NEW in MEM. Return the initial value in MEM. Success is
> @@ -86,5 +131,73 @@ extern void __cmpxchg_wrong_size(void)
> #define cmpxchg(ptr, old, new) \
> __cmpxchg(ptr, old, new, sizeof(*(ptr)))
>
> +#define __raw_try_cmpxchg(_ptr, _pold, _new, size, lock) \
> +({ \
> + bool success; \
> + __typeof__(_ptr) _old = (__typeof__(_ptr))(_pold); \
> + __typeof__(*(_ptr)) __old = *_old; \
> + __typeof__(*(_ptr)) __new = (_new); \
> + switch (size) { \
> + case __X86_CASE_B: \
> + { \
> + volatile u8 *__ptr = (volatile u8 *)(_ptr); \
> + asm_inline volatile(lock "cmpxchgb %[new], %[ptr]" \
> + CC_SET(z) \
> + : CC_OUT(z) (success), \
> + [ptr] "+m" (*__ptr), \
> + [old] "+a" (__old) \
> + : [new] "q" (__new) \
> + : "memory"); \
> + break; \
> + } \
> + case __X86_CASE_W: \
> + { \
> + volatile u16 *__ptr = (volatile u16 *)(_ptr); \
> + asm_inline volatile(lock "cmpxchgw %[new], %[ptr]" \
> + CC_SET(z) \
> + : CC_OUT(z) (success), \
> + [ptr] "+m" (*__ptr), \
> + [old] "+a" (__old) \
> + : [new] "r" (__new) \
> + : "memory"); \
> + break; \
> + } \
> + case __X86_CASE_L: \
> + { \
> + volatile u32 *__ptr = (volatile u32 *)(_ptr); \
> + asm_inline volatile(lock "cmpxchgl %[new], %[ptr]" \
> + CC_SET(z) \
> + : CC_OUT(z) (success), \
> + [ptr] "+m" (*__ptr), \
> + [old] "+a" (__old) \
> + : [new] "r" (__new) \
> + : "memory"); \
> + break; \
> + } \
> + case __X86_CASE_Q: \
> + { \
> + volatile u64 *__ptr = (volatile u64 *)(_ptr); \
> + asm_inline volatile(lock "cmpxchgq %[new], %[ptr]" \
> + CC_SET(z) \
> + : CC_OUT(z) (success), \
> + [ptr] "+m" (*__ptr), \
> + [old] "+a" (__old) \
> + : [new] "r" (__new) \
> + : "memory"); \
> + break; \
> + } \
> + default: \
> + __cmpxchg_wrong_size(); \
> + } \
> + if (unlikely(!success)) \
> + *_old = __old; \
> + likely(success); \
> +})
> +
> +#define __try_cmpxchg(ptr, pold, new, size) \
> + __raw_try_cmpxchg((ptr), (pold), (new), (size), LOCK_PREFIX)
> +
> +#define try_cmpxchg(ptr, pold, new) \
> + __try_cmpxchg((ptr), (pold), (new), sizeof(*(ptr)))
>
> #endif /* TOOLS_ASM_X86_CMPXCHG_H */
> diff --git a/tools/include/asm-generic/atomic-gcc.h b/tools/include/asm-generic/atomic-gcc.h
> index 9b3c528bab92..08b7b3b36873 100644
> --- a/tools/include/asm-generic/atomic-gcc.h
> +++ b/tools/include/asm-generic/atomic-gcc.h
> @@ -62,6 +62,12 @@ static inline int atomic_dec_and_test(atomic_t *v)
> return __sync_sub_and_fetch(&v->counter, 1) == 0;
> }
>
> +#define xchg(ptr, v) \
> + __atomic_exchange_n(ptr, v, __ATOMIC_SEQ_CST)
> +
> +#define xchg_relaxed(ptr, v) \
> + __atomic_exchange_n(ptr, v, __ATOMIC_RELAXED)
> +
> #define cmpxchg(ptr, oldval, newval) \
> __sync_val_compare_and_swap(ptr, oldval, newval)
>
> @@ -70,6 +76,47 @@ static inline int atomic_cmpxchg(atomic_t *v, int oldval, int newval)
> return cmpxchg(&(v)->counter, oldval, newval);
> }
>
> +/**
> + * atomic_try_cmpxchg() - atomic compare and exchange with full ordering
> + * @v: pointer to atomic_t
> + * @old: pointer to int value to compare with
> + * @new: int value to assign
> + *
> + * If (@v == @old), atomically updates @v to @new with full ordering.
> + * Otherwise, @v is not modified, @old is updated to the current value of @v,
> + * and relaxed ordering is provided.
> + *
> + * Unsafe to use in noinstr code; use raw_atomic_try_cmpxchg() there.
> + *
> + * Return: @true if the exchange occured, @false otherwise.
> + */
> +static __always_inline bool
> +atomic_try_cmpxchg(atomic_t *v, int *old, int new)
> +{
> + int r, o = *old;
> + r = atomic_cmpxchg(v, o, new);
> + if (unlikely(r != o))
> + *old = r;
> + return likely(r == o);
> +}
> +
> +/**
> + * atomic_fetch_or() - atomic bitwise OR with full ordering
> + * @i: int value
> + * @v: pointer to atomic_t
> + *
> + * Atomically updates @v to (@v | @i) with full ordering.
> + *
> + * Unsafe to use in noinstr code; use raw_atomic_fetch_or() there.
> + *
> + * Return: The original value of @v.
> + */
> +static __always_inline int
> +atomic_fetch_or(int i, atomic_t *v)
> +{
> + return __sync_fetch_and_or(&v->counter, i);
> +}
> +
> static inline int test_and_set_bit(long nr, unsigned long *addr)
> {
> unsigned long mask = BIT_MASK(nr);
> diff --git a/tools/include/linux/atomic.h b/tools/include/linux/atomic.h
> index 01907b33537e..332a34177995 100644
> --- a/tools/include/linux/atomic.h
> +++ b/tools/include/linux/atomic.h
> @@ -12,4 +12,28 @@ void atomic_long_set(atomic_long_t *v, long i);
> #define atomic_cmpxchg_release atomic_cmpxchg
> #endif /* atomic_cmpxchg_relaxed */
>
> +#ifndef atomic_cmpxchg_acquire
> +#define atomic_cmpxchg_acquire atomic_cmpxchg
> +#endif
> +
> +#ifndef atomic_try_cmpxchg_acquire
> +#define atomic_try_cmpxchg_acquire atomic_try_cmpxchg
> +#endif
> +
> +#ifndef atomic_try_cmpxchg_relaxed
> +#define atomic_try_cmpxchg_relaxed atomic_try_cmpxchg
> +#endif
> +
> +#ifndef atomic_fetch_or_acquire
> +#define atomic_fetch_or_acquire atomic_fetch_or
> +#endif
> +
> +#ifndef xchg_relaxed
> +#define xchg_relaxed xchg
> +#endif
> +
> +#ifndef cmpxchg_release
> +#define cmpxchg_release cmpxchg
> +#endif
> +
> #endif /* __TOOLS_LINUX_ATOMIC_H */
> diff --git a/tools/include/linux/compiler_types.h b/tools/include/linux/compiler_types.h
> index d09f9dc172a4..9a2a2f8d7b6c 100644
> --- a/tools/include/linux/compiler_types.h
> +++ b/tools/include/linux/compiler_types.h
> @@ -31,6 +31,28 @@
> # define __cond_lock(x,c) (c)
> #endif /* __CHECKER__ */
>
> +/*
> + * __unqual_scalar_typeof(x) - Declare an unqualified scalar type, leaving
> + * non-scalar types unchanged.
> + */
> +/*
> + * Prefer C11 _Generic for better compile-times and simpler code. Note: 'char'
> + * is not type-compatible with 'signed char', and we define a separate case.
> + */
> +#define __scalar_type_to_expr_cases(type) \
> + unsigned type: (unsigned type)0, \
> + signed type: (signed type)0
> +
> +#define __unqual_scalar_typeof(x) typeof( \
> + _Generic((x), \
> + char: (char)0, \
> + __scalar_type_to_expr_cases(char), \
> + __scalar_type_to_expr_cases(short), \
> + __scalar_type_to_expr_cases(int), \
> + __scalar_type_to_expr_cases(long), \
> + __scalar_type_to_expr_cases(long long), \
> + default: (x)))
> +
> /* Compiler specific macros. */
> #ifdef __GNUC__
> #include <linux/compiler-gcc.h>
> @@ -40,4 +62,6 @@
> #define asm_goto_output(x...) asm goto(x)
> #endif
>
> +#define asm_inline asm
> +
> #endif /* __LINUX_COMPILER_TYPES_H */
> --
> 2.50.1.487.gc89ff58d15-goog
>
next prev parent reply other threads:[~2025-07-31 4:52 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-07-29 2:26 [PATCH v1 0/7] perf bench: Add qspinlock benchmark Yuzhuo Jing
2025-07-29 2:26 ` [PATCH v1 1/7] tools: Import cmpxchg and xchg functions Yuzhuo Jing
2025-07-31 4:52 ` Namhyung Kim [this message]
2025-08-08 6:11 ` kernel test robot
2025-07-29 2:26 ` [PATCH v1 2/7] tools: Import smp_cond_load and atomic_cond_read Yuzhuo Jing
2025-07-29 2:26 ` [PATCH v1 3/7] tools: Partial import of prefetch.h Yuzhuo Jing
2025-07-31 4:54 ` Namhyung Kim
2025-07-29 2:26 ` [PATCH v1 4/7] tools: Implement userspace per-cpu Yuzhuo Jing
2025-07-31 5:07 ` Namhyung Kim
2025-07-29 2:26 ` [PATCH v1 5/7] perf bench: Import qspinlock from kernel Yuzhuo Jing
2025-07-29 2:26 ` [PATCH v1 6/7] perf bench: Add 'bench sync qspinlock' subcommand Yuzhuo Jing
2025-07-31 5:16 ` Namhyung Kim
2025-07-31 13:19 ` Yuzhuo Jing
2025-07-29 2:26 ` [PATCH v1 7/7] perf bench sync: Add latency histogram functionality Yuzhuo Jing
2025-07-31 5:18 ` Namhyung Kim
2025-07-31 5:24 ` Namhyung Kim
2025-07-31 4:51 ` [PATCH v1 0/7] perf bench: Add qspinlock benchmark Namhyung Kim
2025-08-04 14:28 ` Mark Rutland
2025-09-16 14:18 ` Peter Zijlstra
2025-09-16 17:00 ` Ian Rogers
2025-09-16 20:38 ` Peter Zijlstra
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aIr2krdN4UXOniJ7@google.com \
--to=namhyung@kernel.org \
--cc=acme@kernel.org \
--cc=adrian.hunter@intel.com \
--cc=alexander.shishkin@linux.intel.com \
--cc=alexghiti@rivosinc.com \
--cc=ast@kernel.org \
--cc=bigeasy@linutronix.de \
--cc=brho@google.com \
--cc=charlie@rivosinc.com \
--cc=guoren@kernel.org \
--cc=irogers@google.com \
--cc=jolsa@kernel.org \
--cc=kan.liang@linux.intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-perf-users@vger.kernel.org \
--cc=mark.rutland@arm.com \
--cc=memxor@gmail.com \
--cc=mingo@redhat.com \
--cc=palmer@rivosinc.com \
--cc=parri.andrea@gmail.com \
--cc=peterz@infradead.org \
--cc=yuzhuo@google.com \
--cc=yzj@umich.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.