* [PATCH 01/20] x86/cpu: Use named asm operands in prefetch[w]()
2025-03-14 21:41 [PATCH 00/20] x86: Cleanup alternative_io() and friends, prep for asm_call() Josh Poimboeuf
@ 2025-03-14 21:41 ` Josh Poimboeuf
2025-03-14 21:41 ` [PATCH 02/20] x86/apic: Use named asm operands in native_apic_mem_write() Josh Poimboeuf
` (19 subsequent siblings)
20 siblings, 0 replies; 47+ messages in thread
From: Josh Poimboeuf @ 2025-03-14 21:41 UTC (permalink / raw)
To: x86
Cc: linux-kernel, Linus Torvalds, Peter Zijlstra, Borislav Petkov,
H. Peter Anvin, Uros Bizjak, Andrew Cooper, Ingo Molnar
Use named operands in preparation for removing the operand numbering
restrictions in alternative_input().
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
---
arch/x86/include/asm/processor.h | 14 +++++++-------
1 file changed, 7 insertions(+), 7 deletions(-)
diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 5d2f7e5aff26..2e9566134949 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -613,7 +613,7 @@ extern char ignore_fpu_irq;
# define BASE_PREFETCH ""
# define ARCH_HAS_PREFETCH
#else
-# define BASE_PREFETCH "prefetcht0 %1"
+# define BASE_PREFETCH "prefetcht0 %[val]"
#endif
/*
@@ -624,9 +624,9 @@ extern char ignore_fpu_irq;
*/
static inline void prefetch(const void *x)
{
- alternative_input(BASE_PREFETCH, "prefetchnta %1",
- X86_FEATURE_XMM,
- "m" (*(const char *)x));
+ alternative_input(BASE_PREFETCH,
+ "prefetchnta %[val]", X86_FEATURE_XMM,
+ [val] "m" (*(const char *)x));
}
/*
@@ -636,9 +636,9 @@ static inline void prefetch(const void *x)
*/
static __always_inline void prefetchw(const void *x)
{
- alternative_input(BASE_PREFETCH, "prefetchw %1",
- X86_FEATURE_3DNOWPREFETCH,
- "m" (*(const char *)x));
+ alternative_input(BASE_PREFETCH,
+ "prefetchw %[val]", X86_FEATURE_3DNOWPREFETCH,
+ [val] "m" (*(const char *)x));
}
#define TOP_OF_INIT_STACK ((unsigned long)&init_stack + sizeof(init_stack) - \
--
2.48.1
^ permalink raw reply related [flat|nested] 47+ messages in thread* [PATCH 02/20] x86/apic: Use named asm operands in native_apic_mem_write()
2025-03-14 21:41 [PATCH 00/20] x86: Cleanup alternative_io() and friends, prep for asm_call() Josh Poimboeuf
2025-03-14 21:41 ` [PATCH 01/20] x86/cpu: Use named asm operands in prefetch[w]() Josh Poimboeuf
@ 2025-03-14 21:41 ` Josh Poimboeuf
2025-03-14 21:41 ` [PATCH 03/20] x86/mm: Use named asm operands in task_size_max() Josh Poimboeuf
` (18 subsequent siblings)
20 siblings, 0 replies; 47+ messages in thread
From: Josh Poimboeuf @ 2025-03-14 21:41 UTC (permalink / raw)
To: x86
Cc: linux-kernel, Linus Torvalds, Peter Zijlstra, Borislav Petkov,
H. Peter Anvin, Uros Bizjak, Andrew Cooper, Ingo Molnar
Use named operands in preparation for removing the operand numbering
restrictions in alternative_io().
Note this includes removing the "0" input constraint and converting its
corresponding output constraint from "=r" to "+r". While at it, do the
same for the memory constraint.
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
---
arch/x86/include/asm/apic.h | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/arch/x86/include/asm/apic.h b/arch/x86/include/asm/apic.h
index c903d358405d..ecf1b229f09b 100644
--- a/arch/x86/include/asm/apic.h
+++ b/arch/x86/include/asm/apic.h
@@ -98,9 +98,9 @@ static inline void native_apic_mem_write(u32 reg, u32 v)
{
volatile u32 *addr = (volatile u32 *)(APIC_BASE + reg);
- alternative_io("movl %0, %1", "xchgl %0, %1", X86_BUG_11AP,
- ASM_OUTPUT("=r" (v), "=m" (*addr)),
- ASM_INPUT("0" (v), "m" (*addr)));
+ alternative_io("movl %[val], %[mem]",
+ "xchgl %[val], %[mem]", X86_BUG_11AP,
+ ASM_OUTPUT([val] "+r" (v), [mem] "+m" (*addr)));
}
static inline u32 native_apic_mem_read(u32 reg)
--
2.48.1
^ permalink raw reply related [flat|nested] 47+ messages in thread* [PATCH 03/20] x86/mm: Use named asm operands in task_size_max()
2025-03-14 21:41 [PATCH 00/20] x86: Cleanup alternative_io() and friends, prep for asm_call() Josh Poimboeuf
2025-03-14 21:41 ` [PATCH 01/20] x86/cpu: Use named asm operands in prefetch[w]() Josh Poimboeuf
2025-03-14 21:41 ` [PATCH 02/20] x86/apic: Use named asm operands in native_apic_mem_write() Josh Poimboeuf
@ 2025-03-14 21:41 ` Josh Poimboeuf
2025-03-14 21:41 ` [PATCH 04/20] x86/cpu: Use named asm operands in clflushopt() Josh Poimboeuf
` (17 subsequent siblings)
20 siblings, 0 replies; 47+ messages in thread
From: Josh Poimboeuf @ 2025-03-14 21:41 UTC (permalink / raw)
To: x86
Cc: linux-kernel, Linus Torvalds, Peter Zijlstra, Borislav Petkov,
H. Peter Anvin, Uros Bizjak, Andrew Cooper, Ingo Molnar
Use named operands in preparation for removing the operand numbering
restrictions in alternative_io().
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
---
arch/x86/include/asm/page_64.h | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/arch/x86/include/asm/page_64.h b/arch/x86/include/asm/page_64.h
index b5279f5d5601..db3003acd41e 100644
--- a/arch/x86/include/asm/page_64.h
+++ b/arch/x86/include/asm/page_64.h
@@ -85,9 +85,9 @@ static __always_inline unsigned long task_size_max(void)
{
unsigned long ret;
- alternative_io("movq %[small],%0","movq %[large],%0",
- X86_FEATURE_LA57,
- "=r" (ret),
+ alternative_io("movq %[small], %[ret]",
+ "movq %[large], %[ret]", X86_FEATURE_LA57,
+ [ret] "=r" (ret),
[small] "i" ((1ul << 47)-PAGE_SIZE),
[large] "i" ((1ul << 56)-PAGE_SIZE));
--
2.48.1
^ permalink raw reply related [flat|nested] 47+ messages in thread* [PATCH 04/20] x86/cpu: Use named asm operands in clflushopt()
2025-03-14 21:41 [PATCH 00/20] x86: Cleanup alternative_io() and friends, prep for asm_call() Josh Poimboeuf
` (2 preceding siblings ...)
2025-03-14 21:41 ` [PATCH 03/20] x86/mm: Use named asm operands in task_size_max() Josh Poimboeuf
@ 2025-03-14 21:41 ` Josh Poimboeuf
2025-03-14 23:46 ` Linus Torvalds
2025-03-14 21:41 ` [PATCH 05/20] x86/asm: Always use flag output operands Josh Poimboeuf
` (16 subsequent siblings)
20 siblings, 1 reply; 47+ messages in thread
From: Josh Poimboeuf @ 2025-03-14 21:41 UTC (permalink / raw)
To: x86
Cc: linux-kernel, Linus Torvalds, Peter Zijlstra, Borislav Petkov,
H. Peter Anvin, Uros Bizjak, Andrew Cooper, Ingo Molnar
Use named operands in preparation for removing the operand numbering
restrictions in alternative_io().
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
---
arch/x86/include/asm/special_insns.h | 7 +++----
1 file changed, 3 insertions(+), 4 deletions(-)
diff --git a/arch/x86/include/asm/special_insns.h b/arch/x86/include/asm/special_insns.h
index 21ce480658b1..b905076cf7f6 100644
--- a/arch/x86/include/asm/special_insns.h
+++ b/arch/x86/include/asm/special_insns.h
@@ -176,10 +176,9 @@ static __always_inline void clflush(volatile void *__p)
static inline void clflushopt(volatile void *__p)
{
- alternative_io(".byte 0x3e; clflush %0",
- ".byte 0x66; clflush %0",
- X86_FEATURE_CLFLUSHOPT,
- "+m" (*(volatile char __force *)__p));
+ alternative_io(".byte 0x3e; clflush %[val]",
+ ".byte 0x66; clflush %[val]", X86_FEATURE_CLFLUSHOPT,
+ [val] "+m" (*(volatile char __force *)__p));
}
static inline void clwb(volatile void *__p)
--
2.48.1
^ permalink raw reply related [flat|nested] 47+ messages in thread* Re: [PATCH 04/20] x86/cpu: Use named asm operands in clflushopt()
2025-03-14 21:41 ` [PATCH 04/20] x86/cpu: Use named asm operands in clflushopt() Josh Poimboeuf
@ 2025-03-14 23:46 ` Linus Torvalds
2025-03-15 0:07 ` Josh Poimboeuf
0 siblings, 1 reply; 47+ messages in thread
From: Linus Torvalds @ 2025-03-14 23:46 UTC (permalink / raw)
To: Josh Poimboeuf
Cc: x86, linux-kernel, Peter Zijlstra, Borislav Petkov,
H. Peter Anvin, Uros Bizjak, Andrew Cooper, Ingo Molnar
On Fri, 14 Mar 2025 at 11:42, Josh Poimboeuf <jpoimboe@kernel.org> wrote:
>
> + alternative_io(".byte 0x3e; clflush %[val]",
> + ".byte 0x66; clflush %[val]", X86_FEATURE_CLFLUSHOPT,
> + [val] "+m" (*(volatile char __force *)__p));
Hmm. I think we could just use 'clflushopt', it looks like it exists
in binutils-2.25, which is our minimal version requirement.
But maybe that's a separate cleanup.
Linus
^ permalink raw reply [flat|nested] 47+ messages in thread* Re: [PATCH 04/20] x86/cpu: Use named asm operands in clflushopt()
2025-03-14 23:46 ` Linus Torvalds
@ 2025-03-15 0:07 ` Josh Poimboeuf
2025-03-15 8:42 ` Ingo Molnar
0 siblings, 1 reply; 47+ messages in thread
From: Josh Poimboeuf @ 2025-03-15 0:07 UTC (permalink / raw)
To: Linus Torvalds
Cc: x86, linux-kernel, Peter Zijlstra, Borislav Petkov,
H. Peter Anvin, Uros Bizjak, Andrew Cooper, Ingo Molnar
On Fri, Mar 14, 2025 at 01:46:00PM -1000, Linus Torvalds wrote:
> On Fri, 14 Mar 2025 at 11:42, Josh Poimboeuf <jpoimboe@kernel.org> wrote:
> >
> > + alternative_io(".byte 0x3e; clflush %[val]",
> > + ".byte 0x66; clflush %[val]", X86_FEATURE_CLFLUSHOPT,
> > + [val] "+m" (*(volatile char __force *)__p));
>
> Hmm. I think we could just use 'clflushopt', it looks like it exists
> in binutils-2.25, which is our minimal version requirement.
>
> But maybe that's a separate cleanup.
You appear to be correct, I'll add a patch for that.
--
Josh
^ permalink raw reply [flat|nested] 47+ messages in thread* Re: [PATCH 04/20] x86/cpu: Use named asm operands in clflushopt()
2025-03-15 0:07 ` Josh Poimboeuf
@ 2025-03-15 8:42 ` Ingo Molnar
2025-03-15 9:25 ` Ingo Molnar
0 siblings, 1 reply; 47+ messages in thread
From: Ingo Molnar @ 2025-03-15 8:42 UTC (permalink / raw)
To: Josh Poimboeuf
Cc: Linus Torvalds, x86, linux-kernel, Peter Zijlstra,
Borislav Petkov, H. Peter Anvin, Uros Bizjak, Andrew Cooper
* Josh Poimboeuf <jpoimboe@kernel.org> wrote:
> On Fri, Mar 14, 2025 at 01:46:00PM -1000, Linus Torvalds wrote:
> > On Fri, 14 Mar 2025 at 11:42, Josh Poimboeuf <jpoimboe@kernel.org> wrote:
> > >
> > > + alternative_io(".byte 0x3e; clflush %[val]",
> > > + ".byte 0x66; clflush %[val]", X86_FEATURE_CLFLUSHOPT,
> > > + [val] "+m" (*(volatile char __force *)__p));
> >
> > Hmm. I think we could just use 'clflushopt', it looks like it exists
> > in binutils-2.25, which is our minimal version requirement.
> >
> > But maybe that's a separate cleanup.
>
> You appear to be correct, I'll add a patch for that.
Please base your series on tip:master or tip:x86/asm, we already
cleaned this up recently:
cc2e9e09d1a3 ("x86/asm: Use CLFLUSHOPT and CLWB mnemonics in <asm/special_insns.h>")
Thanks,
Ingo
^ permalink raw reply [flat|nested] 47+ messages in thread* Re: [PATCH 04/20] x86/cpu: Use named asm operands in clflushopt()
2025-03-15 8:42 ` Ingo Molnar
@ 2025-03-15 9:25 ` Ingo Molnar
0 siblings, 0 replies; 47+ messages in thread
From: Ingo Molnar @ 2025-03-15 9:25 UTC (permalink / raw)
To: Josh Poimboeuf
Cc: Linus Torvalds, x86, linux-kernel, Peter Zijlstra,
Borislav Petkov, H. Peter Anvin, Uros Bizjak, Andrew Cooper
* Ingo Molnar <mingo@kernel.org> wrote:
>
> * Josh Poimboeuf <jpoimboe@kernel.org> wrote:
>
> > On Fri, Mar 14, 2025 at 01:46:00PM -1000, Linus Torvalds wrote:
> > > On Fri, 14 Mar 2025 at 11:42, Josh Poimboeuf <jpoimboe@kernel.org> wrote:
> > > >
> > > > + alternative_io(".byte 0x3e; clflush %[val]",
> > > > + ".byte 0x66; clflush %[val]", X86_FEATURE_CLFLUSHOPT,
> > > > + [val] "+m" (*(volatile char __force *)__p));
> > >
> > > Hmm. I think we could just use 'clflushopt', it looks like it exists
> > > in binutils-2.25, which is our minimal version requirement.
> > >
> > > But maybe that's a separate cleanup.
> >
> > You appear to be correct, I'll add a patch for that.
>
> Please base your series on tip:master or tip:x86/asm, we already
> cleaned this up recently:
>
> cc2e9e09d1a3 ("x86/asm: Use CLFLUSHOPT and CLWB mnemonics in <asm/special_insns.h>")
Also, as a general principle I'd prefer to have .byte conversion to
mnemonics and named-operands conversions to be separate patches - not
all named-operands conversions are an improvement, especially very
short and simple asm() templates are just fine using numeric operands.
Thanks,
Ingo
^ permalink raw reply [flat|nested] 47+ messages in thread
* [PATCH 05/20] x86/asm: Always use flag output operands
2025-03-14 21:41 [PATCH 00/20] x86: Cleanup alternative_io() and friends, prep for asm_call() Josh Poimboeuf
` (3 preceding siblings ...)
2025-03-14 21:41 ` [PATCH 04/20] x86/cpu: Use named asm operands in clflushopt() Josh Poimboeuf
@ 2025-03-14 21:41 ` Josh Poimboeuf
2025-03-14 21:41 ` [PATCH 06/20] x86/asm: Remove CC_SET() Josh Poimboeuf
` (15 subsequent siblings)
20 siblings, 0 replies; 47+ messages in thread
From: Josh Poimboeuf @ 2025-03-14 21:41 UTC (permalink / raw)
To: x86
Cc: linux-kernel, Linus Torvalds, Peter Zijlstra, Borislav Petkov,
H. Peter Anvin, Uros Bizjak, Andrew Cooper, Ingo Molnar
On x86, __GCC_ASM_FLAG_OUTPUTS__ is supported starting with GCC 6.0 and
Clang 9.
Now that the GCC minimum version on x86 has been bumped to 8.1 with the
following commit:
commit a3e8fe814ad1 ("x86/build: Raise the minimum GCC version to 8.1")
the flag output operand support can be assumed everywhere.
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
---
arch/x86/include/asm/asm.h | 9 ++-------
arch/x86/include/asm/rmwcc.h | 22 ----------------------
tools/arch/x86/include/asm/asm.h | 5 -----
tools/perf/bench/find-bit-bench.c | 4 ----
4 files changed, 2 insertions(+), 38 deletions(-)
diff --git a/arch/x86/include/asm/asm.h b/arch/x86/include/asm/asm.h
index 975ae7a9397e..fdebd4356860 100644
--- a/arch/x86/include/asm/asm.h
+++ b/arch/x86/include/asm/asm.h
@@ -131,13 +131,8 @@ static __always_inline __pure void *rip_rel_ptr(void *p)
* Macros to generate condition code outputs from inline assembly,
* The output operand must be type "bool".
*/
-#ifdef __GCC_ASM_FLAG_OUTPUTS__
-# define CC_SET(c) "\n\t/* output condition code " #c "*/\n"
-# define CC_OUT(c) "=@cc" #c
-#else
-# define CC_SET(c) "\n\tset" #c " %[_cc_" #c "]\n"
-# define CC_OUT(c) [_cc_ ## c] "=qm"
-#endif
+#define CC_SET(c) "\n\t/* output condition code " #c "*/\n"
+#define CC_OUT(c) "=@cc" #c
#ifdef __KERNEL__
diff --git a/arch/x86/include/asm/rmwcc.h b/arch/x86/include/asm/rmwcc.h
index 363266cbcada..a54303e3dfa1 100644
--- a/arch/x86/include/asm/rmwcc.h
+++ b/arch/x86/include/asm/rmwcc.h
@@ -6,26 +6,6 @@
#define __CLOBBERS_MEM(clb...) "memory", ## clb
-#ifndef __GCC_ASM_FLAG_OUTPUTS__
-
-/* Use asm goto */
-
-#define __GEN_RMWcc(fullop, _var, cc, clobbers, ...) \
-({ \
- bool c = false; \
- asm goto (fullop "; j" #cc " %l[cc_label]" \
- : : [var] "m" (_var), ## __VA_ARGS__ \
- : clobbers : cc_label); \
- if (0) { \
-cc_label: c = true; \
- } \
- c; \
-})
-
-#else /* defined(__GCC_ASM_FLAG_OUTPUTS__) */
-
-/* Use flags output or a set instruction */
-
#define __GEN_RMWcc(fullop, _var, cc, clobbers, ...) \
({ \
bool c; \
@@ -35,8 +15,6 @@ cc_label: c = true; \
c; \
})
-#endif /* defined(__GCC_ASM_FLAG_OUTPUTS__) */
-
#define GEN_UNARY_RMWcc_4(op, var, cc, arg0) \
__GEN_RMWcc(op " " arg0, var, cc, __CLOBBERS_MEM())
diff --git a/tools/arch/x86/include/asm/asm.h b/tools/arch/x86/include/asm/asm.h
index 3ad3da9a7d97..f66cf34f6197 100644
--- a/tools/arch/x86/include/asm/asm.h
+++ b/tools/arch/x86/include/asm/asm.h
@@ -112,13 +112,8 @@
* Macros to generate condition code outputs from inline assembly,
* The output operand must be type "bool".
*/
-#ifdef __GCC_ASM_FLAG_OUTPUTS__
# define CC_SET(c) "\n\t/* output condition code " #c "*/\n"
# define CC_OUT(c) "=@cc" #c
-#else
-# define CC_SET(c) "\n\tset" #c " %[_cc_" #c "]\n"
-# define CC_OUT(c) [_cc_ ## c] "=qm"
-#endif
#ifdef __KERNEL__
diff --git a/tools/perf/bench/find-bit-bench.c b/tools/perf/bench/find-bit-bench.c
index 7e25b0e413f6..99d36dff9d86 100644
--- a/tools/perf/bench/find-bit-bench.c
+++ b/tools/perf/bench/find-bit-bench.c
@@ -37,7 +37,6 @@ static noinline void workload(int val)
accumulator++;
}
-#if (defined(__i386__) || defined(__x86_64__)) && defined(__GCC_ASM_FLAG_OUTPUTS__)
static bool asm_test_bit(long nr, const unsigned long *addr)
{
bool oldbit;
@@ -48,9 +47,6 @@ static bool asm_test_bit(long nr, const unsigned long *addr)
return oldbit;
}
-#else
-#define asm_test_bit test_bit
-#endif
static int do_for_each_set_bit(unsigned int num_bits)
{
--
2.48.1
^ permalink raw reply related [flat|nested] 47+ messages in thread* [PATCH 06/20] x86/asm: Remove CC_SET()
2025-03-14 21:41 [PATCH 00/20] x86: Cleanup alternative_io() and friends, prep for asm_call() Josh Poimboeuf
` (4 preceding siblings ...)
2025-03-14 21:41 ` [PATCH 05/20] x86/asm: Always use flag output operands Josh Poimboeuf
@ 2025-03-14 21:41 ` Josh Poimboeuf
2025-03-15 9:25 ` Uros Bizjak
2025-03-14 21:41 ` [PATCH 07/20] x86/alternative: Remove operand numbering restrictions Josh Poimboeuf
` (14 subsequent siblings)
20 siblings, 1 reply; 47+ messages in thread
From: Josh Poimboeuf @ 2025-03-14 21:41 UTC (permalink / raw)
To: x86
Cc: linux-kernel, Linus Torvalds, Peter Zijlstra, Borislav Petkov,
H. Peter Anvin, Uros Bizjak, Andrew Cooper, Ingo Molnar
Now that flag output operands are unconditionally supported, CC_SET() is
just a comment. Remove it.
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
---
arch/x86/boot/bitops.h | 2 +-
arch/x86/boot/boot.h | 4 ++--
arch/x86/boot/string.c | 2 +-
arch/x86/include/asm/archrandom.h | 2 --
arch/x86/include/asm/asm.h | 3 +--
arch/x86/include/asm/bitops.h | 6 ------
arch/x86/include/asm/cmpxchg.h | 4 ----
arch/x86/include/asm/cmpxchg_32.h | 2 --
arch/x86/include/asm/cmpxchg_64.h | 1 -
arch/x86/include/asm/percpu.h | 4 ----
arch/x86/include/asm/rmwcc.h | 2 +-
arch/x86/include/asm/sev.h | 1 -
arch/x86/include/asm/signal.h | 2 +-
arch/x86/include/asm/special_insns.h | 1 -
arch/x86/include/asm/uaccess.h | 1 -
tools/arch/x86/include/asm/asm.h | 5 ++---
16 files changed, 9 insertions(+), 33 deletions(-)
diff --git a/arch/x86/boot/bitops.h b/arch/x86/boot/bitops.h
index 8518ae214c9b..4f773e0957b0 100644
--- a/arch/x86/boot/bitops.h
+++ b/arch/x86/boot/bitops.h
@@ -27,7 +27,7 @@ static inline bool variable_test_bit(int nr, const void *addr)
bool v;
const u32 *p = addr;
- asm("btl %2,%1" CC_SET(c) : CC_OUT(c) (v) : "m" (*p), "Ir" (nr));
+ asm("btl %2,%1" : CC_OUT(c) (v) : "m" (*p), "Ir" (nr));
return v;
}
diff --git a/arch/x86/boot/boot.h b/arch/x86/boot/boot.h
index 0f24f7ebec9b..a35823039847 100644
--- a/arch/x86/boot/boot.h
+++ b/arch/x86/boot/boot.h
@@ -155,14 +155,14 @@ static inline void wrgs32(u32 v, addr_t addr)
static inline bool memcmp_fs(const void *s1, addr_t s2, size_t len)
{
bool diff;
- asm volatile("fs; repe; cmpsb" CC_SET(nz)
+ asm volatile("fs; repe; cmpsb"
: CC_OUT(nz) (diff), "+D" (s1), "+S" (s2), "+c" (len));
return diff;
}
static inline bool memcmp_gs(const void *s1, addr_t s2, size_t len)
{
bool diff;
- asm volatile("gs; repe; cmpsb" CC_SET(nz)
+ asm volatile("gs; repe; cmpsb"
: CC_OUT(nz) (diff), "+D" (s1), "+S" (s2), "+c" (len));
return diff;
}
diff --git a/arch/x86/boot/string.c b/arch/x86/boot/string.c
index 84f7a883ce1e..c0cc6d1a7030 100644
--- a/arch/x86/boot/string.c
+++ b/arch/x86/boot/string.c
@@ -32,7 +32,7 @@
int memcmp(const void *s1, const void *s2, size_t len)
{
bool diff;
- asm("repe; cmpsb" CC_SET(nz)
+ asm("repe; cmpsb"
: CC_OUT(nz) (diff), "+D" (s1), "+S" (s2), "+c" (len));
return diff;
}
diff --git a/arch/x86/include/asm/archrandom.h b/arch/x86/include/asm/archrandom.h
index 02bae8e0758b..813a99b6ec7b 100644
--- a/arch/x86/include/asm/archrandom.h
+++ b/arch/x86/include/asm/archrandom.h
@@ -23,7 +23,6 @@ static inline bool __must_check rdrand_long(unsigned long *v)
unsigned int retry = RDRAND_RETRY_LOOPS;
do {
asm volatile("rdrand %[out]"
- CC_SET(c)
: CC_OUT(c) (ok), [out] "=r" (*v));
if (ok)
return true;
@@ -35,7 +34,6 @@ static inline bool __must_check rdseed_long(unsigned long *v)
{
bool ok;
asm volatile("rdseed %[out]"
- CC_SET(c)
: CC_OUT(c) (ok), [out] "=r" (*v));
return ok;
}
diff --git a/arch/x86/include/asm/asm.h b/arch/x86/include/asm/asm.h
index fdebd4356860..619817841f4c 100644
--- a/arch/x86/include/asm/asm.h
+++ b/arch/x86/include/asm/asm.h
@@ -128,10 +128,9 @@ static __always_inline __pure void *rip_rel_ptr(void *p)
#endif
/*
- * Macros to generate condition code outputs from inline assembly,
+ * Generate condition code outputs from inline assembly.
* The output operand must be type "bool".
*/
-#define CC_SET(c) "\n\t/* output condition code " #c "*/\n"
#define CC_OUT(c) "=@cc" #c
#ifdef __KERNEL__
diff --git a/arch/x86/include/asm/bitops.h b/arch/x86/include/asm/bitops.h
index b96d45944c59..67b86e7c1ea3 100644
--- a/arch/x86/include/asm/bitops.h
+++ b/arch/x86/include/asm/bitops.h
@@ -99,7 +99,6 @@ static __always_inline bool arch_xor_unlock_is_negative_byte(unsigned long mask,
{
bool negative;
asm volatile(LOCK_PREFIX "xorb %2,%1"
- CC_SET(s)
: CC_OUT(s) (negative), WBYTE_ADDR(addr)
: "iq" ((char)mask) : "memory");
return negative;
@@ -149,7 +148,6 @@ arch___test_and_set_bit(unsigned long nr, volatile unsigned long *addr)
bool oldbit;
asm(__ASM_SIZE(bts) " %2,%1"
- CC_SET(c)
: CC_OUT(c) (oldbit)
: ADDR, "Ir" (nr) : "memory");
return oldbit;
@@ -175,7 +173,6 @@ arch___test_and_clear_bit(unsigned long nr, volatile unsigned long *addr)
bool oldbit;
asm volatile(__ASM_SIZE(btr) " %2,%1"
- CC_SET(c)
: CC_OUT(c) (oldbit)
: ADDR, "Ir" (nr) : "memory");
return oldbit;
@@ -187,7 +184,6 @@ arch___test_and_change_bit(unsigned long nr, volatile unsigned long *addr)
bool oldbit;
asm volatile(__ASM_SIZE(btc) " %2,%1"
- CC_SET(c)
: CC_OUT(c) (oldbit)
: ADDR, "Ir" (nr) : "memory");
@@ -211,7 +207,6 @@ static __always_inline bool constant_test_bit_acquire(long nr, const volatile un
bool oldbit;
asm volatile("testb %2,%1"
- CC_SET(nz)
: CC_OUT(nz) (oldbit)
: "m" (((unsigned char *)addr)[nr >> 3]),
"i" (1 << (nr & 7))
@@ -225,7 +220,6 @@ static __always_inline bool variable_test_bit(long nr, volatile const unsigned l
bool oldbit;
asm volatile(__ASM_SIZE(bt) " %2,%1"
- CC_SET(c)
: CC_OUT(c) (oldbit)
: "m" (*(unsigned long *)addr), "Ir" (nr) : "memory");
diff --git a/arch/x86/include/asm/cmpxchg.h b/arch/x86/include/asm/cmpxchg.h
index fd8afc1f5f6b..e801dc982a64 100644
--- a/arch/x86/include/asm/cmpxchg.h
+++ b/arch/x86/include/asm/cmpxchg.h
@@ -166,7 +166,6 @@ extern void __add_wrong_size(void)
{ \
volatile u8 *__ptr = (volatile u8 *)(_ptr); \
asm volatile(lock "cmpxchgb %[new], %[ptr]" \
- CC_SET(z) \
: CC_OUT(z) (success), \
[ptr] "+m" (*__ptr), \
[old] "+a" (__old) \
@@ -178,7 +177,6 @@ extern void __add_wrong_size(void)
{ \
volatile u16 *__ptr = (volatile u16 *)(_ptr); \
asm volatile(lock "cmpxchgw %[new], %[ptr]" \
- CC_SET(z) \
: CC_OUT(z) (success), \
[ptr] "+m" (*__ptr), \
[old] "+a" (__old) \
@@ -190,7 +188,6 @@ extern void __add_wrong_size(void)
{ \
volatile u32 *__ptr = (volatile u32 *)(_ptr); \
asm volatile(lock "cmpxchgl %[new], %[ptr]" \
- CC_SET(z) \
: CC_OUT(z) (success), \
[ptr] "+m" (*__ptr), \
[old] "+a" (__old) \
@@ -202,7 +199,6 @@ extern void __add_wrong_size(void)
{ \
volatile u64 *__ptr = (volatile u64 *)(_ptr); \
asm volatile(lock "cmpxchgq %[new], %[ptr]" \
- CC_SET(z) \
: CC_OUT(z) (success), \
[ptr] "+m" (*__ptr), \
[old] "+a" (__old) \
diff --git a/arch/x86/include/asm/cmpxchg_32.h b/arch/x86/include/asm/cmpxchg_32.h
index 6d7afee2fa07..50f973ca635d 100644
--- a/arch/x86/include/asm/cmpxchg_32.h
+++ b/arch/x86/include/asm/cmpxchg_32.h
@@ -46,7 +46,6 @@ static __always_inline u64 __cmpxchg64_local(volatile u64 *ptr, u64 old, u64 new
bool ret; \
\
asm volatile(_lock "cmpxchg8b %[ptr]" \
- CC_SET(e) \
: CC_OUT(e) (ret), \
[ptr] "+m" (*(_ptr)), \
"+a" (o.low), "+d" (o.high) \
@@ -125,7 +124,6 @@ static __always_inline u64 arch_cmpxchg64_local(volatile u64 *ptr, u64 old, u64
ALTERNATIVE(_lock_loc \
"call cmpxchg8b_emu", \
_lock "cmpxchg8b %a[ptr]", X86_FEATURE_CX8) \
- CC_SET(e) \
: ALT_OUTPUT_SP(CC_OUT(e) (ret), \
"+a" (o.low), "+d" (o.high)) \
: "b" (n.low), "c" (n.high), \
diff --git a/arch/x86/include/asm/cmpxchg_64.h b/arch/x86/include/asm/cmpxchg_64.h
index 5e241306db26..03ab7699648c 100644
--- a/arch/x86/include/asm/cmpxchg_64.h
+++ b/arch/x86/include/asm/cmpxchg_64.h
@@ -66,7 +66,6 @@ static __always_inline u128 arch_cmpxchg128_local(volatile u128 *ptr, u128 old,
bool ret; \
\
asm volatile(_lock "cmpxchg16b %[ptr]" \
- CC_SET(e) \
: CC_OUT(e) (ret), \
[ptr] "+m" (*(_ptr)), \
"+a" (o.low), "+d" (o.high) \
diff --git a/arch/x86/include/asm/percpu.h b/arch/x86/include/asm/percpu.h
index 462d071c87d4..9c95f2576df1 100644
--- a/arch/x86/include/asm/percpu.h
+++ b/arch/x86/include/asm/percpu.h
@@ -294,7 +294,6 @@ do { \
\
asm qual (__pcpu_op_##size("cmpxchg") "%[nval], " \
__percpu_arg([var]) \
- CC_SET(z) \
: CC_OUT(z) (success), \
[oval] "+a" (pco_old__), \
[var] "+m" (__my_cpu_var(_var)) \
@@ -352,7 +351,6 @@ do { \
asm_inline qual ( \
ALTERNATIVE("call this_cpu_cmpxchg8b_emu", \
"cmpxchg8b " __percpu_arg([var]), X86_FEATURE_CX8) \
- CC_SET(z) \
: ALT_OUTPUT_SP(CC_OUT(z) (success), \
[var] "+m" (__my_cpu_var(_var)), \
"+a" (old__.low), "+d" (old__.high)) \
@@ -421,7 +419,6 @@ do { \
asm_inline qual ( \
ALTERNATIVE("call this_cpu_cmpxchg16b_emu", \
"cmpxchg16b " __percpu_arg([var]), X86_FEATURE_CX16) \
- CC_SET(z) \
: ALT_OUTPUT_SP(CC_OUT(z) (success), \
[var] "+m" (__my_cpu_var(_var)), \
"+a" (old__.low), "+d" (old__.high)) \
@@ -570,7 +567,6 @@ do { \
bool oldbit; \
\
asm volatile("btl %[nr], " __percpu_arg([var]) \
- CC_SET(c) \
: CC_OUT(c) (oldbit) \
: [var] "m" (__my_cpu_var(_var)), \
[nr] "rI" (_nr)); \
diff --git a/arch/x86/include/asm/rmwcc.h b/arch/x86/include/asm/rmwcc.h
index a54303e3dfa1..081311e22438 100644
--- a/arch/x86/include/asm/rmwcc.h
+++ b/arch/x86/include/asm/rmwcc.h
@@ -9,7 +9,7 @@
#define __GEN_RMWcc(fullop, _var, cc, clobbers, ...) \
({ \
bool c; \
- asm volatile (fullop CC_SET(cc) \
+ asm volatile (fullop \
: [var] "+m" (_var), CC_OUT(cc) (c) \
: __VA_ARGS__ : clobbers); \
c; \
diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
index ba7999f66abe..5bd058d3e133 100644
--- a/arch/x86/include/asm/sev.h
+++ b/arch/x86/include/asm/sev.h
@@ -440,7 +440,6 @@ static inline int pvalidate(unsigned long vaddr, bool rmp_psize, bool validate)
/* "pvalidate" mnemonic support in binutils 2.36 and newer */
asm volatile(".byte 0xF2, 0x0F, 0x01, 0xFF\n\t"
- CC_SET(c)
: CC_OUT(c) (no_rmpupdate), "=a"(rc)
: "a"(vaddr), "c"(rmp_psize), "d"(validate)
: "memory", "cc");
diff --git a/arch/x86/include/asm/signal.h b/arch/x86/include/asm/signal.h
index 4a4043ca6493..e0d37bf00f27 100644
--- a/arch/x86/include/asm/signal.h
+++ b/arch/x86/include/asm/signal.h
@@ -83,7 +83,7 @@ static inline int __const_sigismember(sigset_t *set, int _sig)
static inline int __gen_sigismember(sigset_t *set, int _sig)
{
bool ret;
- asm("btl %2,%1" CC_SET(c)
+ asm("btl %2,%1"
: CC_OUT(c) (ret) : "m"(*set), "Ir"(_sig-1));
return ret;
}
diff --git a/arch/x86/include/asm/special_insns.h b/arch/x86/include/asm/special_insns.h
index b905076cf7f6..9c1cc0ef8f3c 100644
--- a/arch/x86/include/asm/special_insns.h
+++ b/arch/x86/include/asm/special_insns.h
@@ -274,7 +274,6 @@ static inline int enqcmds(void __iomem *dst, const void *src)
* See movdir64b()'s comment on operand specification.
*/
asm volatile(".byte 0xf3, 0x0f, 0x38, 0xf8, 0x02, 0x66, 0x90"
- CC_SET(z)
: CC_OUT(z) (zf), "+m" (*__dst)
: "m" (*__src), "a" (__dst), "d" (__src));
diff --git a/arch/x86/include/asm/uaccess.h b/arch/x86/include/asm/uaccess.h
index 3a7755c1a441..c37063121aaa 100644
--- a/arch/x86/include/asm/uaccess.h
+++ b/arch/x86/include/asm/uaccess.h
@@ -417,7 +417,6 @@ do { \
__typeof__(*(_ptr)) __new = (_new); \
asm volatile("\n" \
"1: " LOCK_PREFIX "cmpxchg"itype" %[new], %[ptr]\n"\
- CC_SET(z) \
"2:\n" \
_ASM_EXTABLE_TYPE_REG(1b, 2b, EX_TYPE_EFAULT_REG, \
%[errout]) \
diff --git a/tools/arch/x86/include/asm/asm.h b/tools/arch/x86/include/asm/asm.h
index f66cf34f6197..b97b3b53045f 100644
--- a/tools/arch/x86/include/asm/asm.h
+++ b/tools/arch/x86/include/asm/asm.h
@@ -109,11 +109,10 @@
#endif
/*
- * Macros to generate condition code outputs from inline assembly,
+ * Generate condition code outputs from inline assembly.
* The output operand must be type "bool".
*/
-# define CC_SET(c) "\n\t/* output condition code " #c "*/\n"
-# define CC_OUT(c) "=@cc" #c
+#define CC_OUT(c) "=@cc" #c
#ifdef __KERNEL__
--
2.48.1
^ permalink raw reply related [flat|nested] 47+ messages in thread* Re: [PATCH 06/20] x86/asm: Remove CC_SET()
2025-03-14 21:41 ` [PATCH 06/20] x86/asm: Remove CC_SET() Josh Poimboeuf
@ 2025-03-15 9:25 ` Uros Bizjak
0 siblings, 0 replies; 47+ messages in thread
From: Uros Bizjak @ 2025-03-15 9:25 UTC (permalink / raw)
To: Josh Poimboeuf
Cc: x86, linux-kernel, Linus Torvalds, Peter Zijlstra,
Borislav Petkov, H. Peter Anvin, Andrew Cooper, Ingo Molnar
On Fri, Mar 14, 2025 at 10:41 PM Josh Poimboeuf <jpoimboe@kernel.org> wrote:
>
> Now that flag output operands are unconditionally supported, CC_SET() is
> just a comment. Remove it.
Can you also replace CC_OUT with "=@cc"? CC_SET and CC_OUT were used
together to handle compilers without flag outputs support, and they
should die together. This change should be a simple string
replacement.
Thanks,
Uros.
^ permalink raw reply [flat|nested] 47+ messages in thread
* [PATCH 07/20] x86/alternative: Remove operand numbering restrictions
2025-03-14 21:41 [PATCH 00/20] x86: Cleanup alternative_io() and friends, prep for asm_call() Josh Poimboeuf
` (5 preceding siblings ...)
2025-03-14 21:41 ` [PATCH 06/20] x86/asm: Remove CC_SET() Josh Poimboeuf
@ 2025-03-14 21:41 ` Josh Poimboeuf
2025-03-14 21:41 ` [PATCH 08/20] x86/asm: Replace ASM_{OUTPUT,INPUT}() with ARG() Josh Poimboeuf
` (13 subsequent siblings)
20 siblings, 0 replies; 47+ messages in thread
From: Josh Poimboeuf @ 2025-03-14 21:41 UTC (permalink / raw)
To: x86
Cc: linux-kernel, Linus Torvalds, Peter Zijlstra, Borislav Petkov,
H. Peter Anvin, Uros Bizjak, Andrew Cooper, Ingo Molnar
alternative_input() and alternative_io() arbitrarily require input
constraint operand numbering to start at index 1. No more callers rely
on that. Simplify the interfaces by removing that restriction.
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
---
arch/x86/include/asm/alternative.h | 6 ++----
1 file changed, 2 insertions(+), 4 deletions(-)
diff --git a/arch/x86/include/asm/alternative.h b/arch/x86/include/asm/alternative.h
index 484dfea35aaa..3804b82cb03c 100644
--- a/arch/x86/include/asm/alternative.h
+++ b/arch/x86/include/asm/alternative.h
@@ -214,17 +214,15 @@ static inline int alternatives_text_reserved(void *start, void *end)
*
* Peculiarities:
* No memory clobber here.
- * Argument numbers start with 1.
- * Leaving an unused argument 0 to keep API compatibility.
*/
#define alternative_input(oldinstr, newinstr, ft_flags, input...) \
asm_inline volatile(ALTERNATIVE(oldinstr, newinstr, ft_flags) \
- : : "i" (0), ## input)
+ : : input)
/* Like alternative_input, but with a single output argument */
#define alternative_io(oldinstr, newinstr, ft_flags, output, input...) \
asm_inline volatile(ALTERNATIVE(oldinstr, newinstr, ft_flags) \
- : output : "i" (0), ## input)
+ : output : input)
/*
* Like alternative_io, but for replacing a direct call with another one.
--
2.48.1
^ permalink raw reply related [flat|nested] 47+ messages in thread* [PATCH 08/20] x86/asm: Replace ASM_{OUTPUT,INPUT}() with ARG()
2025-03-14 21:41 [PATCH 00/20] x86: Cleanup alternative_io() and friends, prep for asm_call() Josh Poimboeuf
` (6 preceding siblings ...)
2025-03-14 21:41 ` [PATCH 07/20] x86/alternative: Remove operand numbering restrictions Josh Poimboeuf
@ 2025-03-14 21:41 ` Josh Poimboeuf
2025-03-14 21:41 ` [PATCH 09/20] x86/alternative: Simplify alternative_io() interface Josh Poimboeuf
` (12 subsequent siblings)
20 siblings, 0 replies; 47+ messages in thread
From: Josh Poimboeuf @ 2025-03-14 21:41 UTC (permalink / raw)
To: x86
Cc: linux-kernel, Linus Torvalds, Peter Zijlstra, Borislav Petkov,
H. Peter Anvin, Uros Bizjak, Andrew Cooper, Ingo Molnar
Replace ASM_OUTPUT() and ASM_INPUT() with ARG(). It provides more
visual separation and vertical alignment, making it easy to distinguish
the outputs, inputs and clobbers. It will also come in handy for other
inline asm wrappers.
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
---
arch/x86/include/asm/apic.h | 2 +-
arch/x86/include/asm/asm.h | 11 ++--
arch/x86/include/asm/atomic64_32.h | 93 +++++++++++++++-------------
arch/x86/include/asm/page_64.h | 12 ++--
arch/x86/include/asm/segment.h | 5 +-
arch/x86/include/asm/special_insns.h | 2 +-
6 files changed, 67 insertions(+), 58 deletions(-)
diff --git a/arch/x86/include/asm/apic.h b/arch/x86/include/asm/apic.h
index ecf1b229f09b..6526bad6ec81 100644
--- a/arch/x86/include/asm/apic.h
+++ b/arch/x86/include/asm/apic.h
@@ -100,7 +100,7 @@ static inline void native_apic_mem_write(u32 reg, u32 v)
alternative_io("movl %[val], %[mem]",
"xchgl %[val], %[mem]", X86_BUG_11AP,
- ASM_OUTPUT([val] "+r" (v), [mem] "+m" (*addr)));
+ ARG([val] "+r" (v), [mem] "+m" (*addr)));
}
static inline u32 native_apic_mem_read(u32 reg)
diff --git a/arch/x86/include/asm/asm.h b/arch/x86/include/asm/asm.h
index 619817841f4c..9f0f830628f9 100644
--- a/arch/x86/include/asm/asm.h
+++ b/arch/x86/include/asm/asm.h
@@ -212,11 +212,14 @@ static __always_inline __pure void *rip_rel_ptr(void *p)
#define __COMMA(...) , ##__VA_ARGS__
/*
- * Combine multiple asm inline constraint args into a single arg for passing to
- * another macro.
+ * ARG() can be used to bundle multiple arguments into a single argument for
+ * passing to a macro.
+ *
+ * For inline asm constraint operands, this is recommended even for single
+ * operands as it provides visual separation and vertical alignment similar to
+ * the ':' characters in an inline asm statement.
*/
-#define ASM_OUTPUT(x...) x
-#define ASM_INPUT(x...) x
+#define ARG(x...) x
/*
* This output constraint should be used for any inline asm which has a "call"
diff --git a/arch/x86/include/asm/atomic64_32.h b/arch/x86/include/asm/atomic64_32.h
index ab838205c1c6..8775f84222e6 100644
--- a/arch/x86/include/asm/atomic64_32.h
+++ b/arch/x86/include/asm/atomic64_32.h
@@ -59,9 +59,11 @@ static __always_inline s64 arch_atomic64_read_nonatomic(const atomic64_t *v)
#define ATOMIC64_DECL(sym) ATOMIC64_DECL_ONE(sym##_cx8)
#else
#define __alternative_atomic64(f, g, out, in, clobbers...) \
- alternative_call(atomic64_##f##_386, atomic64_##g##_cx8, \
- X86_FEATURE_CX8, ASM_OUTPUT(out), \
- ASM_INPUT(in), clobbers)
+ alternative_call(atomic64_##f##_386, \
+ atomic64_##g##_cx8, X86_FEATURE_CX8, \
+ ARG(out), \
+ ARG(in), \
+ ARG(clobbers))
#define ATOMIC64_DECL(sym) ATOMIC64_DECL_ONE(sym##_cx8); \
ATOMIC64_DECL_ONE(sym##_386)
@@ -73,7 +75,7 @@ ATOMIC64_DECL_ONE(dec_386);
#endif
#define alternative_atomic64(f, out, in, clobbers...) \
- __alternative_atomic64(f, f, ASM_OUTPUT(out), ASM_INPUT(in), clobbers)
+ __alternative_atomic64(f, f, ARG(out), ARG(in), ARG(clobbers))
ATOMIC64_DECL(read);
ATOMIC64_DECL(set);
@@ -109,9 +111,9 @@ static __always_inline s64 arch_atomic64_xchg(atomic64_t *v, s64 n)
unsigned high = (unsigned)(n >> 32);
unsigned low = (unsigned)n;
alternative_atomic64(xchg,
- "=&A" (o),
- ASM_INPUT("S" (v), "b" (low), "c" (high)),
- "memory");
+ ARG("=&A" (o)),
+ ARG("S" (v), "b" (low), "c" (high)),
+ ARG("memory"));
return o;
}
#define arch_atomic64_xchg arch_atomic64_xchg
@@ -121,24 +123,27 @@ static __always_inline void arch_atomic64_set(atomic64_t *v, s64 i)
unsigned high = (unsigned)(i >> 32);
unsigned low = (unsigned)i;
alternative_atomic64(set,
- /* no output */,
- ASM_INPUT("S" (v), "b" (low), "c" (high)),
- "eax", "edx", "memory");
+ ARG(),
+ ARG("S" (v), "b" (low), "c" (high)),
+ ARG("eax", "edx", "memory"));
}
static __always_inline s64 arch_atomic64_read(const atomic64_t *v)
{
s64 r;
- alternative_atomic64(read, "=&A" (r), "c" (v), "memory");
+ alternative_atomic64(read,
+ ARG("=&A" (r)),
+ ARG("c" (v)),
+ ARG("memory"));
return r;
}
static __always_inline s64 arch_atomic64_add_return(s64 i, atomic64_t *v)
{
alternative_atomic64(add_return,
- ASM_OUTPUT("+A" (i), "+c" (v)),
- /* no input */,
- "memory");
+ ARG("+A" (i), "+c" (v)),
+ ARG(),
+ ARG("memory"));
return i;
}
#define arch_atomic64_add_return arch_atomic64_add_return
@@ -146,9 +151,9 @@ static __always_inline s64 arch_atomic64_add_return(s64 i, atomic64_t *v)
static __always_inline s64 arch_atomic64_sub_return(s64 i, atomic64_t *v)
{
alternative_atomic64(sub_return,
- ASM_OUTPUT("+A" (i), "+c" (v)),
- /* no input */,
- "memory");
+ ARG("+A" (i), "+c" (v)),
+ ARG(),
+ ARG("memory"));
return i;
}
#define arch_atomic64_sub_return arch_atomic64_sub_return
@@ -157,9 +162,9 @@ static __always_inline s64 arch_atomic64_inc_return(atomic64_t *v)
{
s64 a;
alternative_atomic64(inc_return,
- "=&A" (a),
- "S" (v),
- "memory", "ecx");
+ ARG("=&A" (a)),
+ ARG("S" (v)),
+ ARG("memory", "ecx"));
return a;
}
#define arch_atomic64_inc_return arch_atomic64_inc_return
@@ -168,9 +173,9 @@ static __always_inline s64 arch_atomic64_dec_return(atomic64_t *v)
{
s64 a;
alternative_atomic64(dec_return,
- "=&A" (a),
- "S" (v),
- "memory", "ecx");
+ ARG("=&A" (a)),
+ ARG("S" (v)),
+ ARG("memory", "ecx"));
return a;
}
#define arch_atomic64_dec_return arch_atomic64_dec_return
@@ -178,34 +183,34 @@ static __always_inline s64 arch_atomic64_dec_return(atomic64_t *v)
static __always_inline void arch_atomic64_add(s64 i, atomic64_t *v)
{
__alternative_atomic64(add, add_return,
- ASM_OUTPUT("+A" (i), "+c" (v)),
- /* no input */,
- "memory");
+ ARG("+A" (i), "+c" (v)),
+ ARG(),
+ ARG("memory"));
}
static __always_inline void arch_atomic64_sub(s64 i, atomic64_t *v)
{
__alternative_atomic64(sub, sub_return,
- ASM_OUTPUT("+A" (i), "+c" (v)),
- /* no input */,
- "memory");
+ ARG("+A" (i), "+c" (v)),
+ ARG(),
+ ARG("memory"));
}
static __always_inline void arch_atomic64_inc(atomic64_t *v)
{
__alternative_atomic64(inc, inc_return,
- /* no output */,
- "S" (v),
- "memory", "eax", "ecx", "edx");
+ ARG(),
+ ARG("S" (v)),
+ ARG("memory", "eax", "ecx", "edx"));
}
#define arch_atomic64_inc arch_atomic64_inc
static __always_inline void arch_atomic64_dec(atomic64_t *v)
{
__alternative_atomic64(dec, dec_return,
- /* no output */,
- "S" (v),
- "memory", "eax", "ecx", "edx");
+ ARG(),
+ ARG("S" (v)),
+ ARG("memory", "eax", "ecx", "edx"));
}
#define arch_atomic64_dec arch_atomic64_dec
@@ -214,9 +219,9 @@ static __always_inline int arch_atomic64_add_unless(atomic64_t *v, s64 a, s64 u)
unsigned low = (unsigned)u;
unsigned high = (unsigned)(u >> 32);
alternative_atomic64(add_unless,
- ASM_OUTPUT("+A" (a), "+c" (low), "+D" (high)),
- "S" (v),
- "memory");
+ ARG("+A" (a), "+c" (low), "+D" (high)),
+ ARG("S" (v)),
+ ARG("memory"));
return (int)a;
}
#define arch_atomic64_add_unless arch_atomic64_add_unless
@@ -225,9 +230,9 @@ static __always_inline int arch_atomic64_inc_not_zero(atomic64_t *v)
{
int r;
alternative_atomic64(inc_not_zero,
- "=&a" (r),
- "S" (v),
- "ecx", "edx", "memory");
+ ARG("=&a" (r)),
+ ARG("S" (v)),
+ ARG("ecx", "edx", "memory"));
return r;
}
#define arch_atomic64_inc_not_zero arch_atomic64_inc_not_zero
@@ -236,9 +241,9 @@ static __always_inline s64 arch_atomic64_dec_if_positive(atomic64_t *v)
{
s64 r;
alternative_atomic64(dec_if_positive,
- "=&A" (r),
- "S" (v),
- "ecx", "memory");
+ ARG("=&A" (r)),
+ ARG("S" (v)),
+ ARG("ecx", "memory"));
return r;
}
#define arch_atomic64_dec_if_positive arch_atomic64_dec_if_positive
diff --git a/arch/x86/include/asm/page_64.h b/arch/x86/include/asm/page_64.h
index db3003acd41e..0604e9d49221 100644
--- a/arch/x86/include/asm/page_64.h
+++ b/arch/x86/include/asm/page_64.h
@@ -54,9 +54,9 @@ static inline void clear_page(void *page)
alternative_call_2(clear_page_orig,
clear_page_rep, X86_FEATURE_REP_GOOD,
clear_page_erms, X86_FEATURE_ERMS,
- "=D" (page),
- "D" (page),
- "cc", "memory", "rax", "rcx");
+ ARG("=D" (page)),
+ ARG("D" (page)),
+ ARG("cc", "memory", "rax", "rcx"));
}
void copy_page(void *to, void *from);
@@ -87,9 +87,9 @@ static __always_inline unsigned long task_size_max(void)
alternative_io("movq %[small], %[ret]",
"movq %[large], %[ret]", X86_FEATURE_LA57,
- [ret] "=r" (ret),
- [small] "i" ((1ul << 47)-PAGE_SIZE),
- [large] "i" ((1ul << 56)-PAGE_SIZE));
+ ARG([ret] "=r" (ret)),
+ ARG([small] "i" ((1ul << 47)-PAGE_SIZE),
+ [large] "i" ((1ul << 56)-PAGE_SIZE)));
return ret;
}
diff --git a/arch/x86/include/asm/segment.h b/arch/x86/include/asm/segment.h
index 9d6411c65920..32b1aa9f721b 100644
--- a/arch/x86/include/asm/segment.h
+++ b/arch/x86/include/asm/segment.h
@@ -254,10 +254,11 @@ static inline void vdso_read_cpunode(unsigned *cpu, unsigned *node)
*
* If RDPID is available, use it.
*/
- alternative_io ("lsl %[seg],%[p]",
+ alternative_io("lsl %[seg],%[p]",
".byte 0xf3,0x0f,0xc7,0xf8", /* RDPID %eax/rax */
X86_FEATURE_RDPID,
- [p] "=a" (p), [seg] "r" (__CPUNODE_SEG));
+ ARG([p] "=a" (p)),
+ ARG([seg] "r" (__CPUNODE_SEG)));
if (cpu)
*cpu = (p & VDSO_CPUNODE_MASK);
diff --git a/arch/x86/include/asm/special_insns.h b/arch/x86/include/asm/special_insns.h
index 9c1cc0ef8f3c..a6a3f4c95f03 100644
--- a/arch/x86/include/asm/special_insns.h
+++ b/arch/x86/include/asm/special_insns.h
@@ -178,7 +178,7 @@ static inline void clflushopt(volatile void *__p)
{
alternative_io(".byte 0x3e; clflush %[val]",
".byte 0x66; clflush %[val]", X86_FEATURE_CLFLUSHOPT,
- [val] "+m" (*(volatile char __force *)__p));
+ ARG([val] "+m" (*(volatile char __force *)__p)));
}
static inline void clwb(volatile void *__p)
--
2.48.1
^ permalink raw reply related [flat|nested] 47+ messages in thread* [PATCH 09/20] x86/alternative: Simplify alternative_io() interface
2025-03-14 21:41 [PATCH 00/20] x86: Cleanup alternative_io() and friends, prep for asm_call() Josh Poimboeuf
` (7 preceding siblings ...)
2025-03-14 21:41 ` [PATCH 08/20] x86/asm: Replace ASM_{OUTPUT,INPUT}() with ARG() Josh Poimboeuf
@ 2025-03-14 21:41 ` Josh Poimboeuf
2025-03-14 21:41 ` [PATCH 10/20] x86/alternative: Add alternative_2_io() Josh Poimboeuf
` (11 subsequent siblings)
20 siblings, 0 replies; 47+ messages in thread
From: Josh Poimboeuf @ 2025-03-14 21:41 UTC (permalink / raw)
To: x86
Cc: linux-kernel, Linus Torvalds, Peter Zijlstra, Borislav Petkov,
H. Peter Anvin, Uros Bizjak, Andrew Cooper, Ingo Molnar
Similar to alternative_call(), change alternative_io() to allow outputs,
inputs, and clobbers to be specified individually.
Also add in the "memory" clobber for consistent behavior with the other
alternative macros.
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
---
arch/x86/include/asm/alternative.h | 15 +++++++++++----
arch/x86/include/asm/apic.h | 3 ++-
arch/x86/include/asm/special_insns.h | 3 ++-
3 files changed, 15 insertions(+), 6 deletions(-)
diff --git a/arch/x86/include/asm/alternative.h b/arch/x86/include/asm/alternative.h
index 3804b82cb03c..870b1633e1e0 100644
--- a/arch/x86/include/asm/alternative.h
+++ b/arch/x86/include/asm/alternative.h
@@ -219,10 +219,17 @@ static inline int alternatives_text_reserved(void *start, void *end)
asm_inline volatile(ALTERNATIVE(oldinstr, newinstr, ft_flags) \
: : input)
-/* Like alternative_input, but with a single output argument */
-#define alternative_io(oldinstr, newinstr, ft_flags, output, input...) \
- asm_inline volatile(ALTERNATIVE(oldinstr, newinstr, ft_flags) \
- : output : input)
+/*
+ * Alternative inline assembly with input, output and clobbers.
+ *
+ * All @output, @input, and @clobbers should be wrapped with ARG() for both
+ * functionality and readability reasons.
+ */
+#define alternative_io(oldinstr, newinstr, ft_flags, output, input, clobbers...) \
+ asm_inline volatile(ALTERNATIVE(oldinstr, newinstr, ft_flags) \
+ : output \
+ : input \
+ : "memory", ## clobbers)
/*
* Like alternative_io, but for replacing a direct call with another one.
diff --git a/arch/x86/include/asm/apic.h b/arch/x86/include/asm/apic.h
index 6526bad6ec81..8b0c2a392f8b 100644
--- a/arch/x86/include/asm/apic.h
+++ b/arch/x86/include/asm/apic.h
@@ -100,7 +100,8 @@ static inline void native_apic_mem_write(u32 reg, u32 v)
alternative_io("movl %[val], %[mem]",
"xchgl %[val], %[mem]", X86_BUG_11AP,
- ARG([val] "+r" (v), [mem] "+m" (*addr)));
+ ARG([val] "+r" (v), [mem] "+m" (*addr)),
+ ARG());
}
static inline u32 native_apic_mem_read(u32 reg)
diff --git a/arch/x86/include/asm/special_insns.h b/arch/x86/include/asm/special_insns.h
index a6a3f4c95f03..16fb2bc09059 100644
--- a/arch/x86/include/asm/special_insns.h
+++ b/arch/x86/include/asm/special_insns.h
@@ -178,7 +178,8 @@ static inline void clflushopt(volatile void *__p)
{
alternative_io(".byte 0x3e; clflush %[val]",
".byte 0x66; clflush %[val]", X86_FEATURE_CLFLUSHOPT,
- ARG([val] "+m" (*(volatile char __force *)__p)));
+ ARG([val] "+m" (*(volatile char __force *)__p)),
+ ARG());
}
static inline void clwb(volatile void *__p)
--
2.48.1
^ permalink raw reply related [flat|nested] 47+ messages in thread* [PATCH 10/20] x86/alternative: Add alternative_2_io()
2025-03-14 21:41 [PATCH 00/20] x86: Cleanup alternative_io() and friends, prep for asm_call() Josh Poimboeuf
` (8 preceding siblings ...)
2025-03-14 21:41 ` [PATCH 09/20] x86/alternative: Simplify alternative_io() interface Josh Poimboeuf
@ 2025-03-14 21:41 ` Josh Poimboeuf
2025-03-14 21:41 ` [PATCH 11/20] x86/alternative: Make alternative() a wrapper around alternative_io() Josh Poimboeuf
` (10 subsequent siblings)
20 siblings, 0 replies; 47+ messages in thread
From: Josh Poimboeuf @ 2025-03-14 21:41 UTC (permalink / raw)
To: x86
Cc: linux-kernel, Linus Torvalds, Peter Zijlstra, Borislav Petkov,
H. Peter Anvin, Uros Bizjak, Andrew Cooper, Ingo Molnar
Make an ALTERNATIVE_2() version of alternative_io().
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
---
arch/x86/include/asm/alternative.h | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/arch/x86/include/asm/alternative.h b/arch/x86/include/asm/alternative.h
index 870b1633e1e0..0acbb013e7ae 100644
--- a/arch/x86/include/asm/alternative.h
+++ b/arch/x86/include/asm/alternative.h
@@ -231,6 +231,14 @@ static inline int alternatives_text_reserved(void *start, void *end)
: input \
: "memory", ## clobbers)
+#define alternative_2_io(oldinstr, newinstr1, ft_flags1, newinstr2, ft_flags2, \
+ output, input, clobbers...) \
+ asm_inline volatile(ALTERNATIVE_2(oldinstr, newinstr1, ft_flags1, \
+ newinstr2, ft_flags2) \
+ : output \
+ : input \
+ : "memory", ## clobbers)
+
/*
* Like alternative_io, but for replacing a direct call with another one.
*
--
2.48.1
^ permalink raw reply related [flat|nested] 47+ messages in thread* [PATCH 11/20] x86/alternative: Make alternative() a wrapper around alternative_io()
2025-03-14 21:41 [PATCH 00/20] x86: Cleanup alternative_io() and friends, prep for asm_call() Josh Poimboeuf
` (9 preceding siblings ...)
2025-03-14 21:41 ` [PATCH 10/20] x86/alternative: Add alternative_2_io() Josh Poimboeuf
@ 2025-03-14 21:41 ` Josh Poimboeuf
2025-03-14 21:41 ` [PATCH 12/20] x86/cpu: Use alternative_io() in prefetch[w]() Josh Poimboeuf
` (9 subsequent siblings)
20 siblings, 0 replies; 47+ messages in thread
From: Josh Poimboeuf @ 2025-03-14 21:41 UTC (permalink / raw)
To: x86
Cc: linux-kernel, Linus Torvalds, Peter Zijlstra, Borislav Petkov,
H. Peter Anvin, Uros Bizjak, Andrew Cooper, Ingo Molnar
To reduce the number of independent implementations, make alternative()
and alternative_2() wrappers around alternative_io() and
alternative_2_io() respectively.
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
---
arch/x86/include/asm/alternative.h | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/x86/include/asm/alternative.h b/arch/x86/include/asm/alternative.h
index 0acbb013e7ae..92014d56e3aa 100644
--- a/arch/x86/include/asm/alternative.h
+++ b/arch/x86/include/asm/alternative.h
@@ -204,10 +204,10 @@ static inline int alternatives_text_reserved(void *start, void *end)
* without volatile and memory clobber.
*/
#define alternative(oldinstr, newinstr, ft_flags) \
- asm_inline volatile(ALTERNATIVE(oldinstr, newinstr, ft_flags) : : : "memory")
+ alternative_io(oldinstr, newinstr, ft_flags,,)
#define alternative_2(oldinstr, newinstr1, ft_flags1, newinstr2, ft_flags2) \
- asm_inline volatile(ALTERNATIVE_2(oldinstr, newinstr1, ft_flags1, newinstr2, ft_flags2) ::: "memory")
+ alternatve_2_io(oldinstr, newinstr1, ft_flags1, newinstr2, ft_flags2,,)
/*
* Alternative inline assembly with input.
--
2.48.1
^ permalink raw reply related [flat|nested] 47+ messages in thread* [PATCH 12/20] x86/cpu: Use alternative_io() in prefetch[w]()
2025-03-14 21:41 [PATCH 00/20] x86: Cleanup alternative_io() and friends, prep for asm_call() Josh Poimboeuf
` (10 preceding siblings ...)
2025-03-14 21:41 ` [PATCH 11/20] x86/alternative: Make alternative() a wrapper around alternative_io() Josh Poimboeuf
@ 2025-03-14 21:41 ` Josh Poimboeuf
2025-03-14 21:41 ` [PATCH 13/20] x86/alternative: Remove alternative_input() Josh Poimboeuf
` (8 subsequent siblings)
20 siblings, 0 replies; 47+ messages in thread
From: Josh Poimboeuf @ 2025-03-14 21:41 UTC (permalink / raw)
To: x86
Cc: linux-kernel, Linus Torvalds, Peter Zijlstra, Borislav Petkov,
H. Peter Anvin, Uros Bizjak, Andrew Cooper, Ingo Molnar
Use the new alternative_io() interface in preparation for removing
alternative_input().
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
---
arch/x86/include/asm/processor.h | 14 ++++++++------
1 file changed, 8 insertions(+), 6 deletions(-)
diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 2e9566134949..a1baf2fc5f9b 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -624,9 +624,10 @@ extern char ignore_fpu_irq;
*/
static inline void prefetch(const void *x)
{
- alternative_input(BASE_PREFETCH,
- "prefetchnta %[val]", X86_FEATURE_XMM,
- [val] "m" (*(const char *)x));
+ alternative_io(BASE_PREFETCH,
+ "prefetchnta %[val]", X86_FEATURE_XMM,
+ ARG(),
+ ARG([val] "m" (*(const char *)x)));
}
/*
@@ -636,9 +637,10 @@ static inline void prefetch(const void *x)
*/
static __always_inline void prefetchw(const void *x)
{
- alternative_input(BASE_PREFETCH,
- "prefetchw %[val]", X86_FEATURE_3DNOWPREFETCH,
- [val] "m" (*(const char *)x));
+ alternative_io(BASE_PREFETCH,
+ "prefetchw %[val]", X86_FEATURE_3DNOWPREFETCH,
+ ARG(),
+ ARG([val] "m" (*(const char *)x)));
}
#define TOP_OF_INIT_STACK ((unsigned long)&init_stack + sizeof(init_stack) - \
--
2.48.1
^ permalink raw reply related [flat|nested] 47+ messages in thread* [PATCH 13/20] x86/alternative: Remove alternative_input()
2025-03-14 21:41 [PATCH 00/20] x86: Cleanup alternative_io() and friends, prep for asm_call() Josh Poimboeuf
` (11 preceding siblings ...)
2025-03-14 21:41 ` [PATCH 12/20] x86/cpu: Use alternative_io() in prefetch[w]() Josh Poimboeuf
@ 2025-03-14 21:41 ` Josh Poimboeuf
2025-03-14 21:41 ` [PATCH 14/20] x86/barrier: Use alternative_io() in 32-bit barrier functions Josh Poimboeuf
` (7 subsequent siblings)
20 siblings, 0 replies; 47+ messages in thread
From: Josh Poimboeuf @ 2025-03-14 21:41 UTC (permalink / raw)
To: x86
Cc: linux-kernel, Linus Torvalds, Peter Zijlstra, Borislav Petkov,
H. Peter Anvin, Uros Bizjak, Andrew Cooper, Ingo Molnar
alternative_input() is redundant with alternative_io(), and has no more
users. Remove it.
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
---
arch/x86/include/asm/alternative.h | 10 ----------
1 file changed, 10 deletions(-)
diff --git a/arch/x86/include/asm/alternative.h b/arch/x86/include/asm/alternative.h
index 92014d56e3aa..119c24329ef1 100644
--- a/arch/x86/include/asm/alternative.h
+++ b/arch/x86/include/asm/alternative.h
@@ -209,16 +209,6 @@ static inline int alternatives_text_reserved(void *start, void *end)
#define alternative_2(oldinstr, newinstr1, ft_flags1, newinstr2, ft_flags2) \
alternatve_2_io(oldinstr, newinstr1, ft_flags1, newinstr2, ft_flags2,,)
-/*
- * Alternative inline assembly with input.
- *
- * Peculiarities:
- * No memory clobber here.
- */
-#define alternative_input(oldinstr, newinstr, ft_flags, input...) \
- asm_inline volatile(ALTERNATIVE(oldinstr, newinstr, ft_flags) \
- : : input)
-
/*
* Alternative inline assembly with input, output and clobbers.
*
--
2.48.1
^ permalink raw reply related [flat|nested] 47+ messages in thread* [PATCH 14/20] x86/barrier: Use alternative_io() in 32-bit barrier functions
2025-03-14 21:41 [PATCH 00/20] x86: Cleanup alternative_io() and friends, prep for asm_call() Josh Poimboeuf
` (12 preceding siblings ...)
2025-03-14 21:41 ` [PATCH 13/20] x86/alternative: Remove alternative_input() Josh Poimboeuf
@ 2025-03-14 21:41 ` Josh Poimboeuf
2025-03-14 23:49 ` Linus Torvalds
2025-03-14 21:41 ` [PATCH 15/20] x86/cpu/amd: Use named asm operands in asm_clear_divider() Josh Poimboeuf
` (6 subsequent siblings)
20 siblings, 1 reply; 47+ messages in thread
From: Josh Poimboeuf @ 2025-03-14 21:41 UTC (permalink / raw)
To: x86
Cc: linux-kernel, Linus Torvalds, Peter Zijlstra, Borislav Petkov,
H. Peter Anvin, Uros Bizjak, Andrew Cooper, Ingo Molnar
Use the standard alternative_io() interface.
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
---
arch/x86/include/asm/barrier.h | 23 +++++++++++++++++------
1 file changed, 17 insertions(+), 6 deletions(-)
diff --git a/arch/x86/include/asm/barrier.h b/arch/x86/include/asm/barrier.h
index db70832232d4..489a7ea76384 100644
--- a/arch/x86/include/asm/barrier.h
+++ b/arch/x86/include/asm/barrier.h
@@ -12,12 +12,23 @@
*/
#ifdef CONFIG_X86_32
-#define mb() asm volatile(ALTERNATIVE("lock addl $0,-4(%%esp)", "mfence", \
- X86_FEATURE_XMM2) ::: "memory", "cc")
-#define rmb() asm volatile(ALTERNATIVE("lock addl $0,-4(%%esp)", "lfence", \
- X86_FEATURE_XMM2) ::: "memory", "cc")
-#define wmb() asm volatile(ALTERNATIVE("lock addl $0,-4(%%esp)", "sfence", \
- X86_FEATURE_XMM2) ::: "memory", "cc")
+#define mb() alternative_io("lock addl $0,-4(%%esp)", \
+ "mfence", X86_FEATURE_XMM2, \
+ ARG(), \
+ ARG(), \
+ ARG("memory", "cc"))
+
+#define rmb() alternative_io("lock addl $0, -4(%%esp)", \
+ "lfence", X86_FEATURE_XMM2, \
+ ARG(), \
+ ARG(), \
+ ARG("memory", "cc"))
+
+#define wmb() alternative_io("lock addl $0, -4(%%esp)", \
+ "sfence", X86_FEATURE_XMM2, \
+ ARG(), \
+ ARG(), \
+ ARG("memory", "cc"))
#else
#define __mb() asm volatile("mfence":::"memory")
#define __rmb() asm volatile("lfence":::"memory")
--
2.48.1
^ permalink raw reply related [flat|nested] 47+ messages in thread* Re: [PATCH 14/20] x86/barrier: Use alternative_io() in 32-bit barrier functions
2025-03-14 21:41 ` [PATCH 14/20] x86/barrier: Use alternative_io() in 32-bit barrier functions Josh Poimboeuf
@ 2025-03-14 23:49 ` Linus Torvalds
2025-03-14 23:54 ` Linus Torvalds
` (2 more replies)
0 siblings, 3 replies; 47+ messages in thread
From: Linus Torvalds @ 2025-03-14 23:49 UTC (permalink / raw)
To: Josh Poimboeuf
Cc: x86, linux-kernel, Peter Zijlstra, Borislav Petkov,
H. Peter Anvin, Uros Bizjak, Andrew Cooper, Ingo Molnar
On Fri, 14 Mar 2025 at 11:42, Josh Poimboeuf <jpoimboe@kernel.org> wrote:
>
> +#define mb() alternative_io("lock addl $0,-4(%%esp)", \
> + "mfence", X86_FEATURE_XMM2, \
> + ARG(), \
> + ARG(), \
> + ARG("memory", "cc"))
So all of these patches look like good cleanups to me, but I do wonder
if we should
(a) not use some naming *quite* as generic as 'ARG()'
(b) make the asms use ARG_OUT/ARG_IN/ARG_CLOBBER() to clarify
because that ARG(), ARG(), ARGC() pattern looks odd to me.
Maybe it's just me.
Regardless, I do think the series looks like a nice improvement even
in the current form, even if that particular repeated pattern feels
strange.
Linus
^ permalink raw reply [flat|nested] 47+ messages in thread* Re: [PATCH 14/20] x86/barrier: Use alternative_io() in 32-bit barrier functions
2025-03-14 23:49 ` Linus Torvalds
@ 2025-03-14 23:54 ` Linus Torvalds
2025-03-15 0:09 ` Josh Poimboeuf
2025-03-15 0:05 ` Josh Poimboeuf
2025-03-15 8:52 ` Ingo Molnar
2 siblings, 1 reply; 47+ messages in thread
From: Linus Torvalds @ 2025-03-14 23:54 UTC (permalink / raw)
To: Josh Poimboeuf
Cc: x86, linux-kernel, Peter Zijlstra, Borislav Petkov,
H. Peter Anvin, Uros Bizjak, Andrew Cooper, Ingo Molnar
On Fri, 14 Mar 2025 at 13:49, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> because that ARG(), ARG(), ARGC() pattern looks odd to me.
>
> Maybe it's just me.
Oh, and the other thing I reacted to is that I think the
"alternative_io()" thing should be renamed.
The "io" makes me think "actual I/O". As in PCI or disks or whatever.
It always read oddly, but now it's *comletely* pointless, because the
new macro model actually takes pretty much arbitrary asm arguments, to
the "both input and output arguments" no longer makes any real sense.
So I think it would be better to just call this "alternative_asm()",
and make naming simpler. Hmm?
Linus
^ permalink raw reply [flat|nested] 47+ messages in thread* Re: [PATCH 14/20] x86/barrier: Use alternative_io() in 32-bit barrier functions
2025-03-14 23:54 ` Linus Torvalds
@ 2025-03-15 0:09 ` Josh Poimboeuf
2025-03-15 0:16 ` Linus Torvalds
0 siblings, 1 reply; 47+ messages in thread
From: Josh Poimboeuf @ 2025-03-15 0:09 UTC (permalink / raw)
To: Linus Torvalds
Cc: x86, linux-kernel, Peter Zijlstra, Borislav Petkov,
H. Peter Anvin, Uros Bizjak, Andrew Cooper, Ingo Molnar
On Fri, Mar 14, 2025 at 01:54:00PM -1000, Linus Torvalds wrote:
> On Fri, 14 Mar 2025 at 13:49, Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
> >
> > because that ARG(), ARG(), ARGC() pattern looks odd to me.
> >
> > Maybe it's just me.
>
> Oh, and the other thing I reacted to is that I think the
> "alternative_io()" thing should be renamed.
>
> The "io" makes me think "actual I/O". As in PCI or disks or whatever.
> It always read oddly, but now it's *comletely* pointless, because the
> new macro model actually takes pretty much arbitrary asm arguments, to
> the "both input and output arguments" no longer makes any real sense.
>
> So I think it would be better to just call this "alternative_asm()",
> and make naming simpler. Hmm?
Thing is, we still have alternative(), which is also an asm wrapper, but
it's for when the caller doesn't care about adding any constraints.
So the "_io()" distinguishes from that.
--
Josh
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH 14/20] x86/barrier: Use alternative_io() in 32-bit barrier functions
2025-03-15 0:09 ` Josh Poimboeuf
@ 2025-03-15 0:16 ` Linus Torvalds
2025-03-15 8:47 ` Ingo Molnar
0 siblings, 1 reply; 47+ messages in thread
From: Linus Torvalds @ 2025-03-15 0:16 UTC (permalink / raw)
To: Josh Poimboeuf
Cc: x86, linux-kernel, Peter Zijlstra, Borislav Petkov,
H. Peter Anvin, Uros Bizjak, Andrew Cooper, Ingo Molnar
On Fri, 14 Mar 2025 at 14:09, Josh Poimboeuf <jpoimboe@kernel.org> wrote:
>
> Thing is, we still have alternative(), which is also an asm wrapper, but
> it's for when the caller doesn't care about adding any constraints.
>
> So the "_io()" distinguishes from that.
.. but I think it does so very badly because "io" really means
something else entirely in absolutely all other contexts.
And it really makes no sense as "io", since it doesn't take inputs and
outputs, it takes inputs, outputs AND CLOBBERS.
So it would make more sense to call it "ioc", but that's just obvious
nonsense, and "ioc" is already taken as a globally recognized
shorthand for "corruption in sports".
So "ioc" is bad too, but that should make you go "Oh, 'io' is _doubly_
nonsensical".
Ergo: I think "asm" would be a better distinguishing marker, withg the
plain "alternative()" being used for particularly simple asms.
Linus
^ permalink raw reply [flat|nested] 47+ messages in thread* Re: [PATCH 14/20] x86/barrier: Use alternative_io() in 32-bit barrier functions
2025-03-15 0:16 ` Linus Torvalds
@ 2025-03-15 8:47 ` Ingo Molnar
0 siblings, 0 replies; 47+ messages in thread
From: Ingo Molnar @ 2025-03-15 8:47 UTC (permalink / raw)
To: Linus Torvalds
Cc: Josh Poimboeuf, x86, linux-kernel, Peter Zijlstra,
Borislav Petkov, H. Peter Anvin, Uros Bizjak, Andrew Cooper
* Linus Torvalds <torvalds@linux-foundation.org> wrote:
> On Fri, 14 Mar 2025 at 14:09, Josh Poimboeuf <jpoimboe@kernel.org> wrote:
> >
> > Thing is, we still have alternative(), which is also an asm wrapper, but
> > it's for when the caller doesn't care about adding any constraints.
> >
> > So the "_io()" distinguishes from that.
>
> .. but I think it does so very badly because "io" really means
> something else entirely in absolutely all other contexts.
Yeah, alternative_io() is really a misnomer we should fix.
As a minor side note, it's *doubly* a misnomer, because 'io' mixes up
the defined 'o/i' order of the output/input constraints:
arch/x86/include/asm/alternative.h:#define alternative_io(oldinstr, newinstr, ft_flags, output, input...) \
So it should have been alternative_oi().
> And it really makes no sense as "io", since it doesn't take inputs and
> outputs, it takes inputs, outputs AND CLOBBERS.
>
> So it would make more sense to call it "ioc", but that's just obvious
> nonsense, and "ioc" is already taken as a globally recognized
> shorthand for "corruption in sports".
lol ...
> So "ioc" is bad too, but that should make you go "Oh, 'io' is _doubly_
> nonsensical".
>
> Ergo: I think "asm" would be a better distinguishing marker, withg the
> plain "alternative()" being used for particularly simple asms.
Yeah, alternative_asm() or alternative_opts(). Anything but '_io()' :-)
Thanks,
Ingo
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH 14/20] x86/barrier: Use alternative_io() in 32-bit barrier functions
2025-03-14 23:49 ` Linus Torvalds
2025-03-14 23:54 ` Linus Torvalds
@ 2025-03-15 0:05 ` Josh Poimboeuf
2025-03-15 9:14 ` Ingo Molnar
2025-03-17 20:04 ` David Laight
2025-03-15 8:52 ` Ingo Molnar
2 siblings, 2 replies; 47+ messages in thread
From: Josh Poimboeuf @ 2025-03-15 0:05 UTC (permalink / raw)
To: Linus Torvalds
Cc: x86, linux-kernel, Peter Zijlstra, Borislav Petkov,
H. Peter Anvin, Uros Bizjak, Andrew Cooper, Ingo Molnar
On Fri, Mar 14, 2025 at 01:49:48PM -1000, Linus Torvalds wrote:
> So all of these patches look like good cleanups to me, but I do wonder
> if we should
>
> (a) not use some naming *quite* as generic as 'ARG()'
>
> (b) make the asms use ARG_OUT/ARG_IN/ARG_CLOBBER() to clarify
>
> because that ARG(), ARG(), ARGC() pattern looks odd to me.
>
> Maybe it's just me.
>
> Regardless, I do think the series looks like a nice improvement even
> in the current form, even if that particular repeated pattern feels
> strange.
So originally I had ASM_OUTPUT/ASM_INPUT/ASM_CLOBBER, but I ended up
going with ARG() due to its nice vertical alignment and conciseness:
__asm_call(qual, \
ALTERNATIVE("call this_cpu_cmpxchg8b_emu", \
"cmpxchg8b " __percpu_arg([var]), \
X86_FEATURE_CX8), \
ARG([var] "+m" (__my_cpu_var(_var)), "+a" (old__.low), \
"+d" (old__.high)), \
ARG("b" (new__.low), "c" (new__.high), "S" (&(_var))), \
ARG("memory")); \
Though ASM_OUTPUT/ASM_INPUT/ASM_CLOBBER isn't so bad either:
__asm_call(qual, \
ALTERNATIVE("call this_cpu_cmpxchg8b_emu", \
"cmpxchg8b " __percpu_arg([var]), \
X86_FEATURE_CX8), \
ASM_OUTPUT([var] "+m" (__my_cpu_var(_var)), \
"+a" (old__.low), "+d" (old__.high)), \
ASM_INPUT("b" (new__.low), "c" (new__.high), \
"S" (&(_var))), \
ASM_CLOBBER("memory")); \
That has the nice benefit of being more self-documenting, albeit more
verbose and less vertically aligned.
So I could go either way, really.
--
Josh
^ permalink raw reply [flat|nested] 47+ messages in thread* Re: [PATCH 14/20] x86/barrier: Use alternative_io() in 32-bit barrier functions
2025-03-15 0:05 ` Josh Poimboeuf
@ 2025-03-15 9:14 ` Ingo Molnar
2025-03-17 20:04 ` David Laight
1 sibling, 0 replies; 47+ messages in thread
From: Ingo Molnar @ 2025-03-15 9:14 UTC (permalink / raw)
To: Josh Poimboeuf
Cc: Linus Torvalds, x86, linux-kernel, Peter Zijlstra,
Borislav Petkov, H. Peter Anvin, Uros Bizjak, Andrew Cooper
* Josh Poimboeuf <jpoimboe@kernel.org> wrote:
> On Fri, Mar 14, 2025 at 01:49:48PM -1000, Linus Torvalds wrote:
> > So all of these patches look like good cleanups to me, but I do wonder
> > if we should
> >
> > (a) not use some naming *quite* as generic as 'ARG()'
> >
> > (b) make the asms use ARG_OUT/ARG_IN/ARG_CLOBBER() to clarify
> >
> > because that ARG(), ARG(), ARGC() pattern looks odd to me.
> >
> > Maybe it's just me.
> >
> > Regardless, I do think the series looks like a nice improvement even
> > in the current form, even if that particular repeated pattern feels
> > strange.
>
> So originally I had ASM_OUTPUT/ASM_INPUT/ASM_CLOBBER, but I ended up
> going with ARG() due to its nice vertical alignment and conciseness:
>
>
> __asm_call(qual, \
> ALTERNATIVE("call this_cpu_cmpxchg8b_emu", \
> "cmpxchg8b " __percpu_arg([var]), \
> X86_FEATURE_CX8), \
> ARG([var] "+m" (__my_cpu_var(_var)), "+a" (old__.low), \
> "+d" (old__.high)), \
> ARG("b" (new__.low), "c" (new__.high), "S" (&(_var))), \
> ARG("memory")); \
Two nits:
1)
In justified cases we can align vertically just fine by using spaces:
ASM_INPUT ([var] "+m" (__my_cpu_var(_var)), "+a" (old__.low)...
ASM_OUTPUT ("b" (new__.low), "c" (new__.high), "S" (&(_var))),
ASM_CLOBBER("memory")
But I don't think the vertical alignment of visually disjoint,
comma-separated arguments is an improvement in this specific case.
A *truly* advanced typographically aware syntactic construct would be
something like:
ALTERNATIVE("call this_cpu_cmpxchg8b_emu", \
"cmpxchg8b " __percpu_arg([var]), \
X86_FEATURE_CX8), \
\
ASM_INPUT( [var] "+m" (__my_cpu_var(_var)), \
"+a" (old__.low), \
"+d" (old__.high)), \
\
ASM_OUTPUT( "b" (new__.low), \
"c" (new__.high), \
"S" (&(_var))), \
\
ASM_CLOBBER( "memory"));
Note how horizontal and vertical grouping improves readability by an
order of magnitude and properly highlights the only named operand and
makes it very easy to review this code, should it be a new submission
(which it isn't).
And as Knuth said, the intersection of the sets of good coders and good
typographers is necessarily a tiny percentage of humanity
(paraphrased), but I digress ...
2)
If 'ARGS' is included in the naming then I'd like to insist on the
plural 'ARGS', not 'ARG', because the common case for more complicated
asm() statements is multiple asm template constraint arguments
separated by commas.
But I don't think we need the 'ARGS':
> Though ASM_OUTPUT/ASM_INPUT/ASM_CLOBBER isn't so bad either:
>
> __asm_call(qual, \
> ALTERNATIVE("call this_cpu_cmpxchg8b_emu", \
> "cmpxchg8b " __percpu_arg([var]), \
> X86_FEATURE_CX8), \
> ASM_OUTPUT([var] "+m" (__my_cpu_var(_var)), \
> "+a" (old__.low), "+d" (old__.high)), \
> ASM_INPUT("b" (new__.low), "c" (new__.high), \
> "S" (&(_var))), \
> ASM_CLOBBER("memory")); \
>
>
> That has the nice benefit of being more self-documenting, albeit more
> verbose and less vertically aligned.
>
> So I could go either way, really.
I'd vote on:
ASM_INPUT(),
ASM_OUTPUT(),
ASM_CLOBBER()
... because the ASM_ prefix is already unique enough visually to reset
any cross-pollination from other well-known namespaces, and because in
coding shorter is better, all other things equal.
Thanks,
Ingo
^ permalink raw reply [flat|nested] 47+ messages in thread* Re: [PATCH 14/20] x86/barrier: Use alternative_io() in 32-bit barrier functions
2025-03-15 0:05 ` Josh Poimboeuf
2025-03-15 9:14 ` Ingo Molnar
@ 2025-03-17 20:04 ` David Laight
2025-03-18 0:11 ` Josh Poimboeuf
1 sibling, 1 reply; 47+ messages in thread
From: David Laight @ 2025-03-17 20:04 UTC (permalink / raw)
To: Josh Poimboeuf
Cc: Linus Torvalds, x86, linux-kernel, Peter Zijlstra,
Borislav Petkov, H. Peter Anvin, Uros Bizjak, Andrew Cooper,
Ingo Molnar
On Fri, 14 Mar 2025 17:05:34 -0700
Josh Poimboeuf <jpoimboe@kernel.org> wrote:
> On Fri, Mar 14, 2025 at 01:49:48PM -1000, Linus Torvalds wrote:
> > So all of these patches look like good cleanups to me, but I do wonder
> > if we should
> >
> > (a) not use some naming *quite* as generic as 'ARG()'
> >
> > (b) make the asms use ARG_OUT/ARG_IN/ARG_CLOBBER() to clarify
> >
> > because that ARG(), ARG(), ARGC() pattern looks odd to me.
> >
> > Maybe it's just me.
> >
> > Regardless, I do think the series looks like a nice improvement even
> > in the current form, even if that particular repeated pattern feels
> > strange.
>
> So originally I had ASM_OUTPUT/ASM_INPUT/ASM_CLOBBER, but I ended up
> going with ARG() due to its nice vertical alignment and conciseness:
But ARG() does look horrid.
Is the ARG() necessary just to handle the comma separated lists?
If so is it only actually needed if there is more than one item?
Another option is to just require () and add the ARG in the expansion.
So with:
#define __asm_call(qual, alt, out, in, clobber) \
asm("zzz", ARG out, ARG in, ARG clobber)
__asm_call(qual, ALT(), \
([var] "+m" (__my_cpu_var(_var)), "+a" (old__.low), \
"+d" (old__.high)), \
("b" (new__.low), "c" (new__.high), "S" (&(_var))), \
("memory"));
would get expanded the same as the line below.
David
>
>
> __asm_call(qual, \
> ALTERNATIVE("call this_cpu_cmpxchg8b_emu", \
> "cmpxchg8b " __percpu_arg([var]), \
> X86_FEATURE_CX8), \
> ARG([var] "+m" (__my_cpu_var(_var)), "+a" (old__.low), \
> "+d" (old__.high)), \
> ARG("b" (new__.low), "c" (new__.high), "S" (&(_var))), \
> ARG("memory")); \
>
>
> Though ASM_OUTPUT/ASM_INPUT/ASM_CLOBBER isn't so bad either:
>
> __asm_call(qual, \
> ALTERNATIVE("call this_cpu_cmpxchg8b_emu", \
> "cmpxchg8b " __percpu_arg([var]), \
> X86_FEATURE_CX8), \
> ASM_OUTPUT([var] "+m" (__my_cpu_var(_var)), \
> "+a" (old__.low), "+d" (old__.high)), \
> ASM_INPUT("b" (new__.low), "c" (new__.high), \
> "S" (&(_var))), \
> ASM_CLOBBER("memory")); \
>
>
> That has the nice benefit of being more self-documenting, albeit more
> verbose and less vertically aligned.
>
> So I could go either way, really.
>
^ permalink raw reply [flat|nested] 47+ messages in thread* Re: [PATCH 14/20] x86/barrier: Use alternative_io() in 32-bit barrier functions
2025-03-17 20:04 ` David Laight
@ 2025-03-18 0:11 ` Josh Poimboeuf
2025-03-18 22:06 ` David Laight
0 siblings, 1 reply; 47+ messages in thread
From: Josh Poimboeuf @ 2025-03-18 0:11 UTC (permalink / raw)
To: David Laight
Cc: Linus Torvalds, x86, linux-kernel, Peter Zijlstra,
Borislav Petkov, H. Peter Anvin, Uros Bizjak, Andrew Cooper,
Ingo Molnar
On Mon, Mar 17, 2025 at 08:04:32PM +0000, David Laight wrote:
> Is the ARG() necessary just to handle the comma separated lists?
> If so is it only actually needed if there is more than one item?
No, but my preference is to require the use of the macro even for single
constraints as it helps visually separate the lists.
> Another option is to just require () and add the ARG in the expansion.
> So with:
> #define __asm_call(qual, alt, out, in, clobber) \
> asm("zzz", ARG out, ARG in, ARG clobber)
>
> __asm_call(qual, ALT(), \
> ([var] "+m" (__my_cpu_var(_var)), "+a" (old__.low), \
> "+d" (old__.high)), \
> ("b" (new__.low), "c" (new__.high), "S" (&(_var))), \
> ("memory"));
>
> would get expanded the same as the line below.
Interesting idea, though I still prefer the self-documenting ASM_OUTPUT
/ ASM_INPUT / ASM_CLOBBER macros which are self-documenting and make it
easier to read and visually distinguish the constraint lists.
--
Josh
^ permalink raw reply [flat|nested] 47+ messages in thread* Re: [PATCH 14/20] x86/barrier: Use alternative_io() in 32-bit barrier functions
2025-03-18 0:11 ` Josh Poimboeuf
@ 2025-03-18 22:06 ` David Laight
2025-03-18 22:29 ` Josh Poimboeuf
0 siblings, 1 reply; 47+ messages in thread
From: David Laight @ 2025-03-18 22:06 UTC (permalink / raw)
To: Josh Poimboeuf
Cc: Linus Torvalds, x86, linux-kernel, Peter Zijlstra,
Borislav Petkov, H. Peter Anvin, Uros Bizjak, Andrew Cooper,
Ingo Molnar
On Mon, 17 Mar 2025 17:11:58 -0700
Josh Poimboeuf <jpoimboe@kernel.org> wrote:
> On Mon, Mar 17, 2025 at 08:04:32PM +0000, David Laight wrote:
> > Is the ARG() necessary just to handle the comma separated lists?
> > If so is it only actually needed if there is more than one item?
>
> No, but my preference is to require the use of the macro even for single
> constraints as it helps visually separate the lists.
>
> > Another option is to just require () and add the ARG in the expansion.
> > So with:
> > #define __asm_call(qual, alt, out, in, clobber) \
> > asm("zzz", ARG out, ARG in, ARG clobber)
> >
> > __asm_call(qual, ALT(), \
> > ([var] "+m" (__my_cpu_var(_var)), "+a" (old__.low), \
> > "+d" (old__.high)), \
> > ("b" (new__.low), "c" (new__.high), "S" (&(_var))), \
> > ("memory"));
> >
> > would get expanded the same as the line below.
>
> Interesting idea, though I still prefer the self-documenting ASM_OUTPUT
> / ASM_INPUT / ASM_CLOBBER macros which are self-documenting and make it
> easier to read and visually distinguish the constraint lists.
Except that non of this really makes it easier to get out/in in the correct
order or to use the right constraints.
So are you just adding 'syntactic sugar' for no real gain?
Looking back at one of the changes:
-#define mb() asm volatile(ALTERNATIVE("lock addl $0,-4(%%esp)", "mfence", \
- X86_FEATURE_XMM2) ::: "memory", "cc")
+#define mb() alternative_io("lock addl $0,-4(%%esp)", \
+ "mfence", X86_FEATURE_XMM2, \
+ ARG(), \
+ ARG(), \
+ ARG("memory", "cc"))
is it really an improvement?
David
^ permalink raw reply [flat|nested] 47+ messages in thread* Re: [PATCH 14/20] x86/barrier: Use alternative_io() in 32-bit barrier functions
2025-03-18 22:06 ` David Laight
@ 2025-03-18 22:29 ` Josh Poimboeuf
0 siblings, 0 replies; 47+ messages in thread
From: Josh Poimboeuf @ 2025-03-18 22:29 UTC (permalink / raw)
To: David Laight
Cc: Linus Torvalds, x86, linux-kernel, Peter Zijlstra,
Borislav Petkov, H. Peter Anvin, Uros Bizjak, Andrew Cooper,
Ingo Molnar
On Tue, Mar 18, 2025 at 10:06:05PM +0000, David Laight wrote:
> > > So with:
> > > #define __asm_call(qual, alt, out, in, clobber) \
> > > asm("zzz", ARG out, ARG in, ARG clobber)
> > >
> > > __asm_call(qual, ALT(), \
> > > ([var] "+m" (__my_cpu_var(_var)), "+a" (old__.low), \
> > > "+d" (old__.high)), \
> > > ("b" (new__.low), "c" (new__.high), "S" (&(_var))), \
> > > ("memory"));
> > >
> > > would get expanded the same as the line below.
> >
> > Interesting idea, though I still prefer the self-documenting ASM_OUTPUT
> > / ASM_INPUT / ASM_CLOBBER macros which are self-documenting and make it
> > easier to read and visually distinguish the constraint lists.
>
> Except that non of this really makes it easier to get out/in in the correct
> order or to use the right constraints.
At least it's still no worse than asm() itself in that respect.
> So are you just adding 'syntactic sugar' for no real gain?
Some wrappers need to modify their constraint lists, so the sugar does
have a functional purpose. The new alternative_io() (or whatever it
will be called) interface will especially be needed for the followup to
this patch set which introduces asm_call() to try to fix an
ASM_CALL_CONSTRAINT mess.
> Looking back at one of the changes:
> -#define mb() asm volatile(ALTERNATIVE("lock addl $0,-4(%%esp)", "mfence", \
> - X86_FEATURE_XMM2) ::: "memory", "cc")
> +#define mb() alternative_io("lock addl $0,-4(%%esp)", \
> + "mfence", X86_FEATURE_XMM2, \
> + ARG(), \
> + ARG(), \
> + ARG("memory", "cc"))
>
> is it really an improvement?
The motivation here is to use the alternative*() wrappers whenever
possible. It helps achieve consistent behaviors and also removes the
ugly nested ALTERNATIVE() macro.
In fact, the change in your example actually improves code generation:
it changes the asm() to asm_inline() which prevents GCC from doing crazy
things due to the exploded size of the asm string.
--
Josh
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH 14/20] x86/barrier: Use alternative_io() in 32-bit barrier functions
2025-03-14 23:49 ` Linus Torvalds
2025-03-14 23:54 ` Linus Torvalds
2025-03-15 0:05 ` Josh Poimboeuf
@ 2025-03-15 8:52 ` Ingo Molnar
2 siblings, 0 replies; 47+ messages in thread
From: Ingo Molnar @ 2025-03-15 8:52 UTC (permalink / raw)
To: Linus Torvalds
Cc: Josh Poimboeuf, x86, linux-kernel, Peter Zijlstra,
Borislav Petkov, H. Peter Anvin, Uros Bizjak, Andrew Cooper
* Linus Torvalds <torvalds@linux-foundation.org> wrote:
> On Fri, 14 Mar 2025 at 11:42, Josh Poimboeuf <jpoimboe@kernel.org> wrote:
> >
> > +#define mb() alternative_io("lock addl $0,-4(%%esp)", \
> > + "mfence", X86_FEATURE_XMM2, \
> > + ARG(), \
> > + ARG(), \
> > + ARG("memory", "cc"))
>
> So all of these patches look like good cleanups to me, but I do wonder
> if we should
>
> (a) not use some naming *quite* as generic as 'ARG()'
>
> (b) make the asms use ARG_OUT/ARG_IN/ARG_CLOBBER() to clarify
>
> because that ARG(), ARG(), ARGC() pattern looks odd to me.
>
> Maybe it's just me.
Not just you, and I think the ARG_ prefix still looks a bit too
generic-C to me, it should be something more specific and unambiguously
asm() related, like:
ASM_ARGS_IN(),
ASM_ARGS_OUT(),
ASM_ARGS_CLOBBER(),
or maybe even:
ASM_CONSTRAINT_IN(),
ASM_CONSTRAINT_OUT(),
ASM_CONSTRAINT_CLOBBER(),
Because these asm()-ish syntactic constructs look better in separate
lines anyway, and it's not like we are at risk of running out of
letters in the kernel anytime soon.
Thanks,
Ingo
^ permalink raw reply [flat|nested] 47+ messages in thread
* [PATCH 15/20] x86/cpu/amd: Use named asm operands in asm_clear_divider()
2025-03-14 21:41 [PATCH 00/20] x86: Cleanup alternative_io() and friends, prep for asm_call() Josh Poimboeuf
` (13 preceding siblings ...)
2025-03-14 21:41 ` [PATCH 14/20] x86/barrier: Use alternative_io() in 32-bit barrier functions Josh Poimboeuf
@ 2025-03-14 21:41 ` Josh Poimboeuf
2025-03-15 9:01 ` Uros Bizjak
2025-03-14 21:41 ` [PATCH 16/20] x86/cpu: Use alternative_io() in amd_clear_divider() Josh Poimboeuf
` (5 subsequent siblings)
20 siblings, 1 reply; 47+ messages in thread
From: Josh Poimboeuf @ 2025-03-14 21:41 UTC (permalink / raw)
To: x86
Cc: linux-kernel, Linus Torvalds, Peter Zijlstra, Borislav Petkov,
H. Peter Anvin, Uros Bizjak, Andrew Cooper, Ingo Molnar
Use named inline asm operands in preparation for using alternative_io().
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
---
arch/x86/include/asm/processor.h | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index a1baf2fc5f9b..b458ff0e4c79 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -709,8 +709,8 @@ static inline u32 per_cpu_l2c_id(unsigned int cpu)
*/
static __always_inline void amd_clear_divider(void)
{
- asm volatile(ALTERNATIVE("", "div %2\n\t", X86_BUG_DIV0)
- :: "a" (0), "d" (0), "r" (1));
+ asm volatile(ALTERNATIVE("", "div %[one]\n\t", X86_BUG_DIV0)
+ :: "a" (0), "d" (0), [one] "r" (1));
}
extern void amd_check_microcode(void);
--
2.48.1
^ permalink raw reply related [flat|nested] 47+ messages in thread* Re: [PATCH 15/20] x86/cpu/amd: Use named asm operands in asm_clear_divider()
2025-03-14 21:41 ` [PATCH 15/20] x86/cpu/amd: Use named asm operands in asm_clear_divider() Josh Poimboeuf
@ 2025-03-15 9:01 ` Uros Bizjak
2025-03-15 10:00 ` Uros Bizjak
0 siblings, 1 reply; 47+ messages in thread
From: Uros Bizjak @ 2025-03-15 9:01 UTC (permalink / raw)
To: Josh Poimboeuf
Cc: x86, linux-kernel, Linus Torvalds, Peter Zijlstra,
Borislav Petkov, H. Peter Anvin, Andrew Cooper, Ingo Molnar
On Fri, Mar 14, 2025 at 10:42 PM Josh Poimboeuf <jpoimboe@kernel.org> wrote:
>
> Use named inline asm operands in preparation for using alternative_io().
>
> Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
> ---
> arch/x86/include/asm/processor.h | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
> index a1baf2fc5f9b..b458ff0e4c79 100644
> --- a/arch/x86/include/asm/processor.h
> +++ b/arch/x86/include/asm/processor.h
> @@ -709,8 +709,8 @@ static inline u32 per_cpu_l2c_id(unsigned int cpu)
> */
> static __always_inline void amd_clear_divider(void)
> {
> - asm volatile(ALTERNATIVE("", "div %2\n\t", X86_BUG_DIV0)
> - :: "a" (0), "d" (0), "r" (1));
> + asm volatile(ALTERNATIVE("", "div %[one]\n\t", X86_BUG_DIV0)
> + :: "a" (0), "d" (0), [one] "r" (1));
Please remove trailing "\n\t" here and elsewhere.
Thanks,
Uros.
^ permalink raw reply [flat|nested] 47+ messages in thread* Re: [PATCH 15/20] x86/cpu/amd: Use named asm operands in asm_clear_divider()
2025-03-15 9:01 ` Uros Bizjak
@ 2025-03-15 10:00 ` Uros Bizjak
0 siblings, 0 replies; 47+ messages in thread
From: Uros Bizjak @ 2025-03-15 10:00 UTC (permalink / raw)
To: Josh Poimboeuf
Cc: x86, linux-kernel, Linus Torvalds, Peter Zijlstra,
Borislav Petkov, H. Peter Anvin, Andrew Cooper, Ingo Molnar
On Sat, Mar 15, 2025 at 10:01 AM Uros Bizjak <ubizjak@gmail.com> wrote:
>
> On Fri, Mar 14, 2025 at 10:42 PM Josh Poimboeuf <jpoimboe@kernel.org> wrote:
> >
> > Use named inline asm operands in preparation for using alternative_io().
> >
> > Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
> > ---
> > arch/x86/include/asm/processor.h | 4 ++--
> > 1 file changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
> > index a1baf2fc5f9b..b458ff0e4c79 100644
> > --- a/arch/x86/include/asm/processor.h
> > +++ b/arch/x86/include/asm/processor.h
> > @@ -709,8 +709,8 @@ static inline u32 per_cpu_l2c_id(unsigned int cpu)
> > */
> > static __always_inline void amd_clear_divider(void)
> > {
> > - asm volatile(ALTERNATIVE("", "div %2\n\t", X86_BUG_DIV0)
> > - :: "a" (0), "d" (0), "r" (1));
> > + asm volatile(ALTERNATIVE("", "div %[one]\n\t", X86_BUG_DIV0)
> > + :: "a" (0), "d" (0), [one] "r" (1));
>
> Please remove trailing "\n\t" here and elsewhere.
Also, please change $subject to mention "amd_clear_divider", not
"asm_clear_divider".
Thanks,
Uros.
^ permalink raw reply [flat|nested] 47+ messages in thread
* [PATCH 16/20] x86/cpu: Use alternative_io() in amd_clear_divider()
2025-03-14 21:41 [PATCH 00/20] x86: Cleanup alternative_io() and friends, prep for asm_call() Josh Poimboeuf
` (14 preceding siblings ...)
2025-03-14 21:41 ` [PATCH 15/20] x86/cpu/amd: Use named asm operands in asm_clear_divider() Josh Poimboeuf
@ 2025-03-14 21:41 ` Josh Poimboeuf
2025-03-15 9:03 ` Uros Bizjak
2025-03-14 21:41 ` [PATCH 17/20] x86/smap: Use named asm operands in smap_{save,restore}() Josh Poimboeuf
` (4 subsequent siblings)
20 siblings, 1 reply; 47+ messages in thread
From: Josh Poimboeuf @ 2025-03-14 21:41 UTC (permalink / raw)
To: x86
Cc: linux-kernel, Linus Torvalds, Peter Zijlstra, Borislav Petkov,
H. Peter Anvin, Uros Bizjak, Andrew Cooper, Ingo Molnar
Use the standard alternative_io() interface.
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
---
arch/x86/include/asm/processor.h | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index b458ff0e4c79..d7f3f03594a2 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -709,8 +709,10 @@ static inline u32 per_cpu_l2c_id(unsigned int cpu)
*/
static __always_inline void amd_clear_divider(void)
{
- asm volatile(ALTERNATIVE("", "div %[one]\n\t", X86_BUG_DIV0)
- :: "a" (0), "d" (0), [one] "r" (1));
+ alternative_io("",
+ "div %[one]\n\t", X86_BUG_DIV0,
+ ARG(),
+ ARG("a" (0), "d" (0), [one] "r" (1)));
}
extern void amd_check_microcode(void);
--
2.48.1
^ permalink raw reply related [flat|nested] 47+ messages in thread* Re: [PATCH 16/20] x86/cpu: Use alternative_io() in amd_clear_divider()
2025-03-14 21:41 ` [PATCH 16/20] x86/cpu: Use alternative_io() in amd_clear_divider() Josh Poimboeuf
@ 2025-03-15 9:03 ` Uros Bizjak
0 siblings, 0 replies; 47+ messages in thread
From: Uros Bizjak @ 2025-03-15 9:03 UTC (permalink / raw)
To: Josh Poimboeuf
Cc: x86, linux-kernel, Linus Torvalds, Peter Zijlstra,
Borislav Petkov, H. Peter Anvin, Andrew Cooper, Ingo Molnar
On Fri, Mar 14, 2025 at 10:42 PM Josh Poimboeuf <jpoimboe@kernel.org> wrote:
>
> Use the standard alternative_io() interface.
>
> Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
> ---
> arch/x86/include/asm/processor.h | 6 ++++--
> 1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
> index b458ff0e4c79..d7f3f03594a2 100644
> --- a/arch/x86/include/asm/processor.h
> +++ b/arch/x86/include/asm/processor.h
> @@ -709,8 +709,10 @@ static inline u32 per_cpu_l2c_id(unsigned int cpu)
> */
> static __always_inline void amd_clear_divider(void)
> {
> - asm volatile(ALTERNATIVE("", "div %[one]\n\t", X86_BUG_DIV0)
> - :: "a" (0), "d" (0), [one] "r" (1));
> + alternative_io("",
> + "div %[one]\n\t", X86_BUG_DIV0,
A nit - can the above be in one line? Also no need for a trailing \n\t.
Thanks,
Uros.
^ permalink raw reply [flat|nested] 47+ messages in thread
* [PATCH 17/20] x86/smap: Use named asm operands in smap_{save,restore}()
2025-03-14 21:41 [PATCH 00/20] x86: Cleanup alternative_io() and friends, prep for asm_call() Josh Poimboeuf
` (15 preceding siblings ...)
2025-03-14 21:41 ` [PATCH 16/20] x86/cpu: Use alternative_io() in amd_clear_divider() Josh Poimboeuf
@ 2025-03-14 21:41 ` Josh Poimboeuf
2025-03-14 22:51 ` Andrew Cooper
2025-03-14 21:41 ` [PATCH 18/20] x86/smap: Use alternative_io() " Josh Poimboeuf
` (3 subsequent siblings)
20 siblings, 1 reply; 47+ messages in thread
From: Josh Poimboeuf @ 2025-03-14 21:41 UTC (permalink / raw)
To: x86
Cc: linux-kernel, Linus Torvalds, Peter Zijlstra, Borislav Petkov,
H. Peter Anvin, Uros Bizjak, Andrew Cooper, Ingo Molnar
Use named operands in preparation for using alternative_io().
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
---
arch/x86/include/asm/smap.h | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/arch/x86/include/asm/smap.h b/arch/x86/include/asm/smap.h
index 2de1e5a75c57..60ea21b4c8b7 100644
--- a/arch/x86/include/asm/smap.h
+++ b/arch/x86/include/asm/smap.h
@@ -40,9 +40,9 @@ static __always_inline unsigned long smap_save(void)
unsigned long flags;
asm volatile ("# smap_save\n\t"
- ALTERNATIVE("", "pushf; pop %0; " "clac" "\n\t",
- X86_FEATURE_SMAP)
- : "=rm" (flags) : : "memory", "cc");
+ ALTERNATIVE("",
+ "pushf; pop %[flags]; clac\n\t", X86_FEATURE_SMAP)
+ : [flags] "=rm" (flags) : : "memory", "cc");
return flags;
}
@@ -50,9 +50,9 @@ static __always_inline unsigned long smap_save(void)
static __always_inline void smap_restore(unsigned long flags)
{
asm volatile ("# smap_restore\n\t"
- ALTERNATIVE("", "push %0; popf\n\t",
- X86_FEATURE_SMAP)
- : : "g" (flags) : "memory", "cc");
+ ALTERNATIVE("",
+ "push %[flags]; popf\n\t", X86_FEATURE_SMAP)
+ : : [flags] "g" (flags) : "memory", "cc");
}
/* These macros can be used in asm() statements */
--
2.48.1
^ permalink raw reply related [flat|nested] 47+ messages in thread* Re: [PATCH 17/20] x86/smap: Use named asm operands in smap_{save,restore}()
2025-03-14 21:41 ` [PATCH 17/20] x86/smap: Use named asm operands in smap_{save,restore}() Josh Poimboeuf
@ 2025-03-14 22:51 ` Andrew Cooper
2025-03-14 23:56 ` Andrew Cooper
0 siblings, 1 reply; 47+ messages in thread
From: Andrew Cooper @ 2025-03-14 22:51 UTC (permalink / raw)
To: Josh Poimboeuf, x86
Cc: linux-kernel, Linus Torvalds, Peter Zijlstra, Borislav Petkov,
H. Peter Anvin, Uros Bizjak, Ingo Molnar
On 14/03/2025 9:41 pm, Josh Poimboeuf wrote:
> @@ -50,9 +50,9 @@ static __always_inline unsigned long smap_save(void)
> static __always_inline void smap_restore(unsigned long flags)
> {
> asm volatile ("# smap_restore\n\t"
> - ALTERNATIVE("", "push %0; popf\n\t",
> - X86_FEATURE_SMAP)
> - : : "g" (flags) : "memory", "cc");
> + ALTERNATIVE("",
> + "push %[flags]; popf\n\t", X86_FEATURE_SMAP)
> + : : [flags] "g" (flags) : "memory", "cc");
> }
>
> /* These macros can be used in asm() statements */
The problem with ASM_CALL_CONSTRAINT and asm_call() is that it's not
just call instructions. It's any transient stack adjustment.
Peeking forwards the other half of the series, you convert IRET to
asm_call(), but leave these alone.
These need converting, and hopefully someone can think of a better name
than "call" to be used for the wrapper.
~Andrew
^ permalink raw reply [flat|nested] 47+ messages in thread* Re: [PATCH 17/20] x86/smap: Use named asm operands in smap_{save,restore}()
2025-03-14 22:51 ` Andrew Cooper
@ 2025-03-14 23:56 ` Andrew Cooper
0 siblings, 0 replies; 47+ messages in thread
From: Andrew Cooper @ 2025-03-14 23:56 UTC (permalink / raw)
To: Josh Poimboeuf, x86
Cc: linux-kernel, Linus Torvalds, Peter Zijlstra, Borislav Petkov,
H. Peter Anvin, Uros Bizjak, Ingo Molnar
On 14/03/2025 10:51 pm, Andrew Cooper wrote:
> On 14/03/2025 9:41 pm, Josh Poimboeuf wrote:
>> @@ -50,9 +50,9 @@ static __always_inline unsigned long smap_save(void)
>> static __always_inline void smap_restore(unsigned long flags)
>> {
>> asm volatile ("# smap_restore\n\t"
>> - ALTERNATIVE("", "push %0; popf\n\t",
>> - X86_FEATURE_SMAP)
>> - : : "g" (flags) : "memory", "cc");
>> + ALTERNATIVE("",
>> + "push %[flags]; popf\n\t", X86_FEATURE_SMAP)
>> + : : [flags] "g" (flags) : "memory", "cc");
>> }
>>
>> /* These macros can be used in asm() statements */
> The problem with ASM_CALL_CONSTRAINT and asm_call() is that it's not
> just call instructions. It's any transient stack adjustment.
>
> Peeking forwards the other half of the series, you convert IRET to
> asm_call(), but leave these alone.
>
> These need converting, and hopefully someone can think of a better name
> than "call" to be used for the wrapper.
After chatting with Josh, the CALL aspect really is for unwinding
reasons. This will be fine, until the need for "redzone" arrives.
That said, through both series, there are definitely places which have
ASM_CALL_CONSTRAINT and oughtn't to.
~Andrew
^ permalink raw reply [flat|nested] 47+ messages in thread
* [PATCH 18/20] x86/smap: Use alternative_io() in smap_{save,restore}()
2025-03-14 21:41 [PATCH 00/20] x86: Cleanup alternative_io() and friends, prep for asm_call() Josh Poimboeuf
` (16 preceding siblings ...)
2025-03-14 21:41 ` [PATCH 17/20] x86/smap: Use named asm operands in smap_{save,restore}() Josh Poimboeuf
@ 2025-03-14 21:41 ` Josh Poimboeuf
2025-03-15 9:09 ` Uros Bizjak
2025-03-14 21:41 ` [PATCH 19/20] x86/uaccess: Use alternative_io() in __untagged_addr() Josh Poimboeuf
` (2 subsequent siblings)
20 siblings, 1 reply; 47+ messages in thread
From: Josh Poimboeuf @ 2025-03-14 21:41 UTC (permalink / raw)
To: x86
Cc: linux-kernel, Linus Torvalds, Peter Zijlstra, Borislav Petkov,
H. Peter Anvin, Uros Bizjak, Andrew Cooper, Ingo Molnar
Use the standard alternative_io() interface.
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
---
arch/x86/include/asm/smap.h | 18 ++++++++++--------
1 file changed, 10 insertions(+), 8 deletions(-)
diff --git a/arch/x86/include/asm/smap.h b/arch/x86/include/asm/smap.h
index 60ea21b4c8b7..8b77ddcb37e7 100644
--- a/arch/x86/include/asm/smap.h
+++ b/arch/x86/include/asm/smap.h
@@ -39,20 +39,22 @@ static __always_inline unsigned long smap_save(void)
{
unsigned long flags;
- asm volatile ("# smap_save\n\t"
- ALTERNATIVE("",
- "pushf; pop %[flags]; clac\n\t", X86_FEATURE_SMAP)
- : [flags] "=rm" (flags) : : "memory", "cc");
+ alternative_io("",
+ "pushf; pop %[flags]; clac\n\t", X86_FEATURE_SMAP,
+ ARG([flags] "=rm" (flags)),
+ ARG(),
+ ARG("memory", "cc"));
return flags;
}
static __always_inline void smap_restore(unsigned long flags)
{
- asm volatile ("# smap_restore\n\t"
- ALTERNATIVE("",
- "push %[flags]; popf\n\t", X86_FEATURE_SMAP)
- : : [flags] "g" (flags) : "memory", "cc");
+ alternative_io("",
+ "push %[flags]; popf\n\t", X86_FEATURE_SMAP,
+ ARG(),
+ ARG([flags] "g" (flags)),
+ ARG("memory", "cc"));
}
/* These macros can be used in asm() statements */
--
2.48.1
^ permalink raw reply related [flat|nested] 47+ messages in thread* Re: [PATCH 18/20] x86/smap: Use alternative_io() in smap_{save,restore}()
2025-03-14 21:41 ` [PATCH 18/20] x86/smap: Use alternative_io() " Josh Poimboeuf
@ 2025-03-15 9:09 ` Uros Bizjak
0 siblings, 0 replies; 47+ messages in thread
From: Uros Bizjak @ 2025-03-15 9:09 UTC (permalink / raw)
To: Josh Poimboeuf
Cc: x86, linux-kernel, Linus Torvalds, Peter Zijlstra,
Borislav Petkov, H. Peter Anvin, Andrew Cooper, Ingo Molnar
On Fri, Mar 14, 2025 at 10:42 PM Josh Poimboeuf <jpoimboe@kernel.org> wrote:
>
> Use the standard alternative_io() interface.
>
> Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
> ---
> arch/x86/include/asm/smap.h | 18 ++++++++++--------
> 1 file changed, 10 insertions(+), 8 deletions(-)
>
> diff --git a/arch/x86/include/asm/smap.h b/arch/x86/include/asm/smap.h
> index 60ea21b4c8b7..8b77ddcb37e7 100644
> --- a/arch/x86/include/asm/smap.h
> +++ b/arch/x86/include/asm/smap.h
> @@ -39,20 +39,22 @@ static __always_inline unsigned long smap_save(void)
> {
> unsigned long flags;
>
> - asm volatile ("# smap_save\n\t"
> - ALTERNATIVE("",
> - "pushf; pop %[flags]; clac\n\t", X86_FEATURE_SMAP)
> - : [flags] "=rm" (flags) : : "memory", "cc");
> + alternative_io("",
> + "pushf; pop %[flags]; clac\n\t", X86_FEATURE_SMAP,
Please merge two lines above (if possible, otherwise put feature
condition in the next line. Please also remove trailing \n\t.
> + ARG([flags] "=rm" (flags)),
> + ARG(),
> + ARG("memory", "cc"));
>
> return flags;
> }
>
> static __always_inline void smap_restore(unsigned long flags)
> {
> - asm volatile ("# smap_restore\n\t"
> - ALTERNATIVE("",
> - "push %[flags]; popf\n\t", X86_FEATURE_SMAP)
> - : : [flags] "g" (flags) : "memory", "cc");
> + alternative_io("",
> + "push %[flags]; popf\n\t", X86_FEATURE_SMAP,
As above.
> + ARG(),
> + ARG([flags] "g" (flags)),
Uh, a build failure waiting to happen here. Please use "rme" instead
of "g". PUSH doesn't support 64bit immediates ("i") on x86_64 ("g"
expands to "rmi").
Thanks,
Uros.
^ permalink raw reply [flat|nested] 47+ messages in thread
* [PATCH 19/20] x86/uaccess: Use alternative_io() in __untagged_addr()
2025-03-14 21:41 [PATCH 00/20] x86: Cleanup alternative_io() and friends, prep for asm_call() Josh Poimboeuf
` (17 preceding siblings ...)
2025-03-14 21:41 ` [PATCH 18/20] x86/smap: Use alternative_io() " Josh Poimboeuf
@ 2025-03-14 21:41 ` Josh Poimboeuf
2025-03-15 9:12 ` Uros Bizjak
2025-03-14 21:41 ` [PATCH 20/20] x86/msr: Use alternative_2_io() in rdtsc_ordered() Josh Poimboeuf
2025-03-14 22:25 ` [PATCH 00/20] x86: Cleanup alternative_io() and friends, prep for asm_call() Josh Poimboeuf
20 siblings, 1 reply; 47+ messages in thread
From: Josh Poimboeuf @ 2025-03-14 21:41 UTC (permalink / raw)
To: x86
Cc: linux-kernel, Linus Torvalds, Peter Zijlstra, Borislav Petkov,
H. Peter Anvin, Uros Bizjak, Andrew Cooper, Ingo Molnar
Use the standard alternative_io() interface.
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
---
arch/x86/include/asm/uaccess_64.h | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/arch/x86/include/asm/uaccess_64.h b/arch/x86/include/asm/uaccess_64.h
index c52f0133425b..b507d5fb5443 100644
--- a/arch/x86/include/asm/uaccess_64.h
+++ b/arch/x86/include/asm/uaccess_64.h
@@ -26,10 +26,10 @@ extern unsigned long USER_PTR_MAX;
*/
static inline unsigned long __untagged_addr(unsigned long addr)
{
- asm (ALTERNATIVE("",
- "and " __percpu_arg([mask]) ", %[addr]", X86_FEATURE_LAM)
- : [addr] "+r" (addr)
- : [mask] "m" (__my_cpu_var(tlbstate_untag_mask)));
+ alternative_io("",
+ "and " __percpu_arg([mask]) ", %[addr]", X86_FEATURE_LAM,
+ ARG([addr] "+r" (addr)),
+ ARG([mask] "m" (__my_cpu_var(tlbstate_untag_mask))));
return addr;
}
--
2.48.1
^ permalink raw reply related [flat|nested] 47+ messages in thread* Re: [PATCH 19/20] x86/uaccess: Use alternative_io() in __untagged_addr()
2025-03-14 21:41 ` [PATCH 19/20] x86/uaccess: Use alternative_io() in __untagged_addr() Josh Poimboeuf
@ 2025-03-15 9:12 ` Uros Bizjak
0 siblings, 0 replies; 47+ messages in thread
From: Uros Bizjak @ 2025-03-15 9:12 UTC (permalink / raw)
To: Josh Poimboeuf
Cc: x86, linux-kernel, Linus Torvalds, Peter Zijlstra,
Borislav Petkov, H. Peter Anvin, Andrew Cooper, Ingo Molnar
On Fri, Mar 14, 2025 at 10:42 PM Josh Poimboeuf <jpoimboe@kernel.org> wrote:
>
> Use the standard alternative_io() interface.
>
> Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
> ---
> arch/x86/include/asm/uaccess_64.h | 8 ++++----
> 1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/arch/x86/include/asm/uaccess_64.h b/arch/x86/include/asm/uaccess_64.h
> index c52f0133425b..b507d5fb5443 100644
> --- a/arch/x86/include/asm/uaccess_64.h
> +++ b/arch/x86/include/asm/uaccess_64.h
> @@ -26,10 +26,10 @@ extern unsigned long USER_PTR_MAX;
> */
> static inline unsigned long __untagged_addr(unsigned long addr)
> {
> - asm (ALTERNATIVE("",
> - "and " __percpu_arg([mask]) ", %[addr]", X86_FEATURE_LAM)
> - : [addr] "+r" (addr)
> - : [mask] "m" (__my_cpu_var(tlbstate_untag_mask)));
> + alternative_io("",
No, alternative_io() declares asm as volatile. Please define
alternative_io_nv (or something like that) that will not use
volatile. You will penalize the above asm unnecessarily with volatile.
Thanks,
Uros.
^ permalink raw reply [flat|nested] 47+ messages in thread
* [PATCH 20/20] x86/msr: Use alternative_2_io() in rdtsc_ordered()
2025-03-14 21:41 [PATCH 00/20] x86: Cleanup alternative_io() and friends, prep for asm_call() Josh Poimboeuf
` (18 preceding siblings ...)
2025-03-14 21:41 ` [PATCH 19/20] x86/uaccess: Use alternative_io() in __untagged_addr() Josh Poimboeuf
@ 2025-03-14 21:41 ` Josh Poimboeuf
2025-03-14 22:25 ` [PATCH 00/20] x86: Cleanup alternative_io() and friends, prep for asm_call() Josh Poimboeuf
20 siblings, 0 replies; 47+ messages in thread
From: Josh Poimboeuf @ 2025-03-14 21:41 UTC (permalink / raw)
To: x86
Cc: linux-kernel, Linus Torvalds, Peter Zijlstra, Borislav Petkov,
H. Peter Anvin, Uros Bizjak, Andrew Cooper, Ingo Molnar
Use the standard alternative_2_io() interface.
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
---
arch/x86/include/asm/msr.h | 13 +++++++------
1 file changed, 7 insertions(+), 6 deletions(-)
diff --git a/arch/x86/include/asm/msr.h b/arch/x86/include/asm/msr.h
index 001853541f1e..996e3b5857c6 100644
--- a/arch/x86/include/asm/msr.h
+++ b/arch/x86/include/asm/msr.h
@@ -214,12 +214,13 @@ static __always_inline unsigned long long rdtsc_ordered(void)
* Thus, use the preferred barrier on the respective CPU, aiming for
* RDTSCP as the default.
*/
- asm volatile(ALTERNATIVE_2("rdtsc",
- "lfence; rdtsc", X86_FEATURE_LFENCE_RDTSC,
- "rdtscp", X86_FEATURE_RDTSCP)
- : EAX_EDX_RET(val, low, high)
- /* RDTSCP clobbers ECX with MSR_TSC_AUX. */
- :: "ecx");
+ alternative_2_io("rdtsc",
+ "lfence; rdtsc", X86_FEATURE_LFENCE_RDTSC,
+ "rdtscp", X86_FEATURE_RDTSCP,
+ ARG(EAX_EDX_RET(val, low, high)),
+ ARG(),
+ /* RDTSCP clobbers ECX with MSR_TSC_AUX. */
+ ARG("ecx"));
return EAX_EDX_VAL(val, low, high);
}
--
2.48.1
^ permalink raw reply related [flat|nested] 47+ messages in thread* Re: [PATCH 00/20] x86: Cleanup alternative_io() and friends, prep for asm_call()
2025-03-14 21:41 [PATCH 00/20] x86: Cleanup alternative_io() and friends, prep for asm_call() Josh Poimboeuf
` (19 preceding siblings ...)
2025-03-14 21:41 ` [PATCH 20/20] x86/msr: Use alternative_2_io() in rdtsc_ordered() Josh Poimboeuf
@ 2025-03-14 22:25 ` Josh Poimboeuf
2025-03-15 9:52 ` Uros Bizjak
20 siblings, 1 reply; 47+ messages in thread
From: Josh Poimboeuf @ 2025-03-14 22:25 UTC (permalink / raw)
To: x86
Cc: linux-kernel, Linus Torvalds, Peter Zijlstra, Borislav Petkov,
H. Peter Anvin, Uros Bizjak, Andrew Cooper, Ingo Molnar
On Fri, Mar 14, 2025 at 02:41:13PM -0700, Josh Poimboeuf wrote:
> Make the alternative_io() interface more straightforward and flexible,
> and get rid of alternative_input().
>
> These patches are a prereq for another set[1] which will get rid of
> ASM_CALL_CONSTRAINT[2] in favor of a much more flexible asm_call()
> interface similar to the new alternative_io().
>
> [1] Additional 20+ patches not posted yet to avoid flooding inboxes
The rest of the patches are here if anybody wants to see where this is
going:
git://git.kernel.org/pub/scm/linux/kernel/git/jpoimboe/linux.git asm-call
--
Josh
^ permalink raw reply [flat|nested] 47+ messages in thread* Re: [PATCH 00/20] x86: Cleanup alternative_io() and friends, prep for asm_call()
2025-03-14 22:25 ` [PATCH 00/20] x86: Cleanup alternative_io() and friends, prep for asm_call() Josh Poimboeuf
@ 2025-03-15 9:52 ` Uros Bizjak
0 siblings, 0 replies; 47+ messages in thread
From: Uros Bizjak @ 2025-03-15 9:52 UTC (permalink / raw)
To: Josh Poimboeuf
Cc: x86, linux-kernel, Linus Torvalds, Peter Zijlstra,
Borislav Petkov, H. Peter Anvin, Andrew Cooper, Ingo Molnar
On Fri, Mar 14, 2025 at 11:25 PM Josh Poimboeuf <jpoimboe@kernel.org> wrote:
>
> On Fri, Mar 14, 2025 at 02:41:13PM -0700, Josh Poimboeuf wrote:
> > Make the alternative_io() interface more straightforward and flexible,
> > and get rid of alternative_input().
> >
> > These patches are a prereq for another set[1] which will get rid of
> > ASM_CALL_CONSTRAINT[2] in favor of a much more flexible asm_call()
> > interface similar to the new alternative_io().
> >
> > [1] Additional 20+ patches not posted yet to avoid flooding inboxes
>
> The rest of the patches are here if anybody wants to see where this is
> going:
>
> git://git.kernel.org/pub/scm/linux/kernel/git/jpoimboe/linux.git asm-call
FYI, you missed one conversion of asm involving ALTERNATIVE in
arch/x86/include/asm/nospec-branch.h:
void alternative_msr_write(unsigned int msr, u64 val, unsigned int feature)
{
asm volatile(ALTERNATIVE("", "wrmsr", %c[feature])
: : "c" (msr),
"a" ((u32)val),
"d" ((u32)(val >> 32)),
[feature] "i" (feature)
: "memory");
Uros.
^ permalink raw reply [flat|nested] 47+ messages in thread