All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/8] x86/altcall: Switch to a simpler scheme
@ 2025-04-23  1:02 Andrew Cooper
  2025-04-23  1:02 ` [PATCH 1/8] x86/altcall: Split alternative-call.h out of alternative.h Andrew Cooper
                   ` (7 more replies)
  0 siblings, 8 replies; 20+ messages in thread
From: Andrew Cooper @ 2025-04-23  1:02 UTC (permalink / raw)
  To: Xen-devel
  Cc: Andrew Cooper, Anthony PERARD, Michal Orzel, Jan Beulich,
	Julien Grall, Roger Pau Monné, Stefano Stabellini,
	consulting @ bugseng . com, Nicola Vetrini, Ross Lagerwall

See patch 7 for details.  This simplifies _apply_alternatives() specifically
by removing a special and quite complicated case, and encodes the metadata
about altcalls differently and more efficiently.

https://gitlab.com/xen-project/hardware/xen-staging/-/pipelines/1780872355

Andrew Cooper (8):
  x86/altcall: Split alternative-call.h out of alternative.h
  x86/altcall: Rename alternative_branches() to boot_apply_alt_calls()
  x86/alternatives: Rework information passing into
    nmi_apply_alternatives()
  x86/alternatives: Factor seal_endbr64() out of _apply_alternatives()
  x86/altcall: Introduce new simpler scheme
  xen/livepatch: Support new altcall scheme
  x86/altcall: Switch to simpler scheme
  x86/alternatives: Simplify _apply_alternatives() now altcall is
    separate

 .../eclair_analysis/ECLAIR/deviations.ecl     |   4 +-
 xen/arch/x86/alternative.c                    | 267 ++++++++++--------
 .../asm/{alternative.h => alternative-call.h} | 179 +-----------
 xen/arch/x86/include/asm/alternative.h        | 263 -----------------
 xen/arch/x86/include/asm/hvm/hvm.h            |   2 +-
 xen/arch/x86/setup.c                          |   3 +-
 xen/arch/x86/xen.lds.S                        |   4 +
 xen/common/core_parking.c                     |   4 +-
 xen/common/livepatch.c                        |  58 ++++
 xen/include/xen/alternative-call.h            |  24 +-
 10 files changed, 252 insertions(+), 556 deletions(-)
 copy xen/arch/x86/include/asm/{alternative.h => alternative-call.h} (63%)

-- 
2.39.5



^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 1/8] x86/altcall: Split alternative-call.h out of alternative.h
  2025-04-23  1:02 [PATCH 0/8] x86/altcall: Switch to a simpler scheme Andrew Cooper
@ 2025-04-23  1:02 ` Andrew Cooper
  2025-04-23  7:55   ` Jan Beulich
  2025-04-23  1:02 ` [PATCH 2/8] x86/altcall: Rename alternative_branches() to boot_apply_alt_calls() Andrew Cooper
                   ` (6 subsequent siblings)
  7 siblings, 1 reply; 20+ messages in thread
From: Andrew Cooper @ 2025-04-23  1:02 UTC (permalink / raw)
  To: Xen-devel
  Cc: Andrew Cooper, Anthony PERARD, Michal Orzel, Jan Beulich,
	Julien Grall, Roger Pau Monné, Stefano Stabellini,
	consulting @ bugseng . com, Nicola Vetrini

... in preparation for changing how they're implemented.

Update the MISRA deviations with the new path.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Anthony PERARD <anthony.perard@vates.tech>
CC: Michal Orzel <michal.orzel@amd.com>
CC: Jan Beulich <jbeulich@suse.com>
CC: Julien Grall <julien@xen.org>
CC: Roger Pau Monné <roger.pau@citrix.com>
CC: Stefano Stabellini <sstabellini@kernel.org>
CC: consulting@bugseng.com <consulting@bugseng.com>
CC: Nicola Vetrini <nicola.vetrini@bugseng.com>

In terms of naming, while tailcalls can technically be jumps, they're still
usually reasoned about as being calls.

It appears that everywhere else which needs alternative_{v,}call() gets it
transitively through hvm.h
---
 .../eclair_analysis/ECLAIR/deviations.ecl     |   4 +-
 .../asm/{alternative.h => alternative-call.h} | 171 +-----------
 xen/arch/x86/include/asm/alternative.h        | 262 ------------------
 xen/arch/x86/include/asm/hvm/hvm.h            |   2 +-
 xen/common/core_parking.c                     |   4 +-
 xen/include/xen/alternative-call.h            |  10 +-
 6 files changed, 16 insertions(+), 437 deletions(-)
 copy xen/arch/x86/include/asm/{alternative.h => alternative-call.h} (64%)

diff --git a/automation/eclair_analysis/ECLAIR/deviations.ecl b/automation/eclair_analysis/ECLAIR/deviations.ecl
index 2c8fb9271391..9c67358d4663 100644
--- a/automation/eclair_analysis/ECLAIR/deviations.ecl
+++ b/automation/eclair_analysis/ECLAIR/deviations.ecl
@@ -414,8 +414,8 @@ of the short-circuit evaluation strategy of such logical operators."
 -doc_end
 
 -doc_begin="Macros alternative_v?call[0-9] use sizeof and typeof to check that the argument types match the corresponding parameter ones."
--config=MC3A2.R13.6,reports+={deliberate,"any_area(any_loc(any_exp(macro(^alternative_vcall[0-9]$))&&file(^xen/arch/x86/include/asm/alternative\\.h*$)))"}
--config=B.UNEVALEFF,reports+={deliberate,"any_area(any_loc(any_exp(macro(^alternative_v?call[0-9]$))&&file(^xen/arch/x86/include/asm/alterantive\\.h*$)))"}
+-config=MC3A2.R13.6,reports+={deliberate,"any_area(any_loc(any_exp(macro(^alternative_vcall[0-9]$))&&file(^xen/arch/x86/include/asm/alternative-call\\.h*$)))"}
+-config=B.UNEVALEFF,reports+={deliberate,"any_area(any_loc(any_exp(macro(^alternative_v?call[0-9]$))&&file(^xen/arch/x86/include/asm/alterantive-call\\.h*$)))"}
 -doc_end
 
 -doc_begin="Anything, no matter how complicated, inside the BUILD_BUG_ON macro is subject to a compile-time evaluation without relevant side effects."
diff --git a/xen/arch/x86/include/asm/alternative.h b/xen/arch/x86/include/asm/alternative-call.h
similarity index 64%
copy from xen/arch/x86/include/asm/alternative.h
copy to xen/arch/x86/include/asm/alternative-call.h
index 7326ad942836..828ea32a9625 100644
--- a/xen/arch/x86/include/asm/alternative.h
+++ b/xen/arch/x86/include/asm/alternative-call.h
@@ -1,165 +1,8 @@
-#ifndef __X86_ALTERNATIVE_H__
-#define __X86_ALTERNATIVE_H__
+/* SPDX-License-Identifier: GPL-2.0-only */
+#ifndef X86_ALTERNATIVE_CALL_H
+#define X86_ALTERNATIVE_CALL_H
 
-#ifdef __ASSEMBLY__
-#include <asm/alternative-asm.h>
-#else
-
-#include <xen/macros.h>
-#include <xen/types.h>
-
-#include <asm/asm-macros.h>
-#include <asm/cpufeatureset.h>
-
-struct __packed alt_instr {
-    int32_t  orig_offset;   /* original instruction */
-    int32_t  repl_offset;   /* offset to replacement instruction */
-    uint16_t cpuid;         /* cpuid bit set for replacement */
-    uint8_t  orig_len;      /* length of original instruction */
-    uint8_t  repl_len;      /* length of new instruction */
-    uint8_t  pad_len;       /* length of build-time padding */
-    uint8_t  priv;          /* Private, for use by apply_alternatives() */
-};
-
-#define __ALT_PTR(a,f)      ((uint8_t *)((void *)&(a)->f + (a)->f))
-#define ALT_ORIG_PTR(a)     __ALT_PTR(a, orig_offset)
-#define ALT_REPL_PTR(a)     __ALT_PTR(a, repl_offset)
-
-extern void add_nops(void *insns, unsigned int len);
-/* Similar to alternative_instructions except it can be run with IRQs enabled. */
-extern int apply_alternatives(struct alt_instr *start, struct alt_instr *end);
-extern void alternative_instructions(void);
-extern void alternative_branches(void);
-
-#define alt_orig_len       "(.LXEN%=_orig_e - .LXEN%=_orig_s)"
-#define alt_pad_len        "(.LXEN%=_orig_p - .LXEN%=_orig_e)"
-#define alt_total_len      "(.LXEN%=_orig_p - .LXEN%=_orig_s)"
-#define alt_repl_s(num)    ".LXEN%=_repl_s"#num
-#define alt_repl_e(num)    ".LXEN%=_repl_e"#num
-#define alt_repl_len(num)  "(" alt_repl_e(num) " - " alt_repl_s(num) ")"
-
-/*
- * GAS's idea of true is sometimes 1 and sometimes -1, while Clang's idea
- * was consistently 1 up to 6.x (it matches GAS's now).  Transform it to
- * uniformly 1.
- */
-#define AS_TRUE(x) "((" x ") & 1)"
-
-#define as_max(a, b) "(("a") ^ ((("a") ^ ("b")) & -("AS_TRUE("("a") < ("b")")")))"
-
-#define OLDINSTR(oldinstr, padding)                              \
-    ".LXEN%=_orig_s:\n\t" oldinstr "\n .LXEN%=_orig_e:\n\t"      \
-    ".LXEN%=_diff = " padding "\n\t"                             \
-    "mknops ("AS_TRUE(".LXEN%=_diff > 0")" * .LXEN%=_diff)\n\t"  \
-    ".LXEN%=_orig_p:\n\t"
-
-#define OLDINSTR_1(oldinstr, n1)                                 \
-    OLDINSTR(oldinstr, alt_repl_len(n1) "-" alt_orig_len)
-
-#define OLDINSTR_2(oldinstr, n1, n2)                             \
-    OLDINSTR(oldinstr,                                           \
-             as_max(alt_repl_len(n1),                            \
-                    alt_repl_len(n2)) "-" alt_orig_len)
-
-#define ALTINSTR_ENTRY(feature, num)                                    \
-        " .if " STR(feature) " >= " STR(NCAPINTS * 32) "\n"             \
-        " .error \"alternative feature outside of featureset range\"\n" \
-        " .endif\n"                                                     \
-        " .long .LXEN%=_orig_s - .\n"             /* label           */ \
-        " .long " alt_repl_s(num)" - .\n"         /* new instruction */ \
-        " .word " STR(feature) "\n"               /* feature bit     */ \
-        " .byte " alt_orig_len "\n"               /* source len      */ \
-        " .byte " alt_repl_len(num) "\n"          /* replacement len */ \
-        " .byte " alt_pad_len "\n"                /* padding len     */ \
-        " .byte 0\n"                              /* priv            */
-
-#define DISCARD_ENTRY(num)                        /* repl <= total */   \
-        " .byte 0xff + (" alt_repl_len(num) ") - (" alt_total_len ")\n"
-
-#define ALTINSTR_REPLACEMENT(newinstr, num)       /* replacement */     \
-        alt_repl_s(num)":\n\t" newinstr "\n" alt_repl_e(num) ":\n\t"
-
-/* alternative assembly primitive: */
-#define ALTERNATIVE(oldinstr, newinstr, feature)                        \
-        OLDINSTR_1(oldinstr, 1)                                         \
-        ".pushsection .altinstructions, \"a\", @progbits\n"             \
-        ALTINSTR_ENTRY(feature, 1)                                      \
-        ".section .discard, \"a\", @progbits\n"                         \
-        ".byte " alt_total_len "\n" /* total_len <= 255 */              \
-        DISCARD_ENTRY(1)                                                \
-        ".section .altinstr_replacement, \"ax\", @progbits\n"           \
-        ALTINSTR_REPLACEMENT(newinstr, 1)                               \
-        ".popsection\n"
-
-#define ALTERNATIVE_2(oldinstr, newinstr1, feature1, newinstr2, feature2) \
-        OLDINSTR_2(oldinstr, 1, 2)                                      \
-        ".pushsection .altinstructions, \"a\", @progbits\n"             \
-        ALTINSTR_ENTRY(feature1, 1)                                     \
-        ALTINSTR_ENTRY(feature2, 2)                                     \
-        ".section .discard, \"a\", @progbits\n"                         \
-        ".byte " alt_total_len "\n" /* total_len <= 255 */              \
-        DISCARD_ENTRY(1)                                                \
-        DISCARD_ENTRY(2)                                                \
-        ".section .altinstr_replacement, \"ax\", @progbits\n"           \
-        ALTINSTR_REPLACEMENT(newinstr1, 1)                              \
-        ALTINSTR_REPLACEMENT(newinstr2, 2)                              \
-        ".popsection\n"
-
-/*
- * Alternative instructions for different CPU types or capabilities.
- *
- * This allows to use optimized instructions even on generic binary
- * kernels.
- *
- * length of oldinstr must be longer or equal the length of newinstr
- * It can be padded with nops as needed.
- *
- * For non barrier like inlines please define new variants
- * without volatile and memory clobber.
- */
-#define alternative(oldinstr, newinstr, feature)                        \
-        asm volatile (ALTERNATIVE(oldinstr, newinstr, feature) : : : "memory")
-
-#define alternative_2(oldinstr, newinstr1, feature1, newinstr2, feature2) \
-	asm volatile (ALTERNATIVE_2(oldinstr, newinstr1, feature1,	\
-				    newinstr2, feature2)		\
-		      : : : "memory")
-
-/*
- * Alternative inline assembly with input.
- *
- * Pecularities:
- * No memory clobber here.
- * Argument numbers start with 1.
- * Best is to use constraints that are fixed size (like (%1) ... "r")
- * If you use variable sized constraints like "m" or "g" in the
- * replacement make sure to pad to the worst case length.
- */
-#define alternative_input(oldinstr, newinstr, feature, input...)	\
-	asm volatile (ALTERNATIVE(oldinstr, newinstr, feature)		\
-		      : : input)
-
-/* Like alternative_input, but with a single output argument */
-#define alternative_io(oldinstr, newinstr, feature, output, input...)	\
-	asm volatile (ALTERNATIVE(oldinstr, newinstr, feature)		\
-		      : output : input)
-
-/*
- * This is similar to alternative_io. But it has two features and
- * respective instructions.
- *
- * If CPU has feature2, newinstr2 is used.
- * Otherwise, if CPU has feature1, newinstr1 is used.
- * Otherwise, oldinstr is used.
- */
-#define alternative_io_2(oldinstr, newinstr1, feature1, newinstr2,	\
-			 feature2, output, input...)			\
-	asm volatile(ALTERNATIVE_2(oldinstr, newinstr1, feature1,	\
-				   newinstr2, feature2)			\
-		     : output : input)
-
-/* Use this macro(s) if you need more than one output parameter. */
-#define ASM_OUTPUT2(a...) a
+#include <asm/alternative.h>
 
 /*
  * Machinery to allow converting indirect to direct calls, when the called
@@ -303,7 +146,7 @@ extern void alternative_branches(void);
 })
 
 #define alternative_call3(func, arg1, arg2, arg3) ({     \
-    typeof(arg1) v1_ = (arg1);                            \
+    typeof(arg1) v1_ = (arg1);                           \
     typeof(arg2) v2_ = (arg2);                           \
     typeof(arg3) v3_ = (arg3);                           \
     ALT_CALL_ARG(v1_, 1);                                \
@@ -423,6 +266,4 @@ extern void alternative_branches(void);
 #define alternative_call(func, args...) \
     alternative_call_(count_args(args))(func, ## args)
 
-#endif /*  !__ASSEMBLY__  */
-
-#endif /* __X86_ALTERNATIVE_H__ */
+#endif /* X86_ALTERNATIVE_CALL_H */
diff --git a/xen/arch/x86/include/asm/alternative.h b/xen/arch/x86/include/asm/alternative.h
index 7326ad942836..2d2ace97f794 100644
--- a/xen/arch/x86/include/asm/alternative.h
+++ b/xen/arch/x86/include/asm/alternative.h
@@ -161,268 +161,6 @@ extern void alternative_branches(void);
 /* Use this macro(s) if you need more than one output parameter. */
 #define ASM_OUTPUT2(a...) a
 
-/*
- * Machinery to allow converting indirect to direct calls, when the called
- * function is determined once at boot and later never changed.
- */
-
-#define ALT_CALL_arg1 "rdi"
-#define ALT_CALL_arg2 "rsi"
-#define ALT_CALL_arg3 "rdx"
-#define ALT_CALL_arg4 "rcx"
-#define ALT_CALL_arg5 "r8"
-#define ALT_CALL_arg6 "r9"
-
-#ifdef CONFIG_CC_IS_CLANG
-/*
- * Clang doesn't follow the psABI and doesn't truncate parameter values at the
- * callee.  This can lead to bad code being generated when using alternative
- * calls.
- *
- * Workaround it by using a temporary intermediate variable that's zeroed
- * before being assigned the parameter value, as that forces clang to zero the
- * register at the caller.
- *
- * This has been reported upstream:
- * https://github.com/llvm/llvm-project/issues/12579
- * https://github.com/llvm/llvm-project/issues/82598
- */
-#define ALT_CALL_ARG(arg, n)                                            \
-    register unsigned long a ## n ## _ asm ( ALT_CALL_arg ## n ) = ({   \
-        unsigned long tmp = 0;                                          \
-        BUILD_BUG_ON(sizeof(arg) > sizeof(unsigned long));              \
-        *(typeof(arg) *)&tmp = (arg);                                   \
-        tmp;                                                            \
-    })
-#else
-#define ALT_CALL_ARG(arg, n) \
-    register typeof(arg) a ## n ## _ asm ( ALT_CALL_arg ## n ) = \
-        ({ BUILD_BUG_ON(sizeof(arg) > sizeof(void *)); (arg); })
-#endif
-#define ALT_CALL_NO_ARG(n) \
-    register unsigned long a ## n ## _ asm ( ALT_CALL_arg ## n )
-
-#define ALT_CALL_NO_ARG6 ALT_CALL_NO_ARG(6)
-#define ALT_CALL_NO_ARG5 ALT_CALL_NO_ARG(5); ALT_CALL_NO_ARG6
-#define ALT_CALL_NO_ARG4 ALT_CALL_NO_ARG(4); ALT_CALL_NO_ARG5
-#define ALT_CALL_NO_ARG3 ALT_CALL_NO_ARG(3); ALT_CALL_NO_ARG4
-#define ALT_CALL_NO_ARG2 ALT_CALL_NO_ARG(2); ALT_CALL_NO_ARG3
-#define ALT_CALL_NO_ARG1 ALT_CALL_NO_ARG(1); ALT_CALL_NO_ARG2
-
-/*
- * Unfortunately ALT_CALL_NO_ARG() above can't use a fake initializer (to
- * suppress "uninitialized variable" warnings), as various versions of gcc
- * older than 8.1 fall on the nose in various ways with that (always because
- * of some other construct elsewhere in the same function needing to use the
- * same hard register). Otherwise the asm() below could uniformly use "+r"
- * output constraints, making unnecessary all these ALT_CALL<n>_OUT macros.
- */
-#define ALT_CALL0_OUT "=r" (a1_), "=r" (a2_), "=r" (a3_), \
-                      "=r" (a4_), "=r" (a5_), "=r" (a6_)
-#define ALT_CALL1_OUT "+r" (a1_), "=r" (a2_), "=r" (a3_), \
-                      "=r" (a4_), "=r" (a5_), "=r" (a6_)
-#define ALT_CALL2_OUT "+r" (a1_), "+r" (a2_), "=r" (a3_), \
-                      "=r" (a4_), "=r" (a5_), "=r" (a6_)
-#define ALT_CALL3_OUT "+r" (a1_), "+r" (a2_), "+r" (a3_), \
-                      "=r" (a4_), "=r" (a5_), "=r" (a6_)
-#define ALT_CALL4_OUT "+r" (a1_), "+r" (a2_), "+r" (a3_), \
-                      "+r" (a4_), "=r" (a5_), "=r" (a6_)
-#define ALT_CALL5_OUT "+r" (a1_), "+r" (a2_), "+r" (a3_), \
-                      "+r" (a4_), "+r" (a5_), "=r" (a6_)
-#define ALT_CALL6_OUT "+r" (a1_), "+r" (a2_), "+r" (a3_), \
-                      "+r" (a4_), "+r" (a5_), "+r" (a6_)
-
-#define alternative_callN(n, rettype, func) ({                     \
-    rettype ret_;                                                  \
-    register unsigned long r10_ asm("r10");                        \
-    register unsigned long r11_ asm("r11");                        \
-    asm volatile (ALTERNATIVE("call *%c[addr](%%rip)", "call .",   \
-                              X86_FEATURE_ALWAYS)                  \
-                  : ALT_CALL ## n ## _OUT, "=a" (ret_),            \
-                    "=r" (r10_), "=r" (r11_) ASM_CALL_CONSTRAINT   \
-                  : [addr] "i" (&(func)), "g" (func)               \
-                  : "memory" );                                    \
-    ret_;                                                          \
-})
-
-#define alternative_vcall0(func) ({             \
-    ALT_CALL_NO_ARG1;                           \
-    (void)sizeof(func());                       \
-    (void)alternative_callN(0, int, func);      \
-})
-
-#define alternative_call0(func) ({              \
-    ALT_CALL_NO_ARG1;                           \
-    alternative_callN(0, typeof(func()), func); \
-})
-
-#define alternative_vcall1(func, arg) ({           \
-    typeof(arg) v1_ = (arg);                       \
-    ALT_CALL_ARG(v1_, 1);                          \
-    ALT_CALL_NO_ARG2;                              \
-    (void)sizeof(func(arg));                       \
-    (void)alternative_callN(1, int, func);         \
-})
-
-#define alternative_call1(func, arg) ({            \
-    typeof(arg) v1_ = (arg);                       \
-    ALT_CALL_ARG(v1_, 1);                          \
-    ALT_CALL_NO_ARG2;                              \
-    alternative_callN(1, typeof(func(arg)), func); \
-})
-
-#define alternative_vcall2(func, arg1, arg2) ({           \
-    typeof(arg1) v1_ = (arg1);                            \
-    typeof(arg2) v2_ = (arg2);                            \
-    ALT_CALL_ARG(v1_, 1);                                 \
-    ALT_CALL_ARG(v2_, 2);                                 \
-    ALT_CALL_NO_ARG3;                                     \
-    (void)sizeof(func(arg1, arg2));                       \
-    (void)alternative_callN(2, int, func);                \
-})
-
-#define alternative_call2(func, arg1, arg2) ({            \
-    typeof(arg1) v1_ = (arg1);                            \
-    typeof(arg2) v2_ = (arg2);                            \
-    ALT_CALL_ARG(v1_, 1);                                 \
-    ALT_CALL_ARG(v2_, 2);                                 \
-    ALT_CALL_NO_ARG3;                                     \
-    alternative_callN(2, typeof(func(arg1, arg2)), func); \
-})
-
-#define alternative_vcall3(func, arg1, arg2, arg3) ({    \
-    typeof(arg1) v1_ = (arg1);                           \
-    typeof(arg2) v2_ = (arg2);                           \
-    typeof(arg3) v3_ = (arg3);                           \
-    ALT_CALL_ARG(v1_, 1);                                \
-    ALT_CALL_ARG(v2_, 2);                                \
-    ALT_CALL_ARG(v3_, 3);                                \
-    ALT_CALL_NO_ARG4;                                    \
-    (void)sizeof(func(arg1, arg2, arg3));                \
-    (void)alternative_callN(3, int, func);               \
-})
-
-#define alternative_call3(func, arg1, arg2, arg3) ({     \
-    typeof(arg1) v1_ = (arg1);                            \
-    typeof(arg2) v2_ = (arg2);                           \
-    typeof(arg3) v3_ = (arg3);                           \
-    ALT_CALL_ARG(v1_, 1);                                \
-    ALT_CALL_ARG(v2_, 2);                                \
-    ALT_CALL_ARG(v3_, 3);                                \
-    ALT_CALL_NO_ARG4;                                    \
-    alternative_callN(3, typeof(func(arg1, arg2, arg3)), \
-                      func);                             \
-})
-
-#define alternative_vcall4(func, arg1, arg2, arg3, arg4) ({ \
-    typeof(arg1) v1_ = (arg1);                              \
-    typeof(arg2) v2_ = (arg2);                              \
-    typeof(arg3) v3_ = (arg3);                              \
-    typeof(arg4) v4_ = (arg4);                              \
-    ALT_CALL_ARG(v1_, 1);                                   \
-    ALT_CALL_ARG(v2_, 2);                                   \
-    ALT_CALL_ARG(v3_, 3);                                   \
-    ALT_CALL_ARG(v4_, 4);                                   \
-    ALT_CALL_NO_ARG5;                                       \
-    (void)sizeof(func(arg1, arg2, arg3, arg4));             \
-    (void)alternative_callN(4, int, func);                  \
-})
-
-#define alternative_call4(func, arg1, arg2, arg3, arg4) ({  \
-    typeof(arg1) v1_ = (arg1);                              \
-    typeof(arg2) v2_ = (arg2);                              \
-    typeof(arg3) v3_ = (arg3);                              \
-    typeof(arg4) v4_ = (arg4);                              \
-    ALT_CALL_ARG(v1_, 1);                                   \
-    ALT_CALL_ARG(v2_, 2);                                   \
-    ALT_CALL_ARG(v3_, 3);                                   \
-    ALT_CALL_ARG(v4_, 4);                                   \
-    ALT_CALL_NO_ARG5;                                       \
-    alternative_callN(4, typeof(func(arg1, arg2,            \
-                                     arg3, arg4)),          \
-                      func);                                \
-})
-
-#define alternative_vcall5(func, arg1, arg2, arg3, arg4, arg5) ({ \
-    typeof(arg1) v1_ = (arg1);                                    \
-    typeof(arg2) v2_ = (arg2);                                    \
-    typeof(arg3) v3_ = (arg3);                                    \
-    typeof(arg4) v4_ = (arg4);                                    \
-    typeof(arg5) v5_ = (arg5);                                    \
-    ALT_CALL_ARG(v1_, 1);                                         \
-    ALT_CALL_ARG(v2_, 2);                                         \
-    ALT_CALL_ARG(v3_, 3);                                         \
-    ALT_CALL_ARG(v4_, 4);                                         \
-    ALT_CALL_ARG(v5_, 5);                                         \
-    ALT_CALL_NO_ARG6;                                             \
-    (void)sizeof(func(arg1, arg2, arg3, arg4, arg5));             \
-    (void)alternative_callN(5, int, func);                        \
-})
-
-#define alternative_call5(func, arg1, arg2, arg3, arg4, arg5) ({  \
-    typeof(arg1) v1_ = (arg1);                                    \
-    typeof(arg2) v2_ = (arg2);                                    \
-    typeof(arg3) v3_ = (arg3);                                    \
-    typeof(arg4) v4_ = (arg4);                                    \
-    typeof(arg5) v5_ = (arg5);                                    \
-    ALT_CALL_ARG(v1_, 1);                                         \
-    ALT_CALL_ARG(v2_, 2);                                         \
-    ALT_CALL_ARG(v3_, 3);                                         \
-    ALT_CALL_ARG(v4_, 4);                                         \
-    ALT_CALL_ARG(v5_, 5);                                         \
-    ALT_CALL_NO_ARG6;                                             \
-    alternative_callN(5, typeof(func(arg1, arg2, arg3,            \
-                                     arg4, arg5)),                \
-                      func);                                      \
-})
-
-#define alternative_vcall6(func, arg1, arg2, arg3, arg4, arg5, arg6) ({ \
-    typeof(arg1) v1_ = (arg1);                                          \
-    typeof(arg2) v2_ = (arg2);                                          \
-    typeof(arg3) v3_ = (arg3);                                          \
-    typeof(arg4) v4_ = (arg4);                                          \
-    typeof(arg5) v5_ = (arg5);                                          \
-    typeof(arg6) v6_ = (arg6);                                          \
-    ALT_CALL_ARG(v1_, 1);                                               \
-    ALT_CALL_ARG(v2_, 2);                                               \
-    ALT_CALL_ARG(v3_, 3);                                               \
-    ALT_CALL_ARG(v4_, 4);                                               \
-    ALT_CALL_ARG(v5_, 5);                                               \
-    ALT_CALL_ARG(v6_, 6);                                               \
-    (void)sizeof(func(arg1, arg2, arg3, arg4, arg5, arg6));             \
-    (void)alternative_callN(6, int, func);                              \
-})
-
-#define alternative_call6(func, arg1, arg2, arg3, arg4, arg5, arg6) ({  \
-    typeof(arg1) v1_ = (arg1);                                          \
-    typeof(arg2) v2_ = (arg2);                                          \
-    typeof(arg3) v3_ = (arg3);                                          \
-    typeof(arg4) v4_ = (arg4);                                          \
-    typeof(arg5) v5_ = (arg5);                                          \
-    typeof(arg6) v6_ = (arg6);                                          \
-    ALT_CALL_ARG(v1_, 1);                                               \
-    ALT_CALL_ARG(v2_, 2);                                               \
-    ALT_CALL_ARG(v3_, 3);                                               \
-    ALT_CALL_ARG(v4_, 4);                                               \
-    ALT_CALL_ARG(v5_, 5);                                               \
-    ALT_CALL_ARG(v6_, 6);                                               \
-    alternative_callN(6, typeof(func(arg1, arg2, arg3,                  \
-                                     arg4, arg5, arg6)),                \
-                      func);                                            \
-})
-
-#define alternative_vcall__(nr) alternative_vcall ## nr
-#define alternative_call__(nr)  alternative_call ## nr
-
-#define alternative_vcall_(nr) alternative_vcall__(nr)
-#define alternative_call_(nr)  alternative_call__(nr)
-
-#define alternative_vcall(func, args...) \
-    alternative_vcall_(count_args(args))(func, ## args)
-
-#define alternative_call(func, args...) \
-    alternative_call_(count_args(args))(func, ## args)
-
 #endif /*  !__ASSEMBLY__  */
 
 #endif /* __X86_ALTERNATIVE_H__ */
diff --git a/xen/arch/x86/include/asm/hvm/hvm.h b/xen/arch/x86/include/asm/hvm/hvm.h
index 963e8201130a..bf8bc2e100bd 100644
--- a/xen/arch/x86/include/asm/hvm/hvm.h
+++ b/xen/arch/x86/include/asm/hvm/hvm.h
@@ -9,9 +9,9 @@
 #ifndef __ASM_X86_HVM_HVM_H__
 #define __ASM_X86_HVM_HVM_H__
 
+#include <xen/alternative-call.h>
 #include <xen/mm.h>
 
-#include <asm/alternative.h>
 #include <asm/asm_defns.h>
 #include <asm/current.h>
 #include <asm/x86_emulate.h>
diff --git a/xen/common/core_parking.c b/xen/common/core_parking.c
index a970ffeab8c3..7d6a18cdcf4c 100644
--- a/xen/common/core_parking.c
+++ b/xen/common/core_parking.c
@@ -15,10 +15,10 @@
  *  General Public License for more details.
  */
 
-#include <xen/types.h>
+#include <xen/alternative-call.h>
 #include <xen/cpu.h>
-#include <xen/init.h>
 #include <xen/cpumask.h>
+#include <xen/init.h>
 #include <xen/param.h>
 
 #include <asm/smp.h>
diff --git a/xen/include/xen/alternative-call.h b/xen/include/xen/alternative-call.h
index 62672b732431..39339c3f0f76 100644
--- a/xen/include/xen/alternative-call.h
+++ b/xen/include/xen/alternative-call.h
@@ -13,10 +13,10 @@
  *
  * For architectures to support:
  *
- * - Implement alternative_{,v}call() in asm/alternative.h.  Code generation
- *   requirements are to emit a function pointer call at build time, and stash
- *   enough metadata to simplify the call at boot once the implementation has
- *   been resolved.
+ * - Implement alternative_{,v}call() in asm/alternative-call.h.  Code
+ *   generation requirements are to emit a function pointer call at build
+ *   time, and stash enough metadata to simplify the call at boot once the
+ *   implementation has been resolved.
  * - Select ALTERNATIVE_CALL in Kconfig.
  *
  * To use:
@@ -48,7 +48,7 @@
 
 #ifdef CONFIG_ALTERNATIVE_CALL
 
-#include <asm/alternative.h>
+#include <asm/alternative-call.h>
 
 #ifdef CONFIG_LIVEPATCH
 /* Must keep for livepatches to resolve alternative calls. */
-- 
2.39.5



^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 2/8] x86/altcall: Rename alternative_branches() to boot_apply_alt_calls()
  2025-04-23  1:02 [PATCH 0/8] x86/altcall: Switch to a simpler scheme Andrew Cooper
  2025-04-23  1:02 ` [PATCH 1/8] x86/altcall: Split alternative-call.h out of alternative.h Andrew Cooper
@ 2025-04-23  1:02 ` Andrew Cooper
  2025-04-23  7:55   ` Jan Beulich
  2025-04-23  1:02 ` [PATCH 3/8] x86/alternatives: Rework information passing into nmi_apply_alternatives() Andrew Cooper
                   ` (5 subsequent siblings)
  7 siblings, 1 reply; 20+ messages in thread
From: Andrew Cooper @ 2025-04-23  1:02 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Jan Beulich, Roger Pau Monné

The alternatives APIs are not great; rename alternative_branches() to be more
precise.  Centralise the declaration in xen/alternative-call.h, in the
expectation that x86 won't be the only user in the long term.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Roger Pau Monné <roger.pau@citrix.com>
---
 xen/arch/x86/alternative.c             |  2 +-
 xen/arch/x86/include/asm/alternative.h |  1 -
 xen/arch/x86/setup.c                   |  3 ++-
 xen/include/xen/alternative-call.h     | 10 +++++++++-
 4 files changed, 12 insertions(+), 4 deletions(-)

diff --git a/xen/arch/x86/alternative.c b/xen/arch/x86/alternative.c
index 1ba35cb9ede9..d1a3b7ea7ca6 100644
--- a/xen/arch/x86/alternative.c
+++ b/xen/arch/x86/alternative.c
@@ -493,7 +493,7 @@ void __init alternative_instructions(void)
     _alternative_instructions(false);
 }
 
-void __init alternative_branches(void)
+void __init boot_apply_alt_calls(void)
 {
     local_irq_disable();
     _alternative_instructions(true);
diff --git a/xen/arch/x86/include/asm/alternative.h b/xen/arch/x86/include/asm/alternative.h
index 2d2ace97f794..29c3d724b07f 100644
--- a/xen/arch/x86/include/asm/alternative.h
+++ b/xen/arch/x86/include/asm/alternative.h
@@ -29,7 +29,6 @@ extern void add_nops(void *insns, unsigned int len);
 /* Similar to alternative_instructions except it can be run with IRQs enabled. */
 extern int apply_alternatives(struct alt_instr *start, struct alt_instr *end);
 extern void alternative_instructions(void);
-extern void alternative_branches(void);
 
 #define alt_orig_len       "(.LXEN%=_orig_e - .LXEN%=_orig_s)"
 #define alt_pad_len        "(.LXEN%=_orig_p - .LXEN%=_orig_e)"
diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c
index cf1ea040dd90..25189541244d 100644
--- a/xen/arch/x86/setup.c
+++ b/xen/arch/x86/setup.c
@@ -1,4 +1,5 @@
 #include <xen/acpi.h>
+#include <xen/alternative-call.h>
 #include <xen/bitops.h>
 #include <xen/console.h>
 #include <xen/cpu.h>
@@ -2082,7 +2083,7 @@ void asmlinkage __init noreturn __start_xen(void)
 
     do_presmp_initcalls();
 
-    alternative_branches();
+    boot_apply_alt_calls();
 
     /*
      * NB: when running as a PV shim VCPUOP_up/down is wired to the shim
diff --git a/xen/include/xen/alternative-call.h b/xen/include/xen/alternative-call.h
index 39339c3f0f76..3c855bfa44f5 100644
--- a/xen/include/xen/alternative-call.h
+++ b/xen/include/xen/alternative-call.h
@@ -17,6 +17,8 @@
  *   generation requirements are to emit a function pointer call at build
  *   time, and stash enough metadata to simplify the call at boot once the
  *   implementation has been resolved.
+ * - Implement boot_apply_alt_calls() to convert the function pointer calls
+ *   into direct calls on boot.
  * - Select ALTERNATIVE_CALL in Kconfig.
  *
  * To use:
@@ -57,7 +59,13 @@
 # define __alt_call_maybe_initdata __initdata
 #endif
 
-#else
+/*
+ * Devirtualise the alternative_{,v}call()'s on boot.  Convert still-NULL
+ * function pointers into traps.
+ */
+void boot_apply_alt_calls(void);
+
+#else /* CONFIG_ALTERNATIVE_CALL */
 
 #define alternative_call(func, args...)  (func)(args)
 #define alternative_vcall(func, args...) (func)(args)
-- 
2.39.5



^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 3/8] x86/alternatives: Rework information passing into nmi_apply_alternatives()
  2025-04-23  1:02 [PATCH 0/8] x86/altcall: Switch to a simpler scheme Andrew Cooper
  2025-04-23  1:02 ` [PATCH 1/8] x86/altcall: Split alternative-call.h out of alternative.h Andrew Cooper
  2025-04-23  1:02 ` [PATCH 2/8] x86/altcall: Rename alternative_branches() to boot_apply_alt_calls() Andrew Cooper
@ 2025-04-23  1:02 ` Andrew Cooper
  2025-04-23  8:06   ` Jan Beulich
  2025-04-23  1:02 ` [PATCH 4/8] x86/alternatives: Factor seal_endbr64() out of _apply_alternatives() Andrew Cooper
                   ` (4 subsequent siblings)
  7 siblings, 1 reply; 20+ messages in thread
From: Andrew Cooper @ 2025-04-23  1:02 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Jan Beulich, Roger Pau Monné

nmi_apply_alternatives() is soon going to need to dispatch to multiple
functions, and a force parameter is not a good way of passing information.

Introduce ALT_INSNS and ALT_CALLS to pass in at the top level to select the
operation(s) desired.  They represent what will happen when we've separated
the altcalls out of the general alternative instructions infrastructure,
although in the short term we still need to synthesise the force parameter for
_apply_alternatives().

Move two externs to reduce their scope a little.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Roger Pau Monné <roger.pau@citrix.com>
---
 xen/arch/x86/alternative.c | 16 +++++++++-------
 1 file changed, 9 insertions(+), 7 deletions(-)

diff --git a/xen/arch/x86/alternative.c b/xen/arch/x86/alternative.c
index d1a3b7ea7ca6..9aa591b364a4 100644
--- a/xen/arch/x86/alternative.c
+++ b/xen/arch/x86/alternative.c
@@ -19,8 +19,6 @@
 
 #define MAX_PATCH_LEN (255-1)
 
-extern struct alt_instr __alt_instructions[], __alt_instructions_end[];
-
 #ifdef K8_NOP1
 static const unsigned char k8nops[] init_or_livepatch_const = {
     K8_NOP1,
@@ -387,9 +385,13 @@ int apply_alternatives(struct alt_instr *start, struct alt_instr *end)
 }
 #endif
 
+#define ALT_INSNS (1U << 0)
+#define ALT_CALLS (1U << 1)
 static unsigned int __initdata alt_todo;
 static unsigned int __initdata alt_done;
 
+extern struct alt_instr __alt_instructions[], __alt_instructions_end[];
+
 /*
  * At boot time, we patch alternatives in NMI context.  This means that the
  * active NMI-shadow will defer any further NMIs, removing the slim race
@@ -419,7 +421,7 @@ static int __init cf_check nmi_apply_alternatives(
         flush_local(FLUSH_TLB_GLOBAL);
 
         rc = _apply_alternatives(__alt_instructions, __alt_instructions_end,
-                                 alt_done);
+                                 alt_todo == ALT_CALLS);
         if ( rc )
             panic("Unable to apply alternatives: %d\n", rc);
 
@@ -442,7 +444,7 @@ static int __init cf_check nmi_apply_alternatives(
  * This routine is called with local interrupt disabled and used during
  * bootup.
  */
-static void __init _alternative_instructions(bool force)
+static void __init _alternative_instructions(unsigned int what)
 {
     unsigned int i;
     nmi_callback_t *saved_nmi_callback;
@@ -460,7 +462,7 @@ static void __init _alternative_instructions(bool force)
     ASSERT(!local_irq_is_enabled());
 
     /* Set what operation to perform /before/ setting the callback. */
-    alt_todo = 1u << force;
+    alt_todo = what;
     barrier();
 
     /*
@@ -490,12 +492,12 @@ static void __init _alternative_instructions(bool force)
 void __init alternative_instructions(void)
 {
     arch_init_ideal_nops();
-    _alternative_instructions(false);
+    _alternative_instructions(ALT_INSNS);
 }
 
 void __init boot_apply_alt_calls(void)
 {
     local_irq_disable();
-    _alternative_instructions(true);
+    _alternative_instructions(ALT_CALLS);
     local_irq_enable();
 }
-- 
2.39.5



^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 4/8] x86/alternatives: Factor seal_endbr64() out of _apply_alternatives()
  2025-04-23  1:02 [PATCH 0/8] x86/altcall: Switch to a simpler scheme Andrew Cooper
                   ` (2 preceding siblings ...)
  2025-04-23  1:02 ` [PATCH 3/8] x86/alternatives: Rework information passing into nmi_apply_alternatives() Andrew Cooper
@ 2025-04-23  1:02 ` Andrew Cooper
  2025-04-23  8:07   ` Jan Beulich
  2025-04-23  1:02 ` [PATCH 5/8] x86/altcall: Introduce new simpler scheme Andrew Cooper
                   ` (3 subsequent siblings)
  7 siblings, 1 reply; 20+ messages in thread
From: Andrew Cooper @ 2025-04-23  1:02 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Jan Beulich, Roger Pau Monné

We are going to need to reposition the call in a change with several moving
parts.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Roger Pau Monné <roger.pau@citrix.com>
---
 xen/arch/x86/alternative.c | 70 ++++++++++++++++++++++----------------
 1 file changed, 40 insertions(+), 30 deletions(-)

diff --git a/xen/arch/x86/alternative.c b/xen/arch/x86/alternative.c
index 9aa591b364a4..4b9f8d860153 100644
--- a/xen/arch/x86/alternative.c
+++ b/xen/arch/x86/alternative.c
@@ -162,6 +162,44 @@ text_poke(void *addr, const void *opcode, size_t len)
 extern void *const __initdata_cf_clobber_start[];
 extern void *const __initdata_cf_clobber_end[];
 
+/*
+ * In CET-IBT enabled builds, clobber endbr64 instructions after altcall has
+ * finished optimising all indirect branches to direct ones.
+ */
+static void __init seal_endbr64(void)
+{
+    void *const *val;
+    unsigned int clobbered = 0;
+
+    if ( !cpu_has_xen_ibt )
+        return;
+
+    /*
+     * This is some minor structure (ab)use.  We walk the entire contents
+     * of .init.{ro,}data.cf_clobber as if it were an array of pointers.
+     *
+     * If the pointer points into .text, and at an endbr64 instruction,
+     * nop out the endbr64.  This causes the pointer to no longer be a
+     * legal indirect branch target under CET-IBT.  This is a
+     * defence-in-depth measure, to reduce the options available to an
+     * adversary who has managed to hijack a function pointer.
+     */
+    for ( val = __initdata_cf_clobber_start;
+          val < __initdata_cf_clobber_end;
+          val++ )
+    {
+        void *ptr = *val;
+
+        if ( !is_kernel_text(ptr) || !is_endbr64(ptr) )
+            continue;
+
+        place_endbr64_poison(ptr);
+        clobbered++;
+    }
+
+    printk("altcall: Optimised away %u endbr64 instructions\n", clobbered);
+}
+
 /*
  * Replace instructions with better alternatives for this CPU type.
  * This runs before SMP is initialized to avoid SMP problems with
@@ -344,36 +382,8 @@ static int init_or_livepatch _apply_alternatives(struct alt_instr *start,
      * Clobber endbr64 instructions now that altcall has finished optimising
      * all indirect branches to direct ones.
      */
-    if ( force && cpu_has_xen_ibt && system_state < SYS_STATE_active )
-    {
-        void *const *val;
-        unsigned int clobbered = 0;
-
-        /*
-         * This is some minor structure (ab)use.  We walk the entire contents
-         * of .init.{ro,}data.cf_clobber as if it were an array of pointers.
-         *
-         * If the pointer points into .text, and at an endbr64 instruction,
-         * nop out the endbr64.  This causes the pointer to no longer be a
-         * legal indirect branch target under CET-IBT.  This is a
-         * defence-in-depth measure, to reduce the options available to an
-         * adversary who has managed to hijack a function pointer.
-         */
-        for ( val = __initdata_cf_clobber_start;
-              val < __initdata_cf_clobber_end;
-              val++ )
-        {
-            void *ptr = *val;
-
-            if ( !is_kernel_text(ptr) || !is_endbr64(ptr) )
-                continue;
-
-            place_endbr64_poison(ptr);
-            clobbered++;
-        }
-
-        printk("altcall: Optimised away %u endbr64 instructions\n", clobbered);
-    }
+    if ( force && system_state < SYS_STATE_active )
+        seal_endbr64();
 
     return 0;
 }
-- 
2.39.5



^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 5/8] x86/altcall: Introduce new simpler scheme
  2025-04-23  1:02 [PATCH 0/8] x86/altcall: Switch to a simpler scheme Andrew Cooper
                   ` (3 preceding siblings ...)
  2025-04-23  1:02 ` [PATCH 4/8] x86/alternatives: Factor seal_endbr64() out of _apply_alternatives() Andrew Cooper
@ 2025-04-23  1:02 ` Andrew Cooper
  2025-04-23  8:16   ` Jan Beulich
  2025-04-23  1:02 ` [PATCH 6/8] xen/livepatch: Support new altcall scheme Andrew Cooper
                   ` (2 subsequent siblings)
  7 siblings, 1 reply; 20+ messages in thread
From: Andrew Cooper @ 2025-04-23  1:02 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Jan Beulich, Roger Pau Monné

Encoding altcalls as regular alternatives leads to an unreasonable amount of
complexity in _apply_alternatives().

Introduce apply_alt_calls(), and an .alt_call_sites section which simply
tracks the source address (relative, to save on space).  That's literally all
that is needed in order to devirtualise the function pointers.

apply_alt_calls() is mostly as per _apply_alternatives(), except the size is
known to be 6 bytes.  Drop the logic for JMP *RIPREL, as there's no support
for tailcall optimisations, nor a feasbile plan on how to introduce support.
Pad with a redundant prefix to avoid needing a separate NOP on the end.

Wire it up in nmi_apply_alternatives(), although the section is empty at this
juncture so nothing happens in practice.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Roger Pau Monné <roger.pau@citrix.com>

Finding a 6-byte UD instruction that is distinct from ud2 turns out to be
quite challengning.  The easy way involves a length changing prefix, which is
best avoided.  Suggestions for alternative patterns welcome.

---
 xen/arch/x86/alternative.c                  | 94 +++++++++++++++++++++
 xen/arch/x86/include/asm/alternative-call.h |  7 ++
 xen/arch/x86/xen.lds.S                      |  4 +
 3 files changed, 105 insertions(+)

diff --git a/xen/arch/x86/alternative.c b/xen/arch/x86/alternative.c
index 4b9f8d860153..f6594e21a14c 100644
--- a/xen/arch/x86/alternative.c
+++ b/xen/arch/x86/alternative.c
@@ -388,6 +388,92 @@ static int init_or_livepatch _apply_alternatives(struct alt_instr *start,
     return 0;
 }
 
+/*
+ * At build time, alternative calls are emitted as:
+ *   ff 15 xx xx xx xx  =>  call *disp32(%rip)
+ *
+ * During boot, we devirtualise by editing to:
+ *   2e e8 xx xx xx xx  =>  cs call disp32
+ *
+ * or, if the function pointer is still NULL, poison to:
+ *   0f 0b 0f 0b 0f 0b  =>  ud2a (x3)
+ */
+static int init_or_livepatch apply_alt_calls(
+    const struct alt_call *start, const struct alt_call *end)
+{
+    const struct alt_call *a;
+
+    for ( a = start; a < end; a++ )
+    {
+        const uint8_t *dest;
+        uint8_t buf[6], *orig = ALT_CALL_PTR(a);
+        long disp;
+
+        /* It's likely that this won't change, but check just to be safe. */
+        BUILD_BUG_ON(ALT_CALL_LEN(a) != 6);
+
+        if ( orig[0] != 0xff || orig[1] != 0x15 )
+        {
+            printk(XENLOG_ERR
+                   "Altcall for %ps [%6ph] not CALL *RIPREL\n",
+                   orig, orig);
+            return -EINVAL;
+        }
+
+        disp = *(int32_t *)(orig + 2);
+        dest = *(const void **)(orig + 6 + disp);
+
+        if ( dest )
+        {
+            /*
+             * When building for CET-IBT, all function pointer targets
+             * should have an endbr64 instruction.
+             *
+             * If this is not the case, leave a warning because
+             * something is probably wrong with the build.  A CET-IBT
+             * enabled system might have exploded already.
+             *
+             * Otherwise, skip the endbr64 instruction.  This is a
+             * marginal perf improvement which saves on instruction
+             * decode bandwidth.
+             */
+            if ( IS_ENABLED(CONFIG_XEN_IBT) )
+            {
+                if ( is_endbr64(dest) )
+                    dest += ENDBR64_LEN;
+                else
+                    printk(XENLOG_WARNING
+                           "Altcall %ps dest %ps has no endbr64\n",
+                           orig, dest);
+            }
+
+            disp = dest - (orig + 6);
+            ASSERT(disp == (int32_t)disp);
+
+            buf[0] = 0x2e;
+            buf[1] = 0xe8;
+            *(int32_t *)(buf + 2) = disp;
+        }
+        else
+        {
+            /*
+             * The function pointer is still NULL.  Seal the whole call, as
+             * it's not used.
+             */
+            buf[0] = 0x0f;
+            buf[1] = 0x0b;
+            buf[2] = 0x0f;
+            buf[3] = 0x0b;
+            buf[4] = 0x0f;
+            buf[5] = 0x0b;
+        }
+
+        text_poke(orig, buf, sizeof(buf));
+    }
+
+    return 0;
+}
+
 #ifdef CONFIG_LIVEPATCH
 int apply_alternatives(struct alt_instr *start, struct alt_instr *end)
 {
@@ -401,6 +487,7 @@ static unsigned int __initdata alt_todo;
 static unsigned int __initdata alt_done;
 
 extern struct alt_instr __alt_instructions[], __alt_instructions_end[];
+extern struct alt_call __alt_call_sites_start[], __alt_call_sites_end[];
 
 /*
  * At boot time, we patch alternatives in NMI context.  This means that the
@@ -435,6 +522,13 @@ static int __init cf_check nmi_apply_alternatives(
         if ( rc )
             panic("Unable to apply alternatives: %d\n", rc);
 
+        if ( alt_todo & ALT_CALLS )
+        {
+            rc = apply_alt_calls(__alt_call_sites_start, __alt_call_sites_end);
+            if ( rc )
+                panic("Unable to apply alternative calls: %d\n", rc);
+        }
+
         /*
          * Reinstate perms on .text to be RX.  This also cleans out the dirty
          * bits, which matters when CET Shstk is active.
diff --git a/xen/arch/x86/include/asm/alternative-call.h b/xen/arch/x86/include/asm/alternative-call.h
index 828ea32a9625..49a04a7cc45b 100644
--- a/xen/arch/x86/include/asm/alternative-call.h
+++ b/xen/arch/x86/include/asm/alternative-call.h
@@ -4,6 +4,13 @@
 
 #include <asm/alternative.h>
 
+/* Simply the relative position of the source call. */
+struct alt_call {
+    int32_t offset;
+};
+#define ALT_CALL_PTR(a) ((void *)&(a)->offset + (a)->offset)
+#define ALT_CALL_LEN(a) (6)
+
 /*
  * Machinery to allow converting indirect to direct calls, when the called
  * function is determined once at boot and later never changed.
diff --git a/xen/arch/x86/xen.lds.S b/xen/arch/x86/xen.lds.S
index d4dd6434c466..53bafc98a536 100644
--- a/xen/arch/x86/xen.lds.S
+++ b/xen/arch/x86/xen.lds.S
@@ -260,6 +260,10 @@ SECTIONS
         __alt_instructions = .;
         *(.altinstructions)
         __alt_instructions_end = .;
+        . = ALIGN(4);
+        __alt_call_sites_start = .;
+        *(.alt_call_sites)
+        __alt_call_sites_end = .;
 
        LOCK_PROFILE_DATA
 
-- 
2.39.5



^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 6/8] xen/livepatch: Support new altcall scheme
  2025-04-23  1:02 [PATCH 0/8] x86/altcall: Switch to a simpler scheme Andrew Cooper
                   ` (4 preceding siblings ...)
  2025-04-23  1:02 ` [PATCH 5/8] x86/altcall: Introduce new simpler scheme Andrew Cooper
@ 2025-04-23  1:02 ` Andrew Cooper
  2025-04-23  8:20   ` Jan Beulich
  2025-04-23 11:16   ` Roger Pau Monné
  2025-04-23  1:02 ` [PATCH 7/8] x86/altcall: Switch to simpler scheme Andrew Cooper
  2025-04-23  1:02 ` [PATCH 8/8] x86/alternatives: Simplify _apply_alternatives() now altcall is separate Andrew Cooper
  7 siblings, 2 replies; 20+ messages in thread
From: Andrew Cooper @ 2025-04-23  1:02 UTC (permalink / raw)
  To: Xen-devel
  Cc: Andrew Cooper, Jan Beulich, Roger Pau Monné, Ross Lagerwall

The new altcall scheme uses an .alt_call_sites section.  Wire this up in very
much the same way as the .altinstructions section, although there is less
sanity checking necessary.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Roger Pau Monné <roger.pau@citrix.com>
CC: Ross Lagerwall <ross.lagerwall@citrix.com>
---
 xen/arch/x86/alternative.c         |  6 ++++
 xen/common/livepatch.c             | 58 ++++++++++++++++++++++++++++++
 xen/include/xen/alternative-call.h |  8 +++--
 3 files changed, 70 insertions(+), 2 deletions(-)

diff --git a/xen/arch/x86/alternative.c b/xen/arch/x86/alternative.c
index f6594e21a14c..22af224f08f7 100644
--- a/xen/arch/x86/alternative.c
+++ b/xen/arch/x86/alternative.c
@@ -479,6 +479,12 @@ int apply_alternatives(struct alt_instr *start, struct alt_instr *end)
 {
     return _apply_alternatives(start, end, true);
 }
+
+int livepatch_apply_alt_calls(const struct alt_call *start,
+                              const struct alt_call *end)
+{
+    return apply_alt_calls(start, end);
+}
 #endif
 
 #define ALT_INSNS (1U << 0)
diff --git a/xen/common/livepatch.c b/xen/common/livepatch.c
index 6ce77bf021b7..be9b7e367553 100644
--- a/xen/common/livepatch.c
+++ b/xen/common/livepatch.c
@@ -905,6 +905,64 @@ static int prepare_payload(struct payload *payload,
 #endif
     }
 
+    sec = livepatch_elf_sec_by_name(elf, ".alt_call_sites");
+    if ( sec )
+    {
+#ifdef CONFIG_ALTERNATIVE_CALL
+        const struct alt_call *a, *start, *end;
+
+        if ( !section_ok(elf, sec, sizeof(*a)) )
+            return -EINVAL;
+
+        /* Tolerate an empty .alt_call_sites section... */
+        if ( sec->sec->sh_size == 0 )
+            goto alt_call_done;
+
+        /* ... but otherwise, there needs to be something to alter... */
+        if ( payload->text_size == 0 )
+        {
+            printk(XENLOG_ERR LIVEPATCH "%s Alternative calls provided, but no .text\n",
+                   elf->name);
+            return -EINVAL;
+        }
+
+        start = sec->addr;
+        end = sec->addr + sec->sec->sh_size;
+
+        for ( a = start; a < end; a++ )
+        {
+            const void *orig = ALT_CALL_PTR(a);
+            size_t len = ALT_CALL_LEN(a);
+
+            /* orig must be fully within .text. */
+            if ( orig       < payload->text_addr ||
+                 len        > payload->text_size ||
+                 orig + len > payload->text_addr + payload->text_size )
+            {
+                printk(XENLOG_ERR LIVEPATCH
+                       "%s: Alternative call %p+%#zx outside payload text %p+%#zx\n",
+                       elf->name, orig, len,
+                       payload->text_addr, payload->text_size);
+                return -EINVAL;
+            }
+        }
+
+        rc = livepatch_apply_alt_calls(start, end);
+        if ( rc )
+        {
+            printk(XENLOG_ERR LIVEPATCH "%s: Applying alternative calls failed: %d\n",
+                   elf->name, rc);
+            return rc;
+        }
+
+    alt_call_done:;
+#else /* CONFIG_ALTERNATIVE_CALL */
+        printk(XENLOG_ERR LIVEPATCH "%s: Alternative calls not supported\n",
+               elf->name);
+        return -EOPNOTSUPP;
+#endif /* !CONFIG_ALTERNATIVE_CALL */
+    }
+
     sec = livepatch_elf_sec_by_name(elf, ".ex_table");
     if ( sec )
     {
diff --git a/xen/include/xen/alternative-call.h b/xen/include/xen/alternative-call.h
index 3c855bfa44f5..767c2149bce7 100644
--- a/xen/include/xen/alternative-call.h
+++ b/xen/include/xen/alternative-call.h
@@ -17,8 +17,8 @@
  *   generation requirements are to emit a function pointer call at build
  *   time, and stash enough metadata to simplify the call at boot once the
  *   implementation has been resolved.
- * - Implement boot_apply_alt_calls() to convert the function pointer calls
- *   into direct calls on boot.
+ * - Implement {boot,livepatch}_apply_alt_calls() to convert the function
+ *   pointer calls into direct calls on boot/livepatch.
  * - Select ALTERNATIVE_CALL in Kconfig.
  *
  * To use:
@@ -65,6 +65,10 @@
  */
 void boot_apply_alt_calls(void);
 
+/* As per boot_apply_alt_calls() but for a livepatch. */
+int livepatch_apply_alt_calls(const struct alt_call *start,
+                              const struct alt_call *end);
+
 #else /* CONFIG_ALTERNATIVE_CALL */
 
 #define alternative_call(func, args...)  (func)(args)
-- 
2.39.5



^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 7/8] x86/altcall: Switch to simpler scheme
  2025-04-23  1:02 [PATCH 0/8] x86/altcall: Switch to a simpler scheme Andrew Cooper
                   ` (5 preceding siblings ...)
  2025-04-23  1:02 ` [PATCH 6/8] xen/livepatch: Support new altcall scheme Andrew Cooper
@ 2025-04-23  1:02 ` Andrew Cooper
  2025-04-23  9:01   ` Jan Beulich
  2025-04-23  1:02 ` [PATCH 8/8] x86/alternatives: Simplify _apply_alternatives() now altcall is separate Andrew Cooper
  7 siblings, 1 reply; 20+ messages in thread
From: Andrew Cooper @ 2025-04-23  1:02 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Jan Beulich, Roger Pau Monné

With all the infrastructure in place, switch from using ALTERNATIVE() to
simply populating .alt_call_sites.

Before, _apply_alternatives() would devirtualise in two passes; the first
being opportunistic, and the second (signified by the force parameter) sealing
any call with a still-NULL function pointer.

Now, all devirtualising is performed together, at the point in time of the
second pass previously.  The call to seal_endbr64() needs delaying until after
apply_alt_calls() is complete, or we have a narrow window with real indirect
branches and no ENDBR64 instructions.

Under the hood, the following changes are happening:

  Section                Old size   New size   Change (%)
  .alt_call_sites               0    0x00730   +0x0730
  .altinstructions        0x1350a    0x11fe0   -0x152a (-7%)
  .altinstr_replacement   0x015f2    0x00e35   -0x07bd (-23%)

The changes aren't quite equal because inlining is affected by the smaller
asm() block.  Nevertheless, the metadata is held in 1/3 of the space, and
there are no CALL instructions held in the replacement section any more.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Roger Pau Monné <roger.pau@citrix.com>
---
 xen/arch/x86/alternative.c                  | 9 ++-------
 xen/arch/x86/include/asm/alternative-call.h | 9 ++++++---
 2 files changed, 8 insertions(+), 10 deletions(-)

diff --git a/xen/arch/x86/alternative.c b/xen/arch/x86/alternative.c
index 22af224f08f7..047bfc6e424b 100644
--- a/xen/arch/x86/alternative.c
+++ b/xen/arch/x86/alternative.c
@@ -378,13 +378,6 @@ static int init_or_livepatch _apply_alternatives(struct alt_instr *start,
         text_poke(orig, buf, total_len);
     }
 
-    /*
-     * Clobber endbr64 instructions now that altcall has finished optimising
-     * all indirect branches to direct ones.
-     */
-    if ( force && system_state < SYS_STATE_active )
-        seal_endbr64();
-
     return 0;
 }
 
@@ -533,6 +526,8 @@ static int __init cf_check nmi_apply_alternatives(
             rc = apply_alt_calls(__alt_call_sites_start, __alt_call_sites_end);
             if ( rc )
                 panic("Unable to apply alternative calls: %d\n", rc);
+
+            seal_endbr64();
         }
 
         /*
diff --git a/xen/arch/x86/include/asm/alternative-call.h b/xen/arch/x86/include/asm/alternative-call.h
index 49a04a7cc45b..bbc49a5274d9 100644
--- a/xen/arch/x86/include/asm/alternative-call.h
+++ b/xen/arch/x86/include/asm/alternative-call.h
@@ -2,7 +2,8 @@
 #ifndef X86_ALTERNATIVE_CALL_H
 #define X86_ALTERNATIVE_CALL_H
 
-#include <asm/alternative.h>
+#include <xen/macros.h>
+#include <xen/stdint.h>
 
 /* Simply the relative position of the source call. */
 struct alt_call {
@@ -86,8 +87,10 @@ struct alt_call {
     rettype ret_;                                                  \
     register unsigned long r10_ asm("r10");                        \
     register unsigned long r11_ asm("r11");                        \
-    asm volatile (ALTERNATIVE("call *%c[addr](%%rip)", "call .",   \
-                              X86_FEATURE_ALWAYS)                  \
+    asm volatile ("1: call *%c[addr](%%rip)\n\t"                   \
+                  ".pushsection .alt_call_sites, \"a\", @progbits\n\t"  \
+                  ".long 1b - .\n\t"                               \
+                  ".popsection"                                    \
                   : ALT_CALL ## n ## _OUT, "=a" (ret_),            \
                     "=r" (r10_), "=r" (r11_) ASM_CALL_CONSTRAINT   \
                   : [addr] "i" (&(func)), "g" (func)               \
-- 
2.39.5



^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 8/8] x86/alternatives: Simplify _apply_alternatives() now altcall is separate
  2025-04-23  1:02 [PATCH 0/8] x86/altcall: Switch to a simpler scheme Andrew Cooper
                   ` (6 preceding siblings ...)
  2025-04-23  1:02 ` [PATCH 7/8] x86/altcall: Switch to simpler scheme Andrew Cooper
@ 2025-04-23  1:02 ` Andrew Cooper
  2025-04-23  9:03   ` Jan Beulich
  7 siblings, 1 reply; 20+ messages in thread
From: Andrew Cooper @ 2025-04-23  1:02 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Jan Beulich, Roger Pau Monné

With altcall handled separately, the special case in _apply_alternatives() is
unused and can be dropped.  The force parameter (used to signify the seal
pass) can be removed too.

In turn, nmi_apply_alternatives() no longer needs to call
_apply_alternatives() on the second pass.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Roger Pau Monné <roger.pau@citrix.com>
---
 xen/arch/x86/alternative.c | 94 ++++----------------------------------
 1 file changed, 10 insertions(+), 84 deletions(-)

diff --git a/xen/arch/x86/alternative.c b/xen/arch/x86/alternative.c
index 047bfc6e424b..43b009888c02 100644
--- a/xen/arch/x86/alternative.c
+++ b/xen/arch/x86/alternative.c
@@ -206,14 +206,9 @@ static void __init seal_endbr64(void)
  * self modifying code. This implies that asymmetric systems where
  * APs have less capabilities than the boot processor are not handled.
  * Tough. Make sure you disable such features by hand.
- *
- * The caller will set the "force" argument to true for the final
- * invocation, such that no CALLs/JMPs to NULL pointers will be left
- * around. See also the further comment below.
  */
 static int init_or_livepatch _apply_alternatives(struct alt_instr *start,
-                                                 struct alt_instr *end,
-                                                 bool force)
+                                                 struct alt_instr *end)
 {
     struct alt_instr *a, *base;
 
@@ -274,10 +269,7 @@ static int init_or_livepatch _apply_alternatives(struct alt_instr *start,
 
         /* Skip patch sites already handled during the first pass. */
         if ( a->priv )
-        {
-            ASSERT(force);
             continue;
-        }
 
         /* If there is no replacement to make, see about optimising the nops. */
         if ( !boot_cpu_has(a->cpuid) )
@@ -301,76 +293,7 @@ static int init_or_livepatch _apply_alternatives(struct alt_instr *start,
 
         /* 0xe8/0xe9 are relative branches; fix the offset. */
         if ( a->repl_len >= 5 && (*buf & 0xfe) == 0xe8 )
-        {
-            /*
-             * Detect the special case of indirect-to-direct branch patching:
-             * - replacement is a direct CALL/JMP (opcodes 0xE8/0xE9; already
-             *   checked above),
-             * - replacement's displacement is -5 (pointing back at the very
-             *   insn, which makes no sense in a real replacement insn),
-             * - original is an indirect CALL/JMP (opcodes 0xFF/2 or 0xFF/4)
-             *   using RIP-relative addressing.
-             * Some branch destinations may still be NULL when we come here
-             * the first time. Defer patching of those until the post-presmp-
-             * initcalls re-invocation (with force set to true). If at that
-             * point the branch destination is still NULL, insert "UD2; UD0"
-             * (for ease of recognition) instead of CALL/JMP.
-             */
-            if ( a->cpuid == X86_FEATURE_ALWAYS &&
-                 *(int32_t *)(buf + 1) == -5 &&
-                 a->orig_len >= 6 &&
-                 orig[0] == 0xff &&
-                 orig[1] == (*buf & 1 ? 0x25 : 0x15) )
-            {
-                long disp = *(int32_t *)(orig + 2);
-                const uint8_t *dest = *(void **)(orig + 6 + disp);
-
-                if ( dest )
-                {
-                    /*
-                     * When building for CET-IBT, all function pointer targets
-                     * should have an endbr64 instruction.
-                     *
-                     * If this is not the case, leave a warning because
-                     * something is probably wrong with the build.  A CET-IBT
-                     * enabled system might have exploded already.
-                     *
-                     * Otherwise, skip the endbr64 instruction.  This is a
-                     * marginal perf improvement which saves on instruction
-                     * decode bandwidth.
-                     */
-                    if ( IS_ENABLED(CONFIG_XEN_IBT) )
-                    {
-                        if ( is_endbr64(dest) )
-                            dest += ENDBR64_LEN;
-                        else
-                            printk(XENLOG_WARNING
-                                   "altcall %ps dest %ps has no endbr64\n",
-                                   orig, dest);
-                    }
-
-                    disp = dest - (orig + 5);
-                    ASSERT(disp == (int32_t)disp);
-                    *(int32_t *)(buf + 1) = disp;
-                }
-                else if ( force )
-                {
-                    buf[0] = 0x0f;
-                    buf[1] = 0x0b;
-                    buf[2] = 0x0f;
-                    buf[3] = 0xff;
-                    buf[4] = 0xff;
-                }
-                else
-                    continue;
-            }
-            else if ( force && system_state < SYS_STATE_active )
-                ASSERT_UNREACHABLE();
-            else
-                *(int32_t *)(buf + 1) += repl - orig;
-        }
-        else if ( force && system_state < SYS_STATE_active  )
-            ASSERT_UNREACHABLE();
+            *(int32_t *)(buf + 1) += repl - orig;
 
         a->priv = 1;
 
@@ -470,7 +393,7 @@ static int init_or_livepatch apply_alt_calls(
 #ifdef CONFIG_LIVEPATCH
 int apply_alternatives(struct alt_instr *start, struct alt_instr *end)
 {
-    return _apply_alternatives(start, end, true);
+    return _apply_alternatives(start, end);
 }
 
 int livepatch_apply_alt_calls(const struct alt_call *start,
@@ -516,10 +439,13 @@ static int __init cf_check nmi_apply_alternatives(
                                  PAGE_HYPERVISOR_RWX);
         flush_local(FLUSH_TLB_GLOBAL);
 
-        rc = _apply_alternatives(__alt_instructions, __alt_instructions_end,
-                                 alt_todo == ALT_CALLS);
-        if ( rc )
-            panic("Unable to apply alternatives: %d\n", rc);
+        if ( alt_todo & ALT_INSNS )
+        {
+            rc = _apply_alternatives(__alt_instructions,
+                                     __alt_instructions_end);
+            if ( rc )
+                panic("Unable to apply alternatives: %d\n", rc);
+        }
 
         if ( alt_todo & ALT_CALLS )
         {
-- 
2.39.5



^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/8] x86/altcall: Split alternative-call.h out of alternative.h
  2025-04-23  1:02 ` [PATCH 1/8] x86/altcall: Split alternative-call.h out of alternative.h Andrew Cooper
@ 2025-04-23  7:55   ` Jan Beulich
  0 siblings, 0 replies; 20+ messages in thread
From: Jan Beulich @ 2025-04-23  7:55 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Anthony PERARD, Michal Orzel, Julien Grall, Roger Pau Monné,
	Stefano Stabellini, consulting @ bugseng . com, Nicola Vetrini,
	Xen-devel

On 23.04.2025 03:02, Andrew Cooper wrote:
> ... in preparation for changing how they're implemented.
> 
> Update the MISRA deviations with the new path.
> 
> No functional change.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Acked-by: Jan Beulich <jbeulich@suse.com>




^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/8] x86/altcall: Rename alternative_branches() to boot_apply_alt_calls()
  2025-04-23  1:02 ` [PATCH 2/8] x86/altcall: Rename alternative_branches() to boot_apply_alt_calls() Andrew Cooper
@ 2025-04-23  7:55   ` Jan Beulich
  0 siblings, 0 replies; 20+ messages in thread
From: Jan Beulich @ 2025-04-23  7:55 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Roger Pau Monné, Xen-devel

On 23.04.2025 03:02, Andrew Cooper wrote:
> The alternatives APIs are not great; rename alternative_branches() to be more
> precise.  Centralise the declaration in xen/alternative-call.h, in the
> expectation that x86 won't be the only user in the long term.
> 
> No functional change.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Acked-by: Jan Beulich <jbeulich@suse.com>




^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 3/8] x86/alternatives: Rework information passing into nmi_apply_alternatives()
  2025-04-23  1:02 ` [PATCH 3/8] x86/alternatives: Rework information passing into nmi_apply_alternatives() Andrew Cooper
@ 2025-04-23  8:06   ` Jan Beulich
  0 siblings, 0 replies; 20+ messages in thread
From: Jan Beulich @ 2025-04-23  8:06 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Roger Pau Monné, Xen-devel

On 23.04.2025 03:02, Andrew Cooper wrote:
> nmi_apply_alternatives() is soon going to need to dispatch to multiple
> functions, and a force parameter is not a good way of passing information.
> 
> Introduce ALT_INSNS and ALT_CALLS to pass in at the top level to select the
> operation(s) desired.  They represent what will happen when we've separated
> the altcalls out of the general alternative instructions infrastructure,
> although in the short term we still need to synthesise the force parameter for
> _apply_alternatives().
> 
> Move two externs to reduce their scope a little.
> 
> No functional change.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Reviewed-by: Jan Beulich <jbeulich@suse.com>




^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 4/8] x86/alternatives: Factor seal_endbr64() out of _apply_alternatives()
  2025-04-23  1:02 ` [PATCH 4/8] x86/alternatives: Factor seal_endbr64() out of _apply_alternatives() Andrew Cooper
@ 2025-04-23  8:07   ` Jan Beulich
  0 siblings, 0 replies; 20+ messages in thread
From: Jan Beulich @ 2025-04-23  8:07 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Roger Pau Monné, Xen-devel

On 23.04.2025 03:02, Andrew Cooper wrote:
> We are going to need to reposition the call in a change with several moving
> parts.
> 
> No functional change.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Acked-by: Jan Beulich <jbeulich@suse.com>




^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 5/8] x86/altcall: Introduce new simpler scheme
  2025-04-23  1:02 ` [PATCH 5/8] x86/altcall: Introduce new simpler scheme Andrew Cooper
@ 2025-04-23  8:16   ` Jan Beulich
  0 siblings, 0 replies; 20+ messages in thread
From: Jan Beulich @ 2025-04-23  8:16 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Roger Pau Monné, Xen-devel

On 23.04.2025 03:02, Andrew Cooper wrote:
> Encoding altcalls as regular alternatives leads to an unreasonable amount of
> complexity in _apply_alternatives().
> 
> Introduce apply_alt_calls(), and an .alt_call_sites section which simply
> tracks the source address (relative, to save on space).  That's literally all
> that is needed in order to devirtualise the function pointers.
> 
> apply_alt_calls() is mostly as per _apply_alternatives(), except the size is
> known to be 6 bytes.  Drop the logic for JMP *RIPREL, as there's no support
> for tailcall optimisations, nor a feasbile plan on how to introduce support.
> Pad with a redundant prefix to avoid needing a separate NOP on the end.
> 
> Wire it up in nmi_apply_alternatives(), although the section is empty at this
> juncture so nothing happens in practice.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Reviewed-by: Jan Beulich <jbeulich@suse.com>

> Finding a 6-byte UD instruction that is distinct from ud2 turns out to be
> quite challengning.  The easy way involves a length changing prefix, which is
> best avoided.  Suggestions for alternative patterns welcome.

(Intel syntax, sorry.)

	ud0	edi, [edi+edi-1]
	ud1	edi, [edi+edi-1]

	ud0	edi, cs:[rdi+rdi-1]
	ud1	edi, cs:[rdi+rdi-1]

Jan


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 6/8] xen/livepatch: Support new altcall scheme
  2025-04-23  1:02 ` [PATCH 6/8] xen/livepatch: Support new altcall scheme Andrew Cooper
@ 2025-04-23  8:20   ` Jan Beulich
  2025-04-23 11:16   ` Roger Pau Monné
  1 sibling, 0 replies; 20+ messages in thread
From: Jan Beulich @ 2025-04-23  8:20 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Roger Pau Monné, Ross Lagerwall, Xen-devel

On 23.04.2025 03:02, Andrew Cooper wrote:
> The new altcall scheme uses an .alt_call_sites section.  Wire this up in very
> much the same way as the .altinstructions section, although there is less
> sanity checking necessary.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Reviewed-by: Jan Beulich <jbeulich@suse.com>




^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 7/8] x86/altcall: Switch to simpler scheme
  2025-04-23  1:02 ` [PATCH 7/8] x86/altcall: Switch to simpler scheme Andrew Cooper
@ 2025-04-23  9:01   ` Jan Beulich
  2025-04-23  9:12     ` Andrew Cooper
  0 siblings, 1 reply; 20+ messages in thread
From: Jan Beulich @ 2025-04-23  9:01 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Roger Pau Monné, Xen-devel

On 23.04.2025 03:02, Andrew Cooper wrote:
> --- a/xen/arch/x86/alternative.c
> +++ b/xen/arch/x86/alternative.c
> @@ -378,13 +378,6 @@ static int init_or_livepatch _apply_alternatives(struct alt_instr *start,
>          text_poke(orig, buf, total_len);
>      }
>  
> -    /*
> -     * Clobber endbr64 instructions now that altcall has finished optimising
> -     * all indirect branches to direct ones.
> -     */
> -    if ( force && system_state < SYS_STATE_active )
> -        seal_endbr64();
> -
>      return 0;
>  }
>  
> @@ -533,6 +526,8 @@ static int __init cf_check nmi_apply_alternatives(
>              rc = apply_alt_calls(__alt_call_sites_start, __alt_call_sites_end);
>              if ( rc )
>                  panic("Unable to apply alternative calls: %d\n", rc);
> +
> +            seal_endbr64();
>          }

Are you deliberately losing the comment? Other than this:
Reviewed-by: Jan Beulich <jbeulich@suse.com>

Jan


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 8/8] x86/alternatives: Simplify _apply_alternatives() now altcall is separate
  2025-04-23  1:02 ` [PATCH 8/8] x86/alternatives: Simplify _apply_alternatives() now altcall is separate Andrew Cooper
@ 2025-04-23  9:03   ` Jan Beulich
  0 siblings, 0 replies; 20+ messages in thread
From: Jan Beulich @ 2025-04-23  9:03 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Roger Pau Monné, Xen-devel

On 23.04.2025 03:02, Andrew Cooper wrote:
> With altcall handled separately, the special case in _apply_alternatives() is
> unused and can be dropped.  The force parameter (used to signify the seal
> pass) can be removed too.
> 
> In turn, nmi_apply_alternatives() no longer needs to call
> _apply_alternatives() on the second pass.
> 
> No functional change.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Reviewed-by: Jan Beulich <jbeulich@suse.com>




^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 7/8] x86/altcall: Switch to simpler scheme
  2025-04-23  9:01   ` Jan Beulich
@ 2025-04-23  9:12     ` Andrew Cooper
  0 siblings, 0 replies; 20+ messages in thread
From: Andrew Cooper @ 2025-04-23  9:12 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Roger Pau Monné, Xen-devel

On 23/04/2025 10:01 am, Jan Beulich wrote:
> On 23.04.2025 03:02, Andrew Cooper wrote:
>> --- a/xen/arch/x86/alternative.c
>> +++ b/xen/arch/x86/alternative.c
>> @@ -378,13 +378,6 @@ static int init_or_livepatch _apply_alternatives(struct alt_instr *start,
>>          text_poke(orig, buf, total_len);
>>      }
>>  
>> -    /*
>> -     * Clobber endbr64 instructions now that altcall has finished optimising
>> -     * all indirect branches to direct ones.
>> -     */
>> -    if ( force && system_state < SYS_STATE_active )
>> -        seal_endbr64();
>> -
>>      return 0;
>>  }
>>  
>> @@ -533,6 +526,8 @@ static int __init cf_check nmi_apply_alternatives(
>>              rc = apply_alt_calls(__alt_call_sites_start, __alt_call_sites_end);
>>              if ( rc )
>>                  panic("Unable to apply alternative calls: %d\n", rc);
>> +
>> +            seal_endbr64();
>>          }
> Are you deliberately losing the comment?

Yes, counterbalancing the comment gained in patch 4.

>  Other than this:
> Reviewed-by: Jan Beulich <jbeulich@suse.com>

Thanks,

~Andrew


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 6/8] xen/livepatch: Support new altcall scheme
  2025-04-23  1:02 ` [PATCH 6/8] xen/livepatch: Support new altcall scheme Andrew Cooper
  2025-04-23  8:20   ` Jan Beulich
@ 2025-04-23 11:16   ` Roger Pau Monné
  2025-04-23 12:04     ` Andrew Cooper
  1 sibling, 1 reply; 20+ messages in thread
From: Roger Pau Monné @ 2025-04-23 11:16 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Xen-devel, Jan Beulich, Ross Lagerwall

On Wed, Apr 23, 2025 at 02:02:35AM +0100, Andrew Cooper wrote:
> The new altcall scheme uses an .alt_call_sites section.  Wire this up in very
> much the same way as the .altinstructions section, although there is less
> sanity checking necessary.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

One nit/comment below.

> ---
> CC: Jan Beulich <JBeulich@suse.com>
> CC: Roger Pau Monné <roger.pau@citrix.com>
> CC: Ross Lagerwall <ross.lagerwall@citrix.com>
> ---
>  xen/arch/x86/alternative.c         |  6 ++++
>  xen/common/livepatch.c             | 58 ++++++++++++++++++++++++++++++
>  xen/include/xen/alternative-call.h |  8 +++--
>  3 files changed, 70 insertions(+), 2 deletions(-)
> 
> diff --git a/xen/arch/x86/alternative.c b/xen/arch/x86/alternative.c
> index f6594e21a14c..22af224f08f7 100644
> --- a/xen/arch/x86/alternative.c
> +++ b/xen/arch/x86/alternative.c
> @@ -479,6 +479,12 @@ int apply_alternatives(struct alt_instr *start, struct alt_instr *end)
>  {
>      return _apply_alternatives(start, end, true);
>  }
> +
> +int livepatch_apply_alt_calls(const struct alt_call *start,
> +                              const struct alt_call *end)
> +{
> +    return apply_alt_calls(start, end);
> +}
>  #endif
>  
>  #define ALT_INSNS (1U << 0)
> diff --git a/xen/common/livepatch.c b/xen/common/livepatch.c
> index 6ce77bf021b7..be9b7e367553 100644
> --- a/xen/common/livepatch.c
> +++ b/xen/common/livepatch.c
> @@ -905,6 +905,64 @@ static int prepare_payload(struct payload *payload,
>  #endif
>      }
>  
> +    sec = livepatch_elf_sec_by_name(elf, ".alt_call_sites");
> +    if ( sec )
> +    {
> +#ifdef CONFIG_ALTERNATIVE_CALL
> +        const struct alt_call *a, *start, *end;
> +
> +        if ( !section_ok(elf, sec, sizeof(*a)) )
> +            return -EINVAL;
> +
> +        /* Tolerate an empty .alt_call_sites section... */
> +        if ( sec->sec->sh_size == 0 )

You could possibly move this check to the outer `if` condition, and
avoid the alt_call_done label?

As even in the !CONFIG_ALTERNATIVE_CALL case skipping an empty section
would be OK.

Thanks, Roger.


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 6/8] xen/livepatch: Support new altcall scheme
  2025-04-23 11:16   ` Roger Pau Monné
@ 2025-04-23 12:04     ` Andrew Cooper
  0 siblings, 0 replies; 20+ messages in thread
From: Andrew Cooper @ 2025-04-23 12:04 UTC (permalink / raw)
  To: Roger Pau Monné; +Cc: Xen-devel, Jan Beulich, Ross Lagerwall

On 23/04/2025 12:16 pm, Roger Pau Monné wrote:
> On Wed, Apr 23, 2025 at 02:02:35AM +0100, Andrew Cooper wrote:
>> The new altcall scheme uses an .alt_call_sites section.  Wire this up in very
>> much the same way as the .altinstructions section, although there is less
>> sanity checking necessary.
>>
>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

Thanks.

>
> One nit/comment below.
>
>> ---
>> CC: Jan Beulich <JBeulich@suse.com>
>> CC: Roger Pau Monné <roger.pau@citrix.com>
>> CC: Ross Lagerwall <ross.lagerwall@citrix.com>
>> ---
>>  xen/arch/x86/alternative.c         |  6 ++++
>>  xen/common/livepatch.c             | 58 ++++++++++++++++++++++++++++++
>>  xen/include/xen/alternative-call.h |  8 +++--
>>  3 files changed, 70 insertions(+), 2 deletions(-)
>>
>> diff --git a/xen/arch/x86/alternative.c b/xen/arch/x86/alternative.c
>> index f6594e21a14c..22af224f08f7 100644
>> --- a/xen/arch/x86/alternative.c
>> +++ b/xen/arch/x86/alternative.c
>> @@ -479,6 +479,12 @@ int apply_alternatives(struct alt_instr *start, struct alt_instr *end)
>>  {
>>      return _apply_alternatives(start, end, true);
>>  }
>> +
>> +int livepatch_apply_alt_calls(const struct alt_call *start,
>> +                              const struct alt_call *end)
>> +{
>> +    return apply_alt_calls(start, end);
>> +}
>>  #endif
>>  
>>  #define ALT_INSNS (1U << 0)
>> diff --git a/xen/common/livepatch.c b/xen/common/livepatch.c
>> index 6ce77bf021b7..be9b7e367553 100644
>> --- a/xen/common/livepatch.c
>> +++ b/xen/common/livepatch.c
>> @@ -905,6 +905,64 @@ static int prepare_payload(struct payload *payload,
>>  #endif
>>      }
>>  
>> +    sec = livepatch_elf_sec_by_name(elf, ".alt_call_sites");
>> +    if ( sec )
>> +    {
>> +#ifdef CONFIG_ALTERNATIVE_CALL
>> +        const struct alt_call *a, *start, *end;
>> +
>> +        if ( !section_ok(elf, sec, sizeof(*a)) )
>> +            return -EINVAL;
>> +
>> +        /* Tolerate an empty .alt_call_sites section... */
>> +        if ( sec->sec->sh_size == 0 )
> You could possibly move this check to the outer `if` condition, and
> avoid the alt_call_done label?
>
> As even in the !CONFIG_ALTERNATIVE_CALL case skipping an empty section
> would be OK.

.altinstructions is like this.  It was put in as part of e74360e4ba4a, I
believe because it was decided that an empty section wasn't wanted.

We can revisit the decision, but the logic should be consistent.

~Andrew


^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2025-04-23 12:04 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-04-23  1:02 [PATCH 0/8] x86/altcall: Switch to a simpler scheme Andrew Cooper
2025-04-23  1:02 ` [PATCH 1/8] x86/altcall: Split alternative-call.h out of alternative.h Andrew Cooper
2025-04-23  7:55   ` Jan Beulich
2025-04-23  1:02 ` [PATCH 2/8] x86/altcall: Rename alternative_branches() to boot_apply_alt_calls() Andrew Cooper
2025-04-23  7:55   ` Jan Beulich
2025-04-23  1:02 ` [PATCH 3/8] x86/alternatives: Rework information passing into nmi_apply_alternatives() Andrew Cooper
2025-04-23  8:06   ` Jan Beulich
2025-04-23  1:02 ` [PATCH 4/8] x86/alternatives: Factor seal_endbr64() out of _apply_alternatives() Andrew Cooper
2025-04-23  8:07   ` Jan Beulich
2025-04-23  1:02 ` [PATCH 5/8] x86/altcall: Introduce new simpler scheme Andrew Cooper
2025-04-23  8:16   ` Jan Beulich
2025-04-23  1:02 ` [PATCH 6/8] xen/livepatch: Support new altcall scheme Andrew Cooper
2025-04-23  8:20   ` Jan Beulich
2025-04-23 11:16   ` Roger Pau Monné
2025-04-23 12:04     ` Andrew Cooper
2025-04-23  1:02 ` [PATCH 7/8] x86/altcall: Switch to simpler scheme Andrew Cooper
2025-04-23  9:01   ` Jan Beulich
2025-04-23  9:12     ` Andrew Cooper
2025-04-23  1:02 ` [PATCH 8/8] x86/alternatives: Simplify _apply_alternatives() now altcall is separate Andrew Cooper
2025-04-23  9:03   ` Jan Beulich

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.