kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 00/16] objtool: Detect and warn about indirect calls in __nocfi functions
@ 2025-07-14 10:20 Peter Zijlstra
  2025-07-14 10:20 ` [PATCH v3 01/16] x86/kvm/emulate: Implement test_cc() in C Peter Zijlstra
                   ` (16 more replies)
  0 siblings, 17 replies; 36+ messages in thread
From: Peter Zijlstra @ 2025-07-14 10:20 UTC (permalink / raw)
  To: x86
  Cc: kys, haiyangz, wei.liu, decui, tglx, mingo, bp, dave.hansen, hpa,
	seanjc, pbonzini, ardb, kees, Arnd Bergmann, gregkh, jpoimboe,
	peterz, linux-hyperv, linux-kernel, kvm, linux-efi, samitolvanen,
	ojeda


Hi!

On kCFI (CONFIG_CFI_CLANG=y) builds all indirect calls should have the CFI
check on (with very few exceptions). Not having the CFI checks undermines the
protection provided by CFI and will make these sites candidates for people
wanting to steal your cookies.

Specifically the ABI changes are so that doing indirect calls without the CFI
magic, to a CFI adorned function is not compatible (although it happens to work
for some setups, it very much does not for FineIBT).

Rust people tripped over this the other day, since their 'core' happened to
have some no_sanitize(kcfi) bits in, which promptly exploded when ran with
FineIBT on.

Since this is very much not a supported model -- on purpose, have objtool
detect and warn about such constructs.

This effort [1] found all existing [2] non-cfi indirect calls in the kernel.

Notably the KVM fastop emulation stuff -- which is completely rewritten -- the
generated code doesn't look horrific, but is slightly more verbose. I'm running
on the assumption that instruction emulation is not super performance critical
these days of zero VM-exit VMs etc. Paolo noted that pre-Westmere (2010) cares
about this.

KVM has another; the VMX interrupt injection stuff calls the IDT handler
directly. This is rewritten to to use the FRED dispatch table, which moves it
all into C.

HyperV hypercall page stuff, which I've previously suggested use direct calls,
and which I've now converted (after getting properly annoyed with that code).

Also available at:

  git://git.kernel.org/pub/scm/linux/kernel/git/peterz/queue.git x86/core

Changes since v2:

 - renamed COP to EM_ASM
 - reworked the KVM-IDT stuff (Sean, Josh)

[1] https://lkml.kernel.org/r/20250410154556.GB9003@noisy.programming.kicks-ass.net
[2] https://lkml.kernel.org/r/20250410194334.GA3248459@google.com


^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH v3 01/16] x86/kvm/emulate: Implement test_cc() in C
  2025-07-14 10:20 [PATCH v3 00/16] objtool: Detect and warn about indirect calls in __nocfi functions Peter Zijlstra
@ 2025-07-14 10:20 ` Peter Zijlstra
  2025-07-14 10:20 ` [PATCH v3 02/16] x86/kvm/emulate: Introduce EM_ASM_1 Peter Zijlstra
                   ` (15 subsequent siblings)
  16 siblings, 0 replies; 36+ messages in thread
From: Peter Zijlstra @ 2025-07-14 10:20 UTC (permalink / raw)
  To: x86
  Cc: kys, haiyangz, wei.liu, decui, tglx, mingo, bp, dave.hansen, hpa,
	seanjc, pbonzini, ardb, kees, Arnd Bergmann, gregkh, jpoimboe,
	peterz, linux-hyperv, linux-kernel, kvm, linux-efi, samitolvanen,
	ojeda

Current test_cc() uses the fastop infrastructure to test flags using
SETcc instructions. However, int3_emulate_jcc() already fully
implements the flags->CC mapping, use that.

Removes a pile of gnarly asm.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/include/asm/text-patching.h |   20 +++++++++++++-------
 arch/x86/kvm/emulate.c               |   34 ++--------------------------------
 2 files changed, 15 insertions(+), 39 deletions(-)

--- a/arch/x86/include/asm/text-patching.h
+++ b/arch/x86/include/asm/text-patching.h
@@ -177,9 +177,9 @@ void int3_emulate_ret(struct pt_regs *re
 }
 
 static __always_inline
-void int3_emulate_jcc(struct pt_regs *regs, u8 cc, unsigned long ip, unsigned long disp)
+bool __emulate_cc(unsigned long flags, u8 cc)
 {
-	static const unsigned long jcc_mask[6] = {
+	static const unsigned long cc_mask[6] = {
 		[0] = X86_EFLAGS_OF,
 		[1] = X86_EFLAGS_CF,
 		[2] = X86_EFLAGS_ZF,
@@ -192,15 +192,21 @@ void int3_emulate_jcc(struct pt_regs *re
 	bool match;
 
 	if (cc < 0xc) {
-		match = regs->flags & jcc_mask[cc >> 1];
+		match = flags & cc_mask[cc >> 1];
 	} else {
-		match = ((regs->flags & X86_EFLAGS_SF) >> X86_EFLAGS_SF_BIT) ^
-			((regs->flags & X86_EFLAGS_OF) >> X86_EFLAGS_OF_BIT);
+		match = ((flags & X86_EFLAGS_SF) >> X86_EFLAGS_SF_BIT) ^
+			((flags & X86_EFLAGS_OF) >> X86_EFLAGS_OF_BIT);
 		if (cc >= 0xe)
-			match = match || (regs->flags & X86_EFLAGS_ZF);
+			match = match || (flags & X86_EFLAGS_ZF);
 	}
 
-	if ((match && !invert) || (!match && invert))
+	return (match && !invert) || (!match && invert);
+}
+
+static __always_inline
+void int3_emulate_jcc(struct pt_regs *regs, u8 cc, unsigned long ip, unsigned long disp)
+{
+	if (__emulate_cc(regs->flags, cc))
 		ip += disp;
 
 	int3_emulate_jmp(regs, ip);
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -26,6 +26,7 @@
 #include <asm/debugreg.h>
 #include <asm/nospec-branch.h>
 #include <asm/ibt.h>
+#include <asm/text-patching.h>
 
 #include "x86.h"
 #include "tss.h"
@@ -416,31 +417,6 @@ static int fastop(struct x86_emulate_ctx
 	ON64(FOP3E(op##q, rax, rdx, cl)) \
 	FOP_END
 
-/* Special case for SETcc - 1 instruction per cc */
-#define FOP_SETCC(op) \
-	FOP_FUNC(op) \
-	#op " %al \n\t" \
-	FOP_RET(op)
-
-FOP_START(setcc)
-FOP_SETCC(seto)
-FOP_SETCC(setno)
-FOP_SETCC(setc)
-FOP_SETCC(setnc)
-FOP_SETCC(setz)
-FOP_SETCC(setnz)
-FOP_SETCC(setbe)
-FOP_SETCC(setnbe)
-FOP_SETCC(sets)
-FOP_SETCC(setns)
-FOP_SETCC(setp)
-FOP_SETCC(setnp)
-FOP_SETCC(setl)
-FOP_SETCC(setnl)
-FOP_SETCC(setle)
-FOP_SETCC(setnle)
-FOP_END;
-
 FOP_START(salc)
 FOP_FUNC(salc)
 "pushf; sbb %al, %al; popf \n\t"
@@ -1068,13 +1044,7 @@ static int em_bsr_c(struct x86_emulate_c
 
 static __always_inline u8 test_cc(unsigned int condition, unsigned long flags)
 {
-	u8 rc;
-	void (*fop)(void) = (void *)em_setcc + FASTOP_SIZE * (condition & 0xf);
-
-	flags = (flags & EFLAGS_MASK) | X86_EFLAGS_IF;
-	asm("push %[flags]; popf; " CALL_NOSPEC
-	    : "=a"(rc), ASM_CALL_CONSTRAINT : [thunk_target]"r"(fop), [flags]"r"(flags));
-	return rc;
+	return __emulate_cc(flags, condition & 0xf);
 }
 
 static void fetch_register_operand(struct operand *op)



^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH v3 02/16] x86/kvm/emulate: Introduce EM_ASM_1
  2025-07-14 10:20 [PATCH v3 00/16] objtool: Detect and warn about indirect calls in __nocfi functions Peter Zijlstra
  2025-07-14 10:20 ` [PATCH v3 01/16] x86/kvm/emulate: Implement test_cc() in C Peter Zijlstra
@ 2025-07-14 10:20 ` Peter Zijlstra
  2025-07-14 10:20 ` [PATCH v3 03/16] x86/kvm/emulate: Introduce EM_ASM_2 Peter Zijlstra
                   ` (14 subsequent siblings)
  16 siblings, 0 replies; 36+ messages in thread
From: Peter Zijlstra @ 2025-07-14 10:20 UTC (permalink / raw)
  To: x86
  Cc: kys, haiyangz, wei.liu, decui, tglx, mingo, bp, dave.hansen, hpa,
	seanjc, pbonzini, ardb, kees, Arnd Bergmann, gregkh, jpoimboe,
	peterz, linux-hyperv, linux-kernel, kvm, linux-efi, samitolvanen,
	ojeda

Replace fastops with C based stubs. There are a bunch of problems with
the current fastop infrastructure, most all related to their special
calling convention, which bypasses the normal C-ABI.

There are two immediate problems with this at present:

 - it relies on RET preserving EFLAGS; whereas C-ABI does not.

 - it circumvents compiler based control-flow-integrity checking
   because its all asm magic.

The first is a problem for some mitigations where the
x86_indirect_return_thunk needs to include non-trivial work that
clobbers EFLAGS (eg. the Skylake call depth tracking thing).

The second is a problem because it presents a 'naked' indirect call on
kCFI builds, making it a prime target for control flow hijacking.

Additionally, given that a large chunk of virtual machine performance
relies on absolutely avoiding vmexit these days, this emulation stuff
just isn't that critical for performance anymore.

As such, replace the fastop calls with normal C functions using the
'execute' member.

As noted by Paolo: this code was performance critical for pre-Westmere
(2010) and only when running big real mode code.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/kvm/emulate.c |   71 ++++++++++++++++++++++++++++++++++++++++---------
 1 file changed, 58 insertions(+), 13 deletions(-)

--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -267,11 +267,56 @@ static void invalidate_registers(struct
 		     X86_EFLAGS_PF|X86_EFLAGS_CF)
 
 #ifdef CONFIG_X86_64
-#define ON64(x) x
+#define ON64(x...) x
 #else
-#define ON64(x)
+#define ON64(x...)
 #endif
 
+#define EM_ASM_START(op) \
+static int em_##op(struct x86_emulate_ctxt *ctxt) \
+{ \
+	unsigned long flags = (ctxt->eflags & EFLAGS_MASK) | X86_EFLAGS_IF; \
+	int bytes = 1, ok = 1; \
+	if (!(ctxt->d & ByteOp)) \
+		bytes = ctxt->dst.bytes; \
+	switch (bytes) {
+
+#define __EM_ASM(str) \
+		asm("push %[flags]; popf \n\t" \
+		    "10: " str \
+		    "pushf; pop %[flags] \n\t" \
+		    "11: \n\t" \
+		    : "+a" (ctxt->dst.val), \
+		      "+d" (ctxt->src.val), \
+		      [flags] "+D" (flags), \
+		      "+S" (ok) \
+		    : "c" (ctxt->src2.val))
+
+#define __EM_ASM_1(op, dst) \
+		__EM_ASM(#op " %%" #dst " \n\t")
+
+#define __EM_ASM_1_EX(op, dst) \
+		__EM_ASM(#op " %%" #dst " \n\t" \
+			 _ASM_EXTABLE_TYPE_REG(10b, 11f, EX_TYPE_ZERO_REG, %%esi))
+
+#define __EM_ASM_2(op, dst, src) \
+		__EM_ASM(#op " %%" #src ", %%" #dst " \n\t")
+
+#define EM_ASM_END \
+	} \
+	ctxt->eflags = (ctxt->eflags & ~EFLAGS_MASK) | (flags & EFLAGS_MASK); \
+	return !ok ? emulate_de(ctxt) : X86EMUL_CONTINUE; \
+}
+
+/* 1-operand, using "a" (dst) */
+#define EM_ASM_1(op) \
+	EM_ASM_START(op) \
+	case 1: __EM_ASM_1(op##b, al); break; \
+	case 2: __EM_ASM_1(op##w, ax); break; \
+	case 4: __EM_ASM_1(op##l, eax); break; \
+	ON64(case 8: __EM_ASM_1(op##q, rax); break;) \
+	EM_ASM_END
+
 /*
  * fastop functions have a special calling convention:
  *
@@ -1002,10 +1047,10 @@ FASTOP3WCL(shrd);
 
 FASTOP2W(imul);
 
-FASTOP1(not);
-FASTOP1(neg);
-FASTOP1(inc);
-FASTOP1(dec);
+EM_ASM_1(not);
+EM_ASM_1(neg);
+EM_ASM_1(inc);
+EM_ASM_1(dec);
 
 FASTOP2CL(rol);
 FASTOP2CL(ror);
@@ -4021,8 +4066,8 @@ static const struct opcode group2[] = {
 static const struct opcode group3[] = {
 	F(DstMem | SrcImm | NoWrite, em_test),
 	F(DstMem | SrcImm | NoWrite, em_test),
-	F(DstMem | SrcNone | Lock, em_not),
-	F(DstMem | SrcNone | Lock, em_neg),
+	I(DstMem | SrcNone | Lock, em_not),
+	I(DstMem | SrcNone | Lock, em_neg),
 	F(DstXacc | Src2Mem, em_mul_ex),
 	F(DstXacc | Src2Mem, em_imul_ex),
 	F(DstXacc | Src2Mem, em_div_ex),
@@ -4030,14 +4075,14 @@ static const struct opcode group3[] = {
 };
 
 static const struct opcode group4[] = {
-	F(ByteOp | DstMem | SrcNone | Lock, em_inc),
-	F(ByteOp | DstMem | SrcNone | Lock, em_dec),
+	I(ByteOp | DstMem | SrcNone | Lock, em_inc),
+	I(ByteOp | DstMem | SrcNone | Lock, em_dec),
 	N, N, N, N, N, N,
 };
 
 static const struct opcode group5[] = {
-	F(DstMem | SrcNone | Lock,		em_inc),
-	F(DstMem | SrcNone | Lock,		em_dec),
+	I(DstMem | SrcNone | Lock,		em_inc),
+	I(DstMem | SrcNone | Lock,		em_dec),
 	I(SrcMem | NearBranch | IsBranch,       em_call_near_abs),
 	I(SrcMemFAddr | ImplicitOps | IsBranch, em_call_far),
 	I(SrcMem | NearBranch | IsBranch,       em_jmp_abs),
@@ -4237,7 +4282,7 @@ static const struct opcode opcode_table[
 	/* 0x38 - 0x3F */
 	F6ALU(NoWrite, em_cmp), N, N,
 	/* 0x40 - 0x4F */
-	X8(F(DstReg, em_inc)), X8(F(DstReg, em_dec)),
+	X8(I(DstReg, em_inc)), X8(I(DstReg, em_dec)),
 	/* 0x50 - 0x57 */
 	X8(I(SrcReg | Stack, em_push)),
 	/* 0x58 - 0x5F */



^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH v3 03/16] x86/kvm/emulate: Introduce EM_ASM_2
  2025-07-14 10:20 [PATCH v3 00/16] objtool: Detect and warn about indirect calls in __nocfi functions Peter Zijlstra
  2025-07-14 10:20 ` [PATCH v3 01/16] x86/kvm/emulate: Implement test_cc() in C Peter Zijlstra
  2025-07-14 10:20 ` [PATCH v3 02/16] x86/kvm/emulate: Introduce EM_ASM_1 Peter Zijlstra
@ 2025-07-14 10:20 ` Peter Zijlstra
  2025-07-14 10:20 ` [PATCH v3 04/16] x86/kvm/emulate: Introduce EM_ASM_2R Peter Zijlstra
                   ` (13 subsequent siblings)
  16 siblings, 0 replies; 36+ messages in thread
From: Peter Zijlstra @ 2025-07-14 10:20 UTC (permalink / raw)
  To: x86
  Cc: kys, haiyangz, wei.liu, decui, tglx, mingo, bp, dave.hansen, hpa,
	seanjc, pbonzini, ardb, kees, Arnd Bergmann, gregkh, jpoimboe,
	peterz, linux-hyperv, linux-kernel, kvm, linux-efi, samitolvanen,
	ojeda

Replace the FASTOP2 instructions.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/kvm/emulate.c |   85 +++++++++++++++++++++++++++----------------------
 1 file changed, 47 insertions(+), 38 deletions(-)

--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -317,6 +317,15 @@ static int em_##op(struct x86_emulate_ct
 	ON64(case 8: __EM_ASM_1(op##q, rax); break;) \
 	EM_ASM_END
 
+/* 2-operand, using "a" (dst), "d" (src) */
+#define EM_ASM_2(op) \
+	EM_ASM_START(op) \
+	case 1: __EM_ASM_2(op##b, al, dl); break; \
+	case 2: __EM_ASM_2(op##w, ax, dx); break; \
+	case 4: __EM_ASM_2(op##l, eax, edx); break; \
+	ON64(case 8: __EM_ASM_2(op##q, rax, rdx); break;) \
+	EM_ASM_END
+
 /*
  * fastop functions have a special calling convention:
  *
@@ -1027,15 +1036,16 @@ static int read_descriptor(struct x86_em
 	return rc;
 }
 
-FASTOP2(add);
-FASTOP2(or);
-FASTOP2(adc);
-FASTOP2(sbb);
-FASTOP2(and);
-FASTOP2(sub);
-FASTOP2(xor);
-FASTOP2(cmp);
-FASTOP2(test);
+EM_ASM_2(add);
+EM_ASM_2(or);
+EM_ASM_2(adc);
+EM_ASM_2(sbb);
+EM_ASM_2(and);
+EM_ASM_2(sub);
+EM_ASM_2(xor);
+EM_ASM_2(cmp);
+EM_ASM_2(test);
+EM_ASM_2(xadd);
 
 FASTOP1SRC2(mul, mul_ex);
 FASTOP1SRC2(imul, imul_ex);
@@ -1067,7 +1077,6 @@ FASTOP2W(bts);
 FASTOP2W(btr);
 FASTOP2W(btc);
 
-FASTOP2(xadd);
 
 FASTOP2R(cmp, cmp_r);
 
@@ -2304,7 +2313,7 @@ static int em_cmpxchg(struct x86_emulate
 	ctxt->dst.val = reg_read(ctxt, VCPU_REGS_RAX);
 	ctxt->src.orig_val = ctxt->src.val;
 	ctxt->src.val = ctxt->dst.orig_val;
-	fastop(ctxt, em_cmp);
+	em_cmp(ctxt);
 
 	if (ctxt->eflags & X86_EFLAGS_ZF) {
 		/* Success: write back to memory; no update of EAX */
@@ -3069,7 +3078,7 @@ static int em_das(struct x86_emulate_ctx
 	ctxt->src.type = OP_IMM;
 	ctxt->src.val = 0;
 	ctxt->src.bytes = 1;
-	fastop(ctxt, em_or);
+	em_or(ctxt);
 	ctxt->eflags &= ~(X86_EFLAGS_AF | X86_EFLAGS_CF);
 	if (cf)
 		ctxt->eflags |= X86_EFLAGS_CF;
@@ -3095,7 +3104,7 @@ static int em_aam(struct x86_emulate_ctx
 	ctxt->src.type = OP_IMM;
 	ctxt->src.val = 0;
 	ctxt->src.bytes = 1;
-	fastop(ctxt, em_or);
+	em_or(ctxt);
 
 	return X86EMUL_CONTINUE;
 }
@@ -3113,7 +3122,7 @@ static int em_aad(struct x86_emulate_ctx
 	ctxt->src.type = OP_IMM;
 	ctxt->src.val = 0;
 	ctxt->src.bytes = 1;
-	fastop(ctxt, em_or);
+	em_or(ctxt);
 
 	return X86EMUL_CONTINUE;
 }
@@ -3998,9 +4007,9 @@ static int check_perm_out(struct x86_emu
 #define I2bvIP(_f, _e, _i, _p) \
 	IIP((_f) | ByteOp, _e, _i, _p), IIP(_f, _e, _i, _p)
 
-#define F6ALU(_f, _e) F2bv((_f) | DstMem | SrcReg | ModRM, _e),		\
-		F2bv(((_f) | DstReg | SrcMem | ModRM) & ~Lock, _e),	\
-		F2bv(((_f) & ~Lock) | DstAcc | SrcImm, _e)
+#define I6ALU(_f, _e) I2bv((_f) | DstMem | SrcReg | ModRM, _e),		\
+		I2bv(((_f) | DstReg | SrcMem | ModRM) & ~Lock, _e),	\
+		I2bv(((_f) & ~Lock) | DstAcc | SrcImm, _e)
 
 static const struct opcode group7_rm0[] = {
 	N,
@@ -4038,14 +4047,14 @@ static const struct opcode group7_rm7[]
 };
 
 static const struct opcode group1[] = {
-	F(Lock, em_add),
-	F(Lock | PageTable, em_or),
-	F(Lock, em_adc),
-	F(Lock, em_sbb),
-	F(Lock | PageTable, em_and),
-	F(Lock, em_sub),
-	F(Lock, em_xor),
-	F(NoWrite, em_cmp),
+	I(Lock, em_add),
+	I(Lock | PageTable, em_or),
+	I(Lock, em_adc),
+	I(Lock, em_sbb),
+	I(Lock | PageTable, em_and),
+	I(Lock, em_sub),
+	I(Lock, em_xor),
+	I(NoWrite, em_cmp),
 };
 
 static const struct opcode group1A[] = {
@@ -4064,8 +4073,8 @@ static const struct opcode group2[] = {
 };
 
 static const struct opcode group3[] = {
-	F(DstMem | SrcImm | NoWrite, em_test),
-	F(DstMem | SrcImm | NoWrite, em_test),
+	I(DstMem | SrcImm | NoWrite, em_test),
+	I(DstMem | SrcImm | NoWrite, em_test),
 	I(DstMem | SrcNone | Lock, em_not),
 	I(DstMem | SrcNone | Lock, em_neg),
 	F(DstXacc | Src2Mem, em_mul_ex),
@@ -4258,29 +4267,29 @@ static const struct instr_dual instr_dua
 
 static const struct opcode opcode_table[256] = {
 	/* 0x00 - 0x07 */
-	F6ALU(Lock, em_add),
+	I6ALU(Lock, em_add),
 	I(ImplicitOps | Stack | No64 | Src2ES, em_push_sreg),
 	I(ImplicitOps | Stack | No64 | Src2ES, em_pop_sreg),
 	/* 0x08 - 0x0F */
-	F6ALU(Lock | PageTable, em_or),
+	I6ALU(Lock | PageTable, em_or),
 	I(ImplicitOps | Stack | No64 | Src2CS, em_push_sreg),
 	N,
 	/* 0x10 - 0x17 */
-	F6ALU(Lock, em_adc),
+	I6ALU(Lock, em_adc),
 	I(ImplicitOps | Stack | No64 | Src2SS, em_push_sreg),
 	I(ImplicitOps | Stack | No64 | Src2SS, em_pop_sreg),
 	/* 0x18 - 0x1F */
-	F6ALU(Lock, em_sbb),
+	I6ALU(Lock, em_sbb),
 	I(ImplicitOps | Stack | No64 | Src2DS, em_push_sreg),
 	I(ImplicitOps | Stack | No64 | Src2DS, em_pop_sreg),
 	/* 0x20 - 0x27 */
-	F6ALU(Lock | PageTable, em_and), N, N,
+	I6ALU(Lock | PageTable, em_and), N, N,
 	/* 0x28 - 0x2F */
-	F6ALU(Lock, em_sub), N, I(ByteOp | DstAcc | No64, em_das),
+	I6ALU(Lock, em_sub), N, I(ByteOp | DstAcc | No64, em_das),
 	/* 0x30 - 0x37 */
-	F6ALU(Lock, em_xor), N, N,
+	I6ALU(Lock, em_xor), N, N,
 	/* 0x38 - 0x3F */
-	F6ALU(NoWrite, em_cmp), N, N,
+	I6ALU(NoWrite, em_cmp), N, N,
 	/* 0x40 - 0x4F */
 	X8(I(DstReg, em_inc)), X8(I(DstReg, em_dec)),
 	/* 0x50 - 0x57 */
@@ -4306,7 +4315,7 @@ static const struct opcode opcode_table[
 	G(DstMem | SrcImm, group1),
 	G(ByteOp | DstMem | SrcImm | No64, group1),
 	G(DstMem | SrcImmByte, group1),
-	F2bv(DstMem | SrcReg | ModRM | NoWrite, em_test),
+	I2bv(DstMem | SrcReg | ModRM | NoWrite, em_test),
 	I2bv(DstMem | SrcReg | ModRM | Lock | PageTable, em_xchg),
 	/* 0x88 - 0x8F */
 	I2bv(DstMem | SrcReg | ModRM | Mov | PageTable, em_mov),
@@ -4329,7 +4338,7 @@ static const struct opcode opcode_table[
 	I2bv(SrcSI | DstDI | Mov | String | TwoMemOp, em_mov),
 	F2bv(SrcSI | DstDI | String | NoWrite | TwoMemOp, em_cmp_r),
 	/* 0xA8 - 0xAF */
-	F2bv(DstAcc | SrcImm | NoWrite, em_test),
+	I2bv(DstAcc | SrcImm | NoWrite, em_test),
 	I2bv(SrcAcc | DstDI | Mov | String, em_mov),
 	I2bv(SrcSI | DstAcc | Mov | String, em_mov),
 	F2bv(SrcAcc | DstDI | String | NoWrite, em_cmp_r),
@@ -4467,7 +4476,7 @@ static const struct opcode twobyte_table
 	I(DstReg | SrcMem | ModRM, em_bsr_c),
 	D(DstReg | SrcMem8 | ModRM | Mov), D(DstReg | SrcMem16 | ModRM | Mov),
 	/* 0xC0 - 0xC7 */
-	F2bv(DstMem | SrcReg | ModRM | SrcWrite | Lock, em_xadd),
+	I2bv(DstMem | SrcReg | ModRM | SrcWrite | Lock, em_xadd),
 	N, ID(0, &instr_dual_0f_c3),
 	N, N, N, GD(0, &group9),
 	/* 0xC8 - 0xCF */



^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH v3 04/16] x86/kvm/emulate: Introduce EM_ASM_2R
  2025-07-14 10:20 [PATCH v3 00/16] objtool: Detect and warn about indirect calls in __nocfi functions Peter Zijlstra
                   ` (2 preceding siblings ...)
  2025-07-14 10:20 ` [PATCH v3 03/16] x86/kvm/emulate: Introduce EM_ASM_2 Peter Zijlstra
@ 2025-07-14 10:20 ` Peter Zijlstra
  2025-07-14 10:20 ` [PATCH v3 05/16] x86/kvm/emulate: Introduce EM_ASM_2W Peter Zijlstra
                   ` (12 subsequent siblings)
  16 siblings, 0 replies; 36+ messages in thread
From: Peter Zijlstra @ 2025-07-14 10:20 UTC (permalink / raw)
  To: x86
  Cc: kys, haiyangz, wei.liu, decui, tglx, mingo, bp, dave.hansen, hpa,
	seanjc, pbonzini, ardb, kees, Arnd Bergmann, gregkh, jpoimboe,
	peterz, linux-hyperv, linux-kernel, kvm, linux-efi, samitolvanen,
	ojeda

Replace the FASTOP2R instruction.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/kvm/emulate.c |   16 ++++++++++++----
 1 file changed, 12 insertions(+), 4 deletions(-)

--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -326,6 +326,15 @@ static int em_##op(struct x86_emulate_ct
 	ON64(case 8: __EM_ASM_2(op##q, rax, rdx); break;) \
 	EM_ASM_END
 
+/* 2-operand, reversed */
+#define EM_ASM_2R(op, name) \
+	EM_ASM_START(name) \
+	case 1: __EM_ASM_2(op##b, dl, al); break; \
+	case 2: __EM_ASM_2(op##w, dx, ax); break; \
+	case 4: __EM_ASM_2(op##l, edx, eax); break; \
+	ON64(case 8: __EM_ASM_2(op##q, rdx, rax); break;) \
+	EM_ASM_END
+
 /*
  * fastop functions have a special calling convention:
  *
@@ -1077,8 +1086,7 @@ FASTOP2W(bts);
 FASTOP2W(btr);
 FASTOP2W(btc);
 
-
-FASTOP2R(cmp, cmp_r);
+EM_ASM_2R(cmp, cmp_r);
 
 static int em_bsf_c(struct x86_emulate_ctxt *ctxt)
 {
@@ -4336,12 +4344,12 @@ static const struct opcode opcode_table[
 	I2bv(DstAcc | SrcMem | Mov | MemAbs, em_mov),
 	I2bv(DstMem | SrcAcc | Mov | MemAbs | PageTable, em_mov),
 	I2bv(SrcSI | DstDI | Mov | String | TwoMemOp, em_mov),
-	F2bv(SrcSI | DstDI | String | NoWrite | TwoMemOp, em_cmp_r),
+	I2bv(SrcSI | DstDI | String | NoWrite | TwoMemOp, em_cmp_r),
 	/* 0xA8 - 0xAF */
 	I2bv(DstAcc | SrcImm | NoWrite, em_test),
 	I2bv(SrcAcc | DstDI | Mov | String, em_mov),
 	I2bv(SrcSI | DstAcc | Mov | String, em_mov),
-	F2bv(SrcAcc | DstDI | String | NoWrite, em_cmp_r),
+	I2bv(SrcAcc | DstDI | String | NoWrite, em_cmp_r),
 	/* 0xB0 - 0xB7 */
 	X8(I(ByteOp | DstReg | SrcImm | Mov, em_mov)),
 	/* 0xB8 - 0xBF */



^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH v3 05/16] x86/kvm/emulate: Introduce EM_ASM_2W
  2025-07-14 10:20 [PATCH v3 00/16] objtool: Detect and warn about indirect calls in __nocfi functions Peter Zijlstra
                   ` (3 preceding siblings ...)
  2025-07-14 10:20 ` [PATCH v3 04/16] x86/kvm/emulate: Introduce EM_ASM_2R Peter Zijlstra
@ 2025-07-14 10:20 ` Peter Zijlstra
  2025-07-14 10:20 ` [PATCH v3 06/16] x86/kvm/emulate: Introduce EM_ASM_2CL Peter Zijlstra
                   ` (11 subsequent siblings)
  16 siblings, 0 replies; 36+ messages in thread
From: Peter Zijlstra @ 2025-07-14 10:20 UTC (permalink / raw)
  To: x86
  Cc: kys, haiyangz, wei.liu, decui, tglx, mingo, bp, dave.hansen, hpa,
	seanjc, pbonzini, ardb, kees, Arnd Bergmann, gregkh, jpoimboe,
	peterz, linux-hyperv, linux-kernel, kvm, linux-efi, samitolvanen,
	ojeda

Replace the FASTOP2W instructions.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/kvm/emulate.c |   47 ++++++++++++++++++++++++++++-------------------
 1 file changed, 28 insertions(+), 19 deletions(-)

--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -335,6 +335,15 @@ static int em_##op(struct x86_emulate_ct
 	ON64(case 8: __EM_ASM_2(op##q, rdx, rax); break;) \
 	EM_ASM_END
 
+/* 2-operand, word only (no byte op) */
+#define EM_ASM_2W(op) \
+	EM_ASM_START(op) \
+	case 1: break; \
+	case 2: __EM_ASM_2(op##w, ax, dx); break; \
+	case 4: __EM_ASM_2(op##l, eax, edx); break; \
+	ON64(case 8: __EM_ASM_2(op##q, rax, rdx); break;) \
+	EM_ASM_END
+
 /*
  * fastop functions have a special calling convention:
  *
@@ -1064,7 +1073,7 @@ FASTOP1SRC2EX(idiv, idiv_ex);
 FASTOP3WCL(shld);
 FASTOP3WCL(shrd);
 
-FASTOP2W(imul);
+EM_ASM_2W(imul);
 
 EM_ASM_1(not);
 EM_ASM_1(neg);
@@ -1079,12 +1088,12 @@ FASTOP2CL(shl);
 FASTOP2CL(shr);
 FASTOP2CL(sar);
 
-FASTOP2W(bsf);
-FASTOP2W(bsr);
-FASTOP2W(bt);
-FASTOP2W(bts);
-FASTOP2W(btr);
-FASTOP2W(btc);
+EM_ASM_2W(bsf);
+EM_ASM_2W(bsr);
+EM_ASM_2W(bt);
+EM_ASM_2W(bts);
+EM_ASM_2W(btr);
+EM_ASM_2W(btc);
 
 EM_ASM_2R(cmp, cmp_r);
 
@@ -1093,7 +1102,7 @@ static int em_bsf_c(struct x86_emulate_c
 	/* If src is zero, do not writeback, but update flags */
 	if (ctxt->src.val == 0)
 		ctxt->dst.type = OP_NONE;
-	return fastop(ctxt, em_bsf);
+	return em_bsf(ctxt);
 }
 
 static int em_bsr_c(struct x86_emulate_ctxt *ctxt)
@@ -1101,7 +1110,7 @@ static int em_bsr_c(struct x86_emulate_c
 	/* If src is zero, do not writeback, but update flags */
 	if (ctxt->src.val == 0)
 		ctxt->dst.type = OP_NONE;
-	return fastop(ctxt, em_bsr);
+	return em_bsr(ctxt);
 }
 
 static __always_inline u8 test_cc(unsigned int condition, unsigned long flags)
@@ -3221,7 +3230,7 @@ static int em_xchg(struct x86_emulate_ct
 static int em_imul_3op(struct x86_emulate_ctxt *ctxt)
 {
 	ctxt->dst.val = ctxt->src2.val;
-	return fastop(ctxt, em_imul);
+	return em_imul(ctxt);
 }
 
 static int em_cwd(struct x86_emulate_ctxt *ctxt)
@@ -4135,10 +4144,10 @@ static const struct group_dual group7 =
 
 static const struct opcode group8[] = {
 	N, N, N, N,
-	F(DstMem | SrcImmByte | NoWrite,		em_bt),
-	F(DstMem | SrcImmByte | Lock | PageTable,	em_bts),
-	F(DstMem | SrcImmByte | Lock,			em_btr),
-	F(DstMem | SrcImmByte | Lock | PageTable,	em_btc),
+	I(DstMem | SrcImmByte | NoWrite,		em_bt),
+	I(DstMem | SrcImmByte | Lock | PageTable,	em_bts),
+	I(DstMem | SrcImmByte | Lock,			em_btr),
+	I(DstMem | SrcImmByte | Lock | PageTable,	em_btc),
 };
 
 /*
@@ -4459,27 +4468,27 @@ static const struct opcode twobyte_table
 	/* 0xA0 - 0xA7 */
 	I(Stack | Src2FS, em_push_sreg), I(Stack | Src2FS, em_pop_sreg),
 	II(ImplicitOps, em_cpuid, cpuid),
-	F(DstMem | SrcReg | ModRM | BitOp | NoWrite, em_bt),
+	I(DstMem | SrcReg | ModRM | BitOp | NoWrite, em_bt),
 	F(DstMem | SrcReg | Src2ImmByte | ModRM, em_shld),
 	F(DstMem | SrcReg | Src2CL | ModRM, em_shld), N, N,
 	/* 0xA8 - 0xAF */
 	I(Stack | Src2GS, em_push_sreg), I(Stack | Src2GS, em_pop_sreg),
 	II(EmulateOnUD | ImplicitOps, em_rsm, rsm),
-	F(DstMem | SrcReg | ModRM | BitOp | Lock | PageTable, em_bts),
+	I(DstMem | SrcReg | ModRM | BitOp | Lock | PageTable, em_bts),
 	F(DstMem | SrcReg | Src2ImmByte | ModRM, em_shrd),
 	F(DstMem | SrcReg | Src2CL | ModRM, em_shrd),
-	GD(0, &group15), F(DstReg | SrcMem | ModRM, em_imul),
+	GD(0, &group15), I(DstReg | SrcMem | ModRM, em_imul),
 	/* 0xB0 - 0xB7 */
 	I2bv(DstMem | SrcReg | ModRM | Lock | PageTable | SrcWrite, em_cmpxchg),
 	I(DstReg | SrcMemFAddr | ModRM | Src2SS, em_lseg),
-	F(DstMem | SrcReg | ModRM | BitOp | Lock, em_btr),
+	I(DstMem | SrcReg | ModRM | BitOp | Lock, em_btr),
 	I(DstReg | SrcMemFAddr | ModRM | Src2FS, em_lseg),
 	I(DstReg | SrcMemFAddr | ModRM | Src2GS, em_lseg),
 	D(DstReg | SrcMem8 | ModRM | Mov), D(DstReg | SrcMem16 | ModRM | Mov),
 	/* 0xB8 - 0xBF */
 	N, N,
 	G(BitOp, group8),
-	F(DstMem | SrcReg | ModRM | BitOp | Lock | PageTable, em_btc),
+	I(DstMem | SrcReg | ModRM | BitOp | Lock | PageTable, em_btc),
 	I(DstReg | SrcMem | ModRM, em_bsf_c),
 	I(DstReg | SrcMem | ModRM, em_bsr_c),
 	D(DstReg | SrcMem8 | ModRM | Mov), D(DstReg | SrcMem16 | ModRM | Mov),



^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH v3 06/16] x86/kvm/emulate: Introduce EM_ASM_2CL
  2025-07-14 10:20 [PATCH v3 00/16] objtool: Detect and warn about indirect calls in __nocfi functions Peter Zijlstra
                   ` (4 preceding siblings ...)
  2025-07-14 10:20 ` [PATCH v3 05/16] x86/kvm/emulate: Introduce EM_ASM_2W Peter Zijlstra
@ 2025-07-14 10:20 ` Peter Zijlstra
  2025-07-14 10:20 ` [PATCH v3 07/16] x86/kvm/emulate: Introduce EM_ASM_1SRC2 Peter Zijlstra
                   ` (10 subsequent siblings)
  16 siblings, 0 replies; 36+ messages in thread
From: Peter Zijlstra @ 2025-07-14 10:20 UTC (permalink / raw)
  To: x86
  Cc: kys, haiyangz, wei.liu, decui, tglx, mingo, bp, dave.hansen, hpa,
	seanjc, pbonzini, ardb, kees, Arnd Bergmann, gregkh, jpoimboe,
	peterz, linux-hyperv, linux-kernel, kvm, linux-efi, samitolvanen,
	ojeda

Replace the FASTOP2CL instructions.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/kvm/emulate.c |   39 ++++++++++++++++++++++++---------------
 1 file changed, 24 insertions(+), 15 deletions(-)

--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -344,6 +344,15 @@ static int em_##op(struct x86_emulate_ct
 	ON64(case 8: __EM_ASM_2(op##q, rax, rdx); break;) \
 	EM_ASM_END
 
+/* 2-operand, using "a" (dst) and CL (src2) */
+#define EM_ASM_2CL(op) \
+	EM_ASM_START(op) \
+	case 1: __EM_ASM_2(op##b, al, cl); break; \
+	case 2: __EM_ASM_2(op##w, ax, cl); break; \
+	case 4: __EM_ASM_2(op##l, eax, cl); break; \
+	ON64(case 8: __EM_ASM_2(op##q, rax, cl); break;) \
+	EM_ASM_END
+
 /*
  * fastop functions have a special calling convention:
  *
@@ -1080,13 +1089,13 @@ EM_ASM_1(neg);
 EM_ASM_1(inc);
 EM_ASM_1(dec);
 
-FASTOP2CL(rol);
-FASTOP2CL(ror);
-FASTOP2CL(rcl);
-FASTOP2CL(rcr);
-FASTOP2CL(shl);
-FASTOP2CL(shr);
-FASTOP2CL(sar);
+EM_ASM_2CL(rol);
+EM_ASM_2CL(ror);
+EM_ASM_2CL(rcl);
+EM_ASM_2CL(rcr);
+EM_ASM_2CL(shl);
+EM_ASM_2CL(shr);
+EM_ASM_2CL(sar);
 
 EM_ASM_2W(bsf);
 EM_ASM_2W(bsr);
@@ -4079,14 +4088,14 @@ static const struct opcode group1A[] = {
 };
 
 static const struct opcode group2[] = {
-	F(DstMem | ModRM, em_rol),
-	F(DstMem | ModRM, em_ror),
-	F(DstMem | ModRM, em_rcl),
-	F(DstMem | ModRM, em_rcr),
-	F(DstMem | ModRM, em_shl),
-	F(DstMem | ModRM, em_shr),
-	F(DstMem | ModRM, em_shl),
-	F(DstMem | ModRM, em_sar),
+	I(DstMem | ModRM, em_rol),
+	I(DstMem | ModRM, em_ror),
+	I(DstMem | ModRM, em_rcl),
+	I(DstMem | ModRM, em_rcr),
+	I(DstMem | ModRM, em_shl),
+	I(DstMem | ModRM, em_shr),
+	I(DstMem | ModRM, em_shl),
+	I(DstMem | ModRM, em_sar),
 };
 
 static const struct opcode group3[] = {



^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH v3 07/16] x86/kvm/emulate: Introduce EM_ASM_1SRC2
  2025-07-14 10:20 [PATCH v3 00/16] objtool: Detect and warn about indirect calls in __nocfi functions Peter Zijlstra
                   ` (5 preceding siblings ...)
  2025-07-14 10:20 ` [PATCH v3 06/16] x86/kvm/emulate: Introduce EM_ASM_2CL Peter Zijlstra
@ 2025-07-14 10:20 ` Peter Zijlstra
  2025-07-24  0:16   ` Sean Christopherson
  2025-07-14 10:20 ` [PATCH v3 08/16] x86/kvm/emulate: Introduce EM_ASM_3WCL Peter Zijlstra
                   ` (9 subsequent siblings)
  16 siblings, 1 reply; 36+ messages in thread
From: Peter Zijlstra @ 2025-07-14 10:20 UTC (permalink / raw)
  To: x86
  Cc: kys, haiyangz, wei.liu, decui, tglx, mingo, bp, dave.hansen, hpa,
	seanjc, pbonzini, ardb, kees, Arnd Bergmann, gregkh, jpoimboe,
	peterz, linux-hyperv, linux-kernel, kvm, linux-efi, samitolvanen,
	ojeda

Replace the FASTOP1SRC2*() instructions.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/kvm/emulate.c |   34 ++++++++++++++++++++++++++--------
 1 file changed, 26 insertions(+), 8 deletions(-)

--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -317,6 +317,24 @@ static int em_##op(struct x86_emulate_ct
 	ON64(case 8: __EM_ASM_1(op##q, rax); break;) \
 	EM_ASM_END
 
+/* 1-operand, using "c" (src2) */
+#define EM_ASM_1SRC2(op, name) \
+	EM_ASM_START(name) \
+	case 1: __EM_ASM_1(op##b, cl); break; \
+	case 2: __EM_ASM_1(op##w, cx); break; \
+	case 4: __EM_ASM_1(op##l, ecx); break; \
+	ON64(case 8: __EM_ASM_1(op##q, rcx); break;) \
+	EM_ASM_END
+
+/* 1-operand, using "c" (src2) with exception */
+#define EM_ASM_1SRC2EX(op, name) \
+	EM_ASM_START(name) \
+	case 1: __EM_ASM_1_EX(op##b, cl); break; \
+	case 2: __EM_ASM_1_EX(op##w, cx); break; \
+	case 4: __EM_ASM_1_EX(op##l, ecx); break; \
+	ON64(case 8: __EM_ASM_1(op##q, rcx); break;) \
+	EM_ASM_END
+
 /* 2-operand, using "a" (dst), "d" (src) */
 #define EM_ASM_2(op) \
 	EM_ASM_START(op) \
@@ -1074,10 +1092,10 @@ EM_ASM_2(cmp);
 EM_ASM_2(test);
 EM_ASM_2(xadd);
 
-FASTOP1SRC2(mul, mul_ex);
-FASTOP1SRC2(imul, imul_ex);
-FASTOP1SRC2EX(div, div_ex);
-FASTOP1SRC2EX(idiv, idiv_ex);
+EM_ASM_1SRC2(mul, mul_ex);
+EM_ASM_1SRC2(imul, imul_ex);
+EM_ASM_1SRC2EX(div, div_ex);
+EM_ASM_1SRC2EX(idiv, idiv_ex);
 
 FASTOP3WCL(shld);
 FASTOP3WCL(shrd);
@@ -4103,10 +4121,10 @@ static const struct opcode group3[] = {
 	I(DstMem | SrcImm | NoWrite, em_test),
 	I(DstMem | SrcNone | Lock, em_not),
 	I(DstMem | SrcNone | Lock, em_neg),
-	F(DstXacc | Src2Mem, em_mul_ex),
-	F(DstXacc | Src2Mem, em_imul_ex),
-	F(DstXacc | Src2Mem, em_div_ex),
-	F(DstXacc | Src2Mem, em_idiv_ex),
+	I(DstXacc | Src2Mem, em_mul_ex),
+	I(DstXacc | Src2Mem, em_imul_ex),
+	I(DstXacc | Src2Mem, em_div_ex),
+	I(DstXacc | Src2Mem, em_idiv_ex),
 };
 
 static const struct opcode group4[] = {



^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH v3 08/16] x86/kvm/emulate: Introduce EM_ASM_3WCL
  2025-07-14 10:20 [PATCH v3 00/16] objtool: Detect and warn about indirect calls in __nocfi functions Peter Zijlstra
                   ` (6 preceding siblings ...)
  2025-07-14 10:20 ` [PATCH v3 07/16] x86/kvm/emulate: Introduce EM_ASM_1SRC2 Peter Zijlstra
@ 2025-07-14 10:20 ` Peter Zijlstra
  2025-07-14 10:20 ` [PATCH v3 09/16] x86/kvm/emulate: Convert em_salc() to C Peter Zijlstra
                   ` (8 subsequent siblings)
  16 siblings, 0 replies; 36+ messages in thread
From: Peter Zijlstra @ 2025-07-14 10:20 UTC (permalink / raw)
  To: x86
  Cc: kys, haiyangz, wei.liu, decui, tglx, mingo, bp, dave.hansen, hpa,
	seanjc, pbonzini, ardb, kees, Arnd Bergmann, gregkh, jpoimboe,
	peterz, linux-hyperv, linux-kernel, kvm, linux-efi, samitolvanen,
	ojeda

Replace the FASTOP3WCL instructions.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/kvm/emulate.c |   25 +++++++++++++++++++------
 1 file changed, 19 insertions(+), 6 deletions(-)

--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -302,6 +302,9 @@ static int em_##op(struct x86_emulate_ct
 #define __EM_ASM_2(op, dst, src) \
 		__EM_ASM(#op " %%" #src ", %%" #dst " \n\t")
 
+#define __EM_ASM_3(op, dst, src, src2) \
+		__EM_ASM(#op " %%" #src2 ", %%" #src ", %%" #dst " \n\t")
+
 #define EM_ASM_END \
 	} \
 	ctxt->eflags = (ctxt->eflags & ~EFLAGS_MASK) | (flags & EFLAGS_MASK); \
@@ -371,6 +374,16 @@ static int em_##op(struct x86_emulate_ct
 	ON64(case 8: __EM_ASM_2(op##q, rax, cl); break;) \
 	EM_ASM_END
 
+/* 3-operand, using "a" (dst), "d" (src) and CL (src2) */
+#define EM_ASM_3WCL(op) \
+	EM_ASM_START(op) \
+	case 1: break; \
+	case 2: __EM_ASM_3(op##w, ax, dx, cl); break; \
+	case 4: __EM_ASM_3(op##l, eax, edx, cl); break; \
+	ON64(case 8: __EM_ASM_3(op##q, rax, rdx, cl); break;) \
+	EM_ASM_END
+
+
 /*
  * fastop functions have a special calling convention:
  *
@@ -1097,8 +1110,8 @@ EM_ASM_1SRC2(imul, imul_ex);
 EM_ASM_1SRC2EX(div, div_ex);
 EM_ASM_1SRC2EX(idiv, idiv_ex);
 
-FASTOP3WCL(shld);
-FASTOP3WCL(shrd);
+EM_ASM_3WCL(shld);
+EM_ASM_3WCL(shrd);
 
 EM_ASM_2W(imul);
 
@@ -4496,14 +4509,14 @@ static const struct opcode twobyte_table
 	I(Stack | Src2FS, em_push_sreg), I(Stack | Src2FS, em_pop_sreg),
 	II(ImplicitOps, em_cpuid, cpuid),
 	I(DstMem | SrcReg | ModRM | BitOp | NoWrite, em_bt),
-	F(DstMem | SrcReg | Src2ImmByte | ModRM, em_shld),
-	F(DstMem | SrcReg | Src2CL | ModRM, em_shld), N, N,
+	I(DstMem | SrcReg | Src2ImmByte | ModRM, em_shld),
+	I(DstMem | SrcReg | Src2CL | ModRM, em_shld), N, N,
 	/* 0xA8 - 0xAF */
 	I(Stack | Src2GS, em_push_sreg), I(Stack | Src2GS, em_pop_sreg),
 	II(EmulateOnUD | ImplicitOps, em_rsm, rsm),
 	I(DstMem | SrcReg | ModRM | BitOp | Lock | PageTable, em_bts),
-	F(DstMem | SrcReg | Src2ImmByte | ModRM, em_shrd),
-	F(DstMem | SrcReg | Src2CL | ModRM, em_shrd),
+	I(DstMem | SrcReg | Src2ImmByte | ModRM, em_shrd),
+	I(DstMem | SrcReg | Src2CL | ModRM, em_shrd),
 	GD(0, &group15), I(DstReg | SrcMem | ModRM, em_imul),
 	/* 0xB0 - 0xB7 */
 	I2bv(DstMem | SrcReg | ModRM | Lock | PageTable | SrcWrite, em_cmpxchg),



^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH v3 09/16] x86/kvm/emulate: Convert em_salc() to C
  2025-07-14 10:20 [PATCH v3 00/16] objtool: Detect and warn about indirect calls in __nocfi functions Peter Zijlstra
                   ` (7 preceding siblings ...)
  2025-07-14 10:20 ` [PATCH v3 08/16] x86/kvm/emulate: Introduce EM_ASM_3WCL Peter Zijlstra
@ 2025-07-14 10:20 ` Peter Zijlstra
  2025-07-14 10:20 ` [PATCH v3 10/16] x86/kvm/emulate: Remove fastops Peter Zijlstra
                   ` (7 subsequent siblings)
  16 siblings, 0 replies; 36+ messages in thread
From: Peter Zijlstra @ 2025-07-14 10:20 UTC (permalink / raw)
  To: x86
  Cc: kys, haiyangz, wei.liu, decui, tglx, mingo, bp, dave.hansen, hpa,
	seanjc, pbonzini, ardb, kees, Arnd Bergmann, gregkh, jpoimboe,
	peterz, linux-hyperv, linux-kernel, kvm, linux-efi, samitolvanen,
	ojeda

Implement the SALC (Set AL if Carry) instruction in C.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/kvm/emulate.c |   15 +++++++++------
 1 file changed, 9 insertions(+), 6 deletions(-)

--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -529,11 +529,14 @@ static int fastop(struct x86_emulate_ctx
 	ON64(FOP3E(op##q, rax, rdx, cl)) \
 	FOP_END
 
-FOP_START(salc)
-FOP_FUNC(salc)
-"pushf; sbb %al, %al; popf \n\t"
-FOP_RET(salc)
-FOP_END;
+static int em_salc(struct x86_emulate_ctxt *ctxt)
+{
+	/*
+	 * Set AL 0xFF if CF is set, or 0x00 when clear.
+	 */
+	ctxt->dst.val = 0xFF * !!(ctxt->eflags & X86_EFLAGS_CF);
+	return X86EMUL_CONTINUE;
+}
 
 /*
  * XXX: inoutclob user must know where the argument is being expanded.
@@ -4423,7 +4426,7 @@ static const struct opcode opcode_table[
 	G(Src2CL | ByteOp, group2), G(Src2CL, group2),
 	I(DstAcc | SrcImmUByte | No64, em_aam),
 	I(DstAcc | SrcImmUByte | No64, em_aad),
-	F(DstAcc | ByteOp | No64, em_salc),
+	I(DstAcc | ByteOp | No64, em_salc),
 	I(DstAcc | SrcXLat | ByteOp, em_mov),
 	/* 0xD8 - 0xDF */
 	N, E(0, &escape_d9), N, E(0, &escape_db), N, E(0, &escape_dd), N, N,



^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH v3 10/16] x86/kvm/emulate: Remove fastops
  2025-07-14 10:20 [PATCH v3 00/16] objtool: Detect and warn about indirect calls in __nocfi functions Peter Zijlstra
                   ` (8 preceding siblings ...)
  2025-07-14 10:20 ` [PATCH v3 09/16] x86/kvm/emulate: Convert em_salc() to C Peter Zijlstra
@ 2025-07-14 10:20 ` Peter Zijlstra
  2025-07-14 10:20 ` [PATCH v3 11/16] x86,hyperv: Clean up hv_do_hypercall() Peter Zijlstra
                   ` (6 subsequent siblings)
  16 siblings, 0 replies; 36+ messages in thread
From: Peter Zijlstra @ 2025-07-14 10:20 UTC (permalink / raw)
  To: x86
  Cc: kys, haiyangz, wei.liu, decui, tglx, mingo, bp, dave.hansen, hpa,
	seanjc, pbonzini, ardb, kees, Arnd Bergmann, gregkh, jpoimboe,
	peterz, linux-hyperv, linux-kernel, kvm, linux-efi, samitolvanen,
	ojeda

No more FASTOPs, remove the remains.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/kvm/emulate.c |  172 -------------------------------------------------
 1 file changed, 1 insertion(+), 171 deletions(-)

--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -167,7 +167,6 @@
 #define Unaligned   ((u64)2 << 41)  /* Explicitly unaligned (e.g. MOVDQU) */
 #define Avx         ((u64)3 << 41)  /* Advanced Vector Extensions */
 #define Aligned16   ((u64)4 << 41)  /* Aligned to 16 byte boundary (e.g. FXSAVE) */
-#define Fastop      ((u64)1 << 44)  /* Use opcode::u.fastop */
 #define NoWrite     ((u64)1 << 45)  /* No writeback */
 #define SrcWrite    ((u64)1 << 46)  /* Write back src operand */
 #define NoMod	    ((u64)1 << 47)  /* Mod field is ignored */
@@ -203,7 +202,6 @@ struct opcode {
 		const struct escape *esc;
 		const struct instr_dual *idual;
 		const struct mode_dual *mdual;
-		void (*fastop)(struct fastop *fake);
 	} u;
 	int (*check_perm)(struct x86_emulate_ctxt *ctxt);
 };
@@ -383,152 +381,6 @@ static int em_##op(struct x86_emulate_ct
 	ON64(case 8: __EM_ASM_3(op##q, rax, rdx, cl); break;) \
 	EM_ASM_END
 
-
-/*
- * fastop functions have a special calling convention:
- *
- * dst:    rax        (in/out)
- * src:    rdx        (in/out)
- * src2:   rcx        (in)
- * flags:  rflags     (in/out)
- * ex:     rsi        (in:fastop pointer, out:zero if exception)
- *
- * Moreover, they are all exactly FASTOP_SIZE bytes long, so functions for
- * different operand sizes can be reached by calculation, rather than a jump
- * table (which would be bigger than the code).
- *
- * The 16 byte alignment, considering 5 bytes for the RET thunk, 3 for ENDBR
- * and 1 for the straight line speculation INT3, leaves 7 bytes for the
- * body of the function.  Currently none is larger than 4.
- */
-static int fastop(struct x86_emulate_ctxt *ctxt, fastop_t fop);
-
-#define FASTOP_SIZE	16
-
-#define __FOP_FUNC(name) \
-	".align " __stringify(FASTOP_SIZE) " \n\t" \
-	".type " name ", @function \n\t" \
-	name ":\n\t" \
-	ASM_ENDBR \
-	IBT_NOSEAL(name)
-
-#define FOP_FUNC(name) \
-	__FOP_FUNC(#name)
-
-#define __FOP_RET(name) \
-	"11: " ASM_RET \
-	".size " name ", .-" name "\n\t"
-
-#define FOP_RET(name) \
-	__FOP_RET(#name)
-
-#define __FOP_START(op, align) \
-	extern void em_##op(struct fastop *fake); \
-	asm(".pushsection .text, \"ax\" \n\t" \
-	    ".global em_" #op " \n\t" \
-	    ".align " __stringify(align) " \n\t" \
-	    "em_" #op ":\n\t"
-
-#define FOP_START(op) __FOP_START(op, FASTOP_SIZE)
-
-#define FOP_END \
-	    ".popsection")
-
-#define __FOPNOP(name) \
-	__FOP_FUNC(name) \
-	__FOP_RET(name)
-
-#define FOPNOP() \
-	__FOPNOP(__stringify(__UNIQUE_ID(nop)))
-
-#define FOP1E(op,  dst) \
-	__FOP_FUNC(#op "_" #dst) \
-	"10: " #op " %" #dst " \n\t" \
-	__FOP_RET(#op "_" #dst)
-
-#define FOP1EEX(op,  dst) \
-	FOP1E(op, dst) _ASM_EXTABLE_TYPE_REG(10b, 11b, EX_TYPE_ZERO_REG, %%esi)
-
-#define FASTOP1(op) \
-	FOP_START(op) \
-	FOP1E(op##b, al) \
-	FOP1E(op##w, ax) \
-	FOP1E(op##l, eax) \
-	ON64(FOP1E(op##q, rax))	\
-	FOP_END
-
-/* 1-operand, using src2 (for MUL/DIV r/m) */
-#define FASTOP1SRC2(op, name) \
-	FOP_START(name) \
-	FOP1E(op, cl) \
-	FOP1E(op, cx) \
-	FOP1E(op, ecx) \
-	ON64(FOP1E(op, rcx)) \
-	FOP_END
-
-/* 1-operand, using src2 (for MUL/DIV r/m), with exceptions */
-#define FASTOP1SRC2EX(op, name) \
-	FOP_START(name) \
-	FOP1EEX(op, cl) \
-	FOP1EEX(op, cx) \
-	FOP1EEX(op, ecx) \
-	ON64(FOP1EEX(op, rcx)) \
-	FOP_END
-
-#define FOP2E(op,  dst, src)	   \
-	__FOP_FUNC(#op "_" #dst "_" #src) \
-	#op " %" #src ", %" #dst " \n\t" \
-	__FOP_RET(#op "_" #dst "_" #src)
-
-#define FASTOP2(op) \
-	FOP_START(op) \
-	FOP2E(op##b, al, dl) \
-	FOP2E(op##w, ax, dx) \
-	FOP2E(op##l, eax, edx) \
-	ON64(FOP2E(op##q, rax, rdx)) \
-	FOP_END
-
-/* 2 operand, word only */
-#define FASTOP2W(op) \
-	FOP_START(op) \
-	FOPNOP() \
-	FOP2E(op##w, ax, dx) \
-	FOP2E(op##l, eax, edx) \
-	ON64(FOP2E(op##q, rax, rdx)) \
-	FOP_END
-
-/* 2 operand, src is CL */
-#define FASTOP2CL(op) \
-	FOP_START(op) \
-	FOP2E(op##b, al, cl) \
-	FOP2E(op##w, ax, cl) \
-	FOP2E(op##l, eax, cl) \
-	ON64(FOP2E(op##q, rax, cl)) \
-	FOP_END
-
-/* 2 operand, src and dest are reversed */
-#define FASTOP2R(op, name) \
-	FOP_START(name) \
-	FOP2E(op##b, dl, al) \
-	FOP2E(op##w, dx, ax) \
-	FOP2E(op##l, edx, eax) \
-	ON64(FOP2E(op##q, rdx, rax)) \
-	FOP_END
-
-#define FOP3E(op,  dst, src, src2) \
-	__FOP_FUNC(#op "_" #dst "_" #src "_" #src2) \
-	#op " %" #src2 ", %" #src ", %" #dst " \n\t"\
-	__FOP_RET(#op "_" #dst "_" #src "_" #src2)
-
-/* 3-operand, word-only, src2=cl */
-#define FASTOP3WCL(op) \
-	FOP_START(op) \
-	FOPNOP() \
-	FOP3E(op##w, ax, dx, cl) \
-	FOP3E(op##l, eax, edx, cl) \
-	ON64(FOP3E(op##q, rax, rdx, cl)) \
-	FOP_END
-
 static int em_salc(struct x86_emulate_ctxt *ctxt)
 {
 	/*
@@ -4052,7 +3904,6 @@ static int check_perm_out(struct x86_emu
 #define MD(_f, _m) { .flags = ((_f) | ModeDual), .u.mdual = (_m) }
 #define E(_f, _e) { .flags = ((_f) | Escape | ModRM), .u.esc = (_e) }
 #define I(_f, _e) { .flags = (_f), .u.execute = (_e) }
-#define F(_f, _e) { .flags = (_f) | Fastop, .u.fastop = (_e) }
 #define II(_f, _e, _i) \
 	{ .flags = (_f)|Intercept, .u.execute = (_e), .intercept = x86_intercept_##_i }
 #define IIP(_f, _e, _i, _p) \
@@ -5158,24 +5009,6 @@ static void fetch_possible_mmx_operand(s
 		kvm_read_mmx_reg(op->addr.mm, &op->mm_val);
 }
 
-static int fastop(struct x86_emulate_ctxt *ctxt, fastop_t fop)
-{
-	ulong flags = (ctxt->eflags & EFLAGS_MASK) | X86_EFLAGS_IF;
-
-	if (!(ctxt->d & ByteOp))
-		fop += __ffs(ctxt->dst.bytes) * FASTOP_SIZE;
-
-	asm("push %[flags]; popf; " CALL_NOSPEC " ; pushf; pop %[flags]\n"
-	    : "+a"(ctxt->dst.val), "+d"(ctxt->src.val), [flags]"+D"(flags),
-	      [thunk_target]"+S"(fop), ASM_CALL_CONSTRAINT
-	    : "c"(ctxt->src2.val));
-
-	ctxt->eflags = (ctxt->eflags & ~EFLAGS_MASK) | (flags & EFLAGS_MASK);
-	if (!fop) /* exception is returned in fop variable */
-		return emulate_de(ctxt);
-	return X86EMUL_CONTINUE;
-}
-
 void init_decode_cache(struct x86_emulate_ctxt *ctxt)
 {
 	/* Clear fields that are set conditionally but read without a guard. */
@@ -5340,10 +5173,7 @@ int x86_emulate_insn(struct x86_emulate_
 		ctxt->eflags &= ~X86_EFLAGS_RF;
 
 	if (ctxt->execute) {
-		if (ctxt->d & Fastop)
-			rc = fastop(ctxt, ctxt->fop);
-		else
-			rc = ctxt->execute(ctxt);
+		rc = ctxt->execute(ctxt);
 		if (rc != X86EMUL_CONTINUE)
 			goto done;
 		goto writeback;



^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH v3 11/16] x86,hyperv: Clean up hv_do_hypercall()
  2025-07-14 10:20 [PATCH v3 00/16] objtool: Detect and warn about indirect calls in __nocfi functions Peter Zijlstra
                   ` (9 preceding siblings ...)
  2025-07-14 10:20 ` [PATCH v3 10/16] x86/kvm/emulate: Remove fastops Peter Zijlstra
@ 2025-07-14 10:20 ` Peter Zijlstra
  2025-07-15  4:54   ` Wei Liu
  2025-07-15 14:51   ` Michael Kelley
  2025-07-14 10:20 ` [PATCH v3 12/16] x86_64,hyperv: Use direct call to hypercall-page Peter Zijlstra
                   ` (5 subsequent siblings)
  16 siblings, 2 replies; 36+ messages in thread
From: Peter Zijlstra @ 2025-07-14 10:20 UTC (permalink / raw)
  To: x86
  Cc: kys, haiyangz, wei.liu, decui, tglx, mingo, bp, dave.hansen, hpa,
	seanjc, pbonzini, ardb, kees, Arnd Bergmann, gregkh, jpoimboe,
	peterz, linux-hyperv, linux-kernel, kvm, linux-efi, samitolvanen,
	ojeda, Michael Kelley

What used to be a simple few instructions has turned into a giant mess
(for x86_64). Not only does it use static_branch wrong, it mixes it
with dynamic branches for no apparent reason.

Notably it uses static_branch through an out-of-line function call,
which completely defeats the purpose, since instead of a simple
JMP/NOP site, you get a CALL+RET+TEST+Jcc sequence in return, which is
absolutely idiotic.

Add to that a dynamic test of hyperv_paravisor_present, something
which is set once and never changed.

Replace all this idiocy with a single direct function call to the
right hypercall variant.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Michael Kelley <mhklinux@outlook.com>
Reviewed-by: Michael Kelley <mhklinux@outlook.com>
---
 arch/x86/hyperv/hv_init.c       |   20 +++++
 arch/x86/hyperv/ivm.c           |   15 ++++
 arch/x86/include/asm/mshyperv.h |  137 +++++++++++-----------------------------
 arch/x86/kernel/cpu/mshyperv.c  |   19 +++--
 4 files changed, 89 insertions(+), 102 deletions(-)

--- a/arch/x86/hyperv/hv_init.c
+++ b/arch/x86/hyperv/hv_init.c
@@ -36,7 +36,27 @@
 #include <linux/highmem.h>
 
 void *hv_hypercall_pg;
+
+#ifdef CONFIG_X86_64
+u64 hv_std_hypercall(u64 control, u64 param1, u64 param2)
+{
+	u64 hv_status;
+
+	if (!hv_hypercall_pg)
+		return U64_MAX;
+
+	register u64 __r8 asm("r8") = param2;
+	asm volatile (CALL_NOSPEC
+		      : "=a" (hv_status), ASM_CALL_CONSTRAINT,
+		        "+c" (control), "+d" (param1), "+r" (__r8)
+		      : THUNK_TARGET(hv_hypercall_pg)
+		      : "cc", "memory", "r9", "r10", "r11");
+
+	return hv_status;
+}
+#else
 EXPORT_SYMBOL_GPL(hv_hypercall_pg);
+#endif
 
 union hv_ghcb * __percpu *hv_ghcb_pg;
 
--- a/arch/x86/hyperv/ivm.c
+++ b/arch/x86/hyperv/ivm.c
@@ -377,9 +377,23 @@ int hv_snp_boot_ap(u32 cpu, unsigned lon
 	return ret;
 }
 
+u64 hv_snp_hypercall(u64 control, u64 param1, u64 param2)
+{
+	u64 hv_status;
+
+	register u64 __r8 asm("r8") = param2;
+	asm volatile("vmmcall"
+		     : "=a" (hv_status), ASM_CALL_CONSTRAINT,
+		       "+c" (control), "+d" (param1), "+r" (__r8)
+		     : : "cc", "memory", "r9", "r10", "r11");
+
+	return hv_status;
+}
+
 #else
 static inline void hv_ghcb_msr_write(u64 msr, u64 value) {}
 static inline void hv_ghcb_msr_read(u64 msr, u64 *value) {}
+u64 hv_snp_hypercall(u64 control, u64 param1, u64 param2) { return U64_MAX; }
 #endif /* CONFIG_AMD_MEM_ENCRYPT */
 
 #ifdef CONFIG_INTEL_TDX_GUEST
@@ -429,6 +443,7 @@ u64 hv_tdx_hypercall(u64 control, u64 pa
 #else
 static inline void hv_tdx_msr_write(u64 msr, u64 value) {}
 static inline void hv_tdx_msr_read(u64 msr, u64 *value) {}
+u64 hv_tdx_hypercall(u64 control, u64 param1, u64 param2) { return U64_MAX; }
 #endif /* CONFIG_INTEL_TDX_GUEST */
 
 #if defined(CONFIG_AMD_MEM_ENCRYPT) || defined(CONFIG_INTEL_TDX_GUEST)
--- a/arch/x86/include/asm/mshyperv.h
+++ b/arch/x86/include/asm/mshyperv.h
@@ -6,6 +6,7 @@
 #include <linux/nmi.h>
 #include <linux/msi.h>
 #include <linux/io.h>
+#include <linux/static_call.h>
 #include <asm/nospec-branch.h>
 #include <asm/paravirt.h>
 #include <asm/msr.h>
@@ -39,16 +40,21 @@ static inline unsigned char hv_get_nmi_r
 	return 0;
 }
 
-#if IS_ENABLED(CONFIG_HYPERV)
-extern bool hyperv_paravisor_present;
+extern u64 hv_tdx_hypercall(u64 control, u64 param1, u64 param2);
+extern u64 hv_snp_hypercall(u64 control, u64 param1, u64 param2);
+extern u64 hv_std_hypercall(u64 control, u64 param1, u64 param2);
 
+#if IS_ENABLED(CONFIG_HYPERV)
 extern void *hv_hypercall_pg;
 
 extern union hv_ghcb * __percpu *hv_ghcb_pg;
 
 bool hv_isolation_type_snp(void);
 bool hv_isolation_type_tdx(void);
-u64 hv_tdx_hypercall(u64 control, u64 param1, u64 param2);
+
+#ifdef CONFIG_X86_64
+DECLARE_STATIC_CALL(hv_hypercall, hv_std_hypercall);
+#endif
 
 /*
  * DEFAULT INIT GPAT and SEGMENT LIMIT value in struct VMSA
@@ -65,37 +71,15 @@ static inline u64 hv_do_hypercall(u64 co
 {
 	u64 input_address = input ? virt_to_phys(input) : 0;
 	u64 output_address = output ? virt_to_phys(output) : 0;
-	u64 hv_status;
 
 #ifdef CONFIG_X86_64
-	if (hv_isolation_type_tdx() && !hyperv_paravisor_present)
-		return hv_tdx_hypercall(control, input_address, output_address);
-
-	if (hv_isolation_type_snp() && !hyperv_paravisor_present) {
-		__asm__ __volatile__("mov %[output_address], %%r8\n"
-				     "vmmcall"
-				     : "=a" (hv_status), ASM_CALL_CONSTRAINT,
-				       "+c" (control), "+d" (input_address)
-				     : [output_address] "r" (output_address)
-				     : "cc", "memory", "r8", "r9", "r10", "r11");
-		return hv_status;
-	}
-
-	if (!hv_hypercall_pg)
-		return U64_MAX;
-
-	__asm__ __volatile__("mov %[output_address], %%r8\n"
-			     CALL_NOSPEC
-			     : "=a" (hv_status), ASM_CALL_CONSTRAINT,
-			       "+c" (control), "+d" (input_address)
-			     : [output_address] "r" (output_address),
-			       THUNK_TARGET(hv_hypercall_pg)
-			     : "cc", "memory", "r8", "r9", "r10", "r11");
+	return static_call_mod(hv_hypercall)(control, input_address, output_address);
 #else
 	u32 input_address_hi = upper_32_bits(input_address);
 	u32 input_address_lo = lower_32_bits(input_address);
 	u32 output_address_hi = upper_32_bits(output_address);
 	u32 output_address_lo = lower_32_bits(output_address);
+	u64 hv_status;
 
 	if (!hv_hypercall_pg)
 		return U64_MAX;
@@ -108,8 +92,8 @@ static inline u64 hv_do_hypercall(u64 co
 			       "D"(output_address_hi), "S"(output_address_lo),
 			       THUNK_TARGET(hv_hypercall_pg)
 			     : "cc", "memory");
-#endif /* !x86_64 */
 	return hv_status;
+#endif /* !x86_64 */
 }
 
 /* Hypercall to the L0 hypervisor */
@@ -121,41 +105,23 @@ static inline u64 hv_do_nested_hypercall
 /* Fast hypercall with 8 bytes of input and no output */
 static inline u64 _hv_do_fast_hypercall8(u64 control, u64 input1)
 {
-	u64 hv_status;
-
 #ifdef CONFIG_X86_64
-	if (hv_isolation_type_tdx() && !hyperv_paravisor_present)
-		return hv_tdx_hypercall(control, input1, 0);
-
-	if (hv_isolation_type_snp() && !hyperv_paravisor_present) {
-		__asm__ __volatile__(
-				"vmmcall"
-				: "=a" (hv_status), ASM_CALL_CONSTRAINT,
-				"+c" (control), "+d" (input1)
-				:: "cc", "r8", "r9", "r10", "r11");
-	} else {
-		__asm__ __volatile__(CALL_NOSPEC
-				     : "=a" (hv_status), ASM_CALL_CONSTRAINT,
-				       "+c" (control), "+d" (input1)
-				     : THUNK_TARGET(hv_hypercall_pg)
-				     : "cc", "r8", "r9", "r10", "r11");
-	}
+	return static_call_mod(hv_hypercall)(control, input1, 0);
 #else
-	{
-		u32 input1_hi = upper_32_bits(input1);
-		u32 input1_lo = lower_32_bits(input1);
-
-		__asm__ __volatile__ (CALL_NOSPEC
-				      : "=A"(hv_status),
-					"+c"(input1_lo),
-					ASM_CALL_CONSTRAINT
-				      :	"A" (control),
-					"b" (input1_hi),
-					THUNK_TARGET(hv_hypercall_pg)
-				      : "cc", "edi", "esi");
-	}
-#endif
+	u32 input1_hi = upper_32_bits(input1);
+	u32 input1_lo = lower_32_bits(input1);
+	u64 hv_status;
+
+	__asm__ __volatile__ (CALL_NOSPEC
+			      : "=A"(hv_status),
+			      "+c"(input1_lo),
+			      ASM_CALL_CONSTRAINT
+			      :	"A" (control),
+			      "b" (input1_hi),
+			      THUNK_TARGET(hv_hypercall_pg)
+			      : "cc", "edi", "esi");
 	return hv_status;
+#endif
 }
 
 static inline u64 hv_do_fast_hypercall8(u16 code, u64 input1)
@@ -175,45 +141,24 @@ static inline u64 hv_do_fast_nested_hype
 /* Fast hypercall with 16 bytes of input */
 static inline u64 _hv_do_fast_hypercall16(u64 control, u64 input1, u64 input2)
 {
-	u64 hv_status;
-
 #ifdef CONFIG_X86_64
-	if (hv_isolation_type_tdx() && !hyperv_paravisor_present)
-		return hv_tdx_hypercall(control, input1, input2);
-
-	if (hv_isolation_type_snp() && !hyperv_paravisor_present) {
-		__asm__ __volatile__("mov %[input2], %%r8\n"
-				     "vmmcall"
-				     : "=a" (hv_status), ASM_CALL_CONSTRAINT,
-				       "+c" (control), "+d" (input1)
-				     : [input2] "r" (input2)
-				     : "cc", "r8", "r9", "r10", "r11");
-	} else {
-		__asm__ __volatile__("mov %[input2], %%r8\n"
-				     CALL_NOSPEC
-				     : "=a" (hv_status), ASM_CALL_CONSTRAINT,
-				       "+c" (control), "+d" (input1)
-				     : [input2] "r" (input2),
-				       THUNK_TARGET(hv_hypercall_pg)
-				     : "cc", "r8", "r9", "r10", "r11");
-	}
+	return static_call_mod(hv_hypercall)(control, input1, input2);
 #else
-	{
-		u32 input1_hi = upper_32_bits(input1);
-		u32 input1_lo = lower_32_bits(input1);
-		u32 input2_hi = upper_32_bits(input2);
-		u32 input2_lo = lower_32_bits(input2);
-
-		__asm__ __volatile__ (CALL_NOSPEC
-				      : "=A"(hv_status),
-					"+c"(input1_lo), ASM_CALL_CONSTRAINT
-				      :	"A" (control), "b" (input1_hi),
-					"D"(input2_hi), "S"(input2_lo),
-					THUNK_TARGET(hv_hypercall_pg)
-				      : "cc");
-	}
-#endif
+	u32 input1_hi = upper_32_bits(input1);
+	u32 input1_lo = lower_32_bits(input1);
+	u32 input2_hi = upper_32_bits(input2);
+	u32 input2_lo = lower_32_bits(input2);
+	u64 hv_status;
+
+	__asm__ __volatile__ (CALL_NOSPEC
+			      : "=A"(hv_status),
+			      "+c"(input1_lo), ASM_CALL_CONSTRAINT
+			      :	"A" (control), "b" (input1_hi),
+			      "D"(input2_hi), "S"(input2_lo),
+			      THUNK_TARGET(hv_hypercall_pg)
+			      : "cc");
 	return hv_status;
+#endif
 }
 
 static inline u64 hv_do_fast_hypercall16(u16 code, u64 input1, u64 input2)
--- a/arch/x86/kernel/cpu/mshyperv.c
+++ b/arch/x86/kernel/cpu/mshyperv.c
@@ -38,10 +38,6 @@
 bool hv_nested;
 struct ms_hyperv_info ms_hyperv;
 
-/* Used in modules via hv_do_hypercall(): see arch/x86/include/asm/mshyperv.h */
-bool hyperv_paravisor_present __ro_after_init;
-EXPORT_SYMBOL_GPL(hyperv_paravisor_present);
-
 #if IS_ENABLED(CONFIG_HYPERV)
 static inline unsigned int hv_get_nested_msr(unsigned int reg)
 {
@@ -288,8 +284,18 @@ static void __init x86_setup_ops_for_tsc
 	old_restore_sched_clock_state = x86_platform.restore_sched_clock_state;
 	x86_platform.restore_sched_clock_state = hv_restore_sched_clock_state;
 }
+
+#ifdef CONFIG_X86_64
+DEFINE_STATIC_CALL(hv_hypercall, hv_std_hypercall);
+EXPORT_STATIC_CALL_TRAMP_GPL(hv_hypercall);
+#define hypercall_update(hc) static_call_update(hv_hypercall, hc)
+#endif
 #endif /* CONFIG_HYPERV */
 
+#ifndef hypercall_update
+#define hypercall_update(hc) (void)hc
+#endif
+
 static uint32_t  __init ms_hyperv_platform(void)
 {
 	u32 eax;
@@ -484,14 +490,14 @@ static void __init ms_hyperv_init_platfo
 			ms_hyperv.shared_gpa_boundary =
 				BIT_ULL(ms_hyperv.shared_gpa_boundary_bits);
 
-		hyperv_paravisor_present = !!ms_hyperv.paravisor_present;
-
 		pr_info("Hyper-V: Isolation Config: Group A 0x%x, Group B 0x%x\n",
 			ms_hyperv.isolation_config_a, ms_hyperv.isolation_config_b);
 
 
 		if (hv_get_isolation_type() == HV_ISOLATION_TYPE_SNP) {
 			static_branch_enable(&isolation_type_snp);
+			if (!ms_hyperv.paravisor_present)
+				hypercall_update(hv_snp_hypercall);
 		} else if (hv_get_isolation_type() == HV_ISOLATION_TYPE_TDX) {
 			static_branch_enable(&isolation_type_tdx);
 
@@ -499,6 +505,7 @@ static void __init ms_hyperv_init_platfo
 			ms_hyperv.hints &= ~HV_X64_APIC_ACCESS_RECOMMENDED;
 
 			if (!ms_hyperv.paravisor_present) {
+				hypercall_update(hv_tdx_hypercall);
 				/*
 				 * Mark the Hyper-V TSC page feature as disabled
 				 * in a TDX VM without paravisor so that the



^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH v3 12/16] x86_64,hyperv: Use direct call to hypercall-page
  2025-07-14 10:20 [PATCH v3 00/16] objtool: Detect and warn about indirect calls in __nocfi functions Peter Zijlstra
                   ` (10 preceding siblings ...)
  2025-07-14 10:20 ` [PATCH v3 11/16] x86,hyperv: Clean up hv_do_hypercall() Peter Zijlstra
@ 2025-07-14 10:20 ` Peter Zijlstra
  2025-07-15  4:58   ` Wei Liu
  2025-07-15 14:52   ` Michael Kelley
  2025-07-14 10:20 ` [PATCH v3 13/16] x86/fred: Install system vector handlers even if FRED isnt fully enabled Peter Zijlstra
                   ` (4 subsequent siblings)
  16 siblings, 2 replies; 36+ messages in thread
From: Peter Zijlstra @ 2025-07-14 10:20 UTC (permalink / raw)
  To: x86
  Cc: kys, haiyangz, wei.liu, decui, tglx, mingo, bp, dave.hansen, hpa,
	seanjc, pbonzini, ardb, kees, Arnd Bergmann, gregkh, jpoimboe,
	peterz, linux-hyperv, linux-kernel, kvm, linux-efi, samitolvanen,
	ojeda, Michael Kelley

Instead of using an indirect call to the hypercall page, use a direct
call instead. This avoids all CFI problems, including the one where
the hypercall page doesn't have IBT on.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Michael Kelley <mhklinux@outlook.com>
---
 arch/x86/hyperv/hv_init.c |   61 ++++++++++++++++++++++------------------------
 1 file changed, 30 insertions(+), 31 deletions(-)

--- a/arch/x86/hyperv/hv_init.c
+++ b/arch/x86/hyperv/hv_init.c
@@ -17,7 +17,6 @@
 #include <asm/desc.h>
 #include <asm/e820/api.h>
 #include <asm/sev.h>
-#include <asm/ibt.h>
 #include <asm/hypervisor.h>
 #include <hyperv/hvhdk.h>
 #include <asm/mshyperv.h>
@@ -38,23 +37,41 @@
 void *hv_hypercall_pg;
 
 #ifdef CONFIG_X86_64
+static u64 __hv_hyperfail(u64 control, u64 param1, u64 param2)
+{
+	return U64_MAX;
+}
+
+DEFINE_STATIC_CALL(__hv_hypercall, __hv_hyperfail);
+
 u64 hv_std_hypercall(u64 control, u64 param1, u64 param2)
 {
 	u64 hv_status;
 
-	if (!hv_hypercall_pg)
-		return U64_MAX;
-
 	register u64 __r8 asm("r8") = param2;
-	asm volatile (CALL_NOSPEC
+	asm volatile ("call " STATIC_CALL_TRAMP_STR(__hv_hypercall)
 		      : "=a" (hv_status), ASM_CALL_CONSTRAINT,
 		        "+c" (control), "+d" (param1), "+r" (__r8)
-		      : THUNK_TARGET(hv_hypercall_pg)
-		      : "cc", "memory", "r9", "r10", "r11");
+		      : : "cc", "memory", "r9", "r10", "r11");
 
 	return hv_status;
 }
+
+typedef u64 (*hv_hypercall_f)(u64 control, u64 param1, u64 param2);
+
+static inline void hv_set_hypercall_pg(void *ptr)
+{
+	hv_hypercall_pg = ptr;
+
+	if (!ptr)
+		ptr = &__hv_hyperfail;
+	static_call_update(__hv_hypercall, (hv_hypercall_f)ptr);
+}
 #else
+static inline void hv_set_hypercall_pg(void *ptr)
+{
+	hv_hypercall_pg = ptr;
+}
 EXPORT_SYMBOL_GPL(hv_hypercall_pg);
 #endif
 
@@ -349,7 +366,7 @@ static int hv_suspend(void)
 	 * pointer is restored on resume.
 	 */
 	hv_hypercall_pg_saved = hv_hypercall_pg;
-	hv_hypercall_pg = NULL;
+	hv_set_hypercall_pg(NULL);
 
 	/* Disable the hypercall page in the hypervisor */
 	rdmsrq(HV_X64_MSR_HYPERCALL, hypercall_msr.as_uint64);
@@ -375,7 +392,7 @@ static void hv_resume(void)
 		vmalloc_to_pfn(hv_hypercall_pg_saved);
 	wrmsrq(HV_X64_MSR_HYPERCALL, hypercall_msr.as_uint64);
 
-	hv_hypercall_pg = hv_hypercall_pg_saved;
+	hv_set_hypercall_pg(hv_hypercall_pg_saved);
 	hv_hypercall_pg_saved = NULL;
 
 	/*
@@ -529,8 +546,8 @@ void __init hyperv_init(void)
 	if (hv_isolation_type_tdx() && !ms_hyperv.paravisor_present)
 		goto skip_hypercall_pg_init;
 
-	hv_hypercall_pg = __vmalloc_node_range(PAGE_SIZE, 1, VMALLOC_START,
-			VMALLOC_END, GFP_KERNEL, PAGE_KERNEL_ROX,
+	hv_hypercall_pg = __vmalloc_node_range(PAGE_SIZE, 1, MODULES_VADDR,
+			MODULES_END, GFP_KERNEL, PAGE_KERNEL_ROX,
 			VM_FLUSH_RESET_PERMS, NUMA_NO_NODE,
 			__builtin_return_address(0));
 	if (hv_hypercall_pg == NULL)
@@ -568,27 +585,9 @@ void __init hyperv_init(void)
 		wrmsrq(HV_X64_MSR_HYPERCALL, hypercall_msr.as_uint64);
 	}
 
-skip_hypercall_pg_init:
-	/*
-	 * Some versions of Hyper-V that provide IBT in guest VMs have a bug
-	 * in that there's no ENDBR64 instruction at the entry to the
-	 * hypercall page. Because hypercalls are invoked via an indirect call
-	 * to the hypercall page, all hypercall attempts fail when IBT is
-	 * enabled, and Linux panics. For such buggy versions, disable IBT.
-	 *
-	 * Fixed versions of Hyper-V always provide ENDBR64 on the hypercall
-	 * page, so if future Linux kernel versions enable IBT for 32-bit
-	 * builds, additional hypercall page hackery will be required here
-	 * to provide an ENDBR32.
-	 */
-#ifdef CONFIG_X86_KERNEL_IBT
-	if (cpu_feature_enabled(X86_FEATURE_IBT) &&
-	    *(u32 *)hv_hypercall_pg != gen_endbr()) {
-		setup_clear_cpu_cap(X86_FEATURE_IBT);
-		pr_warn("Disabling IBT because of Hyper-V bug\n");
-	}
-#endif
+	hv_set_hypercall_pg(hv_hypercall_pg);
 
+skip_hypercall_pg_init:
 	/*
 	 * hyperv_init() is called before LAPIC is initialized: see
 	 * apic_intr_mode_init() -> x86_platform.apic_post_init() and



^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH v3 13/16] x86/fred: Install system vector handlers even if FRED isnt fully enabled
  2025-07-14 10:20 [PATCH v3 00/16] objtool: Detect and warn about indirect calls in __nocfi functions Peter Zijlstra
                   ` (11 preceding siblings ...)
  2025-07-14 10:20 ` [PATCH v3 12/16] x86_64,hyperv: Use direct call to hypercall-page Peter Zijlstra
@ 2025-07-14 10:20 ` Peter Zijlstra
  2025-07-14 10:20 ` [PATCH v3 14/16] x86/fred: Play nice with invoking asm_fred_entry_from_kvm() on non-FRED hardware Peter Zijlstra
                   ` (3 subsequent siblings)
  16 siblings, 0 replies; 36+ messages in thread
From: Peter Zijlstra @ 2025-07-14 10:20 UTC (permalink / raw)
  To: x86
  Cc: kys, haiyangz, wei.liu, decui, tglx, mingo, bp, dave.hansen, hpa,
	seanjc, pbonzini, ardb, kees, Arnd Bergmann, gregkh, jpoimboe,
	peterz, linux-hyperv, linux-kernel, kvm, linux-efi, samitolvanen,
	ojeda

From: Sean Christopherson <seanjc@google.com>

Install the system vector IRQ handlers for FRED even if FRED isn't fully
enabled in hardware.  This will allow KVM to use the FRED IRQ path even
on non-FRED hardware, which in turn will eliminate a non-CFI indirect CALL
(KVM currently invokes the IRQ handler via an IDT lookup on the vector).

[sean: extract from diff, drop stub, write changelog]
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/include/asm/idtentry.h |    9 ++-------
 arch/x86/kernel/irqinit.c       |    6 ++++--
 2 files changed, 6 insertions(+), 9 deletions(-)

--- a/arch/x86/include/asm/idtentry.h
+++ b/arch/x86/include/asm/idtentry.h
@@ -460,17 +460,12 @@ __visible noinstr void func(struct pt_re
 #endif
 
 void idt_install_sysvec(unsigned int n, const void *function);
-
-#ifdef CONFIG_X86_FRED
 void fred_install_sysvec(unsigned int vector, const idtentry_t function);
-#else
-static inline void fred_install_sysvec(unsigned int vector, const idtentry_t function) { }
-#endif
 
 #define sysvec_install(vector, function) {				\
-	if (cpu_feature_enabled(X86_FEATURE_FRED))			\
+	if (IS_ENABLED(CONFIG_X86_FRED))				\
 		fred_install_sysvec(vector, function);			\
-	else								\
+	if (!cpu_feature_enabled(X86_FEATURE_FRED))			\
 		idt_install_sysvec(vector, asm_##function);		\
 }
 
--- a/arch/x86/kernel/irqinit.c
+++ b/arch/x86/kernel/irqinit.c
@@ -97,9 +97,11 @@ void __init native_init_IRQ(void)
 	/* Execute any quirks before the call gates are initialised: */
 	x86_init.irqs.pre_vector_init();
 
-	if (cpu_feature_enabled(X86_FEATURE_FRED))
+	/* FRED's IRQ path may be used even if FRED isn't fully enabled. */
+	if (IS_ENABLED(CONFIG_X86_FRED))
 		fred_complete_exception_setup();
-	else
+
+	if (!cpu_feature_enabled(X86_FEATURE_FRED))
 		idt_setup_apic_and_irq_gates();
 
 	lapic_assign_system_vectors();



^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH v3 14/16] x86/fred: Play nice with invoking asm_fred_entry_from_kvm() on non-FRED hardware
  2025-07-14 10:20 [PATCH v3 00/16] objtool: Detect and warn about indirect calls in __nocfi functions Peter Zijlstra
                   ` (12 preceding siblings ...)
  2025-07-14 10:20 ` [PATCH v3 13/16] x86/fred: Install system vector handlers even if FRED isnt fully enabled Peter Zijlstra
@ 2025-07-14 10:20 ` Peter Zijlstra
  2025-07-26  4:54   ` Xin Li
  2025-07-14 10:20 ` [PATCH v3 15/16] x86/fred: KVM: VMX: Always use FRED for IRQs when CONFIG_X86_FRED=y Peter Zijlstra
                   ` (2 subsequent siblings)
  16 siblings, 1 reply; 36+ messages in thread
From: Peter Zijlstra @ 2025-07-14 10:20 UTC (permalink / raw)
  To: x86
  Cc: kys, haiyangz, wei.liu, decui, tglx, mingo, bp, dave.hansen, hpa,
	seanjc, pbonzini, ardb, kees, Arnd Bergmann, gregkh, jpoimboe,
	peterz, linux-hyperv, linux-kernel, kvm, linux-efi, samitolvanen,
	ojeda

From: Josh Poimboeuf <jpoimboe@kernel.org>

Modify asm_fred_entry_from_kvm() to allow it to be invoked by KVM even
when FRED isn't fully enabled, e.g. when running with
CONFIG_X86_FRED=y on non-FRED hardware.  This will allow forcing KVM
to always use the FRED entry points for 64-bit kernels, which in turn
will eliminate a rather gross non-CFI indirect call that KVM uses to
trampoline IRQs by doing IDT lookups.

The point of asm_fred_entry_from_kvm() is to bridge between C
(vmx:handle_external_interrupt_irqoff()) and more C
(__fred_entry_from_kvm()) while changing the calling context to appear
like an interrupt (pt_regs). Making the whole thing bound by C ABI.

All that remains for non-FRED hardware is to restore RSP (to undo the
redzone and alignment). However the trivial change would result in
code like:

  push %rbp
  mov %rsp, %rbp

  sub $REDZONE, %rsp
  and $MASK, %rsp

  PUSH_AND_CLEAR_REGS
   push %rbp

  POP_REGS
   pop %rbp <-- *objtool fail*

  mov %rbp, %rsp
  pop %rbp
  ret

And this will confuse objtool something wicked -- it gets confused by
the extra pop %rbp, not realizing the push and pop preserve the value.

Rather than trying to each objtool about this, recognise that since
the code is bound by C ABI on both ends and interrupts are not allowed
to change pt_regs (only exceptions are) it is sufficient to PUSH_REGS
in order to create pt_regs, but there is no reason to POP_REGS --
provided the callee-saved registers are preserved.

So avoid clearing callee-saved regs and skip POP_REGS.

[Original patch by Sean; much of this version by Josh; Changelog,
comments and final form by Peterz]

Originally-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/entry/calling.h       |   11 +++++------
 arch/x86/entry/entry_64_fred.S |   33 ++++++++++++++++++++++++++-------
 arch/x86/kernel/asm-offsets.c  |    1 +
 3 files changed, 32 insertions(+), 13 deletions(-)

--- a/arch/x86/entry/calling.h
+++ b/arch/x86/entry/calling.h
@@ -99,7 +99,7 @@ For 32-bit we have the following convent
 	.endif
 .endm
 
-.macro CLEAR_REGS clear_bp=1
+.macro CLEAR_REGS clear_callee=1
 	/*
 	 * Sanitize registers of values that a speculation attack might
 	 * otherwise want to exploit. The lower registers are likely clobbered
@@ -113,20 +113,19 @@ For 32-bit we have the following convent
 	xorl	%r9d,  %r9d	/* nospec r9  */
 	xorl	%r10d, %r10d	/* nospec r10 */
 	xorl	%r11d, %r11d	/* nospec r11 */
+	.if \clear_callee
 	xorl	%ebx,  %ebx	/* nospec rbx */
-	.if \clear_bp
 	xorl	%ebp,  %ebp	/* nospec rbp */
-	.endif
 	xorl	%r12d, %r12d	/* nospec r12 */
 	xorl	%r13d, %r13d	/* nospec r13 */
 	xorl	%r14d, %r14d	/* nospec r14 */
 	xorl	%r15d, %r15d	/* nospec r15 */
-
+	.endif
 .endm
 
-.macro PUSH_AND_CLEAR_REGS rdx=%rdx rcx=%rcx rax=%rax save_ret=0 clear_bp=1 unwind_hint=1
+.macro PUSH_AND_CLEAR_REGS rdx=%rdx rcx=%rcx rax=%rax save_ret=0 clear_callee=1 unwind_hint=1
 	PUSH_REGS rdx=\rdx, rcx=\rcx, rax=\rax, save_ret=\save_ret unwind_hint=\unwind_hint
-	CLEAR_REGS clear_bp=\clear_bp
+	CLEAR_REGS clear_callee=\clear_callee
 .endm
 
 .macro POP_REGS pop_rdi=1
--- a/arch/x86/entry/entry_64_fred.S
+++ b/arch/x86/entry/entry_64_fred.S
@@ -112,18 +112,37 @@ SYM_FUNC_START(asm_fred_entry_from_kvm)
 	push %rax				/* Return RIP */
 	push $0					/* Error code, 0 for IRQ/NMI */
 
-	PUSH_AND_CLEAR_REGS clear_bp=0 unwind_hint=0
+	PUSH_AND_CLEAR_REGS clear_callee=0 unwind_hint=0
+
 	movq %rsp, %rdi				/* %rdi -> pt_regs */
+	/*
+	 * At this point: {rdi, rsi, rdx, rcx, r8, r9}, {r10, r11}, {rax, rdx}
+	 * are clobbered, which corresponds to: arguments, extra caller-saved
+	 * and return. All registers a C function is allowed to clobber.
+	 *
+	 * Notably, the callee-saved registers: {rbx, r12, r13, r14, r15}
+	 * are untouched, with the exception of rbp, which carries the stack
+	 * frame and will be restored before exit.
+	 *
+	 * Further calling another C function will not alter this state.
+	 */
 	call __fred_entry_from_kvm		/* Call the C entry point */
-	POP_REGS
-	ERETS
-1:
+
+1:	/*
+	 * When FRED, use ERETS to potentially clear NMIs, otherwise simply
+	 * restore the stack pointer.
+	 */
+	ALTERNATIVE "nop; nop; mov %rbp, %rsp", \
+	            __stringify(add $C_PTREGS_SIZE, %rsp; ERETS), \
+		    X86_FEATURE_FRED
+
 	/*
-	 * Objtool doesn't understand what ERETS does, this hint tells it that
-	 * yes, we'll reach here and with what stack state. A save/restore pair
-	 * isn't strictly needed, but it's the simplest form.
+	 * Objtool doesn't understand ERETS, and the cfi register state is
+	 * different from initial_func_cfi due to PUSH_REGS. Tell it the state
+	 * is similar to where UNWIND_HINT_SAVE is.
 	 */
 	UNWIND_HINT_RESTORE
+
 	pop %rbp
 	RET
 
--- a/arch/x86/kernel/asm-offsets.c
+++ b/arch/x86/kernel/asm-offsets.c
@@ -102,6 +102,7 @@ static void __used common(void)
 
 	BLANK();
 	DEFINE(PTREGS_SIZE, sizeof(struct pt_regs));
+	OFFSET(C_PTREGS_SIZE, pt_regs, orig_ax);
 
 	/* TLB state for the entry code */
 	OFFSET(TLB_STATE_user_pcid_flush_mask, tlb_state, user_pcid_flush_mask);



^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH v3 15/16] x86/fred: KVM: VMX: Always use FRED for IRQs when CONFIG_X86_FRED=y
  2025-07-14 10:20 [PATCH v3 00/16] objtool: Detect and warn about indirect calls in __nocfi functions Peter Zijlstra
                   ` (13 preceding siblings ...)
  2025-07-14 10:20 ` [PATCH v3 14/16] x86/fred: Play nice with invoking asm_fred_entry_from_kvm() on non-FRED hardware Peter Zijlstra
@ 2025-07-14 10:20 ` Peter Zijlstra
  2025-07-14 10:20 ` [PATCH v3 16/16] objtool: Validate kCFI calls Peter Zijlstra
  2025-07-24 20:31 ` [PATCH v3 00/16] objtool: Detect and warn about indirect calls in __nocfi functions Sean Christopherson
  16 siblings, 0 replies; 36+ messages in thread
From: Peter Zijlstra @ 2025-07-14 10:20 UTC (permalink / raw)
  To: x86
  Cc: kys, haiyangz, wei.liu, decui, tglx, mingo, bp, dave.hansen, hpa,
	seanjc, pbonzini, ardb, kees, Arnd Bergmann, gregkh, jpoimboe,
	peterz, linux-hyperv, linux-kernel, kvm, linux-efi, samitolvanen,
	ojeda

From: Sean Christopherson <seanjc@google.com>

Now that FRED provides C-code entry points for handling IRQs, use the
FRED infrastructure for forwarding IRQs even if FRED is fully
disabled, e.g. isn't supported in hardware. Avoiding the non-FRED
assembly trampolines into the IDT handlers for IRQs eliminates the
associated non-CFI indirect call (KVM performs a CALL by doing a
lookup on the IDT using the IRQ vector).

Keep NMIs on the legacy IDT path, as the FRED NMI entry code relies on
FRED's architectural behavior with respect to NMI blocking, i.e. doesn't
jump through the myriad hoops needed to deal with IRET "unexpectedly"
unmasking NMIs.  KVM's NMI path already makes a direct CALL to C-code,
i.e. isn't problematic for CFI.  KVM does make a short detour through
assembly code to build the stack frame, but the "FRED entry from KVM"
path does the same.

Force FRED for 64-bit kernels if KVM_INTEL is enabled, as the benefits of
eliminating the IRQ trampoline usage far outwieghts the code overhead for
FRED.

Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/kvm/Kconfig   |    1 +
 arch/x86/kvm/vmx/vmx.c |    8 +++++++-
 2 files changed, 8 insertions(+), 1 deletion(-)

--- a/arch/x86/kvm/Kconfig
+++ b/arch/x86/kvm/Kconfig
@@ -97,6 +97,7 @@ config KVM_INTEL
 	depends on KVM && IA32_FEAT_CTL
 	select KVM_GENERIC_PRIVATE_MEM if INTEL_TDX_HOST
 	select KVM_GENERIC_MEMORY_ATTRIBUTES if INTEL_TDX_HOST
+	select X86_FRED if X86_64
 	help
 	  Provides support for KVM on processors equipped with Intel's VT
 	  extensions, a.k.a. Virtual Machine Extensions (VMX).
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -6989,8 +6989,14 @@ static void handle_external_interrupt_ir
 	    "unexpected VM-Exit interrupt info: 0x%x", intr_info))
 		return;
 
+	/*
+	 * Invoke the kernel's IRQ handler for the vector.  Use the FRED path
+	 * when it's available even if FRED isn't fully enabled, e.g. even if
+	 * FRED isn't supported in hardware, in order to avoid the indirect
+	 * CALL in the non-FRED path.
+	 */
 	kvm_before_interrupt(vcpu, KVM_HANDLING_IRQ);
-	if (cpu_feature_enabled(X86_FEATURE_FRED))
+	if (IS_ENABLED(CONFIG_X86_FRED))
 		fred_entry_from_kvm(EVENT_TYPE_EXTINT, vector);
 	else
 		vmx_do_interrupt_irqoff(gate_offset((gate_desc *)host_idt_base + vector));



^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH v3 16/16] objtool: Validate kCFI calls
  2025-07-14 10:20 [PATCH v3 00/16] objtool: Detect and warn about indirect calls in __nocfi functions Peter Zijlstra
                   ` (14 preceding siblings ...)
  2025-07-14 10:20 ` [PATCH v3 15/16] x86/fred: KVM: VMX: Always use FRED for IRQs when CONFIG_X86_FRED=y Peter Zijlstra
@ 2025-07-14 10:20 ` Peter Zijlstra
  2025-07-14 10:49   ` Peter Zijlstra
                     ` (3 more replies)
  2025-07-24 20:31 ` [PATCH v3 00/16] objtool: Detect and warn about indirect calls in __nocfi functions Sean Christopherson
  16 siblings, 4 replies; 36+ messages in thread
From: Peter Zijlstra @ 2025-07-14 10:20 UTC (permalink / raw)
  To: x86
  Cc: kys, haiyangz, wei.liu, decui, tglx, mingo, bp, dave.hansen, hpa,
	seanjc, pbonzini, ardb, kees, Arnd Bergmann, gregkh, jpoimboe,
	peterz, linux-hyperv, linux-kernel, kvm, linux-efi, samitolvanen,
	ojeda

Validate that all indirect calls adhere to kCFI rules. Notably doing
nocfi indirect call to a cfi function is broken.

Apparently some Rust 'core' code violates this and explodes when ran
with FineIBT.

All the ANNOTATE_NOCFI_SYM sites are prime targets for attackers.

 - runtime EFI is especially henous because it also needs to disable
   IBT. Basically calling unknown code without CFI protection at
   runtime is a massice security issue.

 - Kexec image handover; if you can exploit this, you get to keep it :-)

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/kernel/machine_kexec_64.c  |    4 +++
 arch/x86/kvm/vmx/vmenter.S          |    4 +++
 arch/x86/platform/efi/efi_stub_64.S |    4 +++
 drivers/misc/lkdtm/perms.c          |    5 ++++
 include/linux/objtool.h             |   10 ++++++++
 include/linux/objtool_types.h       |    1 
 tools/include/linux/objtool_types.h |    1 
 tools/objtool/check.c               |   42 ++++++++++++++++++++++++++++++++++++
 tools/objtool/include/objtool/elf.h |    1 
 9 files changed, 72 insertions(+)

--- a/arch/x86/kernel/machine_kexec_64.c
+++ b/arch/x86/kernel/machine_kexec_64.c
@@ -453,6 +453,10 @@ void __nocfi machine_kexec(struct kimage
 
 	__ftrace_enabled_restore(save_ftrace_enabled);
 }
+/*
+ * Handover to the next kernel, no CFI concern.
+ */
+ANNOTATE_NOCFI_SYM(machine_kexec);
 
 /* arch-dependent functionality related to kexec file-based syscall */
 
--- a/arch/x86/kvm/vmx/vmenter.S
+++ b/arch/x86/kvm/vmx/vmenter.S
@@ -361,6 +361,10 @@ SYM_FUNC_END(vmread_error_trampoline)
 
 .section .text, "ax"
 
+#ifndef CONFIG_X86_FRED
+
 SYM_FUNC_START(vmx_do_interrupt_irqoff)
 	VMX_DO_EVENT_IRQOFF CALL_NOSPEC _ASM_ARG1
 SYM_FUNC_END(vmx_do_interrupt_irqoff)
+
+#endif
--- a/arch/x86/platform/efi/efi_stub_64.S
+++ b/arch/x86/platform/efi/efi_stub_64.S
@@ -11,6 +11,10 @@
 #include <asm/nospec-branch.h>
 
 SYM_FUNC_START(__efi_call)
+	/*
+	 * The EFI code doesn't have any CFI, annotate away the CFI violation.
+	 */
+	ANNOTATE_NOCFI_SYM
 	pushq %rbp
 	movq %rsp, %rbp
 	and $~0xf, %rsp
--- a/drivers/misc/lkdtm/perms.c
+++ b/drivers/misc/lkdtm/perms.c
@@ -9,6 +9,7 @@
 #include <linux/vmalloc.h>
 #include <linux/mman.h>
 #include <linux/uaccess.h>
+#include <linux/objtool.h>
 #include <asm/cacheflush.h>
 #include <asm/sections.h>
 
@@ -86,6 +87,10 @@ static noinline __nocfi void execute_loc
 	func();
 	pr_err("FAIL: func returned\n");
 }
+/*
+ * Explicitly doing the wrong thing for testing.
+ */
+ANNOTATE_NOCFI_SYM(execute_location);
 
 static void execute_user_location(void *dst)
 {
--- a/include/linux/objtool.h
+++ b/include/linux/objtool.h
@@ -184,6 +184,15 @@
  * WARN using UD2.
  */
 #define ANNOTATE_REACHABLE(label)	__ASM_ANNOTATE(label, ANNOTYPE_REACHABLE)
+/*
+ * This should not be used; it annotates away CFI violations. There are a few
+ * valid use cases like kexec handover to the next kernel image, and there is
+ * no security concern there.
+ *
+ * There are also a few real issues annotated away, like EFI because we can't
+ * control the EFI code.
+ */
+#define ANNOTATE_NOCFI_SYM(sym)		asm(__ASM_ANNOTATE(sym, ANNOTYPE_NOCFI))
 
 #else
 #define ANNOTATE_NOENDBR		ANNOTATE type=ANNOTYPE_NOENDBR
@@ -194,6 +203,7 @@
 #define ANNOTATE_INTRA_FUNCTION_CALL	ANNOTATE type=ANNOTYPE_INTRA_FUNCTION_CALL
 #define ANNOTATE_UNRET_BEGIN		ANNOTATE type=ANNOTYPE_UNRET_BEGIN
 #define ANNOTATE_REACHABLE		ANNOTATE type=ANNOTYPE_REACHABLE
+#define ANNOTATE_NOCFI_SYM		ANNOTATE type=ANNOTYPE_NOCFI
 #endif
 
 #if defined(CONFIG_NOINSTR_VALIDATION) && \
--- a/include/linux/objtool_types.h
+++ b/include/linux/objtool_types.h
@@ -65,5 +65,6 @@ struct unwind_hint {
 #define ANNOTYPE_IGNORE_ALTS		6
 #define ANNOTYPE_INTRA_FUNCTION_CALL	7
 #define ANNOTYPE_REACHABLE		8
+#define ANNOTYPE_NOCFI			9
 
 #endif /* _LINUX_OBJTOOL_TYPES_H */
--- a/tools/include/linux/objtool_types.h
+++ b/tools/include/linux/objtool_types.h
@@ -65,5 +65,6 @@ struct unwind_hint {
 #define ANNOTYPE_IGNORE_ALTS		6
 #define ANNOTYPE_INTRA_FUNCTION_CALL	7
 #define ANNOTYPE_REACHABLE		8
+#define ANNOTYPE_NOCFI			9
 
 #endif /* _LINUX_OBJTOOL_TYPES_H */
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -2390,6 +2390,8 @@ static int __annotate_ifc(struct objtool
 
 static int __annotate_late(struct objtool_file *file, int type, struct instruction *insn)
 {
+	struct symbol *sym;
+
 	switch (type) {
 	case ANNOTYPE_NOENDBR:
 		/* early */
@@ -2431,6 +2433,15 @@ static int __annotate_late(struct objtoo
 		insn->dead_end = false;
 		break;
 
+	case ANNOTYPE_NOCFI:
+		sym = insn->sym;
+		if (!sym) {
+			ERROR_INSN(insn, "dodgy NOCFI annotation");
+			return -1;
+		}
+		insn->sym->nocfi = 1;
+		break;
+
 	default:
 		ERROR_INSN(insn, "Unknown annotation type: %d", type);
 		return -1;
@@ -4000,6 +4011,37 @@ static int validate_retpoline(struct obj
 		warnings++;
 	}
 
+	if (!opts.cfi)
+		return warnings;
+
+	/*
+	 * kCFI call sites look like:
+	 *
+	 *     movl $(-0x12345678), %r10d
+	 *     addl -4(%r11), %r10d
+	 *     jz 1f
+	 *     ud2
+	 *  1: cs call __x86_indirect_thunk_r11
+	 *
+	 * Verify all indirect calls are kCFI adorned by checking for the
+	 * UD2. Notably, doing __nocfi calls to regular (cfi) functions is
+	 * broken.
+	 */
+	list_for_each_entry(insn, &file->retpoline_call_list, call_node) {
+		struct symbol *sym = insn->sym;
+
+		if (sym && (sym->type == STT_NOTYPE ||
+			    sym->type == STT_FUNC) && !sym->nocfi) {
+			struct instruction *prev =
+				prev_insn_same_sym(file, insn);
+
+			if (!prev || prev->type != INSN_BUG) {
+				WARN_INSN(insn, "no-cfi indirect call!");
+				warnings++;
+			}
+		}
+	}
+
 	return warnings;
 }
 
--- a/tools/objtool/include/objtool/elf.h
+++ b/tools/objtool/include/objtool/elf.h
@@ -70,6 +70,7 @@ struct symbol {
 	u8 local_label       : 1;
 	u8 frame_pointer     : 1;
 	u8 ignore	     : 1;
+	u8 nocfi             : 1;
 	struct list_head pv_target;
 	struct reloc *relocs;
 	struct section *group_sec;



^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v3 16/16] objtool: Validate kCFI calls
  2025-07-14 10:20 ` [PATCH v3 16/16] objtool: Validate kCFI calls Peter Zijlstra
@ 2025-07-14 10:49   ` Peter Zijlstra
  2025-07-14 11:21     ` Peter Zijlstra
  2025-07-14 16:30   ` Miguel Ojeda
                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 36+ messages in thread
From: Peter Zijlstra @ 2025-07-14 10:49 UTC (permalink / raw)
  To: x86
  Cc: kys, haiyangz, wei.liu, decui, tglx, mingo, bp, dave.hansen, hpa,
	seanjc, pbonzini, ardb, kees, Arnd Bergmann, gregkh, jpoimboe,
	linux-hyperv, linux-kernel, kvm, linux-efi, samitolvanen, ojeda

On Mon, Jul 14, 2025 at 12:20:27PM +0200, Peter Zijlstra wrote:

> --- a/arch/x86/platform/efi/efi_stub_64.S
> +++ b/arch/x86/platform/efi/efi_stub_64.S
> @@ -11,6 +11,10 @@
>  #include <asm/nospec-branch.h>
>  
>  SYM_FUNC_START(__efi_call)
> +	/*
> +	 * The EFI code doesn't have any CFI, annotate away the CFI violation.
> +	 */
> +	ANNOTATE_NOCFI_SYM
>  	pushq %rbp
>  	movq %rsp, %rbp
>  	and $~0xf, %rsp

FWIW, we should probably do something like this as well.

---

--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -562,6 +562,13 @@ __noendbr u64 ibt_save(bool disable)
 {
 	u64 msr = 0;
 
+	/*
+	 * Firmware code will not provide the same level of
+	 * control-flow-integriry. Taint the kernel to let the user know.
+	 */
+	if (disable || (IS_ENABLED(CONFIG_CFI_CLANG) && cfi_mode != CFI_OFF))
+		add_taint(TAINT_CFI, LOCKDEP_STILL_OK);
+
 	if (cpu_feature_enabled(X86_FEATURE_IBT)) {
 		rdmsrq(MSR_IA32_S_CET, msr);
 		if (disable)
--- a/include/linux/panic.h
+++ b/include/linux/panic.h
@@ -73,7 +73,8 @@ static inline void set_arch_panic_timeou
 #define TAINT_RANDSTRUCT		17
 #define TAINT_TEST			18
 #define TAINT_FWCTL			19
-#define TAINT_FLAGS_COUNT		20
+#define TAINT_CFI			20
+#define TAINT_FLAGS_COUNT		21
 #define TAINT_FLAGS_MAX			((1UL << TAINT_FLAGS_COUNT) - 1)
 
 struct taint_flag {

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v3 16/16] objtool: Validate kCFI calls
  2025-07-14 10:49   ` Peter Zijlstra
@ 2025-07-14 11:21     ` Peter Zijlstra
  0 siblings, 0 replies; 36+ messages in thread
From: Peter Zijlstra @ 2025-07-14 11:21 UTC (permalink / raw)
  To: x86
  Cc: kys, haiyangz, wei.liu, decui, tglx, mingo, bp, dave.hansen, hpa,
	seanjc, pbonzini, ardb, kees, Arnd Bergmann, gregkh, jpoimboe,
	linux-hyperv, linux-kernel, kvm, linux-efi, samitolvanen, ojeda

On Mon, Jul 14, 2025 at 12:49:19PM +0200, Peter Zijlstra wrote:
> On Mon, Jul 14, 2025 at 12:20:27PM +0200, Peter Zijlstra wrote:
> 
> > --- a/arch/x86/platform/efi/efi_stub_64.S
> > +++ b/arch/x86/platform/efi/efi_stub_64.S
> > @@ -11,6 +11,10 @@
> >  #include <asm/nospec-branch.h>
> >  
> >  SYM_FUNC_START(__efi_call)
> > +	/*
> > +	 * The EFI code doesn't have any CFI, annotate away the CFI violation.
> > +	 */
> > +	ANNOTATE_NOCFI_SYM
> >  	pushq %rbp
> >  	movq %rsp, %rbp
> >  	and $~0xf, %rsp
> 
> FWIW, we should probably do something like this as well.
> 
> ---
> 
> --- a/arch/x86/kernel/cpu/common.c
> +++ b/arch/x86/kernel/cpu/common.c
> @@ -562,6 +562,13 @@ __noendbr u64 ibt_save(bool disable)
>  {
>  	u64 msr = 0;
>  
> +	/*
> +	 * Firmware code will not provide the same level of
> +	 * control-flow-integriry. Taint the kernel to let the user know.
> +	 */
> +	if (disable || (IS_ENABLED(CONFIG_CFI_CLANG) && cfi_mode != CFI_OFF))
> +		add_taint(TAINT_CFI, LOCKDEP_STILL_OK);

Or perhaps:

	WARN_TAINT_ONCE(disable || IS_ENABLED(CONFIG_CFI_CLANG) && cfi_mode != CFI_OFF),
			TAINT_CFI, "Firmware has weaker CFI");

> +
>  	if (cpu_feature_enabled(X86_FEATURE_IBT)) {
>  		rdmsrq(MSR_IA32_S_CET, msr);
>  		if (disable)
> --- a/include/linux/panic.h
> +++ b/include/linux/panic.h
> @@ -73,7 +73,8 @@ static inline void set_arch_panic_timeou
>  #define TAINT_RANDSTRUCT		17
>  #define TAINT_TEST			18
>  #define TAINT_FWCTL			19
> -#define TAINT_FLAGS_COUNT		20
> +#define TAINT_CFI			20
> +#define TAINT_FLAGS_COUNT		21
>  #define TAINT_FLAGS_MAX			((1UL << TAINT_FLAGS_COUNT) - 1)
>  
>  struct taint_flag {

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v3 16/16] objtool: Validate kCFI calls
  2025-07-14 10:20 ` [PATCH v3 16/16] objtool: Validate kCFI calls Peter Zijlstra
  2025-07-14 10:49   ` Peter Zijlstra
@ 2025-07-14 16:30   ` Miguel Ojeda
  2025-07-15  8:38     ` Peter Zijlstra
  2025-07-16 21:03   ` Josh Poimboeuf
  2025-07-24 20:37   ` Sean Christopherson
  3 siblings, 1 reply; 36+ messages in thread
From: Miguel Ojeda @ 2025-07-14 16:30 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: x86, kys, haiyangz, wei.liu, decui, tglx, mingo, bp, dave.hansen,
	hpa, seanjc, pbonzini, ardb, kees, Arnd Bergmann, gregkh,
	jpoimboe, linux-hyperv, linux-kernel, kvm, linux-efi,
	samitolvanen, ojeda

On Mon, Jul 14, 2025 at 12:45 PM Peter Zijlstra <peterz@infradead.org> wrote:
>
> Apparently some Rust 'core' code violates this and explodes when ran
> with FineIBT.

I think this was fixed in Rust 1.88 (latest version), right? Or is
there an issue still?

    5595c31c3709 ("x86/Kconfig: make CFI_AUTO_DEFAULT depend on !RUST
or Rust >= 1.88")

>  - runtime EFI is especially henous because it also needs to disable
>    IBT. Basically calling unknown code without CFI protection at
>    runtime is a massice security issue.

heinous
massive

Cheers,
Miguel

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v3 11/16] x86,hyperv: Clean up hv_do_hypercall()
  2025-07-14 10:20 ` [PATCH v3 11/16] x86,hyperv: Clean up hv_do_hypercall() Peter Zijlstra
@ 2025-07-15  4:54   ` Wei Liu
  2025-07-15 14:51   ` Michael Kelley
  1 sibling, 0 replies; 36+ messages in thread
From: Wei Liu @ 2025-07-15  4:54 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: x86, kys, haiyangz, wei.liu, decui, tglx, mingo, bp, dave.hansen,
	hpa, seanjc, pbonzini, ardb, kees, Arnd Bergmann, gregkh,
	jpoimboe, linux-hyperv, linux-kernel, kvm, linux-efi,
	samitolvanen, ojeda, Michael Kelley

On Mon, Jul 14, 2025 at 12:20:22PM +0200, Peter Zijlstra wrote:
> What used to be a simple few instructions has turned into a giant mess
> (for x86_64). Not only does it use static_branch wrong, it mixes it
> with dynamic branches for no apparent reason.
> 
> Notably it uses static_branch through an out-of-line function call,
> which completely defeats the purpose, since instead of a simple
> JMP/NOP site, you get a CALL+RET+TEST+Jcc sequence in return, which is
> absolutely idiotic.
> 
> Add to that a dynamic test of hyperv_paravisor_present, something
> which is set once and never changed.
> 
> Replace all this idiocy with a single direct function call to the
> right hypercall variant.
> 
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> Tested-by: Michael Kelley <mhklinux@outlook.com>
> Reviewed-by: Michael Kelley <mhklinux@outlook.com>

Acked-by: Wei Liu <wei.liu@kernel.org>

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v3 12/16] x86_64,hyperv: Use direct call to hypercall-page
  2025-07-14 10:20 ` [PATCH v3 12/16] x86_64,hyperv: Use direct call to hypercall-page Peter Zijlstra
@ 2025-07-15  4:58   ` Wei Liu
  2025-07-15 14:52   ` Michael Kelley
  1 sibling, 0 replies; 36+ messages in thread
From: Wei Liu @ 2025-07-15  4:58 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: x86, kys, haiyangz, wei.liu, decui, tglx, mingo, bp, dave.hansen,
	hpa, seanjc, pbonzini, ardb, kees, Arnd Bergmann, gregkh,
	jpoimboe, linux-hyperv, linux-kernel, kvm, linux-efi,
	samitolvanen, ojeda, Michael Kelley

On Mon, Jul 14, 2025 at 12:20:23PM +0200, Peter Zijlstra wrote:
> Instead of using an indirect call to the hypercall page, use a direct
> call instead. This avoids all CFI problems, including the one where
> the hypercall page doesn't have IBT on.
> 
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> Reviewed-by: Michael Kelley <mhklinux@outlook.com>

Acked-by: Wei Liu <wei.liu@kernel.org>

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v3 16/16] objtool: Validate kCFI calls
  2025-07-14 16:30   ` Miguel Ojeda
@ 2025-07-15  8:38     ` Peter Zijlstra
  0 siblings, 0 replies; 36+ messages in thread
From: Peter Zijlstra @ 2025-07-15  8:38 UTC (permalink / raw)
  To: Miguel Ojeda
  Cc: x86, kys, haiyangz, wei.liu, decui, tglx, mingo, bp, dave.hansen,
	hpa, seanjc, pbonzini, ardb, kees, Arnd Bergmann, gregkh,
	jpoimboe, linux-hyperv, linux-kernel, kvm, linux-efi,
	samitolvanen, ojeda

On Mon, Jul 14, 2025 at 06:30:09PM +0200, Miguel Ojeda wrote:
> On Mon, Jul 14, 2025 at 12:45 PM Peter Zijlstra <peterz@infradead.org> wrote:
> >
> > Apparently some Rust 'core' code violates this and explodes when ran
> > with FineIBT.
> 
> I think this was fixed in Rust 1.88 (latest version), right? Or is
> there an issue still?
> 
>     5595c31c3709 ("x86/Kconfig: make CFI_AUTO_DEFAULT depend on !RUST
> or Rust >= 1.88")

Oh yeah, it got fixed. Clearly I failed to update the Changelog.

> >  - runtime EFI is especially henous because it also needs to disable
> >    IBT. Basically calling unknown code without CFI protection at
> >    runtime is a massice security issue.
> 
> heinous
> massive

Typing hard; Thanks!

^ permalink raw reply	[flat|nested] 36+ messages in thread

* RE: [PATCH v3 11/16] x86,hyperv: Clean up hv_do_hypercall()
  2025-07-14 10:20 ` [PATCH v3 11/16] x86,hyperv: Clean up hv_do_hypercall() Peter Zijlstra
  2025-07-15  4:54   ` Wei Liu
@ 2025-07-15 14:51   ` Michael Kelley
  1 sibling, 0 replies; 36+ messages in thread
From: Michael Kelley @ 2025-07-15 14:51 UTC (permalink / raw)
  To: Peter Zijlstra, x86@kernel.org
  Cc: kys@microsoft.com, haiyangz@microsoft.com, wei.liu@kernel.org,
	decui@microsoft.com, tglx@linutronix.de, mingo@redhat.com,
	bp@alien8.de, dave.hansen@linux.intel.com, hpa@zytor.com,
	seanjc@google.com, pbonzini@redhat.com, ardb@kernel.org,
	kees@kernel.org, Arnd Bergmann, gregkh@linuxfoundation.org,
	jpoimboe@kernel.org, linux-hyperv@vger.kernel.org,
	linux-kernel@vger.kernel.org, kvm@vger.kernel.org,
	linux-efi@vger.kernel.org, samitolvanen@google.com,
	ojeda@kernel.org

From: Peter Zijlstra <peterz@infradead.org> Sent: Monday, July 14, 2025 3:20 AM
> 

Nit: For the patch "Subject", the prefix should be "x86/hyperv:" for
consistency with past practice. There's definitely been some variation
over time, but "x86/hyperv:" has been the most consistently used, and
that's what I try to remind people to use.

Michael

> What used to be a simple few instructions has turned into a giant mess
> (for x86_64). Not only does it use static_branch wrong, it mixes it
> with dynamic branches for no apparent reason.
> 
> Notably it uses static_branch through an out-of-line function call,
> which completely defeats the purpose, since instead of a simple
> JMP/NOP site, you get a CALL+RET+TEST+Jcc sequence in return, which is
> absolutely idiotic.
> 
> Add to that a dynamic test of hyperv_paravisor_present, something
> which is set once and never changed.
> 
> Replace all this idiocy with a single direct function call to the
> right hypercall variant.
> 
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> Tested-by: Michael Kelley <mhklinux@outlook.com>
> Reviewed-by: Michael Kelley <mhklinux@outlook.com>
> ---
>  arch/x86/hyperv/hv_init.c       |   20 +++++
>  arch/x86/hyperv/ivm.c           |   15 ++++
>  arch/x86/include/asm/mshyperv.h |  137 +++++++++++-----------------------------
>  arch/x86/kernel/cpu/mshyperv.c  |   19 +++--
>  4 files changed, 89 insertions(+), 102 deletions(-)
> 
> --- a/arch/x86/hyperv/hv_init.c
> +++ b/arch/x86/hyperv/hv_init.c
> @@ -36,7 +36,27 @@
>  #include <linux/highmem.h>
> 
>  void *hv_hypercall_pg;
> +
> +#ifdef CONFIG_X86_64
> +u64 hv_std_hypercall(u64 control, u64 param1, u64 param2)
> +{
> +	u64 hv_status;
> +
> +	if (!hv_hypercall_pg)
> +		return U64_MAX;
> +
> +	register u64 __r8 asm("r8") = param2;
> +	asm volatile (CALL_NOSPEC
> +		      : "=a" (hv_status), ASM_CALL_CONSTRAINT,
> +		        "+c" (control), "+d" (param1), "+r" (__r8)
> +		      : THUNK_TARGET(hv_hypercall_pg)
> +		      : "cc", "memory", "r9", "r10", "r11");
> +
> +	return hv_status;
> +}
> +#else
>  EXPORT_SYMBOL_GPL(hv_hypercall_pg);
> +#endif
> 
>  union hv_ghcb * __percpu *hv_ghcb_pg;
> 
> --- a/arch/x86/hyperv/ivm.c
> +++ b/arch/x86/hyperv/ivm.c
> @@ -377,9 +377,23 @@ int hv_snp_boot_ap(u32 cpu, unsigned lon
>  	return ret;
>  }
> 
> +u64 hv_snp_hypercall(u64 control, u64 param1, u64 param2)
> +{
> +	u64 hv_status;
> +
> +	register u64 __r8 asm("r8") = param2;
> +	asm volatile("vmmcall"
> +		     : "=a" (hv_status), ASM_CALL_CONSTRAINT,
> +		       "+c" (control), "+d" (param1), "+r" (__r8)
> +		     : : "cc", "memory", "r9", "r10", "r11");
> +
> +	return hv_status;
> +}
> +
>  #else
>  static inline void hv_ghcb_msr_write(u64 msr, u64 value) {}
>  static inline void hv_ghcb_msr_read(u64 msr, u64 *value) {}
> +u64 hv_snp_hypercall(u64 control, u64 param1, u64 param2) { return U64_MAX; }
>  #endif /* CONFIG_AMD_MEM_ENCRYPT */
> 
>  #ifdef CONFIG_INTEL_TDX_GUEST
> @@ -429,6 +443,7 @@ u64 hv_tdx_hypercall(u64 control, u64 pa
>  #else
>  static inline void hv_tdx_msr_write(u64 msr, u64 value) {}
>  static inline void hv_tdx_msr_read(u64 msr, u64 *value) {}
> +u64 hv_tdx_hypercall(u64 control, u64 param1, u64 param2) { return U64_MAX; }
>  #endif /* CONFIG_INTEL_TDX_GUEST */
> 
>  #if defined(CONFIG_AMD_MEM_ENCRYPT) || defined(CONFIG_INTEL_TDX_GUEST)
> --- a/arch/x86/include/asm/mshyperv.h
> +++ b/arch/x86/include/asm/mshyperv.h
> @@ -6,6 +6,7 @@
>  #include <linux/nmi.h>
>  #include <linux/msi.h>
>  #include <linux/io.h>
> +#include <linux/static_call.h>
>  #include <asm/nospec-branch.h>
>  #include <asm/paravirt.h>
>  #include <asm/msr.h>
> @@ -39,16 +40,21 @@ static inline unsigned char hv_get_nmi_r
>  	return 0;
>  }
> 
> -#if IS_ENABLED(CONFIG_HYPERV)
> -extern bool hyperv_paravisor_present;
> +extern u64 hv_tdx_hypercall(u64 control, u64 param1, u64 param2);
> +extern u64 hv_snp_hypercall(u64 control, u64 param1, u64 param2);
> +extern u64 hv_std_hypercall(u64 control, u64 param1, u64 param2);
> 
> +#if IS_ENABLED(CONFIG_HYPERV)
>  extern void *hv_hypercall_pg;
> 
>  extern union hv_ghcb * __percpu *hv_ghcb_pg;
> 
>  bool hv_isolation_type_snp(void);
>  bool hv_isolation_type_tdx(void);
> -u64 hv_tdx_hypercall(u64 control, u64 param1, u64 param2);
> +
> +#ifdef CONFIG_X86_64
> +DECLARE_STATIC_CALL(hv_hypercall, hv_std_hypercall);
> +#endif
> 
>  /*
>   * DEFAULT INIT GPAT and SEGMENT LIMIT value in struct VMSA
> @@ -65,37 +71,15 @@ static inline u64 hv_do_hypercall(u64 co
>  {
>  	u64 input_address = input ? virt_to_phys(input) : 0;
>  	u64 output_address = output ? virt_to_phys(output) : 0;
> -	u64 hv_status;
> 
>  #ifdef CONFIG_X86_64
> -	if (hv_isolation_type_tdx() && !hyperv_paravisor_present)
> -		return hv_tdx_hypercall(control, input_address, output_address);
> -
> -	if (hv_isolation_type_snp() && !hyperv_paravisor_present) {
> -		__asm__ __volatile__("mov %[output_address], %%r8\n"
> -				     "vmmcall"
> -				     : "=a" (hv_status), ASM_CALL_CONSTRAINT,
> -				       "+c" (control), "+d" (input_address)
> -				     : [output_address] "r" (output_address)
> -				     : "cc", "memory", "r8", "r9", "r10", "r11");
> -		return hv_status;
> -	}
> -
> -	if (!hv_hypercall_pg)
> -		return U64_MAX;
> -
> -	__asm__ __volatile__("mov %[output_address], %%r8\n"
> -			     CALL_NOSPEC
> -			     : "=a" (hv_status), ASM_CALL_CONSTRAINT,
> -			       "+c" (control), "+d" (input_address)
> -			     : [output_address] "r" (output_address),
> -			       THUNK_TARGET(hv_hypercall_pg)
> -			     : "cc", "memory", "r8", "r9", "r10", "r11");
> +	return static_call_mod(hv_hypercall)(control, input_address, output_address);
>  #else
>  	u32 input_address_hi = upper_32_bits(input_address);
>  	u32 input_address_lo = lower_32_bits(input_address);
>  	u32 output_address_hi = upper_32_bits(output_address);
>  	u32 output_address_lo = lower_32_bits(output_address);
> +	u64 hv_status;
> 
>  	if (!hv_hypercall_pg)
>  		return U64_MAX;
> @@ -108,8 +92,8 @@ static inline u64 hv_do_hypercall(u64 co
>  			       "D"(output_address_hi), "S"(output_address_lo),
>  			       THUNK_TARGET(hv_hypercall_pg)
>  			     : "cc", "memory");
> -#endif /* !x86_64 */
>  	return hv_status;
> +#endif /* !x86_64 */
>  }
> 
>  /* Hypercall to the L0 hypervisor */
> @@ -121,41 +105,23 @@ static inline u64 hv_do_nested_hypercall
>  /* Fast hypercall with 8 bytes of input and no output */
>  static inline u64 _hv_do_fast_hypercall8(u64 control, u64 input1)
>  {
> -	u64 hv_status;
> -
>  #ifdef CONFIG_X86_64
> -	if (hv_isolation_type_tdx() && !hyperv_paravisor_present)
> -		return hv_tdx_hypercall(control, input1, 0);
> -
> -	if (hv_isolation_type_snp() && !hyperv_paravisor_present) {
> -		__asm__ __volatile__(
> -				"vmmcall"
> -				: "=a" (hv_status), ASM_CALL_CONSTRAINT,
> -				"+c" (control), "+d" (input1)
> -				:: "cc", "r8", "r9", "r10", "r11");
> -	} else {
> -		__asm__ __volatile__(CALL_NOSPEC
> -				     : "=a" (hv_status), ASM_CALL_CONSTRAINT,
> -				       "+c" (control), "+d" (input1)
> -				     : THUNK_TARGET(hv_hypercall_pg)
> -				     : "cc", "r8", "r9", "r10", "r11");
> -	}
> +	return static_call_mod(hv_hypercall)(control, input1, 0);
>  #else
> -	{
> -		u32 input1_hi = upper_32_bits(input1);
> -		u32 input1_lo = lower_32_bits(input1);
> -
> -		__asm__ __volatile__ (CALL_NOSPEC
> -				      : "=A"(hv_status),
> -					"+c"(input1_lo),
> -					ASM_CALL_CONSTRAINT
> -				      :	"A" (control),
> -					"b" (input1_hi),
> -					THUNK_TARGET(hv_hypercall_pg)
> -				      : "cc", "edi", "esi");
> -	}
> -#endif
> +	u32 input1_hi = upper_32_bits(input1);
> +	u32 input1_lo = lower_32_bits(input1);
> +	u64 hv_status;
> +
> +	__asm__ __volatile__ (CALL_NOSPEC
> +			      : "=A"(hv_status),
> +			      "+c"(input1_lo),
> +			      ASM_CALL_CONSTRAINT
> +			      :	"A" (control),
> +			      "b" (input1_hi),
> +			      THUNK_TARGET(hv_hypercall_pg)
> +			      : "cc", "edi", "esi");
>  	return hv_status;
> +#endif
>  }
> 
>  static inline u64 hv_do_fast_hypercall8(u16 code, u64 input1)
> @@ -175,45 +141,24 @@ static inline u64 hv_do_fast_nested_hype
>  /* Fast hypercall with 16 bytes of input */
>  static inline u64 _hv_do_fast_hypercall16(u64 control, u64 input1, u64 input2)
>  {
> -	u64 hv_status;
> -
>  #ifdef CONFIG_X86_64
> -	if (hv_isolation_type_tdx() && !hyperv_paravisor_present)
> -		return hv_tdx_hypercall(control, input1, input2);
> -
> -	if (hv_isolation_type_snp() && !hyperv_paravisor_present) {
> -		__asm__ __volatile__("mov %[input2], %%r8\n"
> -				     "vmmcall"
> -				     : "=a" (hv_status), ASM_CALL_CONSTRAINT,
> -				       "+c" (control), "+d" (input1)
> -				     : [input2] "r" (input2)
> -				     : "cc", "r8", "r9", "r10", "r11");
> -	} else {
> -		__asm__ __volatile__("mov %[input2], %%r8\n"
> -				     CALL_NOSPEC
> -				     : "=a" (hv_status), ASM_CALL_CONSTRAINT,
> -				       "+c" (control), "+d" (input1)
> -				     : [input2] "r" (input2),
> -				       THUNK_TARGET(hv_hypercall_pg)
> -				     : "cc", "r8", "r9", "r10", "r11");
> -	}
> +	return static_call_mod(hv_hypercall)(control, input1, input2);
>  #else
> -	{
> -		u32 input1_hi = upper_32_bits(input1);
> -		u32 input1_lo = lower_32_bits(input1);
> -		u32 input2_hi = upper_32_bits(input2);
> -		u32 input2_lo = lower_32_bits(input2);
> -
> -		__asm__ __volatile__ (CALL_NOSPEC
> -				      : "=A"(hv_status),
> -					"+c"(input1_lo), ASM_CALL_CONSTRAINT
> -				      :	"A" (control), "b" (input1_hi),
> -					"D"(input2_hi), "S"(input2_lo),
> -					THUNK_TARGET(hv_hypercall_pg)
> -				      : "cc");
> -	}
> -#endif
> +	u32 input1_hi = upper_32_bits(input1);
> +	u32 input1_lo = lower_32_bits(input1);
> +	u32 input2_hi = upper_32_bits(input2);
> +	u32 input2_lo = lower_32_bits(input2);
> +	u64 hv_status;
> +
> +	__asm__ __volatile__ (CALL_NOSPEC
> +			      : "=A"(hv_status),
> +			      "+c"(input1_lo), ASM_CALL_CONSTRAINT
> +			      :	"A" (control), "b" (input1_hi),
> +			      "D"(input2_hi), "S"(input2_lo),
> +			      THUNK_TARGET(hv_hypercall_pg)
> +			      : "cc");
>  	return hv_status;
> +#endif
>  }
> 
>  static inline u64 hv_do_fast_hypercall16(u16 code, u64 input1, u64 input2)
> --- a/arch/x86/kernel/cpu/mshyperv.c
> +++ b/arch/x86/kernel/cpu/mshyperv.c
> @@ -38,10 +38,6 @@
>  bool hv_nested;
>  struct ms_hyperv_info ms_hyperv;
> 
> -/* Used in modules via hv_do_hypercall(): see arch/x86/include/asm/mshyperv.h */
> -bool hyperv_paravisor_present __ro_after_init;
> -EXPORT_SYMBOL_GPL(hyperv_paravisor_present);
> -
>  #if IS_ENABLED(CONFIG_HYPERV)
>  static inline unsigned int hv_get_nested_msr(unsigned int reg)
>  {
> @@ -288,8 +284,18 @@ static void __init x86_setup_ops_for_tsc
>  	old_restore_sched_clock_state = x86_platform.restore_sched_clock_state;
>  	x86_platform.restore_sched_clock_state = hv_restore_sched_clock_state;
>  }
> +
> +#ifdef CONFIG_X86_64
> +DEFINE_STATIC_CALL(hv_hypercall, hv_std_hypercall);
> +EXPORT_STATIC_CALL_TRAMP_GPL(hv_hypercall);
> +#define hypercall_update(hc) static_call_update(hv_hypercall, hc)
> +#endif
>  #endif /* CONFIG_HYPERV */
> 
> +#ifndef hypercall_update
> +#define hypercall_update(hc) (void)hc
> +#endif
> +
>  static uint32_t  __init ms_hyperv_platform(void)
>  {
>  	u32 eax;
> @@ -484,14 +490,14 @@ static void __init ms_hyperv_init_platfo
>  			ms_hyperv.shared_gpa_boundary =
>  				BIT_ULL(ms_hyperv.shared_gpa_boundary_bits);
> 
> -		hyperv_paravisor_present = !!ms_hyperv.paravisor_present;
> -
>  		pr_info("Hyper-V: Isolation Config: Group A 0x%x, Group B 0x%x\n",
>  			ms_hyperv.isolation_config_a, ms_hyperv.isolation_config_b);
> 
> 
>  		if (hv_get_isolation_type() == HV_ISOLATION_TYPE_SNP) {
>  			static_branch_enable(&isolation_type_snp);
> +			if (!ms_hyperv.paravisor_present)
> +				hypercall_update(hv_snp_hypercall);
>  		} else if (hv_get_isolation_type() == HV_ISOLATION_TYPE_TDX) {
>  			static_branch_enable(&isolation_type_tdx);
> 
> @@ -499,6 +505,7 @@ static void __init ms_hyperv_init_platfo
>  			ms_hyperv.hints &= ~HV_X64_APIC_ACCESS_RECOMMENDED;
> 
>  			if (!ms_hyperv.paravisor_present) {
> +				hypercall_update(hv_tdx_hypercall);
>  				/*
>  				 * Mark the Hyper-V TSC page feature as disabled
>  				 * in a TDX VM without paravisor so that the
> 


^ permalink raw reply	[flat|nested] 36+ messages in thread

* RE: [PATCH v3 12/16] x86_64,hyperv: Use direct call to hypercall-page
  2025-07-14 10:20 ` [PATCH v3 12/16] x86_64,hyperv: Use direct call to hypercall-page Peter Zijlstra
  2025-07-15  4:58   ` Wei Liu
@ 2025-07-15 14:52   ` Michael Kelley
  2025-08-18 10:46     ` Peter Zijlstra
  1 sibling, 1 reply; 36+ messages in thread
From: Michael Kelley @ 2025-07-15 14:52 UTC (permalink / raw)
  To: Peter Zijlstra, x86@kernel.org
  Cc: kys@microsoft.com, haiyangz@microsoft.com, wei.liu@kernel.org,
	decui@microsoft.com, tglx@linutronix.de, mingo@redhat.com,
	bp@alien8.de, dave.hansen@linux.intel.com, hpa@zytor.com,
	seanjc@google.com, pbonzini@redhat.com, ardb@kernel.org,
	kees@kernel.org, Arnd Bergmann, gregkh@linuxfoundation.org,
	jpoimboe@kernel.org, linux-hyperv@vger.kernel.org,
	linux-kernel@vger.kernel.org, kvm@vger.kernel.org,
	linux-efi@vger.kernel.org, samitolvanen@google.com,
	ojeda@kernel.org

From: Peter Zijlstra <peterz@infradead.org> Sent: Monday, July 14, 2025 3:20 AM
> 

Same nit here:  Use "x86/hyperv:" as the Subject prefix, for consistency with
past practice.

Michael

> Instead of using an indirect call to the hypercall page, use a direct
> call instead. This avoids all CFI problems, including the one where
> the hypercall page doesn't have IBT on.
> 
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> Reviewed-by: Michael Kelley <mhklinux@outlook.com>
> ---
>  arch/x86/hyperv/hv_init.c |   61 ++++++++++++++++++++++------------------------
>  1 file changed, 30 insertions(+), 31 deletions(-)
> 
> --- a/arch/x86/hyperv/hv_init.c
> +++ b/arch/x86/hyperv/hv_init.c
> @@ -17,7 +17,6 @@
>  #include <asm/desc.h>
>  #include <asm/e820/api.h>
>  #include <asm/sev.h>
> -#include <asm/ibt.h>
>  #include <asm/hypervisor.h>
>  #include <hyperv/hvhdk.h>
>  #include <asm/mshyperv.h>
> @@ -38,23 +37,41 @@
>  void *hv_hypercall_pg;
> 
>  #ifdef CONFIG_X86_64
> +static u64 __hv_hyperfail(u64 control, u64 param1, u64 param2)
> +{
> +	return U64_MAX;
> +}
> +
> +DEFINE_STATIC_CALL(__hv_hypercall, __hv_hyperfail);
> +
>  u64 hv_std_hypercall(u64 control, u64 param1, u64 param2)
>  {
>  	u64 hv_status;
> 
> -	if (!hv_hypercall_pg)
> -		return U64_MAX;
> -
>  	register u64 __r8 asm("r8") = param2;
> -	asm volatile (CALL_NOSPEC
> +	asm volatile ("call " STATIC_CALL_TRAMP_STR(__hv_hypercall)
>  		      : "=a" (hv_status), ASM_CALL_CONSTRAINT,
>  		        "+c" (control), "+d" (param1), "+r" (__r8)
> -		      : THUNK_TARGET(hv_hypercall_pg)
> -		      : "cc", "memory", "r9", "r10", "r11");
> +		      : : "cc", "memory", "r9", "r10", "r11");
> 
>  	return hv_status;
>  }
> +
> +typedef u64 (*hv_hypercall_f)(u64 control, u64 param1, u64 param2);
> +
> +static inline void hv_set_hypercall_pg(void *ptr)
> +{
> +	hv_hypercall_pg = ptr;
> +
> +	if (!ptr)
> +		ptr = &__hv_hyperfail;
> +	static_call_update(__hv_hypercall, (hv_hypercall_f)ptr);
> +}
>  #else
> +static inline void hv_set_hypercall_pg(void *ptr)
> +{
> +	hv_hypercall_pg = ptr;
> +}
>  EXPORT_SYMBOL_GPL(hv_hypercall_pg);
>  #endif
> 
> @@ -349,7 +366,7 @@ static int hv_suspend(void)
>  	 * pointer is restored on resume.
>  	 */
>  	hv_hypercall_pg_saved = hv_hypercall_pg;
> -	hv_hypercall_pg = NULL;
> +	hv_set_hypercall_pg(NULL);
> 
>  	/* Disable the hypercall page in the hypervisor */
>  	rdmsrq(HV_X64_MSR_HYPERCALL, hypercall_msr.as_uint64);
> @@ -375,7 +392,7 @@ static void hv_resume(void)
>  		vmalloc_to_pfn(hv_hypercall_pg_saved);
>  	wrmsrq(HV_X64_MSR_HYPERCALL, hypercall_msr.as_uint64);
> 
> -	hv_hypercall_pg = hv_hypercall_pg_saved;
> +	hv_set_hypercall_pg(hv_hypercall_pg_saved);
>  	hv_hypercall_pg_saved = NULL;
> 
>  	/*
> @@ -529,8 +546,8 @@ void __init hyperv_init(void)
>  	if (hv_isolation_type_tdx() && !ms_hyperv.paravisor_present)
>  		goto skip_hypercall_pg_init;
> 
> -	hv_hypercall_pg = __vmalloc_node_range(PAGE_SIZE, 1, VMALLOC_START,
> -			VMALLOC_END, GFP_KERNEL, PAGE_KERNEL_ROX,
> +	hv_hypercall_pg = __vmalloc_node_range(PAGE_SIZE, 1, MODULES_VADDR,
> +			MODULES_END, GFP_KERNEL, PAGE_KERNEL_ROX,
>  			VM_FLUSH_RESET_PERMS, NUMA_NO_NODE,
>  			__builtin_return_address(0));
>  	if (hv_hypercall_pg == NULL)
> @@ -568,27 +585,9 @@ void __init hyperv_init(void)
>  		wrmsrq(HV_X64_MSR_HYPERCALL, hypercall_msr.as_uint64);
>  	}
> 
> -skip_hypercall_pg_init:
> -	/*
> -	 * Some versions of Hyper-V that provide IBT in guest VMs have a bug
> -	 * in that there's no ENDBR64 instruction at the entry to the
> -	 * hypercall page. Because hypercalls are invoked via an indirect call
> -	 * to the hypercall page, all hypercall attempts fail when IBT is
> -	 * enabled, and Linux panics. For such buggy versions, disable IBT.
> -	 *
> -	 * Fixed versions of Hyper-V always provide ENDBR64 on the hypercall
> -	 * page, so if future Linux kernel versions enable IBT for 32-bit
> -	 * builds, additional hypercall page hackery will be required here
> -	 * to provide an ENDBR32.
> -	 */
> -#ifdef CONFIG_X86_KERNEL_IBT
> -	if (cpu_feature_enabled(X86_FEATURE_IBT) &&
> -	    *(u32 *)hv_hypercall_pg != gen_endbr()) {
> -		setup_clear_cpu_cap(X86_FEATURE_IBT);
> -		pr_warn("Disabling IBT because of Hyper-V bug\n");
> -	}
> -#endif
> +	hv_set_hypercall_pg(hv_hypercall_pg);
> 
> +skip_hypercall_pg_init:
>  	/*
>  	 * hyperv_init() is called before LAPIC is initialized: see
>  	 * apic_intr_mode_init() -> x86_platform.apic_post_init() and
> 


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v3 16/16] objtool: Validate kCFI calls
  2025-07-14 10:20 ` [PATCH v3 16/16] objtool: Validate kCFI calls Peter Zijlstra
  2025-07-14 10:49   ` Peter Zijlstra
  2025-07-14 16:30   ` Miguel Ojeda
@ 2025-07-16 21:03   ` Josh Poimboeuf
  2025-07-24 20:37   ` Sean Christopherson
  3 siblings, 0 replies; 36+ messages in thread
From: Josh Poimboeuf @ 2025-07-16 21:03 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: x86, kys, haiyangz, wei.liu, decui, tglx, mingo, bp, dave.hansen,
	hpa, seanjc, pbonzini, ardb, kees, Arnd Bergmann, gregkh,
	linux-hyperv, linux-kernel, kvm, linux-efi, samitolvanen, ojeda

On Mon, Jul 14, 2025 at 12:20:27PM +0200, Peter Zijlstra wrote:
> Validate that all indirect calls adhere to kCFI rules. Notably doing
> nocfi indirect call to a cfi function is broken.
> 
> Apparently some Rust 'core' code violates this and explodes when ran
> with FineIBT.
> 
> All the ANNOTATE_NOCFI_SYM sites are prime targets for attackers.
> 
>  - runtime EFI is especially henous because it also needs to disable
>    IBT. Basically calling unknown code without CFI protection at
>    runtime is a massice security issue.
> 
>  - Kexec image handover; if you can exploit this, you get to keep it :-)
> 
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>

Acked-by: Josh Poimboeuf <jpoimboe@kernel.org>

-- 
Josh

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v3 07/16] x86/kvm/emulate: Introduce EM_ASM_1SRC2
  2025-07-14 10:20 ` [PATCH v3 07/16] x86/kvm/emulate: Introduce EM_ASM_1SRC2 Peter Zijlstra
@ 2025-07-24  0:16   ` Sean Christopherson
  2025-08-18 10:37     ` Peter Zijlstra
  0 siblings, 1 reply; 36+ messages in thread
From: Sean Christopherson @ 2025-07-24  0:16 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: x86, kys, haiyangz, wei.liu, decui, tglx, mingo, bp, dave.hansen,
	hpa, pbonzini, ardb, kees, Arnd Bergmann, gregkh, jpoimboe,
	linux-hyperv, linux-kernel, kvm, linux-efi, samitolvanen, ojeda

For all of the KVM patches, please use

  KVM: x86:

"x86/kvm" is used for guest-side code, and while I hope no one will conflate this
with guest code, the consistency is helpful.

On Mon, Jul 14, 2025, Peter Zijlstra wrote:
> Replace the FASTOP1SRC2*() instructions.
> 
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> ---
>  arch/x86/kvm/emulate.c |   34 ++++++++++++++++++++++++++--------
>  1 file changed, 26 insertions(+), 8 deletions(-)
> 
> --- a/arch/x86/kvm/emulate.c
> +++ b/arch/x86/kvm/emulate.c
> @@ -317,6 +317,24 @@ static int em_##op(struct x86_emulate_ct
>  	ON64(case 8: __EM_ASM_1(op##q, rax); break;) \
>  	EM_ASM_END
>  
> +/* 1-operand, using "c" (src2) */
> +#define EM_ASM_1SRC2(op, name) \
> +	EM_ASM_START(name) \
> +	case 1: __EM_ASM_1(op##b, cl); break; \
> +	case 2: __EM_ASM_1(op##w, cx); break; \
> +	case 4: __EM_ASM_1(op##l, ecx); break; \
> +	ON64(case 8: __EM_ASM_1(op##q, rcx); break;) \
> +	EM_ASM_END
> +
> +/* 1-operand, using "c" (src2) with exception */
> +#define EM_ASM_1SRC2EX(op, name) \
> +	EM_ASM_START(name) \
> +	case 1: __EM_ASM_1_EX(op##b, cl); break; \
> +	case 2: __EM_ASM_1_EX(op##w, cx); break; \
> +	case 4: __EM_ASM_1_EX(op##l, ecx); break; \
> +	ON64(case 8: __EM_ASM_1(op##q, rcx); break;) \

This needs to be __EM_ASM_1_EX().  Luckily, KVM-Unit-Tests actually has testcase
for divq (somewhere in the morass of testcases).  I also now have an extension to
the fastops selftest to explicitly test all four flavors of div-by-zero; I'll get
it posted tomorrow.

(also, don't also me how long it took me to spot the copy+paste typo; I was full
 on debugging the exception fixup code before I realized my local diff looked
 "odd", *sigh*)

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v3 00/16] objtool: Detect and warn about indirect calls in __nocfi functions
  2025-07-14 10:20 [PATCH v3 00/16] objtool: Detect and warn about indirect calls in __nocfi functions Peter Zijlstra
                   ` (15 preceding siblings ...)
  2025-07-14 10:20 ` [PATCH v3 16/16] objtool: Validate kCFI calls Peter Zijlstra
@ 2025-07-24 20:31 ` Sean Christopherson
  16 siblings, 0 replies; 36+ messages in thread
From: Sean Christopherson @ 2025-07-24 20:31 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: x86, kys, haiyangz, wei.liu, decui, tglx, mingo, bp, dave.hansen,
	hpa, pbonzini, ardb, kees, Arnd Bergmann, gregkh, jpoimboe,
	linux-hyperv, linux-kernel, kvm, linux-efi, samitolvanen, ojeda

On Mon, Jul 14, 2025, Peter Zijlstra wrote:
> 
> Hi!
> 
> On kCFI (CONFIG_CFI_CLANG=y) builds all indirect calls should have the CFI
> check on (with very few exceptions). Not having the CFI checks undermines the
> protection provided by CFI and will make these sites candidates for people
> wanting to steal your cookies.
> 
> Specifically the ABI changes are so that doing indirect calls without the CFI
> magic, to a CFI adorned function is not compatible (although it happens to work
> for some setups, it very much does not for FineIBT).
> 
> Rust people tripped over this the other day, since their 'core' happened to
> have some no_sanitize(kcfi) bits in, which promptly exploded when ran with
> FineIBT on.
> 
> Since this is very much not a supported model -- on purpose, have objtool
> detect and warn about such constructs.
> 
> This effort [1] found all existing [2] non-cfi indirect calls in the kernel.
> 
> Notably the KVM fastop emulation stuff -- which is completely rewritten -- the
> generated code doesn't look horrific, but is slightly more verbose. I'm running
> on the assumption that instruction emulation is not super performance critical
> these days of zero VM-exit VMs etc. Paolo noted that pre-Westmere (2010) cares
> about this.

Yeah, I'm confident the fastop stuff isn't performance critical.  I'm skeptical
that fastops were _ever_ about raw performance.  

Running with EPT disabled to force emulation of Big RM, with OVMF and a 64-bit
Linux guest, I get literally zero hits on fastop().  With SeaBIOS and a 32-bit
Linux guest, booting a 24 vCPU VM hits <40 fastops.

Maybe there are some super legacy workloads that still heavily utilize Big RM,
but if they exist, I've no idea what they are, and AFAICT that was never the
motivation.

As highlighted in the original cover letter[*], fastops reduced the code footprint
of kvm/emulate.o by ~2500 bytes.  And as called out by commit e28bbd44dad1 ("KVM:
x86 emulator: framework for streamlining arithmetic opcodes"), executing a proxy
for the to-be-emulated instruction is all about functional correctness, e.g. to
ensure arithmetic RFLAGS match exactly.  Nothing suggests that performance was ever
a motivating factor.

I strongly suspect that the "fastop" name was a fairly arbitrary choice, and the
framework needed to be called _something_.  And then everyone since has assumed
that the motivation for fastops was to go fast, when in fact that was just a happy
side effect of the implementation.

https://lore.kernel.org/all/1356179217-5526-1-git-send-email-avi.kivity@gmail.com


So, with the _EX goof fixed, and "KVM: x86:" for all the relevant KVM patches:

Acked-by: Sean Christopherson <seanjc@google.com>


P.S. Thanks a ton for cleaning this up!

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v3 16/16] objtool: Validate kCFI calls
  2025-07-14 10:20 ` [PATCH v3 16/16] objtool: Validate kCFI calls Peter Zijlstra
                     ` (2 preceding siblings ...)
  2025-07-16 21:03   ` Josh Poimboeuf
@ 2025-07-24 20:37   ` Sean Christopherson
  2025-07-25 17:57     ` Xin Li
  3 siblings, 1 reply; 36+ messages in thread
From: Sean Christopherson @ 2025-07-24 20:37 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: x86, kys, haiyangz, wei.liu, decui, tglx, mingo, bp, dave.hansen,
	hpa, pbonzini, ardb, kees, Arnd Bergmann, gregkh, jpoimboe,
	linux-hyperv, linux-kernel, kvm, linux-efi, samitolvanen, ojeda

On Mon, Jul 14, 2025, Peter Zijlstra wrote:
> --- a/arch/x86/kvm/vmx/vmenter.S
> +++ b/arch/x86/kvm/vmx/vmenter.S
> @@ -361,6 +361,10 @@ SYM_FUNC_END(vmread_error_trampoline)
>  
>  .section .text, "ax"
>  
> +#ifndef CONFIG_X86_FRED
> +
>  SYM_FUNC_START(vmx_do_interrupt_irqoff)
>  	VMX_DO_EVENT_IRQOFF CALL_NOSPEC _ASM_ARG1
>  SYM_FUNC_END(vmx_do_interrupt_irqoff)
> +
> +#endif

This can go in the previous patch, "x86/fred: KVM: VMX: Always use FRED for IRQs
when CONFIG_X86_FRED=y".

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v3 16/16] objtool: Validate kCFI calls
  2025-07-24 20:37   ` Sean Christopherson
@ 2025-07-25 17:57     ` Xin Li
  2025-07-25 19:56       ` Sean Christopherson
  0 siblings, 1 reply; 36+ messages in thread
From: Xin Li @ 2025-07-25 17:57 UTC (permalink / raw)
  To: Sean Christopherson, Peter Zijlstra
  Cc: x86, kys, haiyangz, wei.liu, decui, tglx, mingo, bp, dave.hansen,
	hpa, pbonzini, ardb, kees, Arnd Bergmann, gregkh, jpoimboe,
	linux-hyperv, linux-kernel, kvm, linux-efi, samitolvanen, ojeda

On 7/24/2025 1:37 PM, Sean Christopherson wrote:
> On Mon, Jul 14, 2025, Peter Zijlstra wrote:
>> --- a/arch/x86/kvm/vmx/vmenter.S
>> +++ b/arch/x86/kvm/vmx/vmenter.S
>> @@ -361,6 +361,10 @@ SYM_FUNC_END(vmread_error_trampoline)
>>   
>>   .section .text, "ax"
>>   
>> +#ifndef CONFIG_X86_FRED
>> +
>>   SYM_FUNC_START(vmx_do_interrupt_irqoff)
>>   	VMX_DO_EVENT_IRQOFF CALL_NOSPEC _ASM_ARG1
>>   SYM_FUNC_END(vmx_do_interrupt_irqoff)
>> +
>> +#endif
> 
> This can go in the previous patch, "x86/fred: KVM: VMX: Always use FRED for IRQs
> when CONFIG_X86_FRED=y".
> 

I'm going to test patch 13~15, plus this change in patch 16.

BTW, there is a declaration for vmx_do_interrupt_irqoff() in
arch/x86/kvm/vmx/vmx.c, so we'd better also do:

--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -6945,7 +6945,9 @@ void vmx_load_eoi_exitmap(struct kvm_vcpu *vcpu, 
u64 *eoi_exit_bitmap)
         vmcs_write64(EOI_EXIT_BITMAP3, eoi_exit_bitmap[3]);
  }

+#ifndef CONFIG_X86_FRED
  void vmx_do_interrupt_irqoff(unsigned long entry);
+#endif
  void vmx_do_nmi_irqoff(void);

  static void handle_nm_fault_irqoff(struct kvm_vcpu *vcpu)

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v3 16/16] objtool: Validate kCFI calls
  2025-07-25 17:57     ` Xin Li
@ 2025-07-25 19:56       ` Sean Christopherson
  2025-07-26  0:33         ` Xin Li
  0 siblings, 1 reply; 36+ messages in thread
From: Sean Christopherson @ 2025-07-25 19:56 UTC (permalink / raw)
  To: Xin Li
  Cc: Peter Zijlstra, x86, kys, haiyangz, wei.liu, decui, tglx, mingo,
	bp, dave.hansen, hpa, pbonzini, ardb, kees, Arnd Bergmann, gregkh,
	jpoimboe, linux-hyperv, linux-kernel, kvm, linux-efi,
	samitolvanen, ojeda

On Fri, Jul 25, 2025, Xin Li wrote:
> On 7/24/2025 1:37 PM, Sean Christopherson wrote:
> > On Mon, Jul 14, 2025, Peter Zijlstra wrote:
> > > --- a/arch/x86/kvm/vmx/vmenter.S
> > > +++ b/arch/x86/kvm/vmx/vmenter.S
> > > @@ -361,6 +361,10 @@ SYM_FUNC_END(vmread_error_trampoline)
> > >   .section .text, "ax"
> > > +#ifndef CONFIG_X86_FRED
> > > +
> > >   SYM_FUNC_START(vmx_do_interrupt_irqoff)
> > >   	VMX_DO_EVENT_IRQOFF CALL_NOSPEC _ASM_ARG1
> > >   SYM_FUNC_END(vmx_do_interrupt_irqoff)
> > > +
> > > +#endif
> > 
> > This can go in the previous patch, "x86/fred: KVM: VMX: Always use FRED for IRQs
> > when CONFIG_X86_FRED=y".
> > 
> 
> I'm going to test patch 13~15, plus this change in patch 16.
> 
> BTW, there is a declaration for vmx_do_interrupt_irqoff() in
> arch/x86/kvm/vmx/vmx.c, so we'd better also do:
> 
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -6945,7 +6945,9 @@ void vmx_load_eoi_exitmap(struct kvm_vcpu *vcpu, u64
> *eoi_exit_bitmap)
>         vmcs_write64(EOI_EXIT_BITMAP3, eoi_exit_bitmap[3]);
>  }
> 
> +#ifndef CONFIG_X86_FRED
>  void vmx_do_interrupt_irqoff(unsigned long entry);
> +#endif

No, we want to keep the declaration.  Unconditionally decaring the symbol allows
KVM to use IS_ENABLED():

	if (IS_ENABLED(CONFIG_X86_FRED))
 		fred_entry_from_kvm(EVENT_TYPE_EXTINT, vector);

Hiding the declaration would require that to be a "proper" #ifdef, which would
be a net negative for readability.  The extra declaration won't hurt anything for
CONFIG_X86_FRED=n, as "bad" usage will still fail at link time.

>  void vmx_do_nmi_irqoff(void);
> 
>  static void handle_nm_fault_irqoff(struct kvm_vcpu *vcpu)

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v3 16/16] objtool: Validate kCFI calls
  2025-07-25 19:56       ` Sean Christopherson
@ 2025-07-26  0:33         ` Xin Li
  0 siblings, 0 replies; 36+ messages in thread
From: Xin Li @ 2025-07-26  0:33 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Peter Zijlstra, x86, kys, haiyangz, wei.liu, decui, tglx, mingo,
	bp, dave.hansen, hpa, pbonzini, ardb, kees, Arnd Bergmann, gregkh,
	jpoimboe, linux-hyperv, linux-kernel, kvm, linux-efi,
	samitolvanen, ojeda

On 7/25/2025 12:56 PM, Sean Christopherson wrote:
>>
>> BTW, there is a declaration for vmx_do_interrupt_irqoff() in
>> arch/x86/kvm/vmx/vmx.c, so we'd better also do:
>>
>> --- a/arch/x86/kvm/vmx/vmx.c
>> +++ b/arch/x86/kvm/vmx/vmx.c
>> @@ -6945,7 +6945,9 @@ void vmx_load_eoi_exitmap(struct kvm_vcpu *vcpu, u64
>> *eoi_exit_bitmap)
>>          vmcs_write64(EOI_EXIT_BITMAP3, eoi_exit_bitmap[3]);
>>   }
>>
>> +#ifndef CONFIG_X86_FRED
>>   void vmx_do_interrupt_irqoff(unsigned long entry);
>> +#endif
> No, we want to keep the declaration.  Unconditionally decaring the symbol allows
> KVM to use IS_ENABLED():
> 
> 	if (IS_ENABLED(CONFIG_X86_FRED))
>   		fred_entry_from_kvm(EVENT_TYPE_EXTINT, vector);
> 
> Hiding the declaration would require that to be a "proper" #ifdef, which would
> be a net negative for readability.  The extra declaration won't hurt anything for
> CONFIG_X86_FRED=n, as "bad" usage will still fail at link time.

I did hit a compilation error, so yes, we have to keep it.



^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v3 14/16] x86/fred: Play nice with invoking asm_fred_entry_from_kvm() on non-FRED hardware
  2025-07-14 10:20 ` [PATCH v3 14/16] x86/fred: Play nice with invoking asm_fred_entry_from_kvm() on non-FRED hardware Peter Zijlstra
@ 2025-07-26  4:54   ` Xin Li
  2025-08-18 12:09     ` Peter Zijlstra
  0 siblings, 1 reply; 36+ messages in thread
From: Xin Li @ 2025-07-26  4:54 UTC (permalink / raw)
  To: Peter Zijlstra, x86
  Cc: kys, haiyangz, wei.liu, decui, tglx, mingo, bp, dave.hansen, hpa,
	seanjc, pbonzini, ardb, kees, Arnd Bergmann, gregkh, jpoimboe,
	linux-hyperv, linux-kernel, kvm, linux-efi, samitolvanen, ojeda

On 7/14/2025 3:20 AM, Peter Zijlstra wrote:
>   	call __fred_entry_from_kvm		/* Call the C entry point */
> -	POP_REGS
> -	ERETS
> -1:
> +
> +1:	/*

The symbol "1" is misplaced; it needs to be put after the ERETS
instruction.

> +	 * When FRED, use ERETS to potentially clear NMIs, otherwise simply
> +	 * restore the stack pointer.
> +	 */
> +	ALTERNATIVE "nop; nop; mov %rbp, %rsp", \

Why explicitly add two nops here?

ALTERNATIVE will still pad three-byte nop after the MOV instruction.

> +	            __stringify(add $C_PTREGS_SIZE, %rsp; ERETS), \
> +		    X86_FEATURE_FRED
> +
>   	/*
> -	 * Objtool doesn't understand what ERETS does, this hint tells it that
> -	 * yes, we'll reach here and with what stack state. A save/restore pair
> -	 * isn't strictly needed, but it's the simplest form.
> +	 * Objtool doesn't understand ERETS, and the cfi register state is
> +	 * different from initial_func_cfi due to PUSH_REGS. Tell it the state
> +	 * is similar to where UNWIND_HINT_SAVE is.
>   	 */
>   	UNWIND_HINT_RESTORE
> +
>   	pop %rbp
>   	RET
>   


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v3 07/16] x86/kvm/emulate: Introduce EM_ASM_1SRC2
  2025-07-24  0:16   ` Sean Christopherson
@ 2025-08-18 10:37     ` Peter Zijlstra
  0 siblings, 0 replies; 36+ messages in thread
From: Peter Zijlstra @ 2025-08-18 10:37 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: x86, kys, haiyangz, wei.liu, decui, tglx, mingo, bp, dave.hansen,
	hpa, pbonzini, ardb, kees, Arnd Bergmann, gregkh, jpoimboe,
	linux-hyperv, linux-kernel, kvm, linux-efi, samitolvanen, ojeda

On Wed, Jul 23, 2025 at 05:16:38PM -0700, Sean Christopherson wrote:
> For all of the KVM patches, please use
> 
>   KVM: x86:
> 
> "x86/kvm" is used for guest-side code, and while I hope no one will conflate this
> with guest code, the consistency is helpful.

Sure.

> On Mon, Jul 14, 2025, Peter Zijlstra wrote:
> > Replace the FASTOP1SRC2*() instructions.
> > 
> > Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> > ---
> >  arch/x86/kvm/emulate.c |   34 ++++++++++++++++++++++++++--------
> >  1 file changed, 26 insertions(+), 8 deletions(-)
> > 
> > --- a/arch/x86/kvm/emulate.c
> > +++ b/arch/x86/kvm/emulate.c
> > @@ -317,6 +317,24 @@ static int em_##op(struct x86_emulate_ct
> >  	ON64(case 8: __EM_ASM_1(op##q, rax); break;) \
> >  	EM_ASM_END
> >  
> > +/* 1-operand, using "c" (src2) */
> > +#define EM_ASM_1SRC2(op, name) \
> > +	EM_ASM_START(name) \
> > +	case 1: __EM_ASM_1(op##b, cl); break; \
> > +	case 2: __EM_ASM_1(op##w, cx); break; \
> > +	case 4: __EM_ASM_1(op##l, ecx); break; \
> > +	ON64(case 8: __EM_ASM_1(op##q, rcx); break;) \
> > +	EM_ASM_END
> > +
> > +/* 1-operand, using "c" (src2) with exception */
> > +#define EM_ASM_1SRC2EX(op, name) \
> > +	EM_ASM_START(name) \
> > +	case 1: __EM_ASM_1_EX(op##b, cl); break; \
> > +	case 2: __EM_ASM_1_EX(op##w, cx); break; \
> > +	case 4: __EM_ASM_1_EX(op##l, ecx); break; \
> > +	ON64(case 8: __EM_ASM_1(op##q, rcx); break;) \
> 
> This needs to be __EM_ASM_1_EX().  Luckily, KVM-Unit-Tests actually has testcase
> for divq (somewhere in the morass of testcases).  I also now have an extension to
> the fastops selftest to explicitly test all four flavors of div-by-zero; I'll get
> it posted tomorrow.
> 
> (also, don't also me how long it took me to spot the copy+paste typo; I was full
>  on debugging the exception fixup code before I realized my local diff looked
>  "odd", *sigh*)

Urgh, sorry about that. Typically I use regex for these things, clearly
I messed up here.

Thanks and fixed!

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v3 12/16] x86_64,hyperv: Use direct call to hypercall-page
  2025-07-15 14:52   ` Michael Kelley
@ 2025-08-18 10:46     ` Peter Zijlstra
  0 siblings, 0 replies; 36+ messages in thread
From: Peter Zijlstra @ 2025-08-18 10:46 UTC (permalink / raw)
  To: Michael Kelley
  Cc: x86@kernel.org, kys@microsoft.com, haiyangz@microsoft.com,
	wei.liu@kernel.org, decui@microsoft.com, tglx@linutronix.de,
	mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com,
	hpa@zytor.com, seanjc@google.com, pbonzini@redhat.com,
	ardb@kernel.org, kees@kernel.org, Arnd Bergmann,
	gregkh@linuxfoundation.org, jpoimboe@kernel.org,
	linux-hyperv@vger.kernel.org, linux-kernel@vger.kernel.org,
	kvm@vger.kernel.org, linux-efi@vger.kernel.org,
	samitolvanen@google.com, ojeda@kernel.org

On Tue, Jul 15, 2025 at 02:52:04PM +0000, Michael Kelley wrote:
> From: Peter Zijlstra <peterz@infradead.org> Sent: Monday, July 14, 2025 3:20 AM
> > 
> 
> Same nit here:  Use "x86/hyperv:" as the Subject prefix, for consistency with
> past practice.

Sure, done. Thanks!

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v3 14/16] x86/fred: Play nice with invoking asm_fred_entry_from_kvm() on non-FRED hardware
  2025-07-26  4:54   ` Xin Li
@ 2025-08-18 12:09     ` Peter Zijlstra
  0 siblings, 0 replies; 36+ messages in thread
From: Peter Zijlstra @ 2025-08-18 12:09 UTC (permalink / raw)
  To: Xin Li
  Cc: x86, kys, haiyangz, wei.liu, decui, tglx, mingo, bp, dave.hansen,
	hpa, seanjc, pbonzini, ardb, kees, Arnd Bergmann, gregkh,
	jpoimboe, linux-hyperv, linux-kernel, kvm, linux-efi,
	samitolvanen, ojeda

On Fri, Jul 25, 2025 at 09:54:32PM -0700, Xin Li wrote:
> On 7/14/2025 3:20 AM, Peter Zijlstra wrote:
> >   	call __fred_entry_from_kvm		/* Call the C entry point */
> > -	POP_REGS
> > -	ERETS
> > -1:
> > +
> > +1:	/*
> 
> The symbol "1" is misplaced; it needs to be put after the ERETS
> instruction.

Doh, fixed.

> > +	 * When FRED, use ERETS to potentially clear NMIs, otherwise simply
> > +	 * restore the stack pointer.
> > +	 */
> > +	ALTERNATIVE "nop; nop; mov %rbp, %rsp", \
> 
> Why explicitly add two nops here?

Because the CFI information for all alternative code flows must be the
same. So by playing games with instruction offsets you can have
conflicting CFI inside the alternative.

Specifically, we have:

0:	90        nop
1:	90        nop
2:	48 89 ec  mov %rbp, %rsp


0:	48 83 c4 0c  add $12, %rsp
4:      f2 0f 01 ca  erets

This gets us CFI updates on 0, 2 and 4, without conflicts.


> ALTERNATIVE will still pad three-byte nop after the MOV instruction.
> 
> > +	            __stringify(add $C_PTREGS_SIZE, %rsp; ERETS), \
> > +		    X86_FEATURE_FRED
> > +
> >   	/*
> > -	 * Objtool doesn't understand what ERETS does, this hint tells it that
> > -	 * yes, we'll reach here and with what stack state. A save/restore pair
> > -	 * isn't strictly needed, but it's the simplest form.
> > +	 * Objtool doesn't understand ERETS, and the cfi register state is
> > +	 * different from initial_func_cfi due to PUSH_REGS. Tell it the state
> > +	 * is similar to where UNWIND_HINT_SAVE is.
> >   	 */
> >   	UNWIND_HINT_RESTORE
> > +
> >   	pop %rbp
> >   	RET
> 

^ permalink raw reply	[flat|nested] 36+ messages in thread

end of thread, other threads:[~2025-08-18 12:09 UTC | newest]

Thread overview: 36+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-07-14 10:20 [PATCH v3 00/16] objtool: Detect and warn about indirect calls in __nocfi functions Peter Zijlstra
2025-07-14 10:20 ` [PATCH v3 01/16] x86/kvm/emulate: Implement test_cc() in C Peter Zijlstra
2025-07-14 10:20 ` [PATCH v3 02/16] x86/kvm/emulate: Introduce EM_ASM_1 Peter Zijlstra
2025-07-14 10:20 ` [PATCH v3 03/16] x86/kvm/emulate: Introduce EM_ASM_2 Peter Zijlstra
2025-07-14 10:20 ` [PATCH v3 04/16] x86/kvm/emulate: Introduce EM_ASM_2R Peter Zijlstra
2025-07-14 10:20 ` [PATCH v3 05/16] x86/kvm/emulate: Introduce EM_ASM_2W Peter Zijlstra
2025-07-14 10:20 ` [PATCH v3 06/16] x86/kvm/emulate: Introduce EM_ASM_2CL Peter Zijlstra
2025-07-14 10:20 ` [PATCH v3 07/16] x86/kvm/emulate: Introduce EM_ASM_1SRC2 Peter Zijlstra
2025-07-24  0:16   ` Sean Christopherson
2025-08-18 10:37     ` Peter Zijlstra
2025-07-14 10:20 ` [PATCH v3 08/16] x86/kvm/emulate: Introduce EM_ASM_3WCL Peter Zijlstra
2025-07-14 10:20 ` [PATCH v3 09/16] x86/kvm/emulate: Convert em_salc() to C Peter Zijlstra
2025-07-14 10:20 ` [PATCH v3 10/16] x86/kvm/emulate: Remove fastops Peter Zijlstra
2025-07-14 10:20 ` [PATCH v3 11/16] x86,hyperv: Clean up hv_do_hypercall() Peter Zijlstra
2025-07-15  4:54   ` Wei Liu
2025-07-15 14:51   ` Michael Kelley
2025-07-14 10:20 ` [PATCH v3 12/16] x86_64,hyperv: Use direct call to hypercall-page Peter Zijlstra
2025-07-15  4:58   ` Wei Liu
2025-07-15 14:52   ` Michael Kelley
2025-08-18 10:46     ` Peter Zijlstra
2025-07-14 10:20 ` [PATCH v3 13/16] x86/fred: Install system vector handlers even if FRED isnt fully enabled Peter Zijlstra
2025-07-14 10:20 ` [PATCH v3 14/16] x86/fred: Play nice with invoking asm_fred_entry_from_kvm() on non-FRED hardware Peter Zijlstra
2025-07-26  4:54   ` Xin Li
2025-08-18 12:09     ` Peter Zijlstra
2025-07-14 10:20 ` [PATCH v3 15/16] x86/fred: KVM: VMX: Always use FRED for IRQs when CONFIG_X86_FRED=y Peter Zijlstra
2025-07-14 10:20 ` [PATCH v3 16/16] objtool: Validate kCFI calls Peter Zijlstra
2025-07-14 10:49   ` Peter Zijlstra
2025-07-14 11:21     ` Peter Zijlstra
2025-07-14 16:30   ` Miguel Ojeda
2025-07-15  8:38     ` Peter Zijlstra
2025-07-16 21:03   ` Josh Poimboeuf
2025-07-24 20:37   ` Sean Christopherson
2025-07-25 17:57     ` Xin Li
2025-07-25 19:56       ` Sean Christopherson
2025-07-26  0:33         ` Xin Li
2025-07-24 20:31 ` [PATCH v3 00/16] objtool: Detect and warn about indirect calls in __nocfi functions Sean Christopherson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).