public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 00/12] x86/kvm/emulate: Avoid RET for FASTOPs
@ 2024-11-11 11:59 Peter Zijlstra
  2024-11-11 11:59 ` [PATCH v2 01/12] objtool: Generic annotation infrastructure Peter Zijlstra
                   ` (12 more replies)
  0 siblings, 13 replies; 38+ messages in thread
From: Peter Zijlstra @ 2024-11-11 11:59 UTC (permalink / raw)
  To: seanjc, pbonzini, jpoimboe, tglx; +Cc: linux-kernel, x86, kvm, jthoughton

Hi!

At long last, a respin of these patches.

The FASTOPs are special because they rely on RET to preserve CFLAGS, which is a
problem with all the mitigation stuff. Also see things like: ba5ca5e5e6a1
("x86/retpoline: Don't clobber RFLAGS during srso_safe_ret()").

Rework FASTOPs to no longer use RET and side-step the problem of trying to make
the various return thunks preserve CFLAGS for just this one case.

There are two separate instances, test_cc() and fastop(). The first is
basically a SETCC wrapper, which seems like a very complicated (and somewhat
expensive) way to read FLAGS. Instead use the code we already have to emulate
JCC to fully emulate the instruction.

That then leaves fastop(), which when marked noinline is guaranteed to exist
only once. As such, CALL+RET isn't needed, because we'll always be RETurning to
the same location, as such replace with JMP+JMP.

My plan is to take the objtool patches through tip/objtool/core, the nospec
patches through tip/x86/core and either stick the fastop patches in that latter
tree if the KVM folks agree, or they can merge the aforementioned two branches
and then stick the patches on top, whatever works for people.




^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH v2 01/12] objtool: Generic annotation infrastructure
  2024-11-11 11:59 [PATCH v2 00/12] x86/kvm/emulate: Avoid RET for FASTOPs Peter Zijlstra
@ 2024-11-11 11:59 ` Peter Zijlstra
  2024-11-15 18:38   ` Josh Poimboeuf
  2024-11-11 11:59 ` [PATCH v2 02/12] objtool: Convert ANNOTATE_NOENDBR to ANNOTATE Peter Zijlstra
                   ` (11 subsequent siblings)
  12 siblings, 1 reply; 38+ messages in thread
From: Peter Zijlstra @ 2024-11-11 11:59 UTC (permalink / raw)
  To: seanjc, pbonzini, jpoimboe, tglx
  Cc: linux-kernel, x86, kvm, jthoughton, Peter Zijlstra (Intel)

Avoid endless .discard.foo sections for each annotation, create a
single .discard.annotate section that takes an annotation type along
with the instruction.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 include/linux/objtool.h |   18 ++++++++++++++++++
 tools/objtool/check.c   |   46 ++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 64 insertions(+)

--- a/include/linux/objtool.h
+++ b/include/linux/objtool.h
@@ -57,6 +57,13 @@
 	".long 998b\n\t"						\
 	".popsection\n\t"
 
+#define ASM_ANNOTATE(x)						\
+	"911:\n\t"						\
+	".pushsection .discard.annotate,\"M\",@progbits,8\n\t"	\
+	".long 911b - .\n\t"					\
+	".long " __stringify(x) "\n\t"				\
+	".popsection\n\t"
+
 #else /* __ASSEMBLY__ */
 
 /*
@@ -146,6 +153,14 @@
 	.popsection
 .endm
 
+.macro ANNOTATE type:req
+.Lhere_\@:
+	.pushsection .discard.annotate,"M",@progbits,8
+	.long	.Lhere_\@ - .
+	.long	\type
+	.popsection
+.endm
+
 #endif /* __ASSEMBLY__ */
 
 #else /* !CONFIG_OBJTOOL */
@@ -155,6 +170,7 @@
 #define UNWIND_HINT(type, sp_reg, sp_offset, signal) "\n\t"
 #define STACK_FRAME_NON_STANDARD(func)
 #define STACK_FRAME_NON_STANDARD_FP(func)
+#define ASM_ANNOTATE(x)
 #define ANNOTATE_NOENDBR
 #define ASM_REACHABLE
 #else
@@ -167,6 +183,8 @@
 .endm
 .macro REACHABLE
 .endm
+.macro ANNOTATE type:req
+.endm
 #endif
 
 #endif /* CONFIG_OBJTOOL */
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -2373,6 +2373,50 @@ static int read_unwind_hints(struct objt
 	return 0;
 }
 
+static int read_annotate(struct objtool_file *file, void (*func)(int type, struct instruction *insn))
+{
+	struct section *rsec, *sec;
+	struct instruction *insn;
+	struct reloc *reloc;
+	int type;
+
+	rsec = find_section_by_name(file->elf, ".rela.discard.annotate");
+	if (!rsec)
+		return 0;
+
+	sec = find_section_by_name(file->elf, ".discard.annotate");
+	if (!sec)
+		return 0;
+
+	if (sec->sh.sh_entsize != 8) {
+		static bool warn = false;
+		if (!warn) {
+			WARN("%s: dodgy linker, sh_entsize != 8", sec->name);
+			warn = true;
+		}
+		sec->sh.sh_entsize = 8;
+	}
+
+	for_each_reloc(rsec, reloc) {
+		insn = find_insn(file, reloc->sym->sec,
+				 reloc->sym->offset + reloc_addend(reloc));
+		if (!insn) {
+			WARN("bad .discard.annotate entry: %d", reloc_idx(reloc));
+			return -1;
+		}
+
+		type = *(u32 *)(sec->data->d_buf + (reloc_idx(reloc) * sec->sh.sh_entsize) + 4);
+
+		func(type, insn);
+	}
+
+	return 0;
+}
+
+static void __annotate_nop(int type, struct instruction *insn)
+{
+}
+
 static int read_noendbr_hints(struct objtool_file *file)
 {
 	struct instruction *insn;
@@ -2670,6 +2714,8 @@ static int decode_sections(struct objtoo
 	if (ret)
 		return ret;
 
+	ret = read_annotate(file, __annotate_nop);
+
 	/*
 	 * Must be before read_unwind_hints() since that needs insn->noendbr.
 	 */



^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH v2 02/12] objtool: Convert ANNOTATE_NOENDBR to ANNOTATE
  2024-11-11 11:59 [PATCH v2 00/12] x86/kvm/emulate: Avoid RET for FASTOPs Peter Zijlstra
  2024-11-11 11:59 ` [PATCH v2 01/12] objtool: Generic annotation infrastructure Peter Zijlstra
@ 2024-11-11 11:59 ` Peter Zijlstra
  2024-11-11 11:59 ` [PATCH v2 03/12] objtool: Convert ANNOTATE_RETPOLINE_SAFE " Peter Zijlstra
                   ` (10 subsequent siblings)
  12 siblings, 0 replies; 38+ messages in thread
From: Peter Zijlstra @ 2024-11-11 11:59 UTC (permalink / raw)
  To: seanjc, pbonzini, jpoimboe, tglx
  Cc: linux-kernel, x86, kvm, jthoughton, Peter Zijlstra (Intel)


Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 include/linux/objtool.h             |   17 ++++-------------
 include/linux/objtool_types.h       |    5 +++++
 tools/include/linux/objtool_types.h |    5 +++++
 tools/objtool/check.c               |   32 +++++---------------------------
 4 files changed, 19 insertions(+), 40 deletions(-)

--- a/include/linux/objtool.h
+++ b/include/linux/objtool.h
@@ -45,12 +45,6 @@
 #define STACK_FRAME_NON_STANDARD_FP(func)
 #endif
 
-#define ANNOTATE_NOENDBR					\
-	"986: \n\t"						\
-	".pushsection .discard.noendbr\n\t"			\
-	".long 986b\n\t"					\
-	".popsection\n\t"
-
 #define ASM_REACHABLE							\
 	"998:\n\t"							\
 	".pushsection .discard.reachable\n\t"				\
@@ -64,6 +58,8 @@
 	".long " __stringify(x) "\n\t"				\
 	".popsection\n\t"
 
+#define ANNOTATE_NOENDBR	ASM_ANNOTATE(ANNOTYPE_NOENDBR)
+
 #else /* __ASSEMBLY__ */
 
 /*
@@ -122,13 +118,6 @@
 #endif
 .endm
 
-.macro ANNOTATE_NOENDBR
-.Lhere_\@:
-	.pushsection .discard.noendbr
-	.long	.Lhere_\@
-	.popsection
-.endm
-
 /*
  * Use objtool to validate the entry requirement that all code paths do
  * VALIDATE_UNRET_END before RET.
@@ -161,6 +150,8 @@
 	.popsection
 .endm
 
+#define ANNOTATE_NOENDBR	ANNOTATE type=ANNOTYPE_NOENDBR
+
 #endif /* __ASSEMBLY__ */
 
 #else /* !CONFIG_OBJTOOL */
--- a/include/linux/objtool_types.h
+++ b/include/linux/objtool_types.h
@@ -54,4 +54,9 @@ struct unwind_hint {
 #define UNWIND_HINT_TYPE_SAVE		6
 #define UNWIND_HINT_TYPE_RESTORE	7
 
+/*
+ * Annotate types
+ */
+#define ANNOTYPE_NOENDBR		1
+
 #endif /* _LINUX_OBJTOOL_TYPES_H */
--- a/tools/include/linux/objtool_types.h
+++ b/tools/include/linux/objtool_types.h
@@ -54,4 +54,9 @@ struct unwind_hint {
 #define UNWIND_HINT_TYPE_SAVE		6
 #define UNWIND_HINT_TYPE_RESTORE	7
 
+/*
+ * Annotate types
+ */
+#define ANNOTYPE_NOENDBR		1
+
 #endif /* _LINUX_OBJTOOL_TYPES_H */
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -2339,32 +2339,12 @@ static int read_annotate(struct objtool_
 	return 0;
 }
 
-static void __annotate_nop(int type, struct instruction *insn)
+static void __annotate_noendbr(int type, struct instruction *insn)
 {
-}
-
-static int read_noendbr_hints(struct objtool_file *file)
-{
-	struct instruction *insn;
-	struct section *rsec;
-	struct reloc *reloc;
-
-	rsec = find_section_by_name(file->elf, ".rela.discard.noendbr");
-	if (!rsec)
-		return 0;
+	if (type != ANNOTYPE_NOENDBR)
+		return;
 
-	for_each_reloc(rsec, reloc) {
-		insn = find_insn(file, reloc->sym->sec,
-				 reloc->sym->offset + reloc_addend(reloc));
-		if (!insn) {
-			WARN("bad .discard.noendbr entry");
-			return -1;
-		}
-
-		insn->noendbr = 1;
-	}
-
-	return 0;
+	insn->noendbr = 1;
 }
 
 static int read_retpoline_hints(struct objtool_file *file)
@@ -2637,12 +2617,10 @@ static int decode_sections(struct objtoo
 	if (ret)
 		return ret;
 
-	ret = read_annotate(file, __annotate_nop);
-
 	/*
 	 * Must be before read_unwind_hints() since that needs insn->noendbr.
 	 */
-	ret = read_noendbr_hints(file);
+	ret = read_annotate(file, __annotate_noendbr);
 	if (ret)
 		return ret;
 



^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH v2 03/12] objtool: Convert ANNOTATE_RETPOLINE_SAFE to ANNOTATE
  2024-11-11 11:59 [PATCH v2 00/12] x86/kvm/emulate: Avoid RET for FASTOPs Peter Zijlstra
  2024-11-11 11:59 ` [PATCH v2 01/12] objtool: Generic annotation infrastructure Peter Zijlstra
  2024-11-11 11:59 ` [PATCH v2 02/12] objtool: Convert ANNOTATE_NOENDBR to ANNOTATE Peter Zijlstra
@ 2024-11-11 11:59 ` Peter Zijlstra
  2024-11-15 18:39   ` Josh Poimboeuf
  2024-11-11 11:59 ` [PATCH v2 04/12] objtool: Convert instrumentation_{begin,end}() " Peter Zijlstra
                   ` (9 subsequent siblings)
  12 siblings, 1 reply; 38+ messages in thread
From: Peter Zijlstra @ 2024-11-11 11:59 UTC (permalink / raw)
  To: seanjc, pbonzini, jpoimboe, tglx
  Cc: linux-kernel, x86, kvm, jthoughton, Peter Zijlstra (Intel)


Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/include/asm/nospec-branch.h |   13 +-------
 include/linux/objtool_types.h        |    1 
 tools/include/linux/objtool_types.h  |    1 
 tools/objtool/check.c                |   52 ++++++++++++-----------------------
 4 files changed, 22 insertions(+), 45 deletions(-)

--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -193,12 +193,7 @@
  * objtool the subsequent indirect jump/call is vouched safe for retpoline
  * builds.
  */
-.macro ANNOTATE_RETPOLINE_SAFE
-.Lhere_\@:
-	.pushsection .discard.retpoline_safe
-	.long .Lhere_\@
-	.popsection
-.endm
+#define ANNOTATE_RETPOLINE_SAFE	ANNOTATE type=ANNOTYPE_RETPOLINE_SAFE
 
 /*
  * (ab)use RETPOLINE_SAFE on RET to annotate away 'bare' RET instructions
@@ -317,11 +312,7 @@
 
 #else /* __ASSEMBLY__ */
 
-#define ANNOTATE_RETPOLINE_SAFE					\
-	"999:\n\t"						\
-	".pushsection .discard.retpoline_safe\n\t"		\
-	".long 999b\n\t"					\
-	".popsection\n\t"
+#define ANNOTATE_RETPOLINE_SAFE ASM_ANNOTATE(ANNOTYPE_RETPOLINE_SAFE)
 
 typedef u8 retpoline_thunk_t[RETPOLINE_THUNK_SIZE];
 extern retpoline_thunk_t __x86_indirect_thunk_array[];
--- a/include/linux/objtool_types.h
+++ b/include/linux/objtool_types.h
@@ -58,5 +58,6 @@ struct unwind_hint {
  * Annotate types
  */
 #define ANNOTYPE_NOENDBR		1
+#define ANNOTYPE_RETPOLINE_SAFE		2
 
 #endif /* _LINUX_OBJTOOL_TYPES_H */
--- a/tools/include/linux/objtool_types.h
+++ b/tools/include/linux/objtool_types.h
@@ -58,5 +58,6 @@ struct unwind_hint {
  * Annotate types
  */
 #define ANNOTYPE_NOENDBR		1
+#define ANNOTYPE_RETPOLINE_SAFE		2
 
 #endif /* _LINUX_OBJTOOL_TYPES_H */
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -2308,12 +2308,12 @@ static int read_unwind_hints(struct objt
 	return 0;
 }
 
-static int read_annotate(struct objtool_file *file, void (*func)(int type, struct instruction *insn))
+static int read_annotate(struct objtool_file *file, int (*func)(int type, struct instruction *insn))
 {
 	struct section *rsec, *sec;
 	struct instruction *insn;
 	struct reloc *reloc;
-	int type;
+	int type, ret;
 
 	rsec = find_section_by_name(file->elf, ".rela.discard.annotate");
 	if (!rsec)
@@ -2333,53 +2333,37 @@ static int read_annotate(struct objtool_
 
 		type = *(u32 *)(sec->data->d_buf + (reloc_idx(reloc) * sec->sh.sh_entsize) + 4);
 
-		func(type, insn);
+		ret = func(type, insn);
+		if (ret < 0)
+			return ret;
 	}
 
 	return 0;
 }
 
-static void __annotate_noendbr(int type, struct instruction *insn)
+static int __annotate_noendbr(int type, struct instruction *insn)
 {
 	if (type != ANNOTYPE_NOENDBR)
-		return;
+		return 0;
 
 	insn->noendbr = 1;
+	return 0;
 }
 
-static int read_retpoline_hints(struct objtool_file *file)
+static int __annotate_retpoline_safe(int type, struct instruction *insn)
 {
-	struct section *rsec;
-	struct instruction *insn;
-	struct reloc *reloc;
-
-	rsec = find_section_by_name(file->elf, ".rela.discard.retpoline_safe");
-	if (!rsec)
+	if (type != ANNOTYPE_RETPOLINE_SAFE)
 		return 0;
 
-	for_each_reloc(rsec, reloc) {
-		if (reloc->sym->type != STT_SECTION) {
-			WARN("unexpected relocation symbol type in %s", rsec->name);
-			return -1;
-		}
-
-		insn = find_insn(file, reloc->sym->sec, reloc_addend(reloc));
-		if (!insn) {
-			WARN("bad .discard.retpoline_safe entry");
-			return -1;
-		}
-
-		if (insn->type != INSN_JUMP_DYNAMIC &&
-		    insn->type != INSN_CALL_DYNAMIC &&
-		    insn->type != INSN_RETURN &&
-		    insn->type != INSN_NOP) {
-			WARN_INSN(insn, "retpoline_safe hint not an indirect jump/call/ret/nop");
-			return -1;
-		}
-
-		insn->retpoline_safe = true;
+	if (insn->type != INSN_JUMP_DYNAMIC &&
+	    insn->type != INSN_CALL_DYNAMIC &&
+	    insn->type != INSN_RETURN &&
+	    insn->type != INSN_NOP) {
+		WARN_INSN(insn, "retpoline_safe hint not an indirect jump/call/ret/nop");
+		return -1;
 	}
 
+	insn->retpoline_safe = true;
 	return 0;
 }
 
@@ -2666,7 +2650,7 @@ static int decode_sections(struct objtoo
 	if (ret)
 		return ret;
 
-	ret = read_retpoline_hints(file);
+	ret = read_annotate(file, __annotate_retpoline_safe);
 	if (ret)
 		return ret;
 



^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH v2 04/12] objtool: Convert instrumentation_{begin,end}() to ANNOTATE
  2024-11-11 11:59 [PATCH v2 00/12] x86/kvm/emulate: Avoid RET for FASTOPs Peter Zijlstra
                   ` (2 preceding siblings ...)
  2024-11-11 11:59 ` [PATCH v2 03/12] objtool: Convert ANNOTATE_RETPOLINE_SAFE " Peter Zijlstra
@ 2024-11-11 11:59 ` Peter Zijlstra
  2024-11-15 18:40   ` Josh Poimboeuf
  2024-11-11 11:59 ` [PATCH v2 05/12] objtool: Convert VALIDATE_UNRET_BEGIN " Peter Zijlstra
                   ` (8 subsequent siblings)
  12 siblings, 1 reply; 38+ messages in thread
From: Peter Zijlstra @ 2024-11-11 11:59 UTC (permalink / raw)
  To: seanjc, pbonzini, jpoimboe, tglx
  Cc: linux-kernel, x86, kvm, jthoughton, Peter Zijlstra (Intel)


Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 include/linux/instrumentation.h     |   11 +++-----
 include/linux/objtool.h             |    9 ++++--
 include/linux/objtool_types.h       |    2 +
 tools/include/linux/objtool_types.h |    2 +
 tools/objtool/check.c               |   49 +++++++-----------------------------
 5 files changed, 25 insertions(+), 48 deletions(-)

--- a/include/linux/instrumentation.h
+++ b/include/linux/instrumentation.h
@@ -4,14 +4,14 @@
 
 #ifdef CONFIG_NOINSTR_VALIDATION
 
+#include <linux/objtool.h>
 #include <linux/stringify.h>
 
 /* Begin/end of an instrumentation safe region */
 #define __instrumentation_begin(c) ({					\
 	asm volatile(__stringify(c) ": nop\n\t"				\
-		     ".pushsection .discard.instr_begin\n\t"		\
-		     ".long " __stringify(c) "b - .\n\t"		\
-		     ".popsection\n\t" : : "i" (c));			\
+		     __ASM_ANNOTATE(c, ANNOTYPE_INSTR_BEGIN)		\
+		     : : "i" (c));					\
 })
 #define instrumentation_begin() __instrumentation_begin(__COUNTER__)
 
@@ -48,9 +48,8 @@
  */
 #define __instrumentation_end(c) ({					\
 	asm volatile(__stringify(c) ": nop\n\t"				\
-		     ".pushsection .discard.instr_end\n\t"		\
-		     ".long " __stringify(c) "b - .\n\t"		\
-		     ".popsection\n\t" : : "i" (c));			\
+		     __ASM_ANNOTATE(c, ANNOTYPE_INSTR_END)		\
+		     : : "i" (c));					\
 })
 #define instrumentation_end() __instrumentation_end(__COUNTER__)
 #else /* !CONFIG_NOINSTR_VALIDATION */
--- a/include/linux/objtool.h
+++ b/include/linux/objtool.h
@@ -51,13 +51,16 @@
 	".long 998b\n\t"						\
 	".popsection\n\t"
 
-#define ASM_ANNOTATE(x)						\
-	"911:\n\t"						\
+#define __ASM_ANNOTATE(s, x)					\
 	".pushsection .discard.annotate,\"M\",@progbits,8\n\t"	\
-	".long 911b - .\n\t"					\
+	".long " __stringify(s) "b - .\n\t"			\
 	".long " __stringify(x) "\n\t"				\
 	".popsection\n\t"
 
+#define ASM_ANNOTATE(x)						\
+	"911:\n\t"						\
+	__ASM_ANNOTATE(911, x)
+
 #define ANNOTATE_NOENDBR	ASM_ANNOTATE(ANNOTYPE_NOENDBR)
 
 #else /* __ASSEMBLY__ */
--- a/include/linux/objtool_types.h
+++ b/include/linux/objtool_types.h
@@ -59,5 +59,7 @@ struct unwind_hint {
  */
 #define ANNOTYPE_NOENDBR		1
 #define ANNOTYPE_RETPOLINE_SAFE		2
+#define ANNOTYPE_INSTR_BEGIN		3
+#define ANNOTYPE_INSTR_END		4
 
 #endif /* _LINUX_OBJTOOL_TYPES_H */
--- a/tools/include/linux/objtool_types.h
+++ b/tools/include/linux/objtool_types.h
@@ -59,5 +59,7 @@ struct unwind_hint {
  */
 #define ANNOTYPE_NOENDBR		1
 #define ANNOTYPE_RETPOLINE_SAFE		2
+#define ANNOTYPE_INSTR_BEGIN		3
+#define ANNOTYPE_INSTR_END		4
 
 #endif /* _LINUX_OBJTOOL_TYPES_H */
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -2367,48 +2367,19 @@ static int __annotate_retpoline_safe(int
 	return 0;
 }
 
-static int read_instr_hints(struct objtool_file *file)
+static int __annotate_instr(int type, struct instruction *insn)
 {
-	struct section *rsec;
-	struct instruction *insn;
-	struct reloc *reloc;
-
-	rsec = find_section_by_name(file->elf, ".rela.discard.instr_end");
-	if (!rsec)
-		return 0;
-
-	for_each_reloc(rsec, reloc) {
-		if (reloc->sym->type != STT_SECTION) {
-			WARN("unexpected relocation symbol type in %s", rsec->name);
-			return -1;
-		}
-
-		insn = find_insn(file, reloc->sym->sec, reloc_addend(reloc));
-		if (!insn) {
-			WARN("bad .discard.instr_end entry");
-			return -1;
-		}
+	switch (type) {
+	case ANNOTYPE_INSTR_BEGIN:
+		insn->instr++;
+		break;
 
+	case ANNOTYPE_INSTR_END:
 		insn->instr--;
-	}
-
-	rsec = find_section_by_name(file->elf, ".rela.discard.instr_begin");
-	if (!rsec)
-		return 0;
+		break;
 
-	for_each_reloc(rsec, reloc) {
-		if (reloc->sym->type != STT_SECTION) {
-			WARN("unexpected relocation symbol type in %s", rsec->name);
-			return -1;
-		}
-
-		insn = find_insn(file, reloc->sym->sec, reloc_addend(reloc));
-		if (!insn) {
-			WARN("bad .discard.instr_begin entry");
-			return -1;
-		}
-
-		insn->instr++;
+	default:
+		break;
 	}
 
 	return 0;
@@ -2654,7 +2625,7 @@ static int decode_sections(struct objtoo
 	if (ret)
 		return ret;
 
-	ret = read_instr_hints(file);
+	ret = read_annotate(file, __annotate_instr);
 	if (ret)
 		return ret;
 



^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH v2 05/12] objtool: Convert VALIDATE_UNRET_BEGIN to ANNOTATE
  2024-11-11 11:59 [PATCH v2 00/12] x86/kvm/emulate: Avoid RET for FASTOPs Peter Zijlstra
                   ` (3 preceding siblings ...)
  2024-11-11 11:59 ` [PATCH v2 04/12] objtool: Convert instrumentation_{begin,end}() " Peter Zijlstra
@ 2024-11-11 11:59 ` Peter Zijlstra
  2024-11-11 11:59 ` [PATCH v2 06/12] objtool: Convert ANNOTATE_IGNORE_ALTERNATIVE " Peter Zijlstra
                   ` (7 subsequent siblings)
  12 siblings, 0 replies; 38+ messages in thread
From: Peter Zijlstra @ 2024-11-11 11:59 UTC (permalink / raw)
  To: seanjc, pbonzini, jpoimboe, tglx
  Cc: linux-kernel, x86, kvm, jthoughton, Peter Zijlstra (Intel)


Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 include/linux/objtool.h             |    9 +++------
 include/linux/objtool_types.h       |    1 +
 tools/include/linux/objtool_types.h |    1 +
 tools/objtool/check.c               |   28 +++++-----------------------
 4 files changed, 10 insertions(+), 29 deletions(-)

--- a/include/linux/objtool.h
+++ b/include/linux/objtool.h
@@ -128,15 +128,12 @@
  * NOTE: The macro must be used at the beginning of a global symbol, otherwise
  * it will be ignored.
  */
-.macro VALIDATE_UNRET_BEGIN
 #if defined(CONFIG_NOINSTR_VALIDATION) && \
 	(defined(CONFIG_MITIGATION_UNRET_ENTRY) || defined(CONFIG_MITIGATION_SRSO))
-.Lhere_\@:
-	.pushsection .discard.validate_unret
-	.long	.Lhere_\@ - .
-	.popsection
+#define VALIDATE_UNRET_BEGIN	ANNOTATE type=ANNOTYPE_UNRET_BEGIN
+#else
+#define VALIDATE_UNRET_BEGIN
 #endif
-.endm
 
 .macro REACHABLE
 .Lhere_\@:
--- a/include/linux/objtool_types.h
+++ b/include/linux/objtool_types.h
@@ -61,5 +61,6 @@ struct unwind_hint {
 #define ANNOTYPE_RETPOLINE_SAFE		2
 #define ANNOTYPE_INSTR_BEGIN		3
 #define ANNOTYPE_INSTR_END		4
+#define ANNOTYPE_UNRET_BEGIN		5
 
 #endif /* _LINUX_OBJTOOL_TYPES_H */
--- a/tools/include/linux/objtool_types.h
+++ b/tools/include/linux/objtool_types.h
@@ -61,5 +61,6 @@ struct unwind_hint {
 #define ANNOTYPE_RETPOLINE_SAFE		2
 #define ANNOTYPE_INSTR_BEGIN		3
 #define ANNOTYPE_INSTR_END		4
+#define ANNOTYPE_UNRET_BEGIN		5
 
 #endif /* _LINUX_OBJTOOL_TYPES_H */
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -2450,33 +2450,15 @@ static int __annotate_instr(int type, st
 	return 0;
 }
 
-static int read_validate_unret_hints(struct objtool_file *file)
+static int __annotate_unret(int type, struct instruction *insn)
 {
-	struct section *rsec;
-	struct instruction *insn;
-	struct reloc *reloc;
-
-	rsec = find_section_by_name(file->elf, ".rela.discard.validate_unret");
-	if (!rsec)
+	if (type != ANNOTYPE_UNRET_BEGIN)
 		return 0;
 
-	for_each_reloc(rsec, reloc) {
-		if (reloc->sym->type != STT_SECTION) {
-			WARN("unexpected relocation symbol type in %s", rsec->name);
-			return -1;
-		}
-
-		insn = find_insn(file, reloc->sym->sec, reloc_addend(reloc));
-		if (!insn) {
-			WARN("bad .discard.instr_end entry");
-			return -1;
-		}
-		insn->unret = 1;
-	}
-
+	insn->unret = 1;
 	return 0;
-}
 
+}
 
 static int read_intra_function_calls(struct objtool_file *file)
 {
@@ -2697,7 +2679,7 @@ static int decode_sections(struct objtoo
 	if (ret)
 		return ret;
 
-	ret = read_validate_unret_hints(file);
+	ret = read_annotate(file, __annotate_unret);
 	if (ret)
 		return ret;
 



^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH v2 06/12] objtool: Convert ANNOTATE_IGNORE_ALTERNATIVE to ANNOTATE
  2024-11-11 11:59 [PATCH v2 00/12] x86/kvm/emulate: Avoid RET for FASTOPs Peter Zijlstra
                   ` (4 preceding siblings ...)
  2024-11-11 11:59 ` [PATCH v2 05/12] objtool: Convert VALIDATE_UNRET_BEGIN " Peter Zijlstra
@ 2024-11-11 11:59 ` Peter Zijlstra
  2024-11-11 11:59 ` [PATCH v2 07/12] objtool: Convert ANNOTATE_INTRA_FUNCTION_CALLS " Peter Zijlstra
                   ` (6 subsequent siblings)
  12 siblings, 0 replies; 38+ messages in thread
From: Peter Zijlstra @ 2024-11-11 11:59 UTC (permalink / raw)
  To: seanjc, pbonzini, jpoimboe, tglx
  Cc: linux-kernel, x86, kvm, jthoughton, Peter Zijlstra (Intel)


Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/include/asm/alternative.h  |   14 ++---------
 include/linux/objtool_types.h       |    1 
 tools/include/linux/objtool_types.h |    1 
 tools/objtool/check.c               |   45 ++++++++----------------------------
 4 files changed, 15 insertions(+), 46 deletions(-)

--- a/arch/x86/include/asm/alternative.h
+++ b/arch/x86/include/asm/alternative.h
@@ -4,6 +4,7 @@
 
 #include <linux/types.h>
 #include <linux/stringify.h>
+#include <linux/objtool.h>
 #include <asm/asm.h>
 
 #define ALT_FLAGS_SHIFT		16
@@ -55,11 +56,7 @@
  * objtool annotation to ignore the alternatives and only consider the original
  * instruction(s).
  */
-#define ANNOTATE_IGNORE_ALTERNATIVE				\
-	"999:\n\t"						\
-	".pushsection .discard.ignore_alts\n\t"			\
-	".long 999b\n\t"					\
-	".popsection\n\t"
+#define ANNOTATE_IGNORE_ALTERNATIVE	ASM_ANNOTATE(ANNOTYPE_IGNORE_ALTS)
 
 /*
  * The patching flags are part of the upper bits of the @ft_flags parameter when
@@ -349,12 +346,7 @@ static inline int alternatives_text_rese
  * objtool annotation to ignore the alternatives and only consider the original
  * instruction(s).
  */
-.macro ANNOTATE_IGNORE_ALTERNATIVE
-	.Lannotate_\@:
-	.pushsection .discard.ignore_alts
-	.long .Lannotate_\@
-	.popsection
-.endm
+#define ANNOTATE_IGNORE_ALTERNATIVE ANNOTATE type=ANNOTYPE_IGNORE_ALTS
 
 /*
  * Issue one struct alt_instr descriptor entry (need to put it into
--- a/include/linux/objtool_types.h
+++ b/include/linux/objtool_types.h
@@ -62,5 +62,6 @@ struct unwind_hint {
 #define ANNOTYPE_INSTR_BEGIN		3
 #define ANNOTYPE_INSTR_END		4
 #define ANNOTYPE_UNRET_BEGIN		5
+#define ANNOTYPE_IGNORE_ALTS		6
 
 #endif /* _LINUX_OBJTOOL_TYPES_H */
--- a/tools/include/linux/objtool_types.h
+++ b/tools/include/linux/objtool_types.h
@@ -62,5 +62,6 @@ struct unwind_hint {
 #define ANNOTYPE_INSTR_BEGIN		3
 #define ANNOTYPE_INSTR_END		4
 #define ANNOTYPE_UNRET_BEGIN		5
+#define ANNOTYPE_IGNORE_ALTS		6
 
 #endif /* _LINUX_OBJTOOL_TYPES_H */
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -1255,40 +1255,6 @@ static void add_uaccess_safe(struct objt
 }
 
 /*
- * FIXME: For now, just ignore any alternatives which add retpolines.  This is
- * a temporary hack, as it doesn't allow ORC to unwind from inside a retpoline.
- * But it at least allows objtool to understand the control flow *around* the
- * retpoline.
- */
-static int add_ignore_alternatives(struct objtool_file *file)
-{
-	struct section *rsec;
-	struct reloc *reloc;
-	struct instruction *insn;
-
-	rsec = find_section_by_name(file->elf, ".rela.discard.ignore_alts");
-	if (!rsec)
-		return 0;
-
-	for_each_reloc(rsec, reloc) {
-		if (reloc->sym->type != STT_SECTION) {
-			WARN("unexpected relocation symbol type in %s", rsec->name);
-			return -1;
-		}
-
-		insn = find_insn(file, reloc->sym->sec, reloc_addend(reloc));
-		if (!insn) {
-			WARN("bad .discard.ignore_alts entry");
-			return -1;
-		}
-
-		insn->ignore_alts = true;
-	}
-
-	return 0;
-}
-
-/*
  * Symbols that replace INSN_CALL_DYNAMIC, every (tail) call to such a symbol
  * will be added to the .retpoline_sites section.
  */
@@ -2341,6 +2307,15 @@ static int read_annotate(struct objtool_
 	return 0;
 }
 
+static int __annotate_ignore_alts(int type, struct instruction *insn)
+{
+	if (type != ANNOTYPE_IGNORE_ALTS)
+		return 0;
+
+	insn->ignore_alts = true;
+	return 0;
+}
+
 static int __annotate_noendbr(int type, struct instruction *insn)
 {
 	if (type != ANNOTYPE_NOENDBR)
@@ -2550,7 +2525,7 @@ static int decode_sections(struct objtoo
 	add_ignores(file);
 	add_uaccess_safe(file);
 
-	ret = add_ignore_alternatives(file);
+	ret = read_annotate(file, __annotate_ignore_alts);
 	if (ret)
 		return ret;
 



^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH v2 07/12] objtool: Convert ANNOTATE_INTRA_FUNCTION_CALLS to ANNOTATE
  2024-11-11 11:59 [PATCH v2 00/12] x86/kvm/emulate: Avoid RET for FASTOPs Peter Zijlstra
                   ` (5 preceding siblings ...)
  2024-11-11 11:59 ` [PATCH v2 06/12] objtool: Convert ANNOTATE_IGNORE_ALTERNATIVE " Peter Zijlstra
@ 2024-11-11 11:59 ` Peter Zijlstra
  2024-11-15 18:40   ` Josh Poimboeuf
  2024-11-11 11:59 ` [PATCH v2 08/12] objtool: Collapse annotate sequences Peter Zijlstra
                   ` (5 subsequent siblings)
  12 siblings, 1 reply; 38+ messages in thread
From: Peter Zijlstra @ 2024-11-11 11:59 UTC (permalink / raw)
  To: seanjc, pbonzini, jpoimboe, tglx
  Cc: linux-kernel, x86, kvm, jthoughton, Peter Zijlstra (Intel)


Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 include/linux/objtool.h             |   16 ++----
 include/linux/objtool_types.h       |    1 
 tools/include/linux/objtool_types.h |    1 
 tools/objtool/check.c               |   96 ++++++++++++++----------------------
 4 files changed, 47 insertions(+), 67 deletions(-)

--- a/include/linux/objtool.h
+++ b/include/linux/objtool.h
@@ -66,16 +66,6 @@
 #else /* __ASSEMBLY__ */
 
 /*
- * This macro indicates that the following intra-function call is valid.
- * Any non-annotated intra-function call will cause objtool to issue a warning.
- */
-#define ANNOTATE_INTRA_FUNCTION_CALL				\
-	999:							\
-	.pushsection .discard.intra_function_calls;		\
-	.long 999b;						\
-	.popsection;
-
-/*
  * In asm, there are two kinds of code: normal C-type callable functions and
  * the rest.  The normal callable functions can be called by other code, and
  * don't do anything unusual with the stack.  Such normal callable functions
@@ -152,6 +142,12 @@
 
 #define ANNOTATE_NOENDBR	ANNOTATE type=ANNOTYPE_NOENDBR
 
+/*
+ * This macro indicates that the following intra-function call is valid.
+ * Any non-annotated intra-function call will cause objtool to issue a warning.
+ */
+#define ANNOTATE_INTRA_FUNCTION_CALL ANNOTATE type=ANNOTYPE_INTRA_FUNCTION_CALLS
+
 #endif /* __ASSEMBLY__ */
 
 #else /* !CONFIG_OBJTOOL */
--- a/include/linux/objtool_types.h
+++ b/include/linux/objtool_types.h
@@ -63,5 +63,6 @@ struct unwind_hint {
 #define ANNOTYPE_INSTR_END		4
 #define ANNOTYPE_UNRET_BEGIN		5
 #define ANNOTYPE_IGNORE_ALTS		6
+#define ANNOTYPE_INTRA_FUNCTION_CALLS	7
 
 #endif /* _LINUX_OBJTOOL_TYPES_H */
--- a/tools/include/linux/objtool_types.h
+++ b/tools/include/linux/objtool_types.h
@@ -63,5 +63,6 @@ struct unwind_hint {
 #define ANNOTYPE_INSTR_END		4
 #define ANNOTYPE_UNRET_BEGIN		5
 #define ANNOTYPE_IGNORE_ALTS		6
+#define ANNOTYPE_INTRA_FUNCTION_CALLS	7
 
 #endif /* _LINUX_OBJTOOL_TYPES_H */
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -2274,7 +2274,8 @@ static int read_unwind_hints(struct objt
 	return 0;
 }
 
-static int read_annotate(struct objtool_file *file, int (*func)(int type, struct instruction *insn))
+static int read_annotate(struct objtool_file *file,
+			 int (*func)(struct objtool_file *file, int type, struct instruction *insn))
 {
 	struct section *rsec, *sec;
 	struct instruction *insn;
@@ -2299,7 +2300,7 @@ static int read_annotate(struct objtool_
 
 		type = *(u32 *)(sec->data->d_buf + (reloc_idx(reloc) * sec->sh.sh_entsize) + 4);
 
-		ret = func(type, insn);
+		ret = func(file, type, insn);
 		if (ret < 0)
 			return ret;
 	}
@@ -2307,7 +2308,7 @@ static int read_annotate(struct objtool_
 	return 0;
 }
 
-static int __annotate_ignore_alts(int type, struct instruction *insn)
+static int __annotate_ignore_alts(struct objtool_file *file, int type, struct instruction *insn)
 {
 	if (type != ANNOTYPE_IGNORE_ALTS)
 		return 0;
@@ -2316,7 +2317,7 @@ static int __annotate_ignore_alts(int ty
 	return 0;
 }
 
-static int __annotate_noendbr(int type, struct instruction *insn)
+static int __annotate_noendbr(struct objtool_file *file, int type, struct instruction *insn)
 {
 	if (type != ANNOTYPE_NOENDBR)
 		return 0;
@@ -2325,7 +2326,37 @@ static int __annotate_noendbr(int type,
 	return 0;
 }
 
-static int __annotate_retpoline_safe(int type, struct instruction *insn)
+static int __annotate_ifc(struct objtool_file *file, int type, struct instruction *insn)
+{
+	unsigned long dest_off;
+
+	if (type != ANNOTYPE_INTRA_FUNCTION_CALLS)
+		return 0;
+
+	if (insn->type != INSN_CALL) {
+		WARN_INSN(insn, "intra_function_call not a direct call");
+		return -1;
+	}
+
+	/*
+	 * Treat intra-function CALLs as JMPs, but with a stack_op.
+	 * See add_call_destinations(), which strips stack_ops from
+	 * normal CALLs.
+	 */
+	insn->type = INSN_JUMP_UNCONDITIONAL;
+
+	dest_off = arch_jump_destination(insn);
+	insn->jump_dest = find_insn(file, insn->sec, dest_off);
+	if (!insn->jump_dest) {
+		WARN_INSN(insn, "can't find call dest at %s+0x%lx",
+			  insn->sec->name, dest_off);
+		return -1;
+	}
+
+	return 0;
+}
+
+static int __annotate_retpoline_safe(struct objtool_file *file, int type, struct instruction *insn)
 {
 	if (type != ANNOTYPE_RETPOLINE_SAFE)
 		return 0;
@@ -2342,7 +2373,7 @@ static int __annotate_retpoline_safe(int
 	return 0;
 }
 
-static int __annotate_instr(int type, struct instruction *insn)
+static int __annotate_instr(struct objtool_file *file, int type, struct instruction *insn)
 {
 	switch (type) {
 	case ANNOTYPE_INSTR_BEGIN:
@@ -2360,7 +2391,7 @@ static int __annotate_instr(int type, st
 	return 0;
 }
 
-static int __annotate_unret(int type, struct instruction *insn)
+static int __annotate_unret(struct objtool_file *file, int type, struct instruction *insn)
 {
 	if (type != ANNOTYPE_UNRET_BEGIN)
 		return 0;
@@ -2370,55 +2401,6 @@ static int __annotate_unret(int type, st
 
 }
 
-static int read_intra_function_calls(struct objtool_file *file)
-{
-	struct instruction *insn;
-	struct section *rsec;
-	struct reloc *reloc;
-
-	rsec = find_section_by_name(file->elf, ".rela.discard.intra_function_calls");
-	if (!rsec)
-		return 0;
-
-	for_each_reloc(rsec, reloc) {
-		unsigned long dest_off;
-
-		if (reloc->sym->type != STT_SECTION) {
-			WARN("unexpected relocation symbol type in %s",
-			     rsec->name);
-			return -1;
-		}
-
-		insn = find_insn(file, reloc->sym->sec, reloc_addend(reloc));
-		if (!insn) {
-			WARN("bad .discard.intra_function_call entry");
-			return -1;
-		}
-
-		if (insn->type != INSN_CALL) {
-			WARN_INSN(insn, "intra_function_call not a direct call");
-			return -1;
-		}
-
-		/*
-		 * Treat intra-function CALLs as JMPs, but with a stack_op.
-		 * See add_call_destinations(), which strips stack_ops from
-		 * normal CALLs.
-		 */
-		insn->type = INSN_JUMP_UNCONDITIONAL;
-
-		dest_off = arch_jump_destination(insn);
-		insn->jump_dest = find_insn(file, insn->sec, dest_off);
-		if (!insn->jump_dest) {
-			WARN_INSN(insn, "can't find call dest at %s+0x%lx",
-				  insn->sec->name, dest_off);
-			return -1;
-		}
-	}
-
-	return 0;
-}
-
 /*
  * Return true if name matches an instrumentation function, where calls to that
  * function from noinstr code can safely be removed, but compilers won't do so.
@@ -2554,7 +2536,7 @@ static int decode_sections(struct objtoo
 	 * Must be before add_call_destination(); it changes INSN_CALL to
 	 * INSN_JUMP.
 	 */
-	ret = read_intra_function_calls(file);
+	ret = read_annotate(file, __annotate_ifc);
 	if (ret)
 		return ret;
 



^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH v2 08/12] objtool: Collapse annotate sequences
  2024-11-11 11:59 [PATCH v2 00/12] x86/kvm/emulate: Avoid RET for FASTOPs Peter Zijlstra
                   ` (6 preceding siblings ...)
  2024-11-11 11:59 ` [PATCH v2 07/12] objtool: Convert ANNOTATE_INTRA_FUNCTION_CALLS " Peter Zijlstra
@ 2024-11-11 11:59 ` Peter Zijlstra
  2024-11-11 11:59 ` [PATCH v2 09/12] x86/nospec: JMP_NOSPEC Peter Zijlstra
                   ` (4 subsequent siblings)
  12 siblings, 0 replies; 38+ messages in thread
From: Peter Zijlstra @ 2024-11-11 11:59 UTC (permalink / raw)
  To: seanjc, pbonzini, jpoimboe, tglx
  Cc: linux-kernel, x86, kvm, jthoughton, Peter Zijlstra (Intel)

Reduce read_annotate() runs by collapsing subsequent runs into a
single call.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 tools/objtool/check.c |   87 ++++++++++++++++++--------------------------------
 1 file changed, 32 insertions(+), 55 deletions(-)

--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -2308,21 +2308,24 @@ static int read_annotate(struct objtool_
 	return 0;
 }
 
-static int __annotate_ignore_alts(struct objtool_file *file, int type, struct instruction *insn)
+static int __annotate_early(struct objtool_file *file, int type, struct instruction *insn)
 {
-	if (type != ANNOTYPE_IGNORE_ALTS)
-		return 0;
+	switch (type) {
+	case ANNOTYPE_IGNORE_ALTS:
+		insn->ignore_alts = true;
+		break;
 
-	insn->ignore_alts = true;
-	return 0;
-}
+	/*
+	 * Must be before read_unwind_hints() since that needs insn->noendbr.
+	 */
+	case ANNOTYPE_NOENDBR:
+		insn->noendbr = 1;
+		break;
 
-static int __annotate_noendbr(struct objtool_file *file, int type, struct instruction *insn)
-{
-	if (type != ANNOTYPE_NOENDBR)
-		return 0;
+	default:
+		break;
+	}
 
-	insn->noendbr = 1;
 	return 0;
 }
 
@@ -2356,26 +2359,21 @@ static int __annotate_ifc(struct objtool
 	return 0;
 }
 
-static int __annotate_retpoline_safe(struct objtool_file *file, int type, struct instruction *insn)
+static int __annotate_late(struct objtool_file *file, int type, struct instruction *insn)
 {
-	if (type != ANNOTYPE_RETPOLINE_SAFE)
-		return 0;
-
-	if (insn->type != INSN_JUMP_DYNAMIC &&
-	    insn->type != INSN_CALL_DYNAMIC &&
-	    insn->type != INSN_RETURN &&
-	    insn->type != INSN_NOP) {
-		WARN_INSN(insn, "retpoline_safe hint not an indirect jump/call/ret/nop");
-		return -1;
-	}
+	switch (type) {
+	case ANNOTYPE_RETPOLINE_SAFE:
+		if (insn->type != INSN_JUMP_DYNAMIC &&
+		    insn->type != INSN_CALL_DYNAMIC &&
+		    insn->type != INSN_RETURN &&
+		    insn->type != INSN_NOP) {
+			WARN_INSN(insn, "retpoline_safe hint not an indirect jump/call/ret/nop");
+			return -1;
+		}
 
-	insn->retpoline_safe = true;
-	return 0;
-}
+		insn->retpoline_safe = true;
+		break;
 
-static int __annotate_instr(struct objtool_file *file, int type, struct instruction *insn)
-{
-	switch (type) {
 	case ANNOTYPE_INSTR_BEGIN:
 		insn->instr++;
 		break;
@@ -2384,6 +2382,10 @@ static int __annotate_instr(struct objto
 		insn->instr--;
 		break;
 
+	case ANNOTYPE_UNRET_BEGIN:
+		insn->unret = 1;
+		break;
+
 	default:
 		break;
 	}
@@ -2391,16 +2393,6 @@ static int __annotate_instr(struct objto
 	return 0;
 }
 
-static int __annotate_unret(struct objtool_file *file, int type, struct instruction *insn)
-{
-	if (type != ANNOTYPE_UNRET_BEGIN)
-		return 0;
-
-	insn->unret = 1;
-	return 0;
-
-}
-
 /*
  * Return true if name matches an instrumentation function, where calls to that
  * function from noinstr code can safely be removed, but compilers won't do so.
@@ -2507,14 +2499,7 @@ static int decode_sections(struct objtoo
 	add_ignores(file);
 	add_uaccess_safe(file);
 
-	ret = read_annotate(file, __annotate_ignore_alts);
-	if (ret)
-		return ret;
-
-	/*
-	 * Must be before read_unwind_hints() since that needs insn->noendbr.
-	 */
-	ret = read_annotate(file, __annotate_noendbr);
+	ret = read_annotate(file, __annotate_early);
 	if (ret)
 		return ret;
 
@@ -2560,15 +2545,7 @@ static int decode_sections(struct objtoo
 	if (ret)
 		return ret;
 
-	ret = read_annotate(file, __annotate_retpoline_safe);
-	if (ret)
-		return ret;
-
-	ret = read_annotate(file, __annotate_instr);
-	if (ret)
-		return ret;
-
-	ret = read_annotate(file, __annotate_unret);
+	ret = read_annotate(file, __annotate_late);
 	if (ret)
 		return ret;
 



^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH v2 09/12] x86/nospec: JMP_NOSPEC
  2024-11-11 11:59 [PATCH v2 00/12] x86/kvm/emulate: Avoid RET for FASTOPs Peter Zijlstra
                   ` (7 preceding siblings ...)
  2024-11-11 11:59 ` [PATCH v2 08/12] objtool: Collapse annotate sequences Peter Zijlstra
@ 2024-11-11 11:59 ` Peter Zijlstra
  2024-11-11 11:59 ` [PATCH v2 10/12] x86,nospec: Simplify {JMP,CALL}_NOSPEC (part 2) Peter Zijlstra
                   ` (3 subsequent siblings)
  12 siblings, 0 replies; 38+ messages in thread
From: Peter Zijlstra @ 2024-11-11 11:59 UTC (permalink / raw)
  To: seanjc, pbonzini, jpoimboe, tglx
  Cc: linux-kernel, x86, kvm, jthoughton, Peter Zijlstra (Intel)


Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/include/asm/nospec-branch.h |   32 ++++++++++++++++++++++++++++++++
 1 file changed, 32 insertions(+)

--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -403,6 +403,17 @@ static inline void call_depth_return_thu
 	"call *%[thunk_target]\n",				\
 	X86_FEATURE_RETPOLINE_LFENCE)
 
+# define JMP_NOSPEC						\
+	ALTERNATIVE_2(						\
+	ANNOTATE_RETPOLINE_SAFE					\
+	"jmp *%[thunk_target]\n",				\
+	"jmp __x86_indirect_thunk_%V[thunk_target]\n",		\
+	X86_FEATURE_RETPOLINE,					\
+	"lfence;\n"						\
+	ANNOTATE_RETPOLINE_SAFE					\
+	"jmp *%[thunk_target]\n",				\
+	X86_FEATURE_RETPOLINE_LFENCE)
+
 # define THUNK_TARGET(addr) [thunk_target] "r" (addr)
 
 #else /* CONFIG_X86_32 */
@@ -433,10 +444,31 @@ static inline void call_depth_return_thu
 	"call *%[thunk_target]\n",				\
 	X86_FEATURE_RETPOLINE_LFENCE)
 
+# define JMP_NOSPEC						\
+	ALTERNATIVE_2(						\
+	ANNOTATE_RETPOLINE_SAFE					\
+	"jmp *%[thunk_target]\n",				\
+	"       jmp    901f;\n"					\
+	"       .align 16\n"					\
+	"901:	call   903f;\n"					\
+	"902:	pause;\n"					\
+	"       lfence;\n"					\
+	"       jmp    902b;\n"					\
+	"       .align 16\n"					\
+	"903:	lea    4(%%esp), %%esp;\n"			\
+	"       pushl  %[thunk_target];\n"			\
+	"       ret;\n",					\
+	X86_FEATURE_RETPOLINE,					\
+	"lfence;\n"						\
+	ANNOTATE_RETPOLINE_SAFE					\
+	"jmp *%[thunk_target]\n",				\
+	X86_FEATURE_RETPOLINE_LFENCE)
+
 # define THUNK_TARGET(addr) [thunk_target] "rm" (addr)
 #endif
 #else /* No retpoline for C / inline asm */
 # define CALL_NOSPEC "call *%[thunk_target]\n"
+# define JMP_NOSPEC "jmp *%[thunk_target]\n"
 # define THUNK_TARGET(addr) [thunk_target] "rm" (addr)
 #endif
 



^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH v2 10/12] x86,nospec: Simplify {JMP,CALL}_NOSPEC (part 2)
  2024-11-11 11:59 [PATCH v2 00/12] x86/kvm/emulate: Avoid RET for FASTOPs Peter Zijlstra
                   ` (8 preceding siblings ...)
  2024-11-11 11:59 ` [PATCH v2 09/12] x86/nospec: JMP_NOSPEC Peter Zijlstra
@ 2024-11-11 11:59 ` Peter Zijlstra
  2024-11-15 18:40   ` Josh Poimboeuf
  2024-11-11 11:59 ` [PATCH v2 11/12] x86/kvm/emulate: Implement test_cc() in C Peter Zijlstra
                   ` (2 subsequent siblings)
  12 siblings, 1 reply; 38+ messages in thread
From: Peter Zijlstra @ 2024-11-11 11:59 UTC (permalink / raw)
  To: seanjc, pbonzini, jpoimboe, tglx
  Cc: linux-kernel, x86, kvm, jthoughton, Peter Zijlstra (Intel)

Counterpart to 09d09531a51a ("x86,nospec: Simplify
{JMP,CALL}_NOSPEC"), x86_64 will rewrite all this anyway, see
apply_retpoline.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/include/asm/nospec-branch.h |   29 +++++++++++------------------
 1 file changed, 11 insertions(+), 18 deletions(-)

--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -429,31 +429,24 @@ static inline void call_depth_return_thu
 
 #ifdef CONFIG_X86_64
 
+#define __CS_PREFIX						\
+	".irp rs,r8,r9,r10,r11,r12,r13,r14,r15\n"		\
+	".ifc %V[thunk_target],\\rs\n"				\
+	".byte 0x2e\n"						\
+	".endif\n"						\
+	".endr\n"
+
 /*
  * Inline asm uses the %V modifier which is only in newer GCC
  * which is ensured when CONFIG_MITIGATION_RETPOLINE is defined.
  */
 # define CALL_NOSPEC						\
-	ALTERNATIVE_2(						\
-	ANNOTATE_RETPOLINE_SAFE					\
-	"call *%[thunk_target]\n",				\
-	"call __x86_indirect_thunk_%V[thunk_target]\n",		\
-	X86_FEATURE_RETPOLINE,					\
-	"lfence;\n"						\
-	ANNOTATE_RETPOLINE_SAFE					\
-	"call *%[thunk_target]\n",				\
-	X86_FEATURE_RETPOLINE_LFENCE)
+	__CS_PREFIX						\
+	"call __x86_indirect_thunk_%V[thunk_target]\n"
 
 # define JMP_NOSPEC						\
-	ALTERNATIVE_2(						\
-	ANNOTATE_RETPOLINE_SAFE					\
-	"jmp *%[thunk_target]\n",				\
-	"jmp __x86_indirect_thunk_%V[thunk_target]\n",		\
-	X86_FEATURE_RETPOLINE,					\
-	"lfence;\n"						\
-	ANNOTATE_RETPOLINE_SAFE					\
-	"jmp *%[thunk_target]\n",				\
-	X86_FEATURE_RETPOLINE_LFENCE)
+	__CS_PREFIX						\
+	"jmp __x86_indirect_thunk_%V[thunk_target]\n"
 
 # define THUNK_TARGET(addr) [thunk_target] "r" (addr)
 



^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH v2 11/12] x86/kvm/emulate: Implement test_cc() in C
  2024-11-11 11:59 [PATCH v2 00/12] x86/kvm/emulate: Avoid RET for FASTOPs Peter Zijlstra
                   ` (9 preceding siblings ...)
  2024-11-11 11:59 ` [PATCH v2 10/12] x86,nospec: Simplify {JMP,CALL}_NOSPEC (part 2) Peter Zijlstra
@ 2024-11-11 11:59 ` Peter Zijlstra
  2024-11-11 17:13   ` Sean Christopherson
  2024-11-11 11:59 ` [PATCH v2 12/12] x86/kvm/emulate: Avoid RET for fastops Peter Zijlstra
  2024-11-11 17:27 ` [PATCH v2 00/12] x86/kvm/emulate: Avoid RET for FASTOPs Sean Christopherson
  12 siblings, 1 reply; 38+ messages in thread
From: Peter Zijlstra @ 2024-11-11 11:59 UTC (permalink / raw)
  To: seanjc, pbonzini, jpoimboe, tglx
  Cc: linux-kernel, x86, kvm, jthoughton, Peter Zijlstra (Intel)

Current test_cc() uses the fastop infrastructure to test flags using
SETcc instructions. However, int3_emulate_jcc() already fully
implements the flags->CC mapping, use that.

Removes a pile of gnarly asm.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/include/asm/text-patching.h |   20 +++++++++++++-------
 arch/x86/kvm/emulate.c               |   34 ++--------------------------------
 2 files changed, 15 insertions(+), 39 deletions(-)

--- a/arch/x86/include/asm/text-patching.h
+++ b/arch/x86/include/asm/text-patching.h
@@ -176,9 +176,9 @@ void int3_emulate_ret(struct pt_regs *re
 }
 
 static __always_inline
-void int3_emulate_jcc(struct pt_regs *regs, u8 cc, unsigned long ip, unsigned long disp)
+bool __emulate_cc(unsigned long flags, u8 cc)
 {
-	static const unsigned long jcc_mask[6] = {
+	static const unsigned long cc_mask[6] = {
 		[0] = X86_EFLAGS_OF,
 		[1] = X86_EFLAGS_CF,
 		[2] = X86_EFLAGS_ZF,
@@ -191,15 +191,21 @@ void int3_emulate_jcc(struct pt_regs *re
 	bool match;
 
 	if (cc < 0xc) {
-		match = regs->flags & jcc_mask[cc >> 1];
+		match = flags & cc_mask[cc >> 1];
 	} else {
-		match = ((regs->flags & X86_EFLAGS_SF) >> X86_EFLAGS_SF_BIT) ^
-			((regs->flags & X86_EFLAGS_OF) >> X86_EFLAGS_OF_BIT);
+		match = ((flags & X86_EFLAGS_SF) >> X86_EFLAGS_SF_BIT) ^
+			((flags & X86_EFLAGS_OF) >> X86_EFLAGS_OF_BIT);
 		if (cc >= 0xe)
-			match = match || (regs->flags & X86_EFLAGS_ZF);
+			match = match || (flags & X86_EFLAGS_ZF);
 	}
 
-	if ((match && !invert) || (!match && invert))
+	return (match && !invert) || (!match && invert);
+}
+
+static __always_inline
+void int3_emulate_jcc(struct pt_regs *regs, u8 cc, unsigned long ip, unsigned long disp)
+{
+	if (__emulate_cc(regs->flags, cc))
 		ip += disp;
 
 	int3_emulate_jmp(regs, ip);
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -26,6 +26,7 @@
 #include <asm/debugreg.h>
 #include <asm/nospec-branch.h>
 #include <asm/ibt.h>
+#include <asm/text-patching.h>
 
 #include "x86.h"
 #include "tss.h"
@@ -416,31 +417,6 @@ static int fastop(struct x86_emulate_ctx
 	ON64(FOP3E(op##q, rax, rdx, cl)) \
 	FOP_END
 
-/* Special case for SETcc - 1 instruction per cc */
-#define FOP_SETCC(op) \
-	FOP_FUNC(op) \
-	#op " %al \n\t" \
-	FOP_RET(op)
-
-FOP_START(setcc)
-FOP_SETCC(seto)
-FOP_SETCC(setno)
-FOP_SETCC(setc)
-FOP_SETCC(setnc)
-FOP_SETCC(setz)
-FOP_SETCC(setnz)
-FOP_SETCC(setbe)
-FOP_SETCC(setnbe)
-FOP_SETCC(sets)
-FOP_SETCC(setns)
-FOP_SETCC(setp)
-FOP_SETCC(setnp)
-FOP_SETCC(setl)
-FOP_SETCC(setnl)
-FOP_SETCC(setle)
-FOP_SETCC(setnle)
-FOP_END;
-
 FOP_START(salc)
 FOP_FUNC(salc)
 "pushf; sbb %al, %al; popf \n\t"
@@ -1064,13 +1040,7 @@ static int em_bsr_c(struct x86_emulate_c
 
 static __always_inline u8 test_cc(unsigned int condition, unsigned long flags)
 {
-	u8 rc;
-	void (*fop)(void) = (void *)em_setcc + FASTOP_SIZE * (condition & 0xf);
-
-	flags = (flags & EFLAGS_MASK) | X86_EFLAGS_IF;
-	asm("push %[flags]; popf; " CALL_NOSPEC
-	    : "=a"(rc), ASM_CALL_CONSTRAINT : [thunk_target]"r"(fop), [flags]"r"(flags));
-	return rc;
+	return __emulate_cc(flags, condition & 0xf);
 }
 
 static void fetch_register_operand(struct operand *op)



^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH v2 12/12] x86/kvm/emulate: Avoid RET for fastops
  2024-11-11 11:59 [PATCH v2 00/12] x86/kvm/emulate: Avoid RET for FASTOPs Peter Zijlstra
                   ` (10 preceding siblings ...)
  2024-11-11 11:59 ` [PATCH v2 11/12] x86/kvm/emulate: Implement test_cc() in C Peter Zijlstra
@ 2024-11-11 11:59 ` Peter Zijlstra
  2024-11-11 16:27   ` Peter Zijlstra
                     ` (2 more replies)
  2024-11-11 17:27 ` [PATCH v2 00/12] x86/kvm/emulate: Avoid RET for FASTOPs Sean Christopherson
  12 siblings, 3 replies; 38+ messages in thread
From: Peter Zijlstra @ 2024-11-11 11:59 UTC (permalink / raw)
  To: seanjc, pbonzini, jpoimboe, tglx
  Cc: linux-kernel, x86, kvm, jthoughton, Peter Zijlstra (Intel)

Since there is only a single fastop() function, convert the FASTOP
stuff from CALL_NOSPEC+RET to JMP_NOSPEC+JMP, avoiding the return
thunks and all that jazz.

Specifically FASTOPs rely on the return thunk to preserve EFLAGS,
which not all of them can trivially do (call depth tracing suffers
here).

Objtool strenuously complains about this:

 - indirect call without a .rodata, fails to determine JUMP_TABLE,
   annotate
 - fastop functions fall through, exception
 - unreachable instruction after fastop_return, save/restore

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/kvm/emulate.c              |   20 +++++++++++++++-----
 include/linux/objtool_types.h       |    1 +
 tools/include/linux/objtool_types.h |    1 +
 tools/objtool/check.c               |   11 ++++++++++-
 4 files changed, 27 insertions(+), 6 deletions(-)

--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -285,8 +285,8 @@ static void invalidate_registers(struct
  * different operand sizes can be reached by calculation, rather than a jump
  * table (which would be bigger than the code).
  *
- * The 16 byte alignment, considering 5 bytes for the RET thunk, 3 for ENDBR
- * and 1 for the straight line speculation INT3, leaves 7 bytes for the
+ * The 16 byte alignment, considering 5 bytes for the JMP, 4 for ENDBR
+ * and 1 for the straight line speculation INT3, leaves 6 bytes for the
  * body of the function.  Currently none is larger than 4.
  */
 static int fastop(struct x86_emulate_ctxt *ctxt, fastop_t fop);
@@ -304,7 +304,7 @@ static int fastop(struct x86_emulate_ctx
 	__FOP_FUNC(#name)
 
 #define __FOP_RET(name) \
-	"11: " ASM_RET \
+	"11: jmp fastop_return; int3 \n\t" \
 	".size " name ", .-" name "\n\t"
 
 #define FOP_RET(name) \
@@ -5071,14 +5071,24 @@ static void fetch_possible_mmx_operand(s
 		kvm_read_mmx_reg(op->addr.mm, &op->mm_val);
 }
 
-static int fastop(struct x86_emulate_ctxt *ctxt, fastop_t fop)
+/*
+ * All the FASTOP magic above relies on there being *one* instance of this
+ * so it can JMP back, avoiding RET and it's various thunks.
+ */
+static noinline int fastop(struct x86_emulate_ctxt *ctxt, fastop_t fop)
 {
 	ulong flags = (ctxt->eflags & EFLAGS_MASK) | X86_EFLAGS_IF;
 
 	if (!(ctxt->d & ByteOp))
 		fop += __ffs(ctxt->dst.bytes) * FASTOP_SIZE;
 
-	asm("push %[flags]; popf; " CALL_NOSPEC " ; pushf; pop %[flags]\n"
+	asm("push %[flags]; popf \n\t"
+	    UNWIND_HINT(UNWIND_HINT_TYPE_SAVE, 0, 0, 0)
+	    ASM_ANNOTATE(ANNOTYPE_JUMP_TABLE)
+	    JMP_NOSPEC
+	    "fastop_return: \n\t"
+	    UNWIND_HINT(UNWIND_HINT_TYPE_RESTORE, 0, 0, 0)
+	    "pushf; pop %[flags]\n"
 	    : "+a"(ctxt->dst.val), "+d"(ctxt->src.val), [flags]"+D"(flags),
 	      [thunk_target]"+S"(fop), ASM_CALL_CONSTRAINT
 	    : "c"(ctxt->src2.val));
--- a/include/linux/objtool_types.h
+++ b/include/linux/objtool_types.h
@@ -64,5 +64,6 @@ struct unwind_hint {
 #define ANNOTYPE_UNRET_BEGIN		5
 #define ANNOTYPE_IGNORE_ALTS		6
 #define ANNOTYPE_INTRA_FUNCTION_CALLS	7
+#define ANNOTYPE_JUMP_TABLE		8
 
 #endif /* _LINUX_OBJTOOL_TYPES_H */
--- a/tools/include/linux/objtool_types.h
+++ b/tools/include/linux/objtool_types.h
@@ -64,5 +64,6 @@ struct unwind_hint {
 #define ANNOTYPE_UNRET_BEGIN		5
 #define ANNOTYPE_IGNORE_ALTS		6
 #define ANNOTYPE_INTRA_FUNCTION_CALLS	7
+#define ANNOTYPE_JUMP_TABLE		8
 
 #endif /* _LINUX_OBJTOOL_TYPES_H */
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -2386,6 +2386,14 @@ static int __annotate_late(struct objtoo
 		insn->unret = 1;
 		break;
 
+	/*
+	 * Must be after add_jump_table(); for it doesn't set a sane
+	 * _jump_table value.
+	 */
+	case ANNOTYPE_JUMP_TABLE:
+		insn->_jump_table = (void *)1;
+		break;
+
 	default:
 		break;
 	}
@@ -3459,7 +3467,8 @@ static int validate_branch(struct objtoo
 		if (func && insn_func(insn) && func != insn_func(insn)->pfunc) {
 			/* Ignore KCFI type preambles, which always fall through */
 			if (!strncmp(func->name, "__cfi_", 6) ||
-			    !strncmp(func->name, "__pfx_", 6))
+			    !strncmp(func->name, "__pfx_", 6) ||
+			    !strcmp(insn_func(insn)->name, "fastop"))
 				return 0;
 
 			WARN("%s() falls through to next function %s()",



^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2 12/12] x86/kvm/emulate: Avoid RET for fastops
  2024-11-11 11:59 ` [PATCH v2 12/12] x86/kvm/emulate: Avoid RET for fastops Peter Zijlstra
@ 2024-11-11 16:27   ` Peter Zijlstra
  2024-11-11 17:26   ` Sean Christopherson
  2024-11-15 18:41   ` Josh Poimboeuf
  2 siblings, 0 replies; 38+ messages in thread
From: Peter Zijlstra @ 2024-11-11 16:27 UTC (permalink / raw)
  To: seanjc, pbonzini, jpoimboe, tglx
  Cc: linux-kernel, x86, kvm, jthoughton, andrew.cooper3

On Mon, Nov 11, 2024 at 12:59:47PM +0100, Peter Zijlstra wrote:

> +/*
> + * All the FASTOP magic above relies on there being *one* instance of this
> + * so it can JMP back, avoiding RET and it's various thunks.
> + */
> +static noinline int fastop(struct x86_emulate_ctxt *ctxt, fastop_t fop)
>  {
>  	ulong flags = (ctxt->eflags & EFLAGS_MASK) | X86_EFLAGS_IF;
>  
>  	if (!(ctxt->d & ByteOp))
>  		fop += __ffs(ctxt->dst.bytes) * FASTOP_SIZE;
>  
> -	asm("push %[flags]; popf; " CALL_NOSPEC " ; pushf; pop %[flags]\n"
> +	asm("push %[flags]; popf \n\t"
> +	    UNWIND_HINT(UNWIND_HINT_TYPE_SAVE, 0, 0, 0)
> +	    ASM_ANNOTATE(ANNOTYPE_JUMP_TABLE)
> +	    JMP_NOSPEC
> +	    "fastop_return: \n\t"
> +	    UNWIND_HINT(UNWIND_HINT_TYPE_RESTORE, 0, 0, 0)
> +	    "pushf; pop %[flags]\n"
>  	    : "+a"(ctxt->dst.val), "+d"(ctxt->src.val), [flags]"+D"(flags),
>  	      [thunk_target]"+S"(fop), ASM_CALL_CONSTRAINT
>  	    : "c"(ctxt->src2.val));

Do Andrew is telling me the compiler is free to mess this up... Notably:

  https://github.com/llvm/llvm-project/issues/92161

In lieu of that, I wrote the below hack. It makes objtool sad (it don't
like STT_FUNC calling STT_NOTYPE), but it should work if we ever run
into the compiler being daft like that (it should fail to compile
because of the duplicate fastop_return label, so it's not silent
failure).

Wear protective eye gear before continuing...

---
--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -429,9 +429,9 @@ static inline void call_depth_return_thu
 
 #ifdef CONFIG_X86_64
 
-#define __CS_PREFIX						\
+#define __CS_PREFIX(reg)					\
 	".irp rs,r8,r9,r10,r11,r12,r13,r14,r15\n"		\
-	".ifc %V[thunk_target],\\rs\n"				\
+	".ifc " reg ",\\rs\n"					\
 	".byte 0x2e\n"						\
 	".endif\n"						\
 	".endr\n"
@@ -441,12 +441,12 @@ static inline void call_depth_return_thu
  * which is ensured when CONFIG_MITIGATION_RETPOLINE is defined.
  */
 # define CALL_NOSPEC						\
-	__CS_PREFIX						\
+	__CS_PREFIX("%V[thunk_target]")				\
 	"call __x86_indirect_thunk_%V[thunk_target]\n"
 
-# define JMP_NOSPEC						\
-	__CS_PREFIX						\
-	"jmp __x86_indirect_thunk_%V[thunk_target]\n"
+# define __JMP_NOSPEC(reg)					\
+	__CS_PREFIX(reg)					\
+	"jmp __x86_indirect_thunk_" reg "\n"
 
 # define THUNK_TARGET(addr) [thunk_target] "r" (addr)
 
@@ -478,10 +478,10 @@ static inline void call_depth_return_thu
 	"call *%[thunk_target]\n",				\
 	X86_FEATURE_RETPOLINE_LFENCE)
 
-# define JMP_NOSPEC						\
+# define __JMP_NOSPEC(reg)					\
 	ALTERNATIVE_2(						\
 	ANNOTATE_RETPOLINE_SAFE					\
-	"jmp *%[thunk_target]\n",				\
+	"jmp *%%" reg "\n",					\
 	"       jmp    901f;\n"					\
 	"       .align 16\n"					\
 	"901:	call   903f;\n"					\
@@ -490,22 +490,25 @@ static inline void call_depth_return_thu
 	"       jmp    902b;\n"					\
 	"       .align 16\n"					\
 	"903:	lea    4(%%esp), %%esp;\n"			\
-	"       pushl  %[thunk_target];\n"			\
+	"       pushl  %%" reg "\n"				\
 	"       ret;\n",					\
 	X86_FEATURE_RETPOLINE,					\
 	"lfence;\n"						\
 	ANNOTATE_RETPOLINE_SAFE					\
-	"jmp *%[thunk_target]\n",				\
+	"jmp *%%" reg "\n",					\
 	X86_FEATURE_RETPOLINE_LFENCE)
 
 # define THUNK_TARGET(addr) [thunk_target] "rm" (addr)
 #endif
+
 #else /* No retpoline for C / inline asm */
 # define CALL_NOSPEC "call *%[thunk_target]\n"
-# define JMP_NOSPEC "jmp *%[thunk_target]\n"
+# define __JMP_NOSPEC(reg) "jmp *%%" reg "\n"
 # define THUNK_TARGET(addr) [thunk_target] "rm" (addr)
 #endif
 
+# define JMP_NOSPEC __JMP_NOSPEC("%V[thunk_target]")
+
 /* The Spectre V2 mitigation variants */
 enum spectre_v2_mitigation {
 	SPECTRE_V2_NONE,
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -5039,23 +5039,45 @@ static void fetch_possible_mmx_operand(s
 }
 
 /*
+ * Stub written in asm in order to ensure GCC doesn't duplicate the
+ * fastop_return: label.
+ *
+ * Custom calling convention.
+ *
+ * __fastop:
+ * ax = ctxt->dst.val
+ * dx = ctxt->src.val
+ * cx = ctxt->src.val2
+ * di = flags
+ * si = fop
+ */
+asm (ASM_FUNC_ALIGN
+     "__fastop: \n\t"
+     "push %" _ASM_DI "\n\t"
+     "popf \n\t"
+     UNWIND_HINT(UNWIND_HINT_TYPE_SAVE, 0, 0, 0)
+     ASM_ANNOTATE(ANNOTYPE_JUMP_TABLE)
+     __JMP_NOSPEC(_ASM_SI)
+     "fastop_return: \n\t"
+     UNWIND_HINT(UNWIND_HINT_TYPE_RESTORE, 0, 0, 0)
+     "pushf \n\t"
+     "pop %" _ASM_DI "\n\t"
+     ASM_RET
+     ".type __fastop, @notype \n\t"
+     ".size __fastop, . - __fastop \n\t");
+
+/*
  * All the FASTOP magic above relies on there being *one* instance of this
  * so it can JMP back, avoiding RET and it's various thunks.
  */
-static noinline int fastop(struct x86_emulate_ctxt *ctxt, fastop_t fop)
+static int fastop(struct x86_emulate_ctxt *ctxt, fastop_t fop)
 {
 	ulong flags = (ctxt->eflags & EFLAGS_MASK) | X86_EFLAGS_IF;
 
 	if (!(ctxt->d & ByteOp))
 		fop += __ffs(ctxt->dst.bytes) * FASTOP_SIZE;
 
-	asm("push %[flags]; popf \n\t"
-	    UNWIND_HINT(UNWIND_HINT_TYPE_SAVE, 0, 0, 0)
-	    ASM_ANNOTATE(ANNOTYPE_JUMP_TABLE)
-	    JMP_NOSPEC
-	    "fastop_return: \n\t"
-	    UNWIND_HINT(UNWIND_HINT_TYPE_RESTORE, 0, 0, 0)
-	    "pushf; pop %[flags]\n"
+	asm("call __fastop"
 	    : "+a"(ctxt->dst.val), "+d"(ctxt->src.val), [flags]"+D"(flags),
 	      [thunk_target]"+S"(fop), ASM_CALL_CONSTRAINT
 	    : "c"(ctxt->src2.val));

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2 11/12] x86/kvm/emulate: Implement test_cc() in C
  2024-11-11 11:59 ` [PATCH v2 11/12] x86/kvm/emulate: Implement test_cc() in C Peter Zijlstra
@ 2024-11-11 17:13   ` Sean Christopherson
  0 siblings, 0 replies; 38+ messages in thread
From: Sean Christopherson @ 2024-11-11 17:13 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: pbonzini, jpoimboe, tglx, linux-kernel, x86, kvm, jthoughton

Can you use "KVM: x86" for the scope?  "x86/kvm" is used for guest changes, i.e.
for paravirt code when running as a KVM guest.

On Mon, Nov 11, 2024, Peter Zijlstra wrote:
> Current test_cc() uses the fastop infrastructure to test flags using
> SETcc instructions. However, int3_emulate_jcc() already fully
> implements the flags->CC mapping, use that.

I think it's worth presenting this as a revert of sorts, even though it's not a
strict revert.  KVM also emulated jcc-like operations in software prior to commit
9ae9febae950 ("KVM: x86 emulator: covert SETCC to fastop"), i.e. the fastop
madness was introduced for performance reasons, not because writing the code was
hard.

Looking at the output of the fastop versus __emulate_cc(), the worst case cost is
two extra conditional branches.  Assuming that eliminating the test_cc() fastop
code avoids having to add another special case in objtool, I am a-ok with the
tradeoff.  Especially since emulating instructions that use test_cc() is largely
limited to older hardware running guest firmware that uses (emulated) SMM, and
maybe a few bespoke use cases.  E.g. I get literally zero hits on test_cc() when
booting a 24 vCPU VM (64-bit or 32-bit kernel) with EPT and unrestricted guest
disabled, as the OVMF build I use doesn't rely on SMM.

And FWIW, a straight revert appears to generate worse code.  I see no reason to
bring it back.

With a massaged shortlog+changelog,

Acked-by: Sean Christopherson <seanjc@google.com>


fastop:
   0x0000000000042c41 <+1537>:  movzbl 0x61(%rbp),%eax
   0x0000000000042c45 <+1541>:  mov    0x10(%rbp),%rdx
   0x0000000000042c49 <+1545>:  shl    $0x4,%rax
   0x0000000000042c4d <+1549>:  and    $0xf0,%eax
   0x0000000000042c52 <+1554>:  and    $0x8d5,%edx
   0x0000000000042c58 <+1560>:  or     $0x2,%dh
   0x0000000000042c5b <+1563>:  add    $0x0,%rax
   0x0000000000042c61 <+1569>:  push   %rdx
   0x0000000000042c62 <+1570>:  popf   
   0x0000000000042c63 <+1571>:  call   0x42c68 <x86_emulate_insn+1576>
   0x0000000000042c68 <+1576>:  mov    %eax,%edx
   0x0000000000042c6a <+1578>:  xor    %eax,%eax
   0x0000000000042c6c <+1580>:  test   %dl,%dl
   0x0000000000042c6e <+1582>:  jne    0x42f24 <x86_emulate_insn+2276>

__emulate_cc:
   0x0000000000042b95 <+1541>:	movzbl 0x61(%rbp),%eax
   0x0000000000042b99 <+1545>:	mov    0x10(%rbp),%rcx
   0x0000000000042b9d <+1549>:	and    $0xf,%eax
   0x0000000000042ba0 <+1552>:	cmp    $0xb,%al
   0x0000000000042ba2 <+1554>:	ja     0x42e90 <x86_emulate_insn+2304>
        0x0000000000042e90 <+2304>:  mov    %rcx,%rdx
        0x0000000000042e93 <+2307>:  mov    %rcx,%rsi
        0x0000000000042e96 <+2310>:  shr    $0x7,%rdx
        0x0000000000042e9a <+2314>:  shr    $0xb,%rsi
        0x0000000000042e9e <+2318>:  xor    %rsi,%rdx
        0x0000000000042ea1 <+2321>:  cmp    $0xd,%al
        0x0000000000042ea3 <+2323>:  ja     0x4339a <x86_emulate_insn+3594>
                0x000000000004339a <+3594>:  and    $0x1,%edx
                0x000000000004339d <+3597>:  and    $0x40,%ecx
                0x00000000000433a0 <+3600>:  or     %rcx,%rdx
                0x00000000000433a3 <+3603>:  setne  %dl
                0x00000000000433a6 <+3606>:  jmp    0x42bba <x86_emulate_insn+1578>
        0x0000000000042ea9 <+2329>:  and    $0x1,%edx
        0x0000000000042eac <+2332>:  jmp    0x42bba <x86_emulate_insn+1578>
   0x0000000000042ba8 <+1560>:	mov    %eax,%edx
   0x0000000000042baa <+1562>:	shr    %dl
   0x0000000000042bac <+1564>:	and    $0x7,%edx
   0x0000000000042baf <+1567>:	and    0x0(,%rdx,8),%rcx
   0x0000000000042bb7 <+1575>:	setne  %dl
   0x0000000000042bba <+1578>:	test   $0x1,%al
   0x0000000000042bbc <+1580>:	jne    0x43323 <x86_emulate_insn+3475>
        0x0000000000043323 <+3475>:  test   %dl,%dl
        0x0000000000043325 <+3477>:  jne    0x4332f <x86_emulate_insn+3487>
        0x0000000000043327 <+3479>:  test   $0x1,%al
        0x0000000000043329 <+3481>:  jne    0x42bca <x86_emulate_insn+1594>
        0x000000000004332f <+3487>:  xor    %eax,%eax
        0x0000000000043331 <+3489>:  jmp    0x42be0 <x86_emulate_insn+1616>
   0x0000000000042bc2 <+1586>:	test   %dl,%dl
   0x0000000000042bc4 <+1588>:	je     0x43327 <x86_emulate_insn+3479>
        0x0000000000043327 <+3479>:  test   $0x1,%al
        0x0000000000043329 <+3481>:  jne    0x42bca <x86_emulate_insn+1594>
        0x000000000004332f <+3487>:  xor    %eax,%eax
        0x0000000000043331 <+3489>:  jmp    0x42be0 <x86_emulate_insn+1616>
   0x0000000000042bca <+1594>:	movslq 0xd0(%rbp),%rsi
   0x0000000000042bd1 <+1601>:	mov    %rbp,%rdi
   0x0000000000042bd4 <+1604>:	add    0x90(%rbp),%rsi
   0x0000000000042bdb <+1611>:	call   0x3a7c0 <assign_eip>

revert:
   0x0000000000042bc9 <+1545>:  movzbl 0x61(%rbp),%esi
   0x0000000000042bcd <+1549>:  mov    0x10(%rbp),%rdx
   0x0000000000042bd1 <+1553>:  mov    %esi,%eax
   0x0000000000042bd3 <+1555>:  shr    %eax
   0x0000000000042bd5 <+1557>:  and    $0x7,%eax
   0x0000000000042bd8 <+1560>:  cmp    $0x4,%eax
   0x0000000000042bdb <+1563>:  je     0x42ed4 <x86_emulate_insn+2324>
        0x0000000000042ed4 <+2324>:  mov    %edx,%ecx
        0x0000000000042ed6 <+2326>:  and    $0x80,%ecx
        0x0000000000042edc <+2332>:  jmp    0x42bff <x86_emulate_insn+1599>
   0x0000000000042be1 <+1569>:  ja     0x43225 <x86_emulate_insn+3173>
        0x0000000000043225 <+3173>:  cmp    $0x6,%eax
        0x0000000000043228 <+3176>:  je     0x434ec <x86_emulate_insn+3884>
                0x00000000000434ec <+3884>:  xor    %edi,%edi
                0x00000000000434ee <+3886>:  jmp    0x43238 <x86_emulate_insn+3192>
        0x000000000004322e <+3182>:  mov    %edx,%edi
        0x0000000000043230 <+3184>:  and    $0x40,%edi
        0x0000000000043233 <+3187>:  cmp    $0x7,%eax
        0x0000000000043236 <+3190>:  jne    0x4325c <x86_emulate_insn+3228>
                0x000000000004325c <+3228>:  mov    %edx,%ecx
                0x000000000004325e <+3230>:  and    $0x4,%ecx
                0x0000000000043261 <+3233>:  cmp    $0x5,%eax
                0x0000000000043264 <+3236>:  je     0x42bff <x86_emulate_insn+1599>
                0x000000000004326a <+3242>:  jmp    0x43218 <x86_emulate_insn+3160>
        0x0000000000043238 <+3192>:  mov    %rdx,%rcx
        0x000000000004323b <+3195>:  shr    $0xb,%rdx
        0x000000000004323f <+3199>:  shr    $0x7,%rcx
        0x0000000000043243 <+3203>:  xor    %edx,%ecx
        0x0000000000043245 <+3205>:  and    $0x1,%ecx
        0x0000000000043248 <+3208>:  or     %edi,%ecx
        0x000000000004324a <+3210>:  jmp    0x42bff <x86_emulate_insn+1599>
   0x0000000000042be7 <+1575>:  mov    %edx,%ecx
   0x0000000000042be9 <+1577>:  and    $0x40,%ecx
   0x0000000000042bec <+1580>:  cmp    $0x2,%eax
   0x0000000000042bef <+1583>:  je     0x42bff <x86_emulate_insn+1599>
   0x0000000000042bf1 <+1585>:  mov    %edx,%ecx
   0x0000000000042bf3 <+1587>:  and    $0x41,%ecx
   0x0000000000042bf6 <+1590>:  cmp    $0x3,%eax
   0x0000000000042bf9 <+1593>:  jne    0x4320a <x86_emulate_insn+3146>
        0x000000000004320a <+3146>:  mov    %edx,%ecx
        0x000000000004320c <+3148>:  and    $0x1,%ecx
        0x000000000004320f <+3151>:  cmp    $0x1,%eax
        0x0000000000043212 <+3154>:  je     0x42bff <x86_emulate_insn+1599>
        0x0000000000043218 <+3160>:  mov    %edx,%ecx
        0x000000000004321a <+3162>:  and    $0x800,%ecx
        0x0000000000043220 <+3168>:  jmp    0x42bff <x86_emulate_insn+1599>
   0x0000000000042bff <+1599>:  test   %ecx,%ecx
   0x0000000000042c01 <+1601>:  setne  %dl
   0x0000000000042c04 <+1604>:  and    $0x1,%esi
   0x0000000000042c07 <+1607>:  xor    %eax,%eax
   0x0000000000042c09 <+1609>:  cmp    %sil,%dl
   0x0000000000042c0c <+1612>:  je     0x42c24 <x86_emulate_insn+1636>
   0x0000000000042c0e <+1614>:  movslq 0xd0(%rbp),%rsi
   0x0000000000042c15 <+1621>:  mov    %rbp,%rdi
   0x0000000000042c18 <+1624>:  add    0x90(%rbp),%rsi
   0x0000000000042c1f <+1631>:  call   0x3a7c0 <assign_eip>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2 12/12] x86/kvm/emulate: Avoid RET for fastops
  2024-11-11 11:59 ` [PATCH v2 12/12] x86/kvm/emulate: Avoid RET for fastops Peter Zijlstra
  2024-11-11 16:27   ` Peter Zijlstra
@ 2024-11-11 17:26   ` Sean Christopherson
  2024-11-11 18:28     ` Peter Zijlstra
  2024-11-15 18:41   ` Josh Poimboeuf
  2 siblings, 1 reply; 38+ messages in thread
From: Sean Christopherson @ 2024-11-11 17:26 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: pbonzini, jpoimboe, tglx, linux-kernel, x86, kvm, jthoughton

KVM: x86:

On Mon, Nov 11, 2024, Peter Zijlstra wrote:
> Since there is only a single fastop() function, convert the FASTOP
> stuff from CALL_NOSPEC+RET to JMP_NOSPEC+JMP, avoiding the return
> thunks and all that jazz.
> 
> Specifically FASTOPs rely on the return thunk to preserve EFLAGS,
> which not all of them can trivially do (call depth tracing suffers
> here).

Maybe add an example?  Mostly as a reminder of how to reproduce the call depth
issues.

  E.g. booting with "retbleed=force,stuff spectre_v2=retpoline,generic" causes
  KVM-Unit-Test's "emulator" test to fail due to flags being clobbered.

> Objtool strenuously complains about this:
> 
>  - indirect call without a .rodata, fails to determine JUMP_TABLE,
>    annotate
>  - fastop functions fall through, exception
>  - unreachable instruction after fastop_return, save/restore
> 
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>

The original patch works, but with the fixup KVM fails emulation of an ADC and
generates:

  ------------[ cut here ]------------
  Unpatched return thunk in use. This should not happen!
  WARNING: CPU: 4 PID: 1452 at arch/x86/kernel/cpu/bugs.c:3063 __warn_thunk+0x26/0x30
  Modules linked in: vhost_net vhost vhost_iotlb tap kvm_intel kvm
  CPU: 4 UID: 1000 PID: 1452 Comm: qemu Not tainted 6.12.0-rc5-22582d7d68a6-rev/fastops-miti #11
  Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
  RIP: 0010:__warn_thunk+0x26/0x30
  Code: 5e ff 7e 05 0f 1f 44 00 00 80 3d 5d 06 5c 01 00 74 05 e9 bd 7d a0 00 48 c7 c7 10 4a 13 82 c6 05 48 06 5c 01 01 e8 31 48 04 00 <0f> 0b e9 a3 7d a0 00 cc cc cc 90 90 90 90 90 90 90 90 90 90 90 90
  RSP: 0018:ffffc90001743c78 EFLAGS: 00010287
  RAX: 0000000000000000 RBX: ffff88811877a040 RCX: 0000000000000027
  RDX: ffff88846f91b208 RSI: 0000000000000001 RDI: ffff88846f91b200
  RBP: ffffc90001743cc8 R08: ffffffff825195c8 R09: 0000000000000003
  R10: ffffffff824395e0 R11: ffffffff824e95e0 R12: 0000000000000000
  R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
  FS:  00007f3027400700(0000) GS:ffff88846f900000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 00007f3029ba3001 CR3: 000000010cdc0000 CR4: 0000000000352eb0
  Call Trace:
   <TASK>
   ? __warn+0x85/0x120
   ? __warn_thunk+0x26/0x30
   ? report_bug+0x17d/0x190
   ? handle_bug+0x53/0x90
   ? exc_invalid_op+0x14/0x70
   ? asm_exc_invalid_op+0x16/0x20
   ? __warn_thunk+0x26/0x30
   ? __warn_thunk+0x26/0x30
   warn_thunk_thunk+0x16/0x30
   ? __kvm_mmu_topup_memory_cache+0x57/0x150 [kvm]
   init_emulate_ctxt+0xae/0x110 [kvm]
   x86_decode_emulated_instruction+0x25/0x80 [kvm]
   x86_emulate_instruction+0x2cb/0x6f0 [kvm]
   vmx_handle_exit+0x394/0x6e0 [kvm_intel]
   kvm_arch_vcpu_ioctl_run+0xf40/0x1db0 [kvm]
   kvm_vcpu_ioctl+0x2e9/0x870 [kvm]
   ? futex_wake+0x81/0x180
   ? call_depth_return_thunk+0x6c/0x90
   ? call_depth_return_thunk+0x66/0x90
   ? call_depth_return_thunk+0x60/0x90
   ? call_depth_return_thunk+0x5a/0x90
   __x64_sys_ioctl+0x8a/0xc0
   do_syscall_64+0x5b/0x170
   entry_SYSCALL_64_after_hwframe+0x71/0x79
  RIP: 0033:0x7f30290cedeb
  Code: 0f 92 c0 84 c0 75 b0 49 8d 3c 1c e8 ff 47 04 00 85 c0 78 b1 48 83 c4 08 4c 89 e0 5b 41 5c c3 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 a0 ff ff ff f7 d8 64 89 01 48
  RSP: 002b:00007f30273ff748 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
  RAX: ffffffffffffffda RBX: 000000000000ae80 RCX: 00007f30290cedeb
  RDX: 0000000000000000 RSI: 000000000000ae80 RDI: 000000000000000e
  RBP: 0000555587e2f1e0 R08: 00007f302923fc10 R09: 0000000000000000
  R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
  R13: 00007f3029b90780 R14: 00007ffea5ab9640 R15: 00007f30273ffa00
   </TASK>
  ---[ end trace 0000000000000000 ]---

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2 00/12] x86/kvm/emulate: Avoid RET for FASTOPs
  2024-11-11 11:59 [PATCH v2 00/12] x86/kvm/emulate: Avoid RET for FASTOPs Peter Zijlstra
                   ` (11 preceding siblings ...)
  2024-11-11 11:59 ` [PATCH v2 12/12] x86/kvm/emulate: Avoid RET for fastops Peter Zijlstra
@ 2024-11-11 17:27 ` Sean Christopherson
  12 siblings, 0 replies; 38+ messages in thread
From: Sean Christopherson @ 2024-11-11 17:27 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: pbonzini, jpoimboe, tglx, linux-kernel, x86, kvm, jthoughton

On Mon, Nov 11, 2024, Peter Zijlstra wrote:
> Hi!
> 
> At long last, a respin of these patches.
> 
> The FASTOPs are special because they rely on RET to preserve CFLAGS, which is a
> problem with all the mitigation stuff. Also see things like: ba5ca5e5e6a1
> ("x86/retpoline: Don't clobber RFLAGS during srso_safe_ret()").
> 
> Rework FASTOPs to no longer use RET and side-step the problem of trying to make
> the various return thunks preserve CFLAGS for just this one case.
> 
> There are two separate instances, test_cc() and fastop(). The first is
> basically a SETCC wrapper, which seems like a very complicated (and somewhat
> expensive) way to read FLAGS. Instead use the code we already have to emulate
> JCC to fully emulate the instruction.
> 
> That then leaves fastop(), which when marked noinline is guaranteed to exist
> only once. As such, CALL+RET isn't needed, because we'll always be RETurning to
> the same location, as such replace with JMP+JMP.
> 
> My plan is to take the objtool patches through tip/objtool/core, the nospec
> patches through tip/x86/core and either stick the fastop patches in that latter
> tree if the KVM folks agree, or they can merge the aforementioned two branches
> and then stick the patches on top, whatever works for people.

Unless Paolo objects, I think it makes sense to take the fastop patches through
tip/x86/core.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2 12/12] x86/kvm/emulate: Avoid RET for fastops
  2024-11-11 17:26   ` Sean Christopherson
@ 2024-11-11 18:28     ` Peter Zijlstra
  0 siblings, 0 replies; 38+ messages in thread
From: Peter Zijlstra @ 2024-11-11 18:28 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: pbonzini, jpoimboe, tglx, linux-kernel, x86, kvm, jthoughton

On Mon, Nov 11, 2024 at 09:26:44AM -0800, Sean Christopherson wrote:
> KVM: x86:
> 
> On Mon, Nov 11, 2024, Peter Zijlstra wrote:
> > Since there is only a single fastop() function, convert the FASTOP
> > stuff from CALL_NOSPEC+RET to JMP_NOSPEC+JMP, avoiding the return
> > thunks and all that jazz.
> > 
> > Specifically FASTOPs rely on the return thunk to preserve EFLAGS,
> > which not all of them can trivially do (call depth tracing suffers
> > here).
> 
> Maybe add an example?  Mostly as a reminder of how to reproduce the call depth
> issues.
> 
>   E.g. booting with "retbleed=force,stuff spectre_v2=retpoline,generic" causes
>   KVM-Unit-Test's "emulator" test to fail due to flags being clobbered.
> 
> > Objtool strenuously complains about this:
> > 
> >  - indirect call without a .rodata, fails to determine JUMP_TABLE,
> >    annotate
> >  - fastop functions fall through, exception
> >  - unreachable instruction after fastop_return, save/restore
> > 
> > Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> 
> The original patch works, but with the fixup KVM fails emulation of an ADC and
> generates:

Bah, I'll go chase it down. Thanks!

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2 01/12] objtool: Generic annotation infrastructure
  2024-11-11 11:59 ` [PATCH v2 01/12] objtool: Generic annotation infrastructure Peter Zijlstra
@ 2024-11-15 18:38   ` Josh Poimboeuf
  2024-11-16  9:33     ` Peter Zijlstra
  0 siblings, 1 reply; 38+ messages in thread
From: Josh Poimboeuf @ 2024-11-15 18:38 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: seanjc, pbonzini, tglx, linux-kernel, x86, kvm, jthoughton

On Mon, Nov 11, 2024 at 12:59:36PM +0100, Peter Zijlstra wrote:
> +#define ASM_ANNOTATE(x)						\
> +	"911:\n\t"						\
> +	".pushsection .discard.annotate,\"M\",@progbits,8\n\t"	\
> +	".long 911b - .\n\t"					\
> +	".long " __stringify(x) "\n\t"				\
> +	".popsection\n\t"

Why mergeable and progbits?

> +static int read_annotate(struct objtool_file *file, void (*func)(int type, struct instruction *insn))
> +{
> +	struct section *rsec, *sec;
> +	struct instruction *insn;
> +	struct reloc *reloc;
> +	int type;
> +
> +	rsec = find_section_by_name(file->elf, ".rela.discard.annotate");
> +	if (!rsec)
> +		return 0;
> +
> +	sec = find_section_by_name(file->elf, ".discard.annotate");
> +	if (!sec)
> +		return 0;

Instead of looking for .rela.discard.annotate you can just get it from
sec->rsec.


> +
> +	if (sec->sh.sh_entsize != 8) {
> +		static bool warn = false;

"warned" ?

> +		if (!warn) {
> +			WARN("%s: dodgy linker, sh_entsize != 8", sec->name);
> +			warn = true;
> +		}

Any reason not to make this a fatal error?

> +		sec->sh.sh_entsize = 8;
> +	}
> +
> +	for_each_reloc(rsec, reloc) {
> +		insn = find_insn(file, reloc->sym->sec,
> +				 reloc->sym->offset + reloc_addend(reloc));
> +		if (!insn) {
> +			WARN("bad .discard.annotate entry: %d", reloc_idx(reloc));
> +			return -1;
> +		}

Would be nice to print the type here as well.

> @@ -2670,6 +2714,8 @@ static int decode_sections(struct objtoo
>  	if (ret)
>  		return ret;
>  
> +	ret = read_annotate(file, __annotate_nop);
> +

'ret' is ignored here (not that it matters much as this goes away in the
next patch)

-- 
Josh

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2 03/12] objtool: Convert ANNOTATE_RETPOLINE_SAFE to ANNOTATE
  2024-11-11 11:59 ` [PATCH v2 03/12] objtool: Convert ANNOTATE_RETPOLINE_SAFE " Peter Zijlstra
@ 2024-11-15 18:39   ` Josh Poimboeuf
  2024-11-16  9:34     ` Peter Zijlstra
  0 siblings, 1 reply; 38+ messages in thread
From: Josh Poimboeuf @ 2024-11-15 18:39 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: seanjc, pbonzini, tglx, linux-kernel, x86, kvm, jthoughton

On Mon, Nov 11, 2024 at 12:59:38PM +0100, Peter Zijlstra wrote:
> 
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> ---
>  arch/x86/include/asm/nospec-branch.h |   13 +-------
>  include/linux/objtool_types.h        |    1 
>  tools/include/linux/objtool_types.h  |    1 
>  tools/objtool/check.c                |   52 ++++++++++++-----------------------
>  4 files changed, 22 insertions(+), 45 deletions(-)
> 
> --- a/arch/x86/include/asm/nospec-branch.h
> +++ b/arch/x86/include/asm/nospec-branch.h
> @@ -193,12 +193,7 @@
>   * objtool the subsequent indirect jump/call is vouched safe for retpoline
>   * builds.
>   */
> -.macro ANNOTATE_RETPOLINE_SAFE
> -.Lhere_\@:
> -	.pushsection .discard.retpoline_safe
> -	.long .Lhere_\@
> -	.popsection
> -.endm
> +#define ANNOTATE_RETPOLINE_SAFE	ANNOTATE type=ANNOTYPE_RETPOLINE_SAFE

I'm thinking it would be nice to put all the ANNOTATE_* definitions
in objtool.h so we can have all the annotations and their descriptions
in one place.

-- 
Josh

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2 04/12] objtool: Convert instrumentation_{begin,end}() to ANNOTATE
  2024-11-11 11:59 ` [PATCH v2 04/12] objtool: Convert instrumentation_{begin,end}() " Peter Zijlstra
@ 2024-11-15 18:40   ` Josh Poimboeuf
  2024-11-16  9:36     ` Peter Zijlstra
  2024-11-16 10:06     ` Peter Zijlstra
  0 siblings, 2 replies; 38+ messages in thread
From: Josh Poimboeuf @ 2024-11-15 18:40 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: seanjc, pbonzini, tglx, linux-kernel, x86, kvm, jthoughton

On Mon, Nov 11, 2024 at 12:59:39PM +0100, Peter Zijlstra wrote:
> +++ b/include/linux/objtool.h
> @@ -51,13 +51,16 @@
>  	".long 998b\n\t"						\
>  	".popsection\n\t"
>  
> -#define ASM_ANNOTATE(x)						\
> -	"911:\n\t"						\
> +#define __ASM_ANNOTATE(s, x)					\
>  	".pushsection .discard.annotate,\"M\",@progbits,8\n\t"	\
> -	".long 911b - .\n\t"					\
> +	".long " __stringify(s) "b - .\n\t"			\

It would probably be better for __ASM_ANNOTATE's callers to pass in the
full label name (e.g. '911b') since they know where the label is?  It
could even be a named label.

-- 
Josh

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2 07/12] objtool: Convert ANNOTATE_INTRA_FUNCTION_CALLS to ANNOTATE
  2024-11-11 11:59 ` [PATCH v2 07/12] objtool: Convert ANNOTATE_INTRA_FUNCTION_CALLS " Peter Zijlstra
@ 2024-11-15 18:40   ` Josh Poimboeuf
  2024-11-16  9:37     ` Peter Zijlstra
  0 siblings, 1 reply; 38+ messages in thread
From: Josh Poimboeuf @ 2024-11-15 18:40 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: seanjc, pbonzini, tglx, linux-kernel, x86, kvm, jthoughton

On Mon, Nov 11, 2024 at 12:59:42PM +0100, Peter Zijlstra wrote:
> +++ b/include/linux/objtool_types.h
> @@ -63,5 +63,6 @@ struct unwind_hint {
>  #define ANNOTYPE_INSTR_END		4
>  #define ANNOTYPE_UNRET_BEGIN		5
>  #define ANNOTYPE_IGNORE_ALTS		6
> +#define ANNOTYPE_INTRA_FUNCTION_CALLS	7

Should be a singular "CALL"?

-- 
Josh

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2 10/12] x86,nospec: Simplify {JMP,CALL}_NOSPEC (part 2)
  2024-11-11 11:59 ` [PATCH v2 10/12] x86,nospec: Simplify {JMP,CALL}_NOSPEC (part 2) Peter Zijlstra
@ 2024-11-15 18:40   ` Josh Poimboeuf
  2024-11-16  9:39     ` Peter Zijlstra
  0 siblings, 1 reply; 38+ messages in thread
From: Josh Poimboeuf @ 2024-11-15 18:40 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: seanjc, pbonzini, tglx, linux-kernel, x86, kvm, jthoughton

On Mon, Nov 11, 2024 at 12:59:45PM +0100, Peter Zijlstra wrote:
> Counterpart to 09d09531a51a ("x86,nospec: Simplify
> {JMP,CALL}_NOSPEC"), x86_64 will rewrite all this anyway, see
> apply_retpoline.
> 
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> ---
>  arch/x86/include/asm/nospec-branch.h |   29 +++++++++++------------------
>  1 file changed, 11 insertions(+), 18 deletions(-)
> 
> --- a/arch/x86/include/asm/nospec-branch.h
> +++ b/arch/x86/include/asm/nospec-branch.h
> @@ -429,31 +429,24 @@ static inline void call_depth_return_thu
>  
>  #ifdef CONFIG_X86_64
>  
> +#define __CS_PREFIX						\
> +	".irp rs,r8,r9,r10,r11,r12,r13,r14,r15\n"		\
> +	".ifc %V[thunk_target],\\rs\n"				\
> +	".byte 0x2e\n"						\
> +	".endif\n"						\
> +	".endr\n"

After staring at this for some minutes I'm thinking it would be helpful
to add a comment saying this is equivalent to
-mindirect-branch-cs-prefix.

-- 
Josh

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2 12/12] x86/kvm/emulate: Avoid RET for fastops
  2024-11-11 11:59 ` [PATCH v2 12/12] x86/kvm/emulate: Avoid RET for fastops Peter Zijlstra
  2024-11-11 16:27   ` Peter Zijlstra
  2024-11-11 17:26   ` Sean Christopherson
@ 2024-11-15 18:41   ` Josh Poimboeuf
  2024-11-16  9:39     ` Peter Zijlstra
  2 siblings, 1 reply; 38+ messages in thread
From: Josh Poimboeuf @ 2024-11-15 18:41 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: seanjc, pbonzini, tglx, linux-kernel, x86, kvm, jthoughton

On Mon, Nov 11, 2024 at 12:59:47PM +0100, Peter Zijlstra wrote:
> Since there is only a single fastop() function, convert the FASTOP
> stuff from CALL_NOSPEC+RET to JMP_NOSPEC+JMP, avoiding the return
> thunks and all that jazz.
> 
> Specifically FASTOPs rely on the return thunk to preserve EFLAGS,
> which not all of them can trivially do (call depth tracing suffers
> here).
> 
> Objtool strenuously complains about this:
> 
>  - indirect call without a .rodata, fails to determine JUMP_TABLE,
>    annotate
>  - fastop functions fall through, exception
>  - unreachable instruction after fastop_return, save/restore

This wording makes it sound like this patch triggers objtool warnings,
which I'm guessing is not true?

-- 
Josh

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2 01/12] objtool: Generic annotation infrastructure
  2024-11-15 18:38   ` Josh Poimboeuf
@ 2024-11-16  9:33     ` Peter Zijlstra
  2024-11-20  0:31       ` Josh Poimboeuf
  0 siblings, 1 reply; 38+ messages in thread
From: Peter Zijlstra @ 2024-11-16  9:33 UTC (permalink / raw)
  To: Josh Poimboeuf; +Cc: seanjc, pbonzini, tglx, linux-kernel, x86, kvm, jthoughton

On Fri, Nov 15, 2024 at 10:38:28AM -0800, Josh Poimboeuf wrote:
> On Mon, Nov 11, 2024 at 12:59:36PM +0100, Peter Zijlstra wrote:
> > +#define ASM_ANNOTATE(x)						\
> > +	"911:\n\t"						\
> > +	".pushsection .discard.annotate,\"M\",@progbits,8\n\t"	\
> > +	".long 911b - .\n\t"					\
> > +	".long " __stringify(x) "\n\t"				\
> > +	".popsection\n\t"
> 
> Why mergeable and progbits?

In order to get sh_entsize ?

> > +static int read_annotate(struct objtool_file *file, void (*func)(int type, struct instruction *insn))
> > +{
> > +	struct section *rsec, *sec;
> > +	struct instruction *insn;
> > +	struct reloc *reloc;
> > +	int type;
> > +
> > +	rsec = find_section_by_name(file->elf, ".rela.discard.annotate");
> > +	if (!rsec)
> > +		return 0;
> > +
> > +	sec = find_section_by_name(file->elf, ".discard.annotate");
> > +	if (!sec)
> > +		return 0;
> 
> Instead of looking for .rela.discard.annotate you can just get it from
> sec->rsec.

Oh, indeed.

> > +
> > +	if (sec->sh.sh_entsize != 8) {
> > +		static bool warn = false;
> 
> "warned" ?

Sure.

> > +		if (!warn) {
> > +			WARN("%s: dodgy linker, sh_entsize != 8", sec->name);
> > +			warn = true;
> > +		}
> 
> Any reason not to make this a fatal error?

lld is currently suffering from this, it would get us build failures on
llvm builds. Once that's fixed, then yes, this should become fatal.

  https://github.com/ClangBuiltLinux/linux/issues/2057

> > +		sec->sh.sh_entsize = 8;
> > +	}
> > +
> > +	for_each_reloc(rsec, reloc) {
> > +		insn = find_insn(file, reloc->sym->sec,
> > +				 reloc->sym->offset + reloc_addend(reloc));
> > +		if (!insn) {
> > +			WARN("bad .discard.annotate entry: %d", reloc_idx(reloc));
> > +			return -1;
> > +		}
> 
> Would be nice to print the type here as well.

Sure.

> > @@ -2670,6 +2714,8 @@ static int decode_sections(struct objtoo
> >  	if (ret)
> >  		return ret;
> >  
> > +	ret = read_annotate(file, __annotate_nop);
> > +
> 
> 'ret' is ignored here (not that it matters much as this goes away in the
> next patch)

Right..

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2 03/12] objtool: Convert ANNOTATE_RETPOLINE_SAFE to ANNOTATE
  2024-11-15 18:39   ` Josh Poimboeuf
@ 2024-11-16  9:34     ` Peter Zijlstra
  0 siblings, 0 replies; 38+ messages in thread
From: Peter Zijlstra @ 2024-11-16  9:34 UTC (permalink / raw)
  To: Josh Poimboeuf; +Cc: seanjc, pbonzini, tglx, linux-kernel, x86, kvm, jthoughton

On Fri, Nov 15, 2024 at 10:39:36AM -0800, Josh Poimboeuf wrote:
> On Mon, Nov 11, 2024 at 12:59:38PM +0100, Peter Zijlstra wrote:
> > 
> > Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> > ---
> >  arch/x86/include/asm/nospec-branch.h |   13 +-------
> >  include/linux/objtool_types.h        |    1 
> >  tools/include/linux/objtool_types.h  |    1 
> >  tools/objtool/check.c                |   52 ++++++++++++-----------------------
> >  4 files changed, 22 insertions(+), 45 deletions(-)
> > 
> > --- a/arch/x86/include/asm/nospec-branch.h
> > +++ b/arch/x86/include/asm/nospec-branch.h
> > @@ -193,12 +193,7 @@
> >   * objtool the subsequent indirect jump/call is vouched safe for retpoline
> >   * builds.
> >   */
> > -.macro ANNOTATE_RETPOLINE_SAFE
> > -.Lhere_\@:
> > -	.pushsection .discard.retpoline_safe
> > -	.long .Lhere_\@
> > -	.popsection
> > -.endm
> > +#define ANNOTATE_RETPOLINE_SAFE	ANNOTATE type=ANNOTYPE_RETPOLINE_SAFE
> 
> I'm thinking it would be nice to put all the ANNOTATE_* definitions
> in objtool.h so we can have all the annotations and their descriptions
> in one place.

Probably, but that's going to be somewhat of a pain. Let me do that at
the end and throw it at an allyesconfig or something.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2 04/12] objtool: Convert instrumentation_{begin,end}() to ANNOTATE
  2024-11-15 18:40   ` Josh Poimboeuf
@ 2024-11-16  9:36     ` Peter Zijlstra
  2024-11-16  9:51       ` Peter Zijlstra
  2024-11-16 10:06     ` Peter Zijlstra
  1 sibling, 1 reply; 38+ messages in thread
From: Peter Zijlstra @ 2024-11-16  9:36 UTC (permalink / raw)
  To: Josh Poimboeuf; +Cc: seanjc, pbonzini, tglx, linux-kernel, x86, kvm, jthoughton

On Fri, Nov 15, 2024 at 10:40:08AM -0800, Josh Poimboeuf wrote:
> On Mon, Nov 11, 2024 at 12:59:39PM +0100, Peter Zijlstra wrote:
> > +++ b/include/linux/objtool.h
> > @@ -51,13 +51,16 @@
> >  	".long 998b\n\t"						\
> >  	".popsection\n\t"
> >  
> > -#define ASM_ANNOTATE(x)						\
> > -	"911:\n\t"						\
> > +#define __ASM_ANNOTATE(s, x)					\
> >  	".pushsection .discard.annotate,\"M\",@progbits,8\n\t"	\
> > -	".long 911b - .\n\t"					\
> > +	".long " __stringify(s) "b - .\n\t"			\
> 
> It would probably be better for __ASM_ANNOTATE's callers to pass in the
> full label name (e.g. '911b') since they know where the label is?  It
> could even be a named label.

I have this somewhere later, changing it here would be a pain because
the existing annotations dont do it like that.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2 07/12] objtool: Convert ANNOTATE_INTRA_FUNCTION_CALLS to ANNOTATE
  2024-11-15 18:40   ` Josh Poimboeuf
@ 2024-11-16  9:37     ` Peter Zijlstra
  0 siblings, 0 replies; 38+ messages in thread
From: Peter Zijlstra @ 2024-11-16  9:37 UTC (permalink / raw)
  To: Josh Poimboeuf; +Cc: seanjc, pbonzini, tglx, linux-kernel, x86, kvm, jthoughton

On Fri, Nov 15, 2024 at 10:40:27AM -0800, Josh Poimboeuf wrote:
> On Mon, Nov 11, 2024 at 12:59:42PM +0100, Peter Zijlstra wrote:
> > +++ b/include/linux/objtool_types.h
> > @@ -63,5 +63,6 @@ struct unwind_hint {
> >  #define ANNOTYPE_INSTR_END		4
> >  #define ANNOTYPE_UNRET_BEGIN		5
> >  #define ANNOTYPE_IGNORE_ALTS		6
> > +#define ANNOTYPE_INTRA_FUNCTION_CALLS	7
> 
> Should be a singular "CALL"?

Sure, one regex on the patch sorted that :-)

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2 10/12] x86,nospec: Simplify {JMP,CALL}_NOSPEC (part 2)
  2024-11-15 18:40   ` Josh Poimboeuf
@ 2024-11-16  9:39     ` Peter Zijlstra
  0 siblings, 0 replies; 38+ messages in thread
From: Peter Zijlstra @ 2024-11-16  9:39 UTC (permalink / raw)
  To: Josh Poimboeuf; +Cc: seanjc, pbonzini, tglx, linux-kernel, x86, kvm, jthoughton

On Fri, Nov 15, 2024 at 10:40:56AM -0800, Josh Poimboeuf wrote:
> On Mon, Nov 11, 2024 at 12:59:45PM +0100, Peter Zijlstra wrote:
> > Counterpart to 09d09531a51a ("x86,nospec: Simplify
> > {JMP,CALL}_NOSPEC"), x86_64 will rewrite all this anyway, see
> > apply_retpoline.
> > 
> > Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> > ---
> >  arch/x86/include/asm/nospec-branch.h |   29 +++++++++++------------------
> >  1 file changed, 11 insertions(+), 18 deletions(-)
> > 
> > --- a/arch/x86/include/asm/nospec-branch.h
> > +++ b/arch/x86/include/asm/nospec-branch.h
> > @@ -429,31 +429,24 @@ static inline void call_depth_return_thu
> >  
> >  #ifdef CONFIG_X86_64
> >  
> > +#define __CS_PREFIX						\
> > +	".irp rs,r8,r9,r10,r11,r12,r13,r14,r15\n"		\
> > +	".ifc %V[thunk_target],\\rs\n"				\
> > +	".byte 0x2e\n"						\
> > +	".endif\n"						\
> > +	".endr\n"
> 
> After staring at this for some minutes I'm thinking it would be helpful
> to add a comment saying this is equivalent to
> -mindirect-branch-cs-prefix.

I'll just copy-paste the comment from the other __CS_PREFIX elsewhere in
this file :-)


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2 12/12] x86/kvm/emulate: Avoid RET for fastops
  2024-11-15 18:41   ` Josh Poimboeuf
@ 2024-11-16  9:39     ` Peter Zijlstra
  0 siblings, 0 replies; 38+ messages in thread
From: Peter Zijlstra @ 2024-11-16  9:39 UTC (permalink / raw)
  To: Josh Poimboeuf; +Cc: seanjc, pbonzini, tglx, linux-kernel, x86, kvm, jthoughton

On Fri, Nov 15, 2024 at 10:41:04AM -0800, Josh Poimboeuf wrote:
> On Mon, Nov 11, 2024 at 12:59:47PM +0100, Peter Zijlstra wrote:
> > Since there is only a single fastop() function, convert the FASTOP
> > stuff from CALL_NOSPEC+RET to JMP_NOSPEC+JMP, avoiding the return
> > thunks and all that jazz.
> > 
> > Specifically FASTOPs rely on the return thunk to preserve EFLAGS,
> > which not all of them can trivially do (call depth tracing suffers
> > here).
> > 
> > Objtool strenuously complains about this:
> > 
> >  - indirect call without a .rodata, fails to determine JUMP_TABLE,
> >    annotate
> >  - fastop functions fall through, exception
> >  - unreachable instruction after fastop_return, save/restore
> 
> This wording makes it sound like this patch triggers objtool warnings,
> which I'm guessing is not true?

Right, no, it did without the fixups. This was a (poorly worder) attempt
at explaining the reasons for the various annotations in the patch.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2 04/12] objtool: Convert instrumentation_{begin,end}() to ANNOTATE
  2024-11-16  9:36     ` Peter Zijlstra
@ 2024-11-16  9:51       ` Peter Zijlstra
  0 siblings, 0 replies; 38+ messages in thread
From: Peter Zijlstra @ 2024-11-16  9:51 UTC (permalink / raw)
  To: Josh Poimboeuf; +Cc: seanjc, pbonzini, tglx, linux-kernel, x86, kvm, jthoughton

On Sat, Nov 16, 2024 at 10:36:26AM +0100, Peter Zijlstra wrote:
> On Fri, Nov 15, 2024 at 10:40:08AM -0800, Josh Poimboeuf wrote:
> > On Mon, Nov 11, 2024 at 12:59:39PM +0100, Peter Zijlstra wrote:
> > > +++ b/include/linux/objtool.h
> > > @@ -51,13 +51,16 @@
> > >  	".long 998b\n\t"						\
> > >  	".popsection\n\t"
> > >  
> > > -#define ASM_ANNOTATE(x)						\
> > > -	"911:\n\t"						\
> > > +#define __ASM_ANNOTATE(s, x)					\
> > >  	".pushsection .discard.annotate,\"M\",@progbits,8\n\t"	\
> > > -	".long 911b - .\n\t"					\
> > > +	".long " __stringify(s) "b - .\n\t"			\
> > 
> > It would probably be better for __ASM_ANNOTATE's callers to pass in the
> > full label name (e.g. '911b') since they know where the label is?  It
> > could even be a named label.
> 
> I have this somewhere later, changing it here would be a pain because
> the existing annotations dont do it like that.

Reading is hard, yes let me do what you said.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2 04/12] objtool: Convert instrumentation_{begin,end}() to ANNOTATE
  2024-11-15 18:40   ` Josh Poimboeuf
  2024-11-16  9:36     ` Peter Zijlstra
@ 2024-11-16 10:06     ` Peter Zijlstra
  1 sibling, 0 replies; 38+ messages in thread
From: Peter Zijlstra @ 2024-11-16 10:06 UTC (permalink / raw)
  To: Josh Poimboeuf; +Cc: seanjc, pbonzini, tglx, linux-kernel, x86, kvm, jthoughton

On Fri, Nov 15, 2024 at 10:40:08AM -0800, Josh Poimboeuf wrote:
> On Mon, Nov 11, 2024 at 12:59:39PM +0100, Peter Zijlstra wrote:
> > +++ b/include/linux/objtool.h
> > @@ -51,13 +51,16 @@
> >  	".long 998b\n\t"						\
> >  	".popsection\n\t"
> >  
> > -#define ASM_ANNOTATE(x)						\
> > -	"911:\n\t"						\
> > +#define __ASM_ANNOTATE(s, x)					\
> >  	".pushsection .discard.annotate,\"M\",@progbits,8\n\t"	\
> > -	".long 911b - .\n\t"					\
> > +	".long " __stringify(s) "b - .\n\t"			\
> 
> It would probably be better for __ASM_ANNOTATE's callers to pass in the
> full label name (e.g. '911b') since they know where the label is?  It
> could even be a named label.

This seems to work.

--- a/include/linux/instrumentation.h
+++ b/include/linux/instrumentation.h
@@ -6,11 +6,12 @@
 
 #include <linux/objtool.h>
 #include <linux/stringify.h>
+#include <linux/args.h>
 
 /* Begin/end of an instrumentation safe region */
 #define __instrumentation_begin(c) ({					\
 	asm volatile(__stringify(c) ": nop\n\t"				\
-		     __ASM_ANNOTATE(c, ANNOTYPE_INSTR_BEGIN)		\
+		     __ASM_ANNOTATE(CONCATENATE(c, b), ANNOTYPE_INSTR_BEGIN)	\
 		     : : "i" (c));					\
 })
 #define instrumentation_begin() __instrumentation_begin(__COUNTER__)
@@ -48,7 +49,7 @@
  */
 #define __instrumentation_end(c) ({					\
 	asm volatile(__stringify(c) ": nop\n\t"				\
-		     __ASM_ANNOTATE(c, ANNOTYPE_INSTR_END)		\
+		     __ASM_ANNOTATE(CONCATENATE(c, b), ANNOTYPE_INSTR_END)		\
 		     : : "i" (c));					\
 })
 #define instrumentation_end() __instrumentation_end(__COUNTER__)
--- a/include/linux/objtool.h
+++ b/include/linux/objtool.h
@@ -53,13 +53,13 @@
 
 #define __ASM_ANNOTATE(s, x)					\
 	".pushsection .discard.annotate,\"M\",@progbits,8\n\t"	\
-	".long " __stringify(s) "b - .\n\t"			\
+	".long " __stringify(s) " - .\n\t"			\
 	".long " __stringify(x) "\n\t"				\
 	".popsection\n\t"
 
 #define ASM_ANNOTATE(x)						\
 	"911:\n\t"						\
-	__ASM_ANNOTATE(911, x)
+	__ASM_ANNOTATE(911b, x)
 
 #define ANNOTATE_NOENDBR	ASM_ANNOTATE(ANNOTYPE_NOENDBR)
 

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2 01/12] objtool: Generic annotation infrastructure
  2024-11-16  9:33     ` Peter Zijlstra
@ 2024-11-20  0:31       ` Josh Poimboeuf
  2024-11-20  1:04         ` Josh Poimboeuf
  0 siblings, 1 reply; 38+ messages in thread
From: Josh Poimboeuf @ 2024-11-20  0:31 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: seanjc, pbonzini, tglx, linux-kernel, x86, kvm, jthoughton

On Sat, Nov 16, 2024 at 10:33:31AM +0100, Peter Zijlstra wrote:
> On Fri, Nov 15, 2024 at 10:38:28AM -0800, Josh Poimboeuf wrote:
> > On Mon, Nov 11, 2024 at 12:59:36PM +0100, Peter Zijlstra wrote:
> > > +#define ASM_ANNOTATE(x)						\
> > > +	"911:\n\t"						\
> > > +	".pushsection .discard.annotate,\"M\",@progbits,8\n\t"	\
> > > +	".long 911b - .\n\t"					\
> > > +	".long " __stringify(x) "\n\t"				\
> > > +	".popsection\n\t"
> > 
> > Why mergeable and progbits?
> 
> In order to get sh_entsize ?

Is that a guess?  If so, it's not very convincing as I don't see what
entsize would have to do with it.

-- 
Josh

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2 01/12] objtool: Generic annotation infrastructure
  2024-11-20  0:31       ` Josh Poimboeuf
@ 2024-11-20  1:04         ` Josh Poimboeuf
  2024-11-20  8:52           ` Peter Zijlstra
  0 siblings, 1 reply; 38+ messages in thread
From: Josh Poimboeuf @ 2024-11-20  1:04 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: seanjc, pbonzini, tglx, linux-kernel, x86, kvm, jthoughton

On Tue, Nov 19, 2024 at 04:31:25PM -0800, Josh Poimboeuf wrote:
> On Sat, Nov 16, 2024 at 10:33:31AM +0100, Peter Zijlstra wrote:
> > On Fri, Nov 15, 2024 at 10:38:28AM -0800, Josh Poimboeuf wrote:
> > > On Mon, Nov 11, 2024 at 12:59:36PM +0100, Peter Zijlstra wrote:
> > > > +#define ASM_ANNOTATE(x)						\
> > > > +	"911:\n\t"						\
> > > > +	".pushsection .discard.annotate,\"M\",@progbits,8\n\t"	\
> > > > +	".long 911b - .\n\t"					\
> > > > +	".long " __stringify(x) "\n\t"				\
> > > > +	".popsection\n\t"
> > > 
> > > Why mergeable and progbits?
> > 
> > In order to get sh_entsize ?
> 
> Is that a guess?  If so, it's not very convincing as I don't see what
> entsize would have to do with it.

Oh, nevermind... I see it's a gas syntax issue.

-- 
Josh

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2 01/12] objtool: Generic annotation infrastructure
  2024-11-20  1:04         ` Josh Poimboeuf
@ 2024-11-20  8:52           ` Peter Zijlstra
  2024-11-20 16:03             ` Josh Poimboeuf
  0 siblings, 1 reply; 38+ messages in thread
From: Peter Zijlstra @ 2024-11-20  8:52 UTC (permalink / raw)
  To: Josh Poimboeuf; +Cc: seanjc, pbonzini, tglx, linux-kernel, x86, kvm, jthoughton

On Tue, Nov 19, 2024 at 05:04:24PM -0800, Josh Poimboeuf wrote:
> On Tue, Nov 19, 2024 at 04:31:25PM -0800, Josh Poimboeuf wrote:
> > On Sat, Nov 16, 2024 at 10:33:31AM +0100, Peter Zijlstra wrote:
> > > On Fri, Nov 15, 2024 at 10:38:28AM -0800, Josh Poimboeuf wrote:
> > > > On Mon, Nov 11, 2024 at 12:59:36PM +0100, Peter Zijlstra wrote:
> > > > > +#define ASM_ANNOTATE(x)						\
> > > > > +	"911:\n\t"						\
> > > > > +	".pushsection .discard.annotate,\"M\",@progbits,8\n\t"	\
> > > > > +	".long 911b - .\n\t"					\
> > > > > +	".long " __stringify(x) "\n\t"				\
> > > > > +	".popsection\n\t"
> > > > 
> > > > Why mergeable and progbits?
> > > 
> > > In order to get sh_entsize ?
> > 
> > Is that a guess?  If so, it's not very convincing as I don't see what
> > entsize would have to do with it.
> 
> Oh, nevermind... I see it's a gas syntax issue.

Not a guess, only mergable gets entsize, and progbits is a required
argument per the syntax in order to specify entsize.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2 01/12] objtool: Generic annotation infrastructure
  2024-11-20  8:52           ` Peter Zijlstra
@ 2024-11-20 16:03             ` Josh Poimboeuf
  2024-11-20 16:03               ` Josh Poimboeuf
  0 siblings, 1 reply; 38+ messages in thread
From: Josh Poimboeuf @ 2024-11-20 16:03 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: seanjc, pbonzini, tglx, linux-kernel, x86, kvm, jthoughton

On Wed, Nov 20, 2024 at 09:52:54AM +0100, Peter Zijlstra wrote:
> On Tue, Nov 19, 2024 at 05:04:24PM -0800, Josh Poimboeuf wrote:
> > On Tue, Nov 19, 2024 at 04:31:25PM -0800, Josh Poimboeuf wrote:
> > > On Sat, Nov 16, 2024 at 10:33:31AM +0100, Peter Zijlstra wrote:
> > > > On Fri, Nov 15, 2024 at 10:38:28AM -0800, Josh Poimboeuf wrote:
> > > > > On Mon, Nov 11, 2024 at 12:59:36PM +0100, Peter Zijlstra wrote:
> > > > > > +#define ASM_ANNOTATE(x)						\
> > > > > > +	"911:\n\t"						\
> > > > > > +	".pushsection .discard.annotate,\"M\",@progbits,8\n\t"	\
> > > > > > +	".long 911b - .\n\t"					\
> > > > > > +	".long " __stringify(x) "\n\t"				\
> > > > > > +	".popsection\n\t"
> > > > > 
> > > > > Why mergeable and progbits?
> > > > 
> > > > In order to get sh_entsize ?
> > > 
> > > Is that a guess?  If so, it's not very convincing as I don't see what
> > > entsize would have to do with it.
> > 
> > Oh, nevermind... I see it's a gas syntax issue.
> 
> Not a guess, only mergable gets entsize, and progbits is a required
> argument per the syntax in order to specify entsize.

If you look at "readelf -WS vmlinux" there are plenty of non-mergeable
sections with entsize.

-- 
Josh

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2 01/12] objtool: Generic annotation infrastructure
  2024-11-20 16:03             ` Josh Poimboeuf
@ 2024-11-20 16:03               ` Josh Poimboeuf
  2024-11-21 11:46                 ` Peter Zijlstra
  0 siblings, 1 reply; 38+ messages in thread
From: Josh Poimboeuf @ 2024-11-20 16:03 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: seanjc, pbonzini, tglx, linux-kernel, x86, kvm, jthoughton

On Wed, Nov 20, 2024 at 08:03:10AM -0800, Josh Poimboeuf wrote:
> On Wed, Nov 20, 2024 at 09:52:54AM +0100, Peter Zijlstra wrote:
> > On Tue, Nov 19, 2024 at 05:04:24PM -0800, Josh Poimboeuf wrote:
> > > On Tue, Nov 19, 2024 at 04:31:25PM -0800, Josh Poimboeuf wrote:
> > > > On Sat, Nov 16, 2024 at 10:33:31AM +0100, Peter Zijlstra wrote:
> > > > > On Fri, Nov 15, 2024 at 10:38:28AM -0800, Josh Poimboeuf wrote:
> > > > > > On Mon, Nov 11, 2024 at 12:59:36PM +0100, Peter Zijlstra wrote:
> > > > > > > +#define ASM_ANNOTATE(x)						\
> > > > > > > +	"911:\n\t"						\
> > > > > > > +	".pushsection .discard.annotate,\"M\",@progbits,8\n\t"	\
> > > > > > > +	".long 911b - .\n\t"					\
> > > > > > > +	".long " __stringify(x) "\n\t"				\
> > > > > > > +	".popsection\n\t"
> > > > > > 
> > > > > > Why mergeable and progbits?
> > > > > 
> > > > > In order to get sh_entsize ?
> > > > 
> > > > Is that a guess?  If so, it's not very convincing as I don't see what
> > > > entsize would have to do with it.
> > > 
> > > Oh, nevermind... I see it's a gas syntax issue.
> > 
> > Not a guess, only mergable gets entsize, and progbits is a required
> > argument per the syntax in order to specify entsize.
> 
> If you look at "readelf -WS vmlinux" there are plenty of non-mergeable
> sections with entsize.

Er, vmlinux.o

-- 
Josh

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2 01/12] objtool: Generic annotation infrastructure
  2024-11-20 16:03               ` Josh Poimboeuf
@ 2024-11-21 11:46                 ` Peter Zijlstra
  0 siblings, 0 replies; 38+ messages in thread
From: Peter Zijlstra @ 2024-11-21 11:46 UTC (permalink / raw)
  To: Josh Poimboeuf; +Cc: seanjc, pbonzini, tglx, linux-kernel, x86, kvm, jthoughton

On Wed, Nov 20, 2024 at 08:03:24AM -0800, Josh Poimboeuf wrote:

> > If you look at "readelf -WS vmlinux" there are plenty of non-mergeable
> > sections with entsize.
> 
> Er, vmlinux.o

Well yes, but how do you set entsize from as? The manual only mentions
entsize in relation to M(ergable) with the .section command.

^ permalink raw reply	[flat|nested] 38+ messages in thread

end of thread, other threads:[~2024-11-21 11:46 UTC | newest]

Thread overview: 38+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-11-11 11:59 [PATCH v2 00/12] x86/kvm/emulate: Avoid RET for FASTOPs Peter Zijlstra
2024-11-11 11:59 ` [PATCH v2 01/12] objtool: Generic annotation infrastructure Peter Zijlstra
2024-11-15 18:38   ` Josh Poimboeuf
2024-11-16  9:33     ` Peter Zijlstra
2024-11-20  0:31       ` Josh Poimboeuf
2024-11-20  1:04         ` Josh Poimboeuf
2024-11-20  8:52           ` Peter Zijlstra
2024-11-20 16:03             ` Josh Poimboeuf
2024-11-20 16:03               ` Josh Poimboeuf
2024-11-21 11:46                 ` Peter Zijlstra
2024-11-11 11:59 ` [PATCH v2 02/12] objtool: Convert ANNOTATE_NOENDBR to ANNOTATE Peter Zijlstra
2024-11-11 11:59 ` [PATCH v2 03/12] objtool: Convert ANNOTATE_RETPOLINE_SAFE " Peter Zijlstra
2024-11-15 18:39   ` Josh Poimboeuf
2024-11-16  9:34     ` Peter Zijlstra
2024-11-11 11:59 ` [PATCH v2 04/12] objtool: Convert instrumentation_{begin,end}() " Peter Zijlstra
2024-11-15 18:40   ` Josh Poimboeuf
2024-11-16  9:36     ` Peter Zijlstra
2024-11-16  9:51       ` Peter Zijlstra
2024-11-16 10:06     ` Peter Zijlstra
2024-11-11 11:59 ` [PATCH v2 05/12] objtool: Convert VALIDATE_UNRET_BEGIN " Peter Zijlstra
2024-11-11 11:59 ` [PATCH v2 06/12] objtool: Convert ANNOTATE_IGNORE_ALTERNATIVE " Peter Zijlstra
2024-11-11 11:59 ` [PATCH v2 07/12] objtool: Convert ANNOTATE_INTRA_FUNCTION_CALLS " Peter Zijlstra
2024-11-15 18:40   ` Josh Poimboeuf
2024-11-16  9:37     ` Peter Zijlstra
2024-11-11 11:59 ` [PATCH v2 08/12] objtool: Collapse annotate sequences Peter Zijlstra
2024-11-11 11:59 ` [PATCH v2 09/12] x86/nospec: JMP_NOSPEC Peter Zijlstra
2024-11-11 11:59 ` [PATCH v2 10/12] x86,nospec: Simplify {JMP,CALL}_NOSPEC (part 2) Peter Zijlstra
2024-11-15 18:40   ` Josh Poimboeuf
2024-11-16  9:39     ` Peter Zijlstra
2024-11-11 11:59 ` [PATCH v2 11/12] x86/kvm/emulate: Implement test_cc() in C Peter Zijlstra
2024-11-11 17:13   ` Sean Christopherson
2024-11-11 11:59 ` [PATCH v2 12/12] x86/kvm/emulate: Avoid RET for fastops Peter Zijlstra
2024-11-11 16:27   ` Peter Zijlstra
2024-11-11 17:26   ` Sean Christopherson
2024-11-11 18:28     ` Peter Zijlstra
2024-11-15 18:41   ` Josh Poimboeuf
2024-11-16  9:39     ` Peter Zijlstra
2024-11-11 17:27 ` [PATCH v2 00/12] x86/kvm/emulate: Avoid RET for FASTOPs Sean Christopherson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox