[PATCH v3 0/7] x86/cfi,bpf: Fix CFI vs eBPF

public inbox for linux-arch@vger.kernel.org
 help / color / mirror / Atom feed

* [PATCH v3 0/7] x86/cfi,bpf: Fix CFI vs eBPF
@ 2023-12-15  9:12 Peter Zijlstra
  2023-12-15  9:12 ` [PATCH v3 1/7] cfi: Flip headers Peter Zijlstra
                   ` (8 more replies)
  0 siblings, 9 replies; 10+ messages in thread
From: Peter Zijlstra @ 2023-12-15  9:12 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: paul.walmsley, palmer, aou, tglx, mingo, bp, dave.hansen, x86,
	hpa, davem, dsahern, daniel, andrii, martin.lau, song,
	yonghong.song, john.fastabend, kpsingh, sdf, haoluo, jolsa,
	Arnd Bergmann, samitolvanen, keescook, nathan, ndesaulniers,
	linux-riscv, linux-kernel, netdev, bpf, linux-arch, llvm,
	jpoimboe, joao, mark.rutland, peterz

Hi!

What started with the simple observation that bpf_dispatcher_*_func() was
broken for calling CFI functions with a __nocfi calling context for FineIBT
ended up with a complete BPF wide CFI fixup.

With these changes on the BPF selftest suite passes without crashing -- there's
still a few failures, but Alexei has graciously offered to look into those.

(Alexei, I have presumed your SoB on the very last patch, please update
as you see fit)

Changes since v2 are numerous but include:
 - cfi_get_offset() -- as a means to communicate the offset (ast)
 - 5 new patches fixing various BPF internals to be CFI clean

Note: it *might* be possible to merge the
bpf_bpf_tcp_ca.c:unsupported_ops[] thing into the CFI stubs, as is
get_info will have a NULL stub, unlike the others.

---
 arch/riscv/include/asm/cfi.h   |   3 +-
 arch/riscv/kernel/cfi.c        |   2 +-
 arch/x86/include/asm/cfi.h     | 126 +++++++++++++++++++++++++++++++++++++-
 arch/x86/kernel/alternative.c  |  87 +++++++++++++++++++++++---
 arch/x86/kernel/cfi.c          |   4 +-
 arch/x86/net/bpf_jit_comp.c    | 134 +++++++++++++++++++++++++++++++++++------
 include/asm-generic/Kbuild     |   1 +
 include/linux/bpf.h            |  27 ++++++++-
 include/linux/cfi.h            |  12 ++++
 kernel/bpf/bpf_struct_ops.c    |  16 ++---
 kernel/bpf/core.c              |  25 ++++++++
 kernel/bpf/cpumask.c           |   8 ++-
 kernel/bpf/helpers.c           |  18 +++++-
 net/bpf/bpf_dummy_struct_ops.c |  31 +++++++++-
 net/bpf/test_run.c             |  15 ++++-
 net/ipv4/bpf_tcp_ca.c          |  69 +++++++++++++++++++++
 16 files changed, 528 insertions(+), 50 deletions(-)


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v3 1/7] cfi: Flip headers
  2023-12-15  9:12 [PATCH v3 0/7] x86/cfi,bpf: Fix CFI vs eBPF Peter Zijlstra
@ 2023-12-15  9:12 ` Peter Zijlstra
  2023-12-15  9:12 ` [PATCH v3 2/7] x86/cfi,bpf: Fix BPF JIT call Peter Zijlstra
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Peter Zijlstra @ 2023-12-15  9:12 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: paul.walmsley, palmer, aou, tglx, mingo, bp, dave.hansen, x86,
	hpa, davem, dsahern, daniel, andrii, martin.lau, song,
	yonghong.song, john.fastabend, kpsingh, sdf, haoluo, jolsa,
	Arnd Bergmann, samitolvanen, keescook, nathan, ndesaulniers,
	linux-riscv, linux-kernel, netdev, bpf, linux-arch, llvm,
	jpoimboe, joao, mark.rutland, peterz

Normal include order is that linux/foo.h should include asm/foo.h, CFI has it
the wrong way around.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Sami Tolvanen <samitolvanen@google.com>
---
 arch/riscv/include/asm/cfi.h |    3 ++-
 arch/riscv/kernel/cfi.c      |    2 +-
 arch/x86/include/asm/cfi.h   |    3 ++-
 arch/x86/kernel/cfi.c        |    4 ++--
 include/asm-generic/Kbuild   |    1 +
 include/asm-generic/cfi.h    |    5 +++++
 include/linux/cfi.h          |    1 +
 7 files changed, 14 insertions(+), 5 deletions(-)

--- a/arch/riscv/include/asm/cfi.h
+++ b/arch/riscv/include/asm/cfi.h
@@ -7,8 +7,9 @@
  *
  * Copyright (C) 2023 Google LLC
  */
+#include <linux/bug.h>
 
-#include <linux/cfi.h>
+struct pt_regs;
 
 #ifdef CONFIG_CFI_CLANG
 enum bug_trap_type handle_cfi_failure(struct pt_regs *regs);
--- a/arch/riscv/kernel/cfi.c
+++ b/arch/riscv/kernel/cfi.c
@@ -4,7 +4,7 @@
  *
  * Copyright (C) 2023 Google LLC
  */
-#include <asm/cfi.h>
+#include <linux/cfi.h>
 #include <asm/insn.h>
 
 /*
--- a/arch/x86/include/asm/cfi.h
+++ b/arch/x86/include/asm/cfi.h
@@ -7,8 +7,9 @@
  *
  * Copyright (C) 2022 Google LLC
  */
+#include <linux/bug.h>
 
-#include <linux/cfi.h>
+struct pt_regs;
 
 #ifdef CONFIG_CFI_CLANG
 enum bug_trap_type handle_cfi_failure(struct pt_regs *regs);
--- a/arch/x86/kernel/cfi.c
+++ b/arch/x86/kernel/cfi.c
@@ -4,10 +4,10 @@
  *
  * Copyright (C) 2022 Google LLC
  */
-#include <asm/cfi.h>
+#include <linux/string.h>
+#include <linux/cfi.h>
 #include <asm/insn.h>
 #include <asm/insn-eval.h>
-#include <linux/string.h>
 
 /*
  * Returns the target address and the expected type when regs->ip points
--- a/include/asm-generic/Kbuild
+++ b/include/asm-generic/Kbuild
@@ -11,6 +11,7 @@ mandatory-y += bitops.h
 mandatory-y += bug.h
 mandatory-y += bugs.h
 mandatory-y += cacheflush.h
+mandatory-y += cfi.h
 mandatory-y += checksum.h
 mandatory-y += compat.h
 mandatory-y += current.h
--- /dev/null
+++ b/include/asm-generic/cfi.h
@@ -0,0 +1,5 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef __ASM_GENERIC_CFI_H
+#define __ASM_GENERIC_CFI_H
+
+#endif /* __ASM_GENERIC_CFI_H */
--- a/include/linux/cfi.h
+++ b/include/linux/cfi.h
@@ -9,6 +9,7 @@
 
 #include <linux/bug.h>
 #include <linux/module.h>
+#include <asm/cfi.h>
 
 #ifdef CONFIG_CFI_CLANG
 enum bug_trap_type report_cfi_failure(struct pt_regs *regs, unsigned long addr,



^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v3 2/7] x86/cfi,bpf: Fix BPF JIT call
  2023-12-15  9:12 [PATCH v3 0/7] x86/cfi,bpf: Fix CFI vs eBPF Peter Zijlstra
  2023-12-15  9:12 ` [PATCH v3 1/7] cfi: Flip headers Peter Zijlstra
@ 2023-12-15  9:12 ` Peter Zijlstra
  2023-12-15  9:12 ` [PATCH v3 3/7] x86/cfi,bpf: Fix bpf_callback_t CFI Peter Zijlstra
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Peter Zijlstra @ 2023-12-15  9:12 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: paul.walmsley, palmer, aou, tglx, mingo, bp, dave.hansen, x86,
	hpa, davem, dsahern, daniel, andrii, martin.lau, song,
	yonghong.song, john.fastabend, kpsingh, sdf, haoluo, jolsa,
	Arnd Bergmann, samitolvanen, keescook, nathan, ndesaulniers,
	linux-riscv, linux-kernel, netdev, bpf, linux-arch, llvm,
	jpoimboe, joao, mark.rutland, peterz

The current BPF call convention is __nocfi, except when it calls !JIT things,
then it calls regular C functions.

It so happens that with FineIBT the __nocfi and C calling conventions are
incompatible. Specifically __nocfi will call at func+0, while FineIBT will have
endbr-poison there, which is not a valid indirect target. Causing #CP.

Notably this only triggers on IBT enabled hardware, which is probably why this
hasn't been reported (also, most people will have JIT on anyway).

Implement proper CFI prologues for the BPF JIT codegen and drop __nocfi for
x86.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/include/asm/cfi.h    |  110 ++++++++++++++++++++++++++++++++++++++++++
 arch/x86/kernel/alternative.c |   47 ++++++++++++++---
 arch/x86/net/bpf_jit_comp.c   |   82 +++++++++++++++++++++++++++++--
 include/linux/bpf.h           |   12 +++-
 include/linux/cfi.h           |    7 ++
 kernel/bpf/core.c             |   25 +++++++++
 6 files changed, 269 insertions(+), 14 deletions(-)

--- a/arch/x86/include/asm/cfi.h
+++ b/arch/x86/include/asm/cfi.h
@@ -9,15 +9,125 @@
  */
 #include <linux/bug.h>
 
+/*
+ * An overview of the various calling conventions...
+ *
+ * Traditional:
+ *
+ * foo:
+ *   ... code here ...
+ *   ret
+ *
+ * direct caller:
+ *   call foo
+ *
+ * indirect caller:
+ *   lea foo(%rip), %r11
+ *   ...
+ *   call *%r11
+ *
+ *
+ * IBT:
+ *
+ * foo:
+ *   endbr64
+ *   ... code here ...
+ *   ret
+ *
+ * direct caller:
+ *   call foo / call foo+4
+ *
+ * indirect caller:
+ *   lea foo(%rip), %r11
+ *   ...
+ *   call *%r11
+ *
+ *
+ * kCFI:
+ *
+ * __cfi_foo:
+ *   movl $0x12345678, %eax
+ *				# 11 nops when CONFIG_CALL_PADDING
+ * foo:
+ *   endbr64			# when IBT
+ *   ... code here ...
+ *   ret
+ *
+ * direct call:
+ *   call foo			# / call foo+4 when IBT
+ *
+ * indirect call:
+ *   lea foo(%rip), %r11
+ *   ...
+ *   movl $(-0x12345678), %r10d
+ *   addl -4(%r11), %r10d	# -15 when CONFIG_CALL_PADDING
+ *   jz   1f
+ *   ud2
+ * 1:call *%r11
+ *
+ *
+ * FineIBT (builds as kCFI + CALL_PADDING + IBT + RETPOLINE and runtime patches into):
+ *
+ * __cfi_foo:
+ *   endbr64
+ *   subl 0x12345678, %r10d
+ *   jz   foo
+ *   ud2
+ *   nop
+ * foo:
+ *   osp nop3			# was endbr64
+ *   ... code here ...
+ *   ret
+ *
+ * direct caller:
+ *   call foo / call foo+4
+ *
+ * indirect caller:
+ *   lea foo(%rip), %r11
+ *   ...
+ *   movl $0x12345678, %r10d
+ *   subl $16, %r11
+ *   nop4
+ *   call *%r11
+ *
+ */
+enum cfi_mode {
+	CFI_DEFAULT,	/* FineIBT if hardware has IBT, otherwise kCFI */
+	CFI_OFF,	/* Taditional / IBT depending on .config */
+	CFI_KCFI,	/* Optionally CALL_PADDING, IBT, RETPOLINE */
+	CFI_FINEIBT,	/* see arch/x86/kernel/alternative.c */
+};
+
+extern enum cfi_mode cfi_mode;
+
 struct pt_regs;
 
 #ifdef CONFIG_CFI_CLANG
 enum bug_trap_type handle_cfi_failure(struct pt_regs *regs);
+#define __bpfcall
+extern u32 cfi_bpf_hash;
+
+static inline int cfi_get_offset(void)
+{
+	switch (cfi_mode) {
+	case CFI_FINEIBT:
+		return 16;
+	case CFI_KCFI:
+		if (IS_ENABLED(CONFIG_CALL_PADDING))
+			return 16;
+		return 5;
+	default:
+		return 0;
+	}
+}
+#define cfi_get_offset cfi_get_offset
+
 #else
 static inline enum bug_trap_type handle_cfi_failure(struct pt_regs *regs)
 {
 	return BUG_TRAP_TYPE_NONE;
 }
+#define cfi_bpf_hash 0U
 #endif /* CONFIG_CFI_CLANG */
 
 #endif /* _ASM_X86_CFI_H */
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -30,6 +30,7 @@
 #include <asm/fixmap.h>
 #include <asm/paravirt.h>
 #include <asm/asm-prototypes.h>
+#include <asm/cfi.h>
 
 int __read_mostly alternatives_patched;
 
@@ -832,15 +833,43 @@ void __init_or_module apply_seal_endbr(s
 #endif /* CONFIG_X86_KERNEL_IBT */
 
 #ifdef CONFIG_FINEIBT
+#define __CFI_DEFAULT	CFI_DEFAULT
+#elif defined(CONFIG_CFI_CLANG)
+#define __CFI_DEFAULT	CFI_KCFI
+#else
+#define __CFI_DEFAULT	CFI_OFF
+#endif
 
-enum cfi_mode {
-	CFI_DEFAULT,
-	CFI_OFF,
-	CFI_KCFI,
-	CFI_FINEIBT,
-};
+enum cfi_mode cfi_mode __ro_after_init = __CFI_DEFAULT;
+
+#ifdef CONFIG_CFI_CLANG
+struct bpf_insn;
+
+/* Must match bpf_func_t / DEFINE_BPF_PROG_RUN() */
+extern unsigned int __bpf_prog_runX(const void *ctx,
+				    const struct bpf_insn *insn);
+
+/*
+ * Force a reference to the external symbol so the compiler generates
+ * __kcfi_typid.
+ */
+__ADDRESSABLE(__bpf_prog_runX);
+
+/* u32 __ro_after_init cfi_bpf_hash = __kcfi_typeid___bpf_prog_runX; */
+asm (
+"	.pushsection	.data..ro_after_init,\"aw\",@progbits	\n"
+"	.type	cfi_bpf_hash,@object				\n"
+"	.globl	cfi_bpf_hash					\n"
+"	.p2align	2, 0x0					\n"
+"cfi_bpf_hash:							\n"
+"	.long	__kcfi_typeid___bpf_prog_runX			\n"
+"	.size	cfi_bpf_hash, 4					\n"
+"	.popsection						\n"
+);
+#endif
+
+#ifdef CONFIG_FINEIBT
 
-static enum cfi_mode cfi_mode __ro_after_init = CFI_DEFAULT;
 static bool cfi_rand __ro_after_init = true;
 static u32  cfi_seed __ro_after_init;
 
@@ -1149,8 +1178,10 @@ static void __apply_fineibt(s32 *start_r
 		goto err;
 
 	if (cfi_rand) {
-		if (builtin)
+		if (builtin) {
 			cfi_seed = get_random_u32();
+			cfi_bpf_hash = cfi_rehash(cfi_bpf_hash);
+		}
 
 		ret = cfi_rand_preamble(start_cfi, end_cfi);
 		if (ret)
--- a/arch/x86/net/bpf_jit_comp.c
+++ b/arch/x86/net/bpf_jit_comp.c
@@ -17,6 +17,7 @@
 #include <asm/nospec-branch.h>
 #include <asm/text-patching.h>
 #include <asm/unwind.h>
+#include <asm/cfi.h>
 
 static bool all_callee_regs_used[4] = {true, true, true, true};
 
@@ -51,9 +52,11 @@ static u8 *emit_code(u8 *ptr, u32 bytes,
 	do { EMIT4(b1, b2, b3, b4); EMIT(off, 4); } while (0)
 
 #ifdef CONFIG_X86_KERNEL_IBT
-#define EMIT_ENDBR()	EMIT(gen_endbr(), 4)
+#define EMIT_ENDBR()		EMIT(gen_endbr(), 4)
+#define EMIT_ENDBR_POISON()	EMIT(gen_endbr_poison(), 4)
 #else
 #define EMIT_ENDBR()
+#define EMIT_ENDBR_POISON()
 #endif
 
 static bool is_imm8(int value)
@@ -305,6 +308,69 @@ static void pop_callee_regs(u8 **pprog,
 }
 
 /*
+ * Emit the various CFI preambles, see asm/cfi.h and the comments about FineIBT
+ * in arch/x86/kernel/alternative.c
+ */
+
+static void emit_fineibt(u8 **pprog)
+{
+	u8 *prog = *pprog;
+
+	EMIT_ENDBR();
+	EMIT3_off32(0x41, 0x81, 0xea, cfi_bpf_hash);	/* subl $hash, %r10d	*/
+	EMIT2(0x74, 0x07);				/* jz.d8 +7		*/
+	EMIT2(0x0f, 0x0b);				/* ud2			*/
+	EMIT1(0x90);					/* nop			*/
+	EMIT_ENDBR_POISON();
+
+	*pprog = prog;
+}
+
+static void emit_kcfi(u8 **pprog)
+{
+	u8 *prog = *pprog;
+
+	EMIT1_off32(0xb8, cfi_bpf_hash);		/* movl $hash, %eax	*/
+#ifdef CONFIG_CALL_PADDING
+	EMIT1(0x90);
+	EMIT1(0x90);
+	EMIT1(0x90);
+	EMIT1(0x90);
+	EMIT1(0x90);
+	EMIT1(0x90);
+	EMIT1(0x90);
+	EMIT1(0x90);
+	EMIT1(0x90);
+	EMIT1(0x90);
+	EMIT1(0x90);
+#endif
+	EMIT_ENDBR();
+
+	*pprog = prog;
+}
+
+static void emit_cfi(u8 **pprog)
+{
+	u8 *prog = *pprog;
+
+	switch (cfi_mode) {
+	case CFI_FINEIBT:
+		emit_fineibt(&prog);
+		break;
+
+	case CFI_KCFI:
+		emit_kcfi(&prog);
+		break;
+
+	default:
+		EMIT_ENDBR();
+		break;
+	}
+
+	*pprog = prog;
+}
+
+/*
  * Emit x86-64 prologue code for BPF program.
  * bpf_tail_call helper will skip the first X86_TAIL_CALL_OFFSET bytes
  * while jumping to another program
@@ -315,10 +381,10 @@ static void emit_prologue(u8 **pprog, u3
 {
 	u8 *prog = *pprog;
 
+	emit_cfi(&prog);
 	/* BPF trampoline can be made to work without these nops,
 	 * but let's waste 5 bytes for now and optimize later
 	 */
-	EMIT_ENDBR();
 	memcpy(prog, x86_nops[5], X86_PATCH_SIZE);
 	prog += X86_PATCH_SIZE;
 	if (!ebpf_from_cbpf) {
@@ -3013,9 +3079,16 @@ struct bpf_prog *bpf_int_jit_compile(str
 			jit_data->header = header;
 			jit_data->rw_header = rw_header;
 		}
-		prog->bpf_func = (void *)image;
+		/*
+		 * ctx.prog_offset is used when CFI preambles put code *before*
+		 * the function. See emit_cfi(). For FineIBT specifically this code
+		 * can also be executed and bpf_prog_kallsyms_add() will
+		 * generate an additional symbol to cover this, hence also
+		 * decrement proglen.
+		 */
+		prog->bpf_func = (void *)image + cfi_get_offset();
 		prog->jited = 1;
-		prog->jited_len = proglen;
+		prog->jited_len = proglen - cfi_get_offset();
 	} else {
 		prog = orig_prog;
 	}
@@ -3070,6 +3143,7 @@ void bpf_jit_free(struct bpf_prog *prog)
 			kvfree(jit_data->addrs);
 			kfree(jit_data);
 		}
+		prog->bpf_func = (void *)prog->bpf_func - cfi_get_offset();
 		hdr = bpf_jit_binary_pack_hdr(prog);
 		bpf_jit_binary_pack_free(hdr, NULL);
 		WARN_ON_ONCE(!bpf_prog_kallsyms_verify_off(prog));
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -29,6 +29,7 @@
 #include <linux/rcupdate_trace.h>
 #include <linux/static_call.h>
 #include <linux/memcontrol.h>
+#include <linux/cfi.h>
 
 struct bpf_verifier_env;
 struct bpf_verifier_log;
@@ -1211,7 +1212,11 @@ struct bpf_dispatcher {
 #endif
 };
 
-static __always_inline __nocfi unsigned int bpf_dispatcher_nop_func(
+#ifndef __bpfcall
+#define __bpfcall __nocfi
+#endif
+
+static __always_inline __bpfcall unsigned int bpf_dispatcher_nop_func(
 	const void *ctx,
 	const struct bpf_insn *insnsi,
 	bpf_func_t bpf_func)
@@ -1303,7 +1308,7 @@ int arch_prepare_bpf_dispatcher(void *im
 
 #define DEFINE_BPF_DISPATCHER(name)					\
 	__BPF_DISPATCHER_SC(name);					\
-	noinline __nocfi unsigned int bpf_dispatcher_##name##_func(	\
+	noinline __bpfcall unsigned int bpf_dispatcher_##name##_func(	\
 		const void *ctx,					\
 		const struct bpf_insn *insnsi,				\
 		bpf_func_t bpf_func)					\
@@ -1453,6 +1458,9 @@ struct bpf_prog_aux {
 	struct bpf_kfunc_desc_tab *kfunc_tab;
 	struct bpf_kfunc_btf_tab *kfunc_btf_tab;
 	u32 size_poke_tab;
+#ifdef CONFIG_FINEIBT
+	struct bpf_ksym ksym_prefix;
+#endif
 	struct bpf_ksym ksym;
 	const struct bpf_prog_ops *ops;
 	struct bpf_map **used_maps;
--- a/include/linux/cfi.h
+++ b/include/linux/cfi.h
@@ -11,6 +11,13 @@
 #include <linux/module.h>
 #include <asm/cfi.h>
 
+#ifndef cfi_get_offset
+static inline int cfi_get_offset(void)
+{
+	return 0;
+}
+#endif
+
 #ifdef CONFIG_CFI_CLANG
 enum bug_trap_type report_cfi_failure(struct pt_regs *regs, unsigned long addr,
 				      unsigned long *target, u32 type);
--- a/kernel/bpf/core.c
+++ b/kernel/bpf/core.c
@@ -121,6 +121,9 @@ struct bpf_prog *bpf_prog_alloc_no_stats
 #endif
 
 	INIT_LIST_HEAD_RCU(&fp->aux->ksym.lnode);
+#ifdef CONFIG_FINEIBT
+	INIT_LIST_HEAD_RCU(&fp->aux->ksym_prefix.lnode);
+#endif
 	mutex_init(&fp->aux->used_maps_mutex);
 	mutex_init(&fp->aux->dst_mutex);
 
@@ -683,6 +686,23 @@ void bpf_prog_kallsyms_add(struct bpf_pr
 	fp->aux->ksym.prog = true;
 
 	bpf_ksym_add(&fp->aux->ksym);
+
+#ifdef CONFIG_FINEIBT
+	/*
+	 * When FineIBT, code in the __cfi_foo() symbols can get executed
+	 * and hence unwinder needs help.
+	 */
+	if (cfi_mode != CFI_FINEIBT)
+		return;
+
+	snprintf(fp->aux->ksym_prefix.name, KSYM_NAME_LEN,
+		 "__cfi_%s", fp->aux->ksym.name);
+
+	fp->aux->ksym_prefix.start = (unsigned long) fp->bpf_func - 16;
+	fp->aux->ksym_prefix.end   = (unsigned long) fp->bpf_func;
+
+	bpf_ksym_add(&fp->aux->ksym_prefix);
+#endif
 }
 
 void bpf_prog_kallsyms_del(struct bpf_prog *fp)
@@ -691,6 +711,11 @@ void bpf_prog_kallsyms_del(struct bpf_pr
 		return;
 
 	bpf_ksym_del(&fp->aux->ksym);
+#ifdef CONFIG_FINEIBT
+	if (cfi_mode != CFI_FINEIBT)
+		return;
+	bpf_ksym_del(&fp->aux->ksym_prefix);
+#endif
 }
 
 static struct bpf_ksym *bpf_ksym_find(unsigned long addr)



^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v3 3/7] x86/cfi,bpf: Fix bpf_callback_t CFI
  2023-12-15  9:12 [PATCH v3 0/7] x86/cfi,bpf: Fix CFI vs eBPF Peter Zijlstra
  2023-12-15  9:12 ` [PATCH v3 1/7] cfi: Flip headers Peter Zijlstra
  2023-12-15  9:12 ` [PATCH v3 2/7] x86/cfi,bpf: Fix BPF JIT call Peter Zijlstra
@ 2023-12-15  9:12 ` Peter Zijlstra
  2023-12-15  9:12 ` [PATCH v3 4/7] x86/cfi,bpf: Fix bpf_struct_ops CFI Peter Zijlstra
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Peter Zijlstra @ 2023-12-15  9:12 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: paul.walmsley, palmer, aou, tglx, mingo, bp, dave.hansen, x86,
	hpa, davem, dsahern, daniel, andrii, martin.lau, song,
	yonghong.song, john.fastabend, kpsingh, sdf, haoluo, jolsa,
	Arnd Bergmann, samitolvanen, keescook, nathan, ndesaulniers,
	linux-riscv, linux-kernel, netdev, bpf, linux-arch, llvm,
	jpoimboe, joao, mark.rutland, peterz

Where the main BPF program is expected to match bpf_func_t,
sub-programs are expected to match bpf_callback_t.

This fixes things like:

tools/testing/selftests/bpf/progs/bloom_filter_bench.c:

           bpf_for_each_map_elem(&array_map, bloom_callback, &data, 0);

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/include/asm/cfi.h    |    2 ++
 arch/x86/kernel/alternative.c |   18 ++++++++++++++++++
 arch/x86/net/bpf_jit_comp.c   |   18 ++++++++++--------
 3 files changed, 30 insertions(+), 8 deletions(-)

--- a/arch/x86/include/asm/cfi.h
+++ b/arch/x86/include/asm/cfi.h
@@ -106,6 +106,7 @@ struct pt_regs;
 enum bug_trap_type handle_cfi_failure(struct pt_regs *regs);
 #define __bpfcall
 extern u32 cfi_bpf_hash;
+extern u32 cfi_bpf_subprog_hash;
 
 static inline int cfi_get_offset(void)
 {
@@ -128,6 +129,7 @@ static inline enum bug_trap_type handle_
 	return BUG_TRAP_TYPE_NONE;
 }
 #define cfi_bpf_hash 0U
+#define cfi_bpf_subprog_hash 0U
 #endif /* CONFIG_CFI_CLANG */
 
 #endif /* _ASM_X86_CFI_H */
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -866,6 +866,23 @@ asm (
 "	.size	cfi_bpf_hash, 4					\n"
 "	.popsection						\n"
 );
+
+/* Must match bpf_callback_t */
+extern u64 __bpf_callback_fn(u64, u64, u64, u64, u64);
+
+__ADDRESSABLE(__bpf_callback_fn);
+
+/* u32 __ro_after_init cfi_bpf_subprog_hash = __kcfi_typeid___bpf_callback_fn; */
+asm (
+"	.pushsection	.data..ro_after_init,\"aw\",@progbits	\n"
+"	.type	cfi_bpf_subprog_hash,@object			\n"
+"	.globl	cfi_bpf_subprog_hash				\n"
+"	.p2align	2, 0x0					\n"
+"cfi_bpf_subprog_hash:						\n"
+"	.long	__kcfi_typeid___bpf_callback_fn			\n"
+"	.size	cfi_bpf_subprog_hash, 4				\n"
+"	.popsection						\n"
+);
 #endif
 
 #ifdef CONFIG_FINEIBT
@@ -1181,6 +1198,7 @@ static void __apply_fineibt(s32 *start_r
 		if (builtin) {
 			cfi_seed = get_random_u32();
 			cfi_bpf_hash = cfi_rehash(cfi_bpf_hash);
+			cfi_bpf_subprog_hash = cfi_rehash(cfi_bpf_subprog_hash);
 		}
 
 		ret = cfi_rand_preamble(start_cfi, end_cfi);
--- a/arch/x86/net/bpf_jit_comp.c
+++ b/arch/x86/net/bpf_jit_comp.c
@@ -312,12 +312,13 @@ static void pop_callee_regs(u8 **pprog,
  * in arch/x86/kernel/alternative.c
  */
 
-static void emit_fineibt(u8 **pprog)
+static void emit_fineibt(u8 **pprog, bool is_subprog)
 {
+	u32 hash = is_subprog ? cfi_bpf_subprog_hash : cfi_bpf_hash;
 	u8 *prog = *pprog;
 
 	EMIT_ENDBR();
-	EMIT3_off32(0x41, 0x81, 0xea, cfi_bpf_hash);	/* subl $hash, %r10d	*/
+	EMIT3_off32(0x41, 0x81, 0xea, hash);		/* subl $hash, %r10d	*/
 	EMIT2(0x74, 0x07);				/* jz.d8 +7		*/
 	EMIT2(0x0f, 0x0b);				/* ud2			*/
 	EMIT1(0x90);					/* nop			*/
@@ -326,11 +327,12 @@ static void emit_fineibt(u8 **pprog)
 	*pprog = prog;
 }
 
-static void emit_kcfi(u8 **pprog)
+static void emit_kcfi(u8 **pprog, bool is_subprog)
 {
+	u32 hash = is_subprog ? cfi_bpf_subprog_hash : cfi_bpf_hash;
 	u8 *prog = *pprog;
 
-	EMIT1_off32(0xb8, cfi_bpf_hash);		/* movl $hash, %eax	*/
+	EMIT1_off32(0xb8, hash);			/* movl $hash, %eax	*/
 #ifdef CONFIG_CALL_PADDING
 	EMIT1(0x90);
 	EMIT1(0x90);
@@ -349,17 +351,17 @@ static void emit_kcfi(u8 **pprog)
 	*pprog = prog;
 }
 
-static void emit_cfi(u8 **pprog)
+static void emit_cfi(u8 **pprog, bool is_subprog)
 {
 	u8 *prog = *pprog;
 
 	switch (cfi_mode) {
 	case CFI_FINEIBT:
-		emit_fineibt(&prog);
+		emit_fineibt(&prog, is_subprog);
 		break;
 
 	case CFI_KCFI:
-		emit_kcfi(&prog);
+		emit_kcfi(&prog, is_subprog);
 		break;
 
 	default:
@@ -381,7 +383,7 @@ static void emit_prologue(u8 **pprog, u3
 {
 	u8 *prog = *pprog;
 
-	emit_cfi(&prog);
+	emit_cfi(&prog, is_subprog);
 	/* BPF trampoline can be made to work without these nops,
 	 * but let's waste 5 bytes for now and optimize later
 	 */



^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v3 4/7] x86/cfi,bpf: Fix bpf_struct_ops CFI
  2023-12-15  9:12 [PATCH v3 0/7] x86/cfi,bpf: Fix CFI vs eBPF Peter Zijlstra
                   ` (2 preceding siblings ...)
  2023-12-15  9:12 ` [PATCH v3 3/7] x86/cfi,bpf: Fix bpf_callback_t CFI Peter Zijlstra
@ 2023-12-15  9:12 ` Peter Zijlstra
  2023-12-15  9:12 ` [PATCH v3 5/7] cfi: Add CFI_NOSEAL() Peter Zijlstra
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Peter Zijlstra @ 2023-12-15  9:12 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: paul.walmsley, palmer, aou, tglx, mingo, bp, dave.hansen, x86,
	hpa, davem, dsahern, daniel, andrii, martin.lau, song,
	yonghong.song, john.fastabend, kpsingh, sdf, haoluo, jolsa,
	Arnd Bergmann, samitolvanen, keescook, nathan, ndesaulniers,
	linux-riscv, linux-kernel, netdev, bpf, linux-arch, llvm,
	jpoimboe, joao, mark.rutland, peterz

BPF struct_ops uses __arch_prepare_bpf_trampoline() to write
trampolines for indirect function calls. These tramplines much have
matching CFI.

In order to obtain the correct CFI hash for the various methods, add a
matching structure that contains stub functions, the compiler will
generate correct CFI which we can pilfer for the trampolines.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/include/asm/cfi.h     |    6 +++
 arch/x86/kernel/alternative.c  |   22 +++++++++++++
 arch/x86/net/bpf_jit_comp.c    |   66 +++++++++++++++++++++++++--------------
 include/linux/bpf.h            |   13 +++++++
 kernel/bpf/bpf_struct_ops.c    |   16 ++++-----
 net/bpf/bpf_dummy_struct_ops.c |   31 +++++++++++++++++-
 net/ipv4/bpf_tcp_ca.c          |   69 +++++++++++++++++++++++++++++++++++++++++
 7 files changed, 191 insertions(+), 32 deletions(-)

--- a/arch/x86/include/asm/cfi.h
+++ b/arch/x86/include/asm/cfi.h
@@ -123,6 +123,8 @@ static inline int cfi_get_offset(void)
 }
 #define cfi_get_offset cfi_get_offset
 
+extern u32 cfi_get_func_hash(void *func);
+
 #else
 static inline enum bug_trap_type handle_cfi_failure(struct pt_regs *regs)
 {
@@ -130,6 +132,10 @@ static inline enum bug_trap_type handle_
 }
 #define cfi_bpf_hash 0U
 #define cfi_bpf_subprog_hash 0U
+static inline u32 cfi_get_func_hash(void *func)
+{
+	return 0;
+}
 #endif /* CONFIG_CFI_CLANG */
 
 #endif /* _ASM_X86_CFI_H */
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -883,6 +883,28 @@ asm (
 "	.size	cfi_bpf_subprog_hash, 4				\n"
 "	.popsection						\n"
 );
+
+u32 cfi_get_func_hash(void *func)
+{
+	u32 hash;
+
+	func -= cfi_get_offset();
+	switch (cfi_mode) {
+	case CFI_FINEIBT:
+		func += 7;
+		break;
+	case CFI_KCFI:
+		func += 1;
+		break;
+	default:
+		return 0;
+	}
+
+	if (get_kernel_nofault(hash, func))
+		return 0;
+
+	return hash;
+}
 #endif
 
 #ifdef CONFIG_FINEIBT
--- a/arch/x86/net/bpf_jit_comp.c
+++ b/arch/x86/net/bpf_jit_comp.c
@@ -312,9 +312,8 @@ static void pop_callee_regs(u8 **pprog,
  * in arch/x86/kernel/alternative.c
  */
 
-static void emit_fineibt(u8 **pprog, bool is_subprog)
+static void emit_fineibt(u8 **pprog, u32 hash)
 {
-	u32 hash = is_subprog ? cfi_bpf_subprog_hash : cfi_bpf_hash;
 	u8 *prog = *pprog;
 
 	EMIT_ENDBR();
@@ -327,9 +326,8 @@ static void emit_fineibt(u8 **pprog, boo
 	*pprog = prog;
 }
 
-static void emit_kcfi(u8 **pprog, bool is_subprog)
+static void emit_kcfi(u8 **pprog, u32 hash)
 {
-	u32 hash = is_subprog ? cfi_bpf_subprog_hash : cfi_bpf_hash;
 	u8 *prog = *pprog;
 
 	EMIT1_off32(0xb8, hash);			/* movl $hash, %eax	*/
@@ -351,17 +349,17 @@ static void emit_kcfi(u8 **pprog, bool i
 	*pprog = prog;
 }
 
-static void emit_cfi(u8 **pprog, bool is_subprog)
+static void emit_cfi(u8 **pprog, u32 hash)
 {
 	u8 *prog = *pprog;
 
 	switch (cfi_mode) {
 	case CFI_FINEIBT:
-		emit_fineibt(&prog, is_subprog);
+		emit_fineibt(&prog, hash);
 		break;
 
 	case CFI_KCFI:
-		emit_kcfi(&prog, is_subprog);
+		emit_kcfi(&prog, hash);
 		break;
 
 	default:
@@ -383,7 +381,7 @@ static void emit_prologue(u8 **pprog, u3
 {
 	u8 *prog = *pprog;
 
-	emit_cfi(&prog, is_subprog);
+	emit_cfi(&prog, is_subprog ? cfi_bpf_subprog_hash : cfi_bpf_hash);
 	/* BPF trampoline can be made to work without these nops,
 	 * but let's waste 5 bytes for now and optimize later
 	 */
@@ -2510,10 +2508,19 @@ static int __arch_prepare_bpf_trampoline
 	u8 *prog;
 	bool save_ret;
 
+	/*
+	 * F_INDIRECT is only compatible with F_RET_FENTRY_RET, it is
+	 * explicitly incompatible with F_CALL_ORIG | F_SKIP_FRAME | F_IP_ARG
+	 * because @func_addr.
+	 */
+	WARN_ON_ONCE((flags & BPF_TRAMP_F_INDIRECT) &&
+		     (flags & ~(BPF_TRAMP_F_INDIRECT | BPF_TRAMP_F_RET_FENTRY_RET)));
+
 	/* extra registers for struct arguments */
-	for (i = 0; i < m->nr_args; i++)
+	for (i = 0; i < m->nr_args; i++) {
 		if (m->arg_flags[i] & BTF_FMODEL_STRUCT_ARG)
 			nr_regs += (m->arg_size[i] + 7) / 8 - 1;
+	}
 
 	/* x86-64 supports up to MAX_BPF_FUNC_ARGS arguments. 1-6
 	 * are passed through regs, the remains are through stack.
@@ -2596,20 +2603,27 @@ static int __arch_prepare_bpf_trampoline
 
 	prog = rw_image;
 
-	EMIT_ENDBR();
-	/*
-	 * This is the direct-call trampoline, as such it needs accounting
-	 * for the __fentry__ call.
-	 */
-	x86_call_depth_emit_accounting(&prog, NULL);
+	if (flags & BPF_TRAMP_F_INDIRECT) {
+		/*
+		 * Indirect call for bpf_struct_ops
+		 */
+		emit_cfi(&prog, cfi_get_func_hash(func_addr));
+	} else {
+		/*
+		 * Direct-call fentry stub, as such it needs accounting for the
+		 * __fentry__ call.
+		 */
+		x86_call_depth_emit_accounting(&prog, NULL);
+	}
 	EMIT1(0x55);		 /* push rbp */
 	EMIT3(0x48, 0x89, 0xE5); /* mov rbp, rsp */
-	if (!is_imm8(stack_size))
+	if (!is_imm8(stack_size)) {
 		/* sub rsp, stack_size */
 		EMIT3_off32(0x48, 0x81, 0xEC, stack_size);
-	else
+	} else {
 		/* sub rsp, stack_size */
 		EMIT4(0x48, 0x83, 0xEC, stack_size);
+	}
 	if (flags & BPF_TRAMP_F_TAIL_CALL_CTX)
 		EMIT1(0x50);		/* push rax */
 	/* mov QWORD PTR [rbp - rbx_off], rbx */
@@ -2643,10 +2657,11 @@ static int __arch_prepare_bpf_trampoline
 		}
 	}
 
-	if (fentry->nr_links)
+	if (fentry->nr_links) {
 		if (invoke_bpf(m, &prog, fentry, regs_off, run_ctx_off,
 			       flags & BPF_TRAMP_F_RET_FENTRY_RET, image, rw_image))
 			return -EINVAL;
+	}
 
 	if (fmod_ret->nr_links) {
 		branches = kcalloc(fmod_ret->nr_links, sizeof(u8 *),
@@ -2665,11 +2680,12 @@ static int __arch_prepare_bpf_trampoline
 		restore_regs(m, &prog, regs_off);
 		save_args(m, &prog, arg_stack_off, true);
 
-		if (flags & BPF_TRAMP_F_TAIL_CALL_CTX)
+		if (flags & BPF_TRAMP_F_TAIL_CALL_CTX) {
 			/* Before calling the original function, restore the
 			 * tail_call_cnt from stack to rax.
 			 */
 			RESTORE_TAIL_CALL_CNT(stack_size);
+		}
 
 		if (flags & BPF_TRAMP_F_ORIG_STACK) {
 			emit_ldx(&prog, BPF_DW, BPF_REG_6, BPF_REG_FP, 8);
@@ -2698,17 +2714,19 @@ static int __arch_prepare_bpf_trampoline
 		/* Update the branches saved in invoke_bpf_mod_ret with the
 		 * aligned address of do_fexit.
 		 */
-		for (i = 0; i < fmod_ret->nr_links; i++)
+		for (i = 0; i < fmod_ret->nr_links; i++) {
 			emit_cond_near_jump(&branches[i], image + (prog - (u8 *)rw_image),
 					    image + (branches[i] - (u8 *)rw_image), X86_JNE);
+		}
 	}
 
-	if (fexit->nr_links)
+	if (fexit->nr_links) {
 		if (invoke_bpf(m, &prog, fexit, regs_off, run_ctx_off,
 			       false, image, rw_image)) {
 			ret = -EINVAL;
 			goto cleanup;
 		}
+	}
 
 	if (flags & BPF_TRAMP_F_RESTORE_REGS)
 		restore_regs(m, &prog, regs_off);
@@ -2725,11 +2743,12 @@ static int __arch_prepare_bpf_trampoline
 			ret = -EINVAL;
 			goto cleanup;
 		}
-	} else if (flags & BPF_TRAMP_F_TAIL_CALL_CTX)
+	} else if (flags & BPF_TRAMP_F_TAIL_CALL_CTX) {
 		/* Before running the original function, restore the
 		 * tail_call_cnt from stack to rax.
 		 */
 		RESTORE_TAIL_CALL_CNT(stack_size);
+	}
 
 	/* restore return value of orig_call or fentry prog back into RAX */
 	if (save_ret)
@@ -2737,9 +2756,10 @@ static int __arch_prepare_bpf_trampoline
 
 	emit_ldx(&prog, BPF_DW, BPF_REG_6, BPF_REG_FP, -rbx_off);
 	EMIT1(0xC9); /* leave */
-	if (flags & BPF_TRAMP_F_SKIP_FRAME)
+	if (flags & BPF_TRAMP_F_SKIP_FRAME) {
 		/* skip our return address and return to parent */
 		EMIT4(0x48, 0x83, 0xC4, 8); /* add rsp, 8 */
+	}
 	emit_return(&prog, image + (prog - (u8 *)rw_image));
 	/* Make sure the trampoline generation logic doesn't overflow */
 	if (WARN_ON_ONCE(prog > (u8 *)rw_image_end - BPF_INSN_SAFETY)) {
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -1060,6 +1060,17 @@ struct btf_func_model {
  */
 #define BPF_TRAMP_F_TAIL_CALL_CTX	BIT(7)
 
+/*
+ * Indicate the trampoline should be suitable to receive indirect calls;
+ * without this indirectly calling the generated code can result in #UD/#CP,
+ * depending on the CFI options.
+ *
+ * Used by bpf_struct_ops.
+ *
+ * Incompatible with FENTRY usage, overloads @func_addr argument.
+ */
+#define BPF_TRAMP_F_INDIRECT		BIT(8)
+
 /* Each call __bpf_prog_enter + call bpf_func + call __bpf_prog_exit is ~50
  * bytes on x86.
  */
@@ -1695,6 +1706,7 @@ struct bpf_struct_ops {
 	struct btf_func_model func_models[BPF_STRUCT_OPS_MAX_NR_MEMBERS];
 	u32 type_id;
 	u32 value_id;
+	void *cfi_stubs;
 };
 
 #if defined(CONFIG_BPF_JIT) && defined(CONFIG_BPF_SYSCALL)
@@ -1708,6 +1720,7 @@ int bpf_struct_ops_map_sys_lookup_elem(s
 int bpf_struct_ops_prepare_trampoline(struct bpf_tramp_links *tlinks,
 				      struct bpf_tramp_link *link,
 				      const struct btf_func_model *model,
+				      void *stub_func,
 				      void *image, void *image_end);
 static inline bool bpf_try_module_get(const void *data, struct module *owner)
 {
--- a/kernel/bpf/bpf_struct_ops.c
+++ b/kernel/bpf/bpf_struct_ops.c
@@ -352,17 +352,16 @@ const struct bpf_link_ops bpf_struct_ops
 int bpf_struct_ops_prepare_trampoline(struct bpf_tramp_links *tlinks,
 				      struct bpf_tramp_link *link,
 				      const struct btf_func_model *model,
-				      void *image, void *image_end)
+				      void *stub_func, void *image, void *image_end)
 {
-	u32 flags;
+	u32 flags = BPF_TRAMP_F_INDIRECT;
 	int size;
 
 	tlinks[BPF_TRAMP_FENTRY].links[0] = link;
 	tlinks[BPF_TRAMP_FENTRY].nr_links = 1;
-	/* BPF_TRAMP_F_RET_FENTRY_RET is only used by bpf_struct_ops,
-	 * and it must be used alone.
-	 */
-	flags = model->ret_size > 0 ? BPF_TRAMP_F_RET_FENTRY_RET : 0;
+
+	if (model->ret_size > 0)
+		flags |= BPF_TRAMP_F_RET_FENTRY_RET;
 
 	size = arch_bpf_trampoline_size(model, flags, tlinks, NULL);
 	if (size < 0)
@@ -370,7 +369,7 @@ int bpf_struct_ops_prepare_trampoline(st
 	if (size > (unsigned long)image_end - (unsigned long)image)
 		return -E2BIG;
 	return arch_prepare_bpf_trampoline(NULL, image, image_end,
-					   model, flags, tlinks, NULL);
+					   model, flags, tlinks, stub_func);
 }
 
 static long bpf_struct_ops_map_update_elem(struct bpf_map *map, void *key,
@@ -504,11 +503,12 @@ static long bpf_struct_ops_map_update_el
 
 		err = bpf_struct_ops_prepare_trampoline(tlinks, link,
 							&st_ops->func_models[i],
+							*(void **)(st_ops->cfi_stubs + moff),
 							image, image_end);
 		if (err < 0)
 			goto reset_unlock;
 
-		*(void **)(kdata + moff) = image;
+		*(void **)(kdata + moff) = image + cfi_get_offset();
 		image += err;
 
 		/* put prog_id to udata */
--- a/net/bpf/bpf_dummy_struct_ops.c
+++ b/net/bpf/bpf_dummy_struct_ops.c
@@ -12,6 +12,11 @@ extern struct bpf_struct_ops bpf_bpf_dum
 /* A common type for test_N with return value in bpf_dummy_ops */
 typedef int (*dummy_ops_test_ret_fn)(struct bpf_dummy_ops_state *state, ...);
 
+static int dummy_ops_test_ret_function(struct bpf_dummy_ops_state *state, ...)
+{
+	return 0;
+}
+
 struct bpf_dummy_ops_test_args {
 	u64 args[MAX_BPF_FUNC_ARGS];
 	struct bpf_dummy_ops_state state;
@@ -62,7 +67,7 @@ static int dummy_ops_copy_args(struct bp
 
 static int dummy_ops_call_op(void *image, struct bpf_dummy_ops_test_args *args)
 {
-	dummy_ops_test_ret_fn test = (void *)image;
+	dummy_ops_test_ret_fn test = (void *)image + cfi_get_offset();
 	struct bpf_dummy_ops_state *state = NULL;
 
 	/* state needs to be NULL if args[0] is 0 */
@@ -119,6 +124,7 @@ int bpf_struct_ops_test_run(struct bpf_p
 	op_idx = prog->expected_attach_type;
 	err = bpf_struct_ops_prepare_trampoline(tlinks, link,
 						&st_ops->func_models[op_idx],
+						&dummy_ops_test_ret_function,
 						image, image + PAGE_SIZE);
 	if (err < 0)
 		goto out;
@@ -219,6 +225,28 @@ static void bpf_dummy_unreg(void *kdata)
 {
 }
 
+static int bpf_dummy_test_1(struct bpf_dummy_ops_state *cb)
+{
+	return 0;
+}
+
+static int bpf_dummy_test_2(struct bpf_dummy_ops_state *cb, int a1, unsigned short a2,
+			    char a3, unsigned long a4)
+{
+	return 0;
+}
+
+static int bpf_dummy_test_sleepable(struct bpf_dummy_ops_state *cb)
+{
+	return 0;
+}
+
+static struct bpf_dummy_ops __bpf_bpf_dummy_ops = {
+	.test_1 = bpf_dummy_test_1,
+	.test_2 = bpf_dummy_test_2,
+	.test_sleepable = bpf_dummy_test_sleepable,
+};
+
 struct bpf_struct_ops bpf_bpf_dummy_ops = {
 	.verifier_ops = &bpf_dummy_verifier_ops,
 	.init = bpf_dummy_init,
@@ -227,4 +255,5 @@ struct bpf_struct_ops bpf_bpf_dummy_ops
 	.reg = bpf_dummy_reg,
 	.unreg = bpf_dummy_unreg,
 	.name = "bpf_dummy_ops",
+	.cfi_stubs = &__bpf_bpf_dummy_ops,
 };
--- a/net/ipv4/bpf_tcp_ca.c
+++ b/net/ipv4/bpf_tcp_ca.c
@@ -271,6 +271,74 @@ static int bpf_tcp_ca_validate(void *kda
 	return tcp_validate_congestion_control(kdata);
 }
 
+static u32 bpf_tcp_ca_ssthresh(struct sock *sk)
+{
+	return 0;
+}
+
+static void bpf_tcp_ca_cong_avoid(struct sock *sk, u32 ack, u32 acked)
+{
+}
+
+static void bpf_tcp_ca_set_state(struct sock *sk, u8 new_state)
+{
+}
+
+static void bpf_tcp_ca_cwnd_event(struct sock *sk, enum tcp_ca_event ev)
+{
+}
+
+static void bpf_tcp_ca_in_ack_event(struct sock *sk, u32 flags)
+{
+}
+
+static void bpf_tcp_ca_pkts_acked(struct sock *sk, const struct ack_sample *sample)
+{
+}
+
+static u32 bpf_tcp_ca_min_tso_segs(struct sock *sk)
+{
+	return 0;
+}
+
+static void bpf_tcp_ca_cong_control(struct sock *sk, const struct rate_sample *rs)
+{
+}
+
+static u32 bpf_tcp_ca_undo_cwnd(struct sock *sk)
+{
+	return 0;
+}
+
+static u32 bpf_tcp_ca_sndbuf_expand(struct sock *sk)
+{
+	return 0;
+}
+
+static void __bpf_tcp_ca_init(struct sock *sk)
+{
+}
+
+static void __bpf_tcp_ca_release(struct sock *sk)
+{
+}
+
+static struct tcp_congestion_ops __bpf_ops_tcp_congestion_ops = {
+	.ssthresh = bpf_tcp_ca_ssthresh,
+	.cong_avoid = bpf_tcp_ca_cong_avoid,
+	.set_state = bpf_tcp_ca_set_state,
+	.cwnd_event = bpf_tcp_ca_cwnd_event,
+	.in_ack_event = bpf_tcp_ca_in_ack_event,
+	.pkts_acked = bpf_tcp_ca_pkts_acked,
+	.min_tso_segs = bpf_tcp_ca_min_tso_segs,
+	.cong_control = bpf_tcp_ca_cong_control,
+	.undo_cwnd = bpf_tcp_ca_undo_cwnd,
+	.sndbuf_expand = bpf_tcp_ca_sndbuf_expand,
+
+	.init = __bpf_tcp_ca_init,
+	.release = __bpf_tcp_ca_release,
+};
+
 struct bpf_struct_ops bpf_tcp_congestion_ops = {
 	.verifier_ops = &bpf_tcp_ca_verifier_ops,
 	.reg = bpf_tcp_ca_reg,
@@ -281,6 +349,7 @@ struct bpf_struct_ops bpf_tcp_congestion
 	.init = bpf_tcp_ca_init,
 	.validate = bpf_tcp_ca_validate,
 	.name = "tcp_congestion_ops",
+	.cfi_stubs = &__bpf_ops_tcp_congestion_ops,
 };
 
 static int __init bpf_tcp_ca_kfunc_init(void)



^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v3 5/7] cfi: Add CFI_NOSEAL()
  2023-12-15  9:12 [PATCH v3 0/7] x86/cfi,bpf: Fix CFI vs eBPF Peter Zijlstra
                   ` (3 preceding siblings ...)
  2023-12-15  9:12 ` [PATCH v3 4/7] x86/cfi,bpf: Fix bpf_struct_ops CFI Peter Zijlstra
@ 2023-12-15  9:12 ` Peter Zijlstra
  2023-12-15  9:12 ` [PATCH v3 6/7] bpf: Fix dtor CFI Peter Zijlstra
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Peter Zijlstra @ 2023-12-15  9:12 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: paul.walmsley, palmer, aou, tglx, mingo, bp, dave.hansen, x86,
	hpa, davem, dsahern, daniel, andrii, martin.lau, song,
	yonghong.song, john.fastabend, kpsingh, sdf, haoluo, jolsa,
	Arnd Bergmann, samitolvanen, keescook, nathan, ndesaulniers,
	linux-riscv, linux-kernel, netdev, bpf, linux-arch, llvm,
	jpoimboe, joao, mark.rutland, peterz

Add a CFI_NOSEAL() helper to mark functions that need to retain their
CFI information, despite not otherwise leaking their address.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/include/asm/cfi.h |    5 +++++
 include/linux/cfi.h        |    4 ++++
 2 files changed, 9 insertions(+)

--- a/arch/x86/include/asm/cfi.h
+++ b/arch/x86/include/asm/cfi.h
@@ -8,6 +8,7 @@
  * Copyright (C) 2022 Google LLC
  */
 #include <linux/bug.h>
+#include <asm/ibt.h>
 
 /*
  * An overview of the various calling conventions...
@@ -138,4 +139,8 @@ static inline u32 cfi_get_func_hash(void
 }
 #endif /* CONFIG_CFI_CLANG */
 
+#if HAS_KERNEL_IBT == 1
+#define CFI_NOSEAL(x)	asm(IBT_NOSEAL(__stringify(x)))
+#endif
+
 #endif /* _ASM_X86_CFI_H */
--- a/include/linux/cfi.h
+++ b/include/linux/cfi.h
@@ -46,4 +46,8 @@ static inline void module_cfi_finalize(c
 #endif /* CONFIG_ARCH_USES_CFI_TRAPS */
 #endif /* CONFIG_MODULES */
 
+#ifndef CFI_NOSEAL
+#define CFI_NOSEAL(x)
+#endif
+
 #endif /* _LINUX_CFI_H */



^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v3 6/7] bpf: Fix dtor CFI
  2023-12-15  9:12 [PATCH v3 0/7] x86/cfi,bpf: Fix CFI vs eBPF Peter Zijlstra
                   ` (4 preceding siblings ...)
  2023-12-15  9:12 ` [PATCH v3 5/7] cfi: Add CFI_NOSEAL() Peter Zijlstra
@ 2023-12-15  9:12 ` Peter Zijlstra
  2023-12-15  9:12 ` [PATCH v3 7/7] x86/cfi,bpf: Fix bpf_exception_cb() signature Peter Zijlstra
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Peter Zijlstra @ 2023-12-15  9:12 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: paul.walmsley, palmer, aou, tglx, mingo, bp, dave.hansen, x86,
	hpa, davem, dsahern, daniel, andrii, martin.lau, song,
	yonghong.song, john.fastabend, kpsingh, sdf, haoluo, jolsa,
	Arnd Bergmann, samitolvanen, keescook, nathan, ndesaulniers,
	linux-riscv, linux-kernel, netdev, bpf, linux-arch, llvm,
	jpoimboe, joao, mark.rutland, peterz

Ensure the various dtor functions match their prototype and retain
their CFI signatures, since they don't have their address taken, they
are prone to not getting CFI, making them impossible to call
indirectly.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 kernel/bpf/cpumask.c |    8 +++++++-
 kernel/bpf/helpers.c |   16 ++++++++++++++--
 net/bpf/test_run.c   |   15 +++++++++++++--
 3 files changed, 34 insertions(+), 5 deletions(-)

--- a/kernel/bpf/cpumask.c
+++ b/kernel/bpf/cpumask.c
@@ -96,6 +96,12 @@ __bpf_kfunc void bpf_cpumask_release(str
 	migrate_enable();
 }
 
+__bpf_kfunc void bpf_cpumask_release_dtor(void *cpumask)
+{
+	bpf_cpumask_release(cpumask);
+}
+CFI_NOSEAL(bpf_cpumask_release_dtor);
+
 /**
  * bpf_cpumask_first() - Get the index of the first nonzero bit in the cpumask.
  * @cpumask: The cpumask being queried.
@@ -441,7 +447,7 @@ static const struct btf_kfunc_id_set cpu
 
 BTF_ID_LIST(cpumask_dtor_ids)
 BTF_ID(struct, bpf_cpumask)
-BTF_ID(func, bpf_cpumask_release)
+BTF_ID(func, bpf_cpumask_release_dtor)
 
 static int __init cpumask_kfunc_init(void)
 {
--- a/kernel/bpf/helpers.c
+++ b/kernel/bpf/helpers.c
@@ -2150,6 +2150,12 @@ __bpf_kfunc void bpf_task_release(struct
 	put_task_struct_rcu_user(p);
 }
 
+__bpf_kfunc void bpf_task_release_dtor(void *p)
+{
+	put_task_struct_rcu_user(p);
+}
+CFI_NOSEAL(bpf_task_release_dtor);
+
 #ifdef CONFIG_CGROUPS
 /**
  * bpf_cgroup_acquire - Acquire a reference to a cgroup. A cgroup acquired by
@@ -2174,6 +2180,12 @@ __bpf_kfunc void bpf_cgroup_release(stru
 	cgroup_put(cgrp);
 }
 
+__bpf_kfunc void bpf_cgroup_release_dtor(void *cgrp)
+{
+	cgroup_put(cgrp);
+}
+CFI_NOSEAL(bpf_cgroup_release_dtor);
+
 /**
  * bpf_cgroup_ancestor - Perform a lookup on an entry in a cgroup's ancestor
  * array. A cgroup returned by this kfunc which is not subsequently stored in a
@@ -2570,10 +2582,10 @@ static const struct btf_kfunc_id_set gen
 
 BTF_ID_LIST(generic_dtor_ids)
 BTF_ID(struct, task_struct)
-BTF_ID(func, bpf_task_release)
+BTF_ID(func, bpf_task_release_dtor)
 #ifdef CONFIG_CGROUPS
 BTF_ID(struct, cgroup)
-BTF_ID(func, bpf_cgroup_release)
+BTF_ID(func, bpf_cgroup_release_dtor)
 #endif
 
 BTF_SET8_START(common_btf_ids)
--- a/net/bpf/test_run.c
+++ b/net/bpf/test_run.c
@@ -600,10 +600,21 @@ __bpf_kfunc void bpf_kfunc_call_test_rel
 	refcount_dec(&p->cnt);
 }
 
+__bpf_kfunc void bpf_kfunc_call_test_release_dtor(void *p)
+{
+	bpf_kfunc_call_test_release(p);
+}
+CFI_NOSEAL(bpf_kfunc_call_test_release_dtor);
+
 __bpf_kfunc void bpf_kfunc_call_memb_release(struct prog_test_member *p)
 {
 }
 
+__bpf_kfunc void bpf_kfunc_call_memb_release_dtor(void *p)
+{
+}
+CFI_NOSEAL(bpf_kfunc_call_memb_release_dtor);
+
 __bpf_kfunc_end_defs();
 
 BTF_SET8_START(bpf_test_modify_return_ids)
@@ -1671,9 +1682,9 @@ static const struct btf_kfunc_id_set bpf
 
 BTF_ID_LIST(bpf_prog_test_dtor_kfunc_ids)
 BTF_ID(struct, prog_test_ref_kfunc)
-BTF_ID(func, bpf_kfunc_call_test_release)
+BTF_ID(func, bpf_kfunc_call_test_release_dtor)
 BTF_ID(struct, prog_test_member)
-BTF_ID(func, bpf_kfunc_call_memb_release)
+BTF_ID(func, bpf_kfunc_call_memb_release_dtor)
 
 static int __init bpf_prog_test_run_init(void)
 {



^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v3 7/7] x86/cfi,bpf: Fix bpf_exception_cb() signature
  2023-12-15  9:12 [PATCH v3 0/7] x86/cfi,bpf: Fix CFI vs eBPF Peter Zijlstra
                   ` (5 preceding siblings ...)
  2023-12-15  9:12 ` [PATCH v3 6/7] bpf: Fix dtor CFI Peter Zijlstra
@ 2023-12-15  9:12 ` Peter Zijlstra
  2023-12-16  0:50 ` [PATCH v3 0/7] x86/cfi,bpf: Fix CFI vs eBPF patchwork-bot+netdevbpf
  2023-12-16  0:53 ` Alexei Starovoitov
  8 siblings, 0 replies; 10+ messages in thread
From: Peter Zijlstra @ 2023-12-15  9:12 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: paul.walmsley, palmer, aou, tglx, mingo, bp, dave.hansen, x86,
	hpa, davem, dsahern, daniel, andrii, martin.lau, song,
	yonghong.song, john.fastabend, kpsingh, sdf, haoluo, jolsa,
	Arnd Bergmann, samitolvanen, keescook, nathan, ndesaulniers,
	linux-riscv, linux-kernel, netdev, bpf, linux-arch, llvm,
	jpoimboe, joao, mark.rutland, peterz

From: Alexei Starovoitov <alexei.starovoitov@gmail.com>

As per the earlier patches, BPF sub-programs have bpf_callback_t
signature and CFI expects callers to have matching signature. This is
violated by bpf_prog_aux::bpf_exception_cb().

[peterz: Changelog]
Reported-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/CAADnVQ+Z7UcXXBBhMubhcMM=R-dExk-uHtfOLtoLxQ1XxEpqEA@mail.gmail.com
---
 include/linux/bpf.h  |    2 +-
 kernel/bpf/helpers.c |    2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -1484,7 +1484,7 @@ struct bpf_prog_aux {
 	int cgroup_atype; /* enum cgroup_bpf_attach_type */
 	struct bpf_map *cgroup_storage[MAX_BPF_CGROUP_STORAGE_TYPE];
 	char name[BPF_OBJ_NAME_LEN];
-	unsigned int (*bpf_exception_cb)(u64 cookie, u64 sp, u64 bp);
+	u64 (*bpf_exception_cb)(u64 cookie, u64 sp, u64 bp, u64, u64);
 #ifdef CONFIG_SECURITY
 	void *security;
 #endif
--- a/kernel/bpf/helpers.c
+++ b/kernel/bpf/helpers.c
@@ -2537,7 +2537,7 @@ __bpf_kfunc void bpf_throw(u64 cookie)
 	 * which skips compiler generated instrumentation to do the same.
 	 */
 	kasan_unpoison_task_stack_below((void *)(long)ctx.sp);
-	ctx.aux->bpf_exception_cb(cookie, ctx.sp, ctx.bp);
+	ctx.aux->bpf_exception_cb(cookie, ctx.sp, ctx.bp, 0, 0);
 	WARN(1, "A call to BPF exception callback should never return\n");
 }
 



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v3 0/7] x86/cfi,bpf: Fix CFI vs eBPF
  2023-12-15  9:12 [PATCH v3 0/7] x86/cfi,bpf: Fix CFI vs eBPF Peter Zijlstra
                   ` (6 preceding siblings ...)
  2023-12-15  9:12 ` [PATCH v3 7/7] x86/cfi,bpf: Fix bpf_exception_cb() signature Peter Zijlstra
@ 2023-12-16  0:50 ` patchwork-bot+netdevbpf
  2023-12-16  0:53 ` Alexei Starovoitov
  8 siblings, 0 replies; 10+ messages in thread
From: patchwork-bot+netdevbpf @ 2023-12-16  0:50 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: alexei.starovoitov, paul.walmsley, palmer, aou, tglx, mingo, bp,
	dave.hansen, x86, hpa, davem, dsahern, daniel, andrii, martin.lau,
	song, yonghong.song, john.fastabend, kpsingh, sdf, haoluo, jolsa,
	arnd, samitolvanen, keescook, nathan, ndesaulniers, linux-riscv,
	linux-kernel, netdev, bpf, linux-arch, llvm, jpoimboe, joao,
	mark.rutland

Hello:

This series was applied to bpf/bpf-next.git (master)
by Alexei Starovoitov <ast@kernel.org>:

On Fri, 15 Dec 2023 10:12:16 +0100 you wrote:
> Hi!
> 
> What started with the simple observation that bpf_dispatcher_*_func() was
> broken for calling CFI functions with a __nocfi calling context for FineIBT
> ended up with a complete BPF wide CFI fixup.
> 
> With these changes on the BPF selftest suite passes without crashing -- there's
> still a few failures, but Alexei has graciously offered to look into those.
> 
> [...]

Here is the summary with links:
  - [v3,1/7] cfi: Flip headers
    https://git.kernel.org/bpf/bpf-next/c/4382159696c9
  - [v3,2/7] x86/cfi,bpf: Fix BPF JIT call
    https://git.kernel.org/bpf/bpf-next/c/4f9087f16651
  - [v3,3/7] x86/cfi,bpf: Fix bpf_callback_t CFI
    https://git.kernel.org/bpf/bpf-next/c/e72d88d18df4
  - [v3,4/7] x86/cfi,bpf: Fix bpf_struct_ops CFI
    https://git.kernel.org/bpf/bpf-next/c/2cd3e3772e41
  - [v3,5/7] cfi: Add CFI_NOSEAL()
    https://git.kernel.org/bpf/bpf-next/c/e9d13b9d2f99
  - [v3,6/7] bpf: Fix dtor CFI
    https://git.kernel.org/bpf/bpf-next/c/e4c00339891c
  - [v3,7/7] x86/cfi,bpf: Fix bpf_exception_cb() signature
    https://git.kernel.org/bpf/bpf-next/c/852486b35f34

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v3 0/7] x86/cfi,bpf: Fix CFI vs eBPF
  2023-12-15  9:12 [PATCH v3 0/7] x86/cfi,bpf: Fix CFI vs eBPF Peter Zijlstra
                   ` (7 preceding siblings ...)
  2023-12-16  0:50 ` [PATCH v3 0/7] x86/cfi,bpf: Fix CFI vs eBPF patchwork-bot+netdevbpf
@ 2023-12-16  0:53 ` Alexei Starovoitov
  8 siblings, 0 replies; 10+ messages in thread
From: Alexei Starovoitov @ 2023-12-16  0:53 UTC (permalink / raw)
  To: Peter Zijlstra, Ilya Leoshkevich, Martin KaFai Lau
  Cc: Paul Walmsley, Palmer Dabbelt, Albert Ou, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Dave Hansen, X86 ML, H. Peter Anvin,
	David S. Miller, David Ahern, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa, Arnd Bergmann,
	Sami Tolvanen, Kees Cook, Nathan Chancellor, Nick Desaulniers,
	linux-riscv, LKML, Network Development, bpf, linux-arch,
	clang-built-linux, Josh Poimboeuf, Joao Moreira, Mark Rutland

On Fri, Dec 15, 2023 at 1:33 AM Peter Zijlstra <peterz@infradead.org> wrote:
>
> Hi!
>
> What started with the simple observation that bpf_dispatcher_*_func() was
> broken for calling CFI functions with a __nocfi calling context for FineIBT
> ended up with a complete BPF wide CFI fixup.
>
> With these changes on the BPF selftest suite passes without crashing -- there's
> still a few failures, but Alexei has graciously offered to look into those.
>
> (Alexei, I have presumed your SoB on the very last patch, please update
> as you see fit)
>
> Changes since v2 are numerous but include:
>  - cfi_get_offset() -- as a means to communicate the offset (ast)
>  - 5 new patches fixing various BPF internals to be CFI clean

Looks great to me. Pushed to bpf-next.

There is a failure on s390 that I temporarily denylisted
with an extra patch.
And sent a proposed fix:
https://lore.kernel.org/bpf/20231216004549.78355-1-alexei.starovoitov@gmail.com/

Ilya,
please take a look.

> Note: it *might* be possible to merge the
> bpf_bpf_tcp_ca.c:unsupported_ops[] thing into the CFI stubs, as is
> get_info will have a NULL stub, unlike the others.

That's a good idea. Will clean up unsupported_ops.
Either myself or Martin will follow up.

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2023-12-16  0:54 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-12-15  9:12 [PATCH v3 0/7] x86/cfi,bpf: Fix CFI vs eBPF Peter Zijlstra
2023-12-15  9:12 ` [PATCH v3 1/7] cfi: Flip headers Peter Zijlstra
2023-12-15  9:12 ` [PATCH v3 2/7] x86/cfi,bpf: Fix BPF JIT call Peter Zijlstra
2023-12-15  9:12 ` [PATCH v3 3/7] x86/cfi,bpf: Fix bpf_callback_t CFI Peter Zijlstra
2023-12-15  9:12 ` [PATCH v3 4/7] x86/cfi,bpf: Fix bpf_struct_ops CFI Peter Zijlstra
2023-12-15  9:12 ` [PATCH v3 5/7] cfi: Add CFI_NOSEAL() Peter Zijlstra
2023-12-15  9:12 ` [PATCH v3 6/7] bpf: Fix dtor CFI Peter Zijlstra
2023-12-15  9:12 ` [PATCH v3 7/7] x86/cfi,bpf: Fix bpf_exception_cb() signature Peter Zijlstra
2023-12-16  0:50 ` [PATCH v3 0/7] x86/cfi,bpf: Fix CFI vs eBPF patchwork-bot+netdevbpf
2023-12-16  0:53 ` Alexei Starovoitov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox