BPF List
 help / color / mirror / Atom feed
* [PATCH bpf v3 0/2] bpf: Account for early exit of bpf_tail_call() and LD_ABS
@ 2025-01-06 17:15 Arthur Fabre
  2025-01-06 17:15 ` [PATCH bpf v3 1/2] " Arthur Fabre
  2025-01-06 17:15 ` [PATCH bpf v3 2/2] selftests/bpf: Test r0 and ref lifetime after BPF-BPF call with abnormal return Arthur Fabre
  0 siblings, 2 replies; 7+ messages in thread
From: Arthur Fabre @ 2025-01-06 17:15 UTC (permalink / raw)
  To: bpf
  Cc: Alexei Starovoitov, Daniel Borkmann, John Fastabend,
	Andrii Nakryiko, Martin KaFai Lau, Eduard Zingerman, Song Liu,
	Yonghong Song, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
	kernel-team, Arthur Fabre

A BPF function can return before its exit instruction: LD_ABS, LD_IND,
and tail_call() can all cause it to return abnormally.

When such a function is called by another BPF function, the verifier
doesn't take this into account when calculating the bounds of the return
value, or pointers to the caller's stack.

---
Changes in v2:
- Handle LD_ABS and LD_IND, not just tail_call()
- Split tests out
- Use inline asm for tests

Changes in v3:
- Don't handle just r0, model abnormal exits as a branch that exits or
  falls through.
- Try to use C as much as possible for the tests.

Arthur Fabre (2):
  bpf: Account for early exit of bpf_tail_call() and LD_ABS
  selftests/bpf: Test r0 and ref lifetime after BPF-BPF call with
    abnormal return

 kernel/bpf/verifier.c                         |  84 +++++++++----
 .../selftests/bpf/prog_tests/verifier.c       |   2 +
 .../bpf/progs/verifier_abnormal_ret.c         | 115 ++++++++++++++++++
 3 files changed, 178 insertions(+), 23 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/progs/verifier_abnormal_ret.c

-- 
2.43.0


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH bpf v3 1/2] bpf: Account for early exit of bpf_tail_call() and LD_ABS
  2025-01-06 17:15 [PATCH bpf v3 0/2] bpf: Account for early exit of bpf_tail_call() and LD_ABS Arthur Fabre
@ 2025-01-06 17:15 ` Arthur Fabre
  2025-01-06 20:31   ` Eduard Zingerman
  2025-01-06 17:15 ` [PATCH bpf v3 2/2] selftests/bpf: Test r0 and ref lifetime after BPF-BPF call with abnormal return Arthur Fabre
  1 sibling, 1 reply; 7+ messages in thread
From: Arthur Fabre @ 2025-01-06 17:15 UTC (permalink / raw)
  To: bpf
  Cc: Alexei Starovoitov, Daniel Borkmann, John Fastabend,
	Andrii Nakryiko, Martin KaFai Lau, Eduard Zingerman, Song Liu,
	Yonghong Song, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
	kernel-team, Arthur Fabre

bpf_tail_call(), LD_ABS, and LD_IND can cause the current function to
return abnormally:
- On success, bpf_tail_call() will jump to the tail_called program, and
  that program will return directly to the outer caller.
- On failure, LD_ABS or LD_IND return directly to the outer caller.

But the verifier doesn't account for these abnormal exits, so it assumes
the instructions following a bpf_tail_call() or LD_ABS are always
executed, and updates bounds info accordingly.

Before BPF to BPF calls that was ok: the whole BPF program would
terminate anyways, so it didn't matter that the verifier state didn't
match reality.

But if these instructions are used in a function call, the verifier will
propagate some of this incorrect bounds info to the caller. There are at
least two kinds of this:
- The callee's return value in the caller.
- References to the caller's stack passed into the caller.

For example, loading:

    #include <linux/bpf.h>
    #include <bpf/bpf_helpers.h>

    struct {
            __uint(type, BPF_MAP_TYPE_PROG_ARRAY);
            __uint(max_entries, 1);
            __uint(key_size, sizeof(__u32));
            __uint(value_size, sizeof(__u32));
    } tail_call_map SEC(".maps");

    static __attribute__((noinline)) int callee(struct xdp_md *ctx)
    {
            bpf_tail_call(ctx, &tail_call_map, 0);

            int ret;
            asm volatile("%0 = 23" : "=r"(ret));
            return ret;
    }

    static SEC("xdp") int caller(struct xdp_md *ctx)
    {
            int res = callee(ctx);
            if (res == 23) {
                    return XDP_PASS;
            }
            return XDP_DROP;
    }

The verifier logs:

    func#0 @0
    func#1 @6
    0: R1=ctx() R10=fp0
    ; int res = callee(ctx); @ test.c:24
    0: (85) call pc+5
    caller:
     R10=fp0
    callee:
     frame1: R1=ctx() R10=fp0
    6: frame1: R1=ctx() R10=fp0
    ; bpf_tail_call(ctx, &tail_call_map, 0); @ test.c:15
    6: (18) r2 = 0xffff8a9c82a75800       ; frame1: R2_w=map_ptr(map=tail_call_map,ks=4,vs=4)
    8: (b4) w3 = 0                        ; frame1: R3_w=0
    9: (85) call bpf_tail_call#12
    10: frame1:
    ; asm volatile("%0 = 23" : "=r"(ret)); @ test.c:18
    10: (b7) r0 = 23                      ; frame1: R0_w=23
    ; return ret; @ test.c:19
    11: (95) exit
    returning from callee:
     frame1: R0_w=23 R10=fp0
    to caller at 1:
     R0_w=23 R10=fp0

    from 11 to 1: R0_w=23 R10=fp0
    ; int res = callee(ctx); @ test.c:24
    1: (bc) w1 = w0                       ; R0_w=23 R1_w=23
    2: (b4) w0 = 2                        ; R0=2
    ;  @ test.c:0
    3: (16) if w1 == 0x17 goto pc+1
    3: R1=23
    ; } @ test.c:29
    5: (95) exit
    processed 10 insns (limit 1000000) max_states_per_insn 0 total_states 1 peak_states 1 mark_read 1

And tracks R0_w=23 from the callee through to the caller.
This lets it completely prune the res != 23 branch, skipping over
instruction 4.

But this isn't sound: the bpf_tail_call() could make the callee return
before r0 = 23.

Aside from pruning incorrect branches, this can also be used to read and
write arbitrary memory by using r0 as a index.

Make the verifier track instructions that can return abnormally as a
branch that either exits, or falls through to the remaining
instructions.

This naturally checks for resource leaks, so we can remove the explicit
checks for tail_calls and LD_ABS.

Fixes: f4d7e40a5b71 ("bpf: introduce function calls (verification)")
Signed-off-by: Arthur Fabre <afabre@cloudflare.com>
Cc: stable@vger.kernel.org
---
 kernel/bpf/verifier.c | 84 +++++++++++++++++++++++++++++++------------
 1 file changed, 61 insertions(+), 23 deletions(-)

diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 77f56674aaa9..a0853e9866d8 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -10488,13 +10488,20 @@ record_func_key(struct bpf_verifier_env *env, struct bpf_call_arg_meta *meta,
 	return 0;
 }
 
-static int check_reference_leak(struct bpf_verifier_env *env, bool exception_exit)
+enum bpf_exit {
+	BPF_EXIT_INSN,
+	BPF_EXIT_EXCEPTION,
+	BPF_EXIT_TAIL_CALL,
+	BPF_EXIT_LD_ABS,
+};
+
+static int check_reference_leak(struct bpf_verifier_env *env, enum bpf_exit exit)
 {
 	struct bpf_func_state *state = cur_func(env);
 	bool refs_lingering = false;
 	int i;
 
-	if (!exception_exit && state->frameno)
+	if (exit != BPF_EXIT_EXCEPTION && state->frameno)
 		return 0;
 
 	for (i = 0; i < state->acquired_refs; i++) {
@@ -10507,16 +10514,32 @@ static int check_reference_leak(struct bpf_verifier_env *env, bool exception_exi
 	return refs_lingering ? -EINVAL : 0;
 }
 
-static int check_resource_leak(struct bpf_verifier_env *env, bool exception_exit, bool check_lock, const char *prefix)
+static int check_resource_leak(struct bpf_verifier_env *env, enum bpf_exit exit, bool check_lock)
 {
 	int err;
+	const char *prefix;
+
+	switch (exit) {
+	case BPF_EXIT_INSN:
+		prefix = "BPF_EXIT instruction";
+		break;
+	case BPF_EXIT_EXCEPTION:
+		prefix = "bpf_throw";
+		break;
+	case BPF_EXIT_TAIL_CALL:
+		prefix = "tail_call";
+		break;
+	case BPF_EXIT_LD_ABS:
+		prefix = "BPF_LD_[ABS|IND]";
+		break;
+	}
 
 	if (check_lock && cur_func(env)->active_locks) {
 		verbose(env, "%s cannot be used inside bpf_spin_lock-ed region\n", prefix);
 		return -EINVAL;
 	}
 
-	err = check_reference_leak(env, exception_exit);
+	err = check_reference_leak(env, exit);
 	if (err) {
 		verbose(env, "%s would lead to reference leak\n", prefix);
 		return err;
@@ -10802,11 +10825,6 @@ static int check_helper_call(struct bpf_verifier_env *env, struct bpf_insn *insn
 	}
 
 	switch (func_id) {
-	case BPF_FUNC_tail_call:
-		err = check_resource_leak(env, false, true, "tail_call");
-		if (err)
-			return err;
-		break;
 	case BPF_FUNC_get_local_storage:
 		/* check that flags argument in get_local_storage(map, flags) is 0,
 		 * this is required because get_local_storage() can't return an error.
@@ -15963,14 +15981,6 @@ static int check_ld_abs(struct bpf_verifier_env *env, struct bpf_insn *insn)
 	if (err)
 		return err;
 
-	/* Disallow usage of BPF_LD_[ABS|IND] with reference tracking, as
-	 * gen_ld_abs() may terminate the program at runtime, leading to
-	 * reference leak.
-	 */
-	err = check_resource_leak(env, false, true, "BPF_LD_[ABS|IND]");
-	if (err)
-		return err;
-
 	if (regs[ctx_reg].type != PTR_TO_CTX) {
 		verbose(env,
 			"at the time of BPF_LD_ABS|IND R6 != pointer to skb\n");
@@ -18540,7 +18550,7 @@ static int do_check(struct bpf_verifier_env *env)
 	int prev_insn_idx = -1;
 
 	for (;;) {
-		bool exception_exit = false;
+		enum bpf_exit exit;
 		struct bpf_insn *insn;
 		u8 class;
 		int err;
@@ -18760,7 +18770,7 @@ static int do_check(struct bpf_verifier_env *env)
 				} else if (insn->src_reg == BPF_PSEUDO_KFUNC_CALL) {
 					err = check_kfunc_call(env, insn, &env->insn_idx);
 					if (!err && is_bpf_throw_kfunc(insn)) {
-						exception_exit = true;
+						exit = BPF_EXIT_EXCEPTION;
 						goto process_bpf_exit_full;
 					}
 				} else {
@@ -18770,6 +18780,21 @@ static int do_check(struct bpf_verifier_env *env)
 					return err;
 
 				mark_reg_scratched(env, BPF_REG_0);
+
+				if (insn->src_reg == 0 && insn->imm == BPF_FUNC_tail_call) {
+					/* Explore both cases: tail_call fails and we fallthrough,
+					 * or it succeeds and we exit the current function.
+					 */
+					if (!push_stack(env, env->insn_idx + 1, env->insn_idx, false))
+						return -ENOMEM;
+					/* bpf_tail_call() doesn't set r0 on failure / in the fallthrough case.
+					 * But it does on success, so we have to mark it after queueing the
+					 * fallthrough case, but before prepare_func_exit().
+					 */
+					__mark_reg_unknown(env, &state->frame[state->curframe]->regs[BPF_REG_0]);
+					exit = BPF_EXIT_TAIL_CALL;
+					goto process_bpf_exit_full;
+				}
 			} else if (opcode == BPF_JA) {
 				if (BPF_SRC(insn->code) != BPF_K ||
 				    insn->src_reg != BPF_REG_0 ||
@@ -18795,6 +18820,8 @@ static int do_check(struct bpf_verifier_env *env)
 					verbose(env, "BPF_EXIT uses reserved fields\n");
 					return -EINVAL;
 				}
+				exit = BPF_EXIT_INSN;
+
 process_bpf_exit_full:
 				/* We must do check_reference_leak here before
 				 * prepare_func_exit to handle the case when
@@ -18802,8 +18829,7 @@ static int do_check(struct bpf_verifier_env *env)
 				 * function, for which reference_state must
 				 * match caller reference state when it exits.
 				 */
-				err = check_resource_leak(env, exception_exit, !env->cur_state->curframe,
-							  "BPF_EXIT instruction");
+				err = check_resource_leak(env, exit, !env->cur_state->curframe);
 				if (err)
 					return err;
 
@@ -18817,7 +18843,7 @@ static int do_check(struct bpf_verifier_env *env)
 				 * exits. We also skip return code checks as
 				 * they are not needed for exceptional exits.
 				 */
-				if (exception_exit)
+				if (exit == BPF_EXIT_EXCEPTION)
 					goto process_bpf_exit;
 
 				if (state->curframe) {
@@ -18829,6 +18855,12 @@ static int do_check(struct bpf_verifier_env *env)
 					continue;
 				}
 
+				/* BPF_EXIT instruction is the only one that doesn't intrinsically
+				 * set R0.
+				 */
+				if (exit != BPF_EXIT_INSN)
+					goto process_bpf_exit;
+
 				err = check_return_code(env, BPF_REG_0, "R0");
 				if (err)
 					return err;
@@ -18857,7 +18889,13 @@ static int do_check(struct bpf_verifier_env *env)
 				err = check_ld_abs(env, insn);
 				if (err)
 					return err;
-
+				/* Explore both cases: LD_ABS|IND succeeds and we fallthrough,
+				 * or it fails and we exit the current function.
+				 */
+				if (!push_stack(env, env->insn_idx + 1, env->insn_idx, false))
+					return -ENOMEM;
+				exit = BPF_EXIT_LD_ABS;
+				goto process_bpf_exit_full;
 			} else if (mode == BPF_IMM) {
 				err = check_ld_imm(env, insn);
 				if (err)
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH bpf v3 2/2] selftests/bpf: Test r0 and ref lifetime after BPF-BPF call with abnormal return
  2025-01-06 17:15 [PATCH bpf v3 0/2] bpf: Account for early exit of bpf_tail_call() and LD_ABS Arthur Fabre
  2025-01-06 17:15 ` [PATCH bpf v3 1/2] " Arthur Fabre
@ 2025-01-06 17:15 ` Arthur Fabre
  2025-01-06 20:34   ` Eduard Zingerman
  1 sibling, 1 reply; 7+ messages in thread
From: Arthur Fabre @ 2025-01-06 17:15 UTC (permalink / raw)
  To: bpf
  Cc: Alexei Starovoitov, Daniel Borkmann, John Fastabend,
	Andrii Nakryiko, Martin KaFai Lau, Eduard Zingerman, Song Liu,
	Yonghong Song, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
	kernel-team, Arthur Fabre

In all three cases where a callee can abnormally return (tail_call(),
LD_ABS, and LD_IND), test the verifier doesn't know the bounds of:

- r0 / what the callee returned.
- References to the caller's stack passed to the callee.

Additionally, ensure the tail_call fallthrough case can't access r0, as
bpf_tail_call() returns nothing on failure.

Signed-off-by: Arthur Fabre <afabre@cloudflare.com>
---
 .../selftests/bpf/prog_tests/verifier.c       |   2 +
 .../bpf/progs/verifier_abnormal_ret.c         | 115 ++++++++++++++++++
 2 files changed, 117 insertions(+)
 create mode 100644 tools/testing/selftests/bpf/progs/verifier_abnormal_ret.c

diff --git a/tools/testing/selftests/bpf/prog_tests/verifier.c b/tools/testing/selftests/bpf/prog_tests/verifier.c
index 3ee40ee9413a..6bed606544e3 100644
--- a/tools/testing/selftests/bpf/prog_tests/verifier.c
+++ b/tools/testing/selftests/bpf/prog_tests/verifier.c
@@ -3,6 +3,7 @@
 #include <test_progs.h>
 
 #include "cap_helpers.h"
+#include "verifier_abnormal_ret.skel.h"
 #include "verifier_and.skel.h"
 #include "verifier_arena.skel.h"
 #include "verifier_arena_large.skel.h"
@@ -133,6 +134,7 @@ static void run_tests_aux(const char *skel_name,
 
 #define RUN(skel) run_tests_aux(#skel, skel##__elf_bytes, NULL)
 
+void test_verifier_abnormal_ret(void)         { RUN(verifier_abnormal_ret); }
 void test_verifier_and(void)                  { RUN(verifier_and); }
 void test_verifier_arena(void)                { RUN(verifier_arena); }
 void test_verifier_arena_large(void)          { RUN(verifier_arena_large); }
diff --git a/tools/testing/selftests/bpf/progs/verifier_abnormal_ret.c b/tools/testing/selftests/bpf/progs/verifier_abnormal_ret.c
new file mode 100644
index 000000000000..185c19ba3329
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/verifier_abnormal_ret.c
@@ -0,0 +1,115 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#include <linux/bpf.h>
+#include <bpf/bpf_helpers.h>
+#include "../../../include/linux/filter.h"
+#include "bpf_misc.h"
+
+#define TEST(NAME, CALLEE) \
+	SEC("socket")					\
+	__description("r0: " #NAME)	\
+	__failure __msg("math between ctx pointer and register with unbounded min value") \
+	__naked int check_abnormal_ret_r0_##NAME(void)	\
+	{						\
+		asm volatile("				\
+		r6 = r1;				\
+		r2 = r10;				\
+		r2 += -8;				\
+		call " #CALLEE ";			\
+		r6 += r0;				\
+		r0 = 0;					\
+		exit;					\
+	"	:					\
+		:					\
+		: __clobber_all);			\
+	}						\
+							\
+	SEC("socket")					\
+	__description("ref: " #NAME)	\
+	__failure __msg("math between ctx pointer and register with unbounded min value") \
+	__naked int check_abnormal_ret_ref_##NAME(void)	\
+	{						\
+		asm volatile("				\
+		r6 = r1;				\
+		r7 = r10;				\
+		r7 += -8;				\
+		r2 = r7;				\
+		call " #CALLEE ";			\
+		r0 = *(u64*)(r7 + 0);			\
+		r6 += r0;				\
+		exit;					\
+	"	:					\
+		:					\
+		: __clobber_all);			\
+	}
+
+TEST(ld_abs, callee_ld_abs);
+TEST(ld_ind, callee_ld_ind);
+TEST(tail_call, callee_tail_call);
+
+static __naked __noinline __used
+int callee_ld_abs(void)
+{
+	asm volatile("					\
+	r6 = r1;					\
+	r9 = r2;					\
+	.8byte %[ld_abs];				\
+	*(u64*)(r9 + 0) = 1;				\
+	r0 = 0;						\
+	exit;						\
+"	:
+	: __imm_insn(ld_abs, BPF_LD_ABS(BPF_W, 0))
+	: __clobber_all);
+}
+
+static __naked __noinline __used
+int callee_ld_ind(void)
+{
+	asm volatile("					\
+	r6 = r1;					\
+	r7 = 1;						\
+	r9 = r2;					\
+	.8byte %[ld_ind];				\
+	*(u64*)(r9 + 0) = 1;				\
+	r0 = 0;						\
+	exit;						\
+"	:
+	: __imm_insn(ld_ind, BPF_LD_IND(BPF_W, BPF_REG_7, 0))
+	: __clobber_all);
+}
+
+SEC("socket")
+__auxiliary __naked
+int dummy_prog(void)
+{
+	asm volatile("r0 = 1; exit;");
+}
+
+struct {
+	__uint(type, BPF_MAP_TYPE_PROG_ARRAY);
+	__uint(max_entries, 1);
+	__uint(key_size, sizeof(int));
+	__array(values, void(void));
+} map_prog SEC(".maps") = {
+	.values = {
+		[0] = (void *)&dummy_prog,
+	},
+};
+
+static __noinline __used
+int callee_tail_call(struct __sk_buff *skb, __u64 *foo)
+{
+	bpf_tail_call(skb, &map_prog, 0);
+	*foo = 1;
+	return 0;
+}
+
+SEC("socket")
+__description("r0 not set by tail_call")
+__failure __msg("R0 !read_ok")
+int check_abnormal_ret_tail_call_fail(struct __sk_buff *skb)
+{
+	return bpf_tail_call(skb, &map_prog, 0);
+}
+
+char _license[] SEC("license") = "GPL";
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH bpf v3 1/2] bpf: Account for early exit of bpf_tail_call() and LD_ABS
  2025-01-06 17:15 ` [PATCH bpf v3 1/2] " Arthur Fabre
@ 2025-01-06 20:31   ` Eduard Zingerman
  2025-01-08 20:44     ` Arthur Fabre
  0 siblings, 1 reply; 7+ messages in thread
From: Eduard Zingerman @ 2025-01-06 20:31 UTC (permalink / raw)
  To: Arthur Fabre, bpf
  Cc: Alexei Starovoitov, Daniel Borkmann, John Fastabend,
	Andrii Nakryiko, Martin KaFai Lau, Song Liu, Yonghong Song,
	KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa, kernel-team

On Mon, 2025-01-06 at 18:15 +0100, Arthur Fabre wrote:
> bpf_tail_call(), LD_ABS, and LD_IND can cause the current function to
> return abnormally:
> - On success, bpf_tail_call() will jump to the tail_called program, and
>   that program will return directly to the outer caller.
> - On failure, LD_ABS or LD_IND return directly to the outer caller.
> 
> But the verifier doesn't account for these abnormal exits, so it assumes
> the instructions following a bpf_tail_call() or LD_ABS are always
> executed, and updates bounds info accordingly.
> 
> Before BPF to BPF calls that was ok: the whole BPF program would
> terminate anyways, so it didn't matter that the verifier state didn't
> match reality.
> 
> But if these instructions are used in a function call, the verifier will
> propagate some of this incorrect bounds info to the caller. There are at
> least two kinds of this:
> - The callee's return value in the caller.
> - References to the caller's stack passed into the caller.
> 
> For example, loading:
> 
>     #include <linux/bpf.h>
>     #include <bpf/bpf_helpers.h>
> 
>     struct {
>             __uint(type, BPF_MAP_TYPE_PROG_ARRAY);
>             __uint(max_entries, 1);
>             __uint(key_size, sizeof(__u32));
>             __uint(value_size, sizeof(__u32));
>     } tail_call_map SEC(".maps");
> 
>     static __attribute__((noinline)) int callee(struct xdp_md *ctx)
>     {
>             bpf_tail_call(ctx, &tail_call_map, 0);
> 
>             int ret;
>             asm volatile("%0 = 23" : "=r"(ret));
>             return ret;
>     }
> 
>     static SEC("xdp") int caller(struct xdp_md *ctx)
>     {
>             int res = callee(ctx);
>             if (res == 23) {
>                     return XDP_PASS;
>             }
>             return XDP_DROP;
>     }
> 
> The verifier logs:
> 
>     func#0 @0
>     func#1 @6
>     0: R1=ctx() R10=fp0
>     ; int res = callee(ctx); @ test.c:24
>     0: (85) call pc+5
>     caller:
>      R10=fp0
>     callee:
>      frame1: R1=ctx() R10=fp0
>     6: frame1: R1=ctx() R10=fp0
>     ; bpf_tail_call(ctx, &tail_call_map, 0); @ test.c:15
>     6: (18) r2 = 0xffff8a9c82a75800       ; frame1: R2_w=map_ptr(map=tail_call_map,ks=4,vs=4)
>     8: (b4) w3 = 0                        ; frame1: R3_w=0
>     9: (85) call bpf_tail_call#12
>     10: frame1:
>     ; asm volatile("%0 = 23" : "=r"(ret)); @ test.c:18
>     10: (b7) r0 = 23                      ; frame1: R0_w=23
>     ; return ret; @ test.c:19
>     11: (95) exit
>     returning from callee:
>      frame1: R0_w=23 R10=fp0
>     to caller at 1:
>      R0_w=23 R10=fp0
> 
>     from 11 to 1: R0_w=23 R10=fp0
>     ; int res = callee(ctx); @ test.c:24
>     1: (bc) w1 = w0                       ; R0_w=23 R1_w=23
>     2: (b4) w0 = 2                        ; R0=2
>     ;  @ test.c:0
>     3: (16) if w1 == 0x17 goto pc+1
>     3: R1=23
>     ; } @ test.c:29
>     5: (95) exit
>     processed 10 insns (limit 1000000) max_states_per_insn 0 total_states 1 peak_states 1 mark_read 1
> 
> And tracks R0_w=23 from the callee through to the caller.
> This lets it completely prune the res != 23 branch, skipping over
> instruction 4.
> 
> But this isn't sound: the bpf_tail_call() could make the callee return
> before r0 = 23.
> 
> Aside from pruning incorrect branches, this can also be used to read and
> write arbitrary memory by using r0 as a index.
> 
> Make the verifier track instructions that can return abnormally as a
> branch that either exits, or falls through to the remaining
> instructions.
> 
> This naturally checks for resource leaks, so we can remove the explicit
> checks for tail_calls and LD_ABS.
> 
> Fixes: f4d7e40a5b71 ("bpf: introduce function calls (verification)")
> Signed-off-by: Arthur Fabre <afabre@cloudflare.com>
> Cc: stable@vger.kernel.org
> ---

This patch is correct as far as I can tell.

Acked-by: Eduard Zingerman <eddyz87@gmail.com>

[...]

> @@ -18770,6 +18780,21 @@ static int do_check(struct bpf_verifier_env *env)
>  					return err;
>  
>  				mark_reg_scratched(env, BPF_REG_0);
> +
> +				if (insn->src_reg == 0 && insn->imm == BPF_FUNC_tail_call) {
> +					/* Explore both cases: tail_call fails and we fallthrough,
> +					 * or it succeeds and we exit the current function.
> +					 */
> +					if (!push_stack(env, env->insn_idx + 1, env->insn_idx, false))
> +						return -ENOMEM;
> +					/* bpf_tail_call() doesn't set r0 on failure / in the fallthrough case.
> +					 * But it does on success, so we have to mark it after queueing the
> +					 * fallthrough case, but before prepare_func_exit().
> +					 */
> +					__mark_reg_unknown(env, &state->frame[state->curframe]->regs[BPF_REG_0]);
> +					exit = BPF_EXIT_TAIL_CALL;
> +					goto process_bpf_exit_full;
> +				}

Nit: it's a bit unfortunate, that this logic is inside do_check,
     instead of check_helper_call() and check_ld_abs().
     But it makes BPF_EXIT_* propagation simpler.

>  			} else if (opcode == BPF_JA) {
>  				if (BPF_SRC(insn->code) != BPF_K ||
>  				    insn->src_reg != BPF_REG_0 ||

[...]


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH bpf v3 2/2] selftests/bpf: Test r0 and ref lifetime after BPF-BPF call with abnormal return
  2025-01-06 17:15 ` [PATCH bpf v3 2/2] selftests/bpf: Test r0 and ref lifetime after BPF-BPF call with abnormal return Arthur Fabre
@ 2025-01-06 20:34   ` Eduard Zingerman
  2025-01-08 20:46     ` Arthur Fabre
  0 siblings, 1 reply; 7+ messages in thread
From: Eduard Zingerman @ 2025-01-06 20:34 UTC (permalink / raw)
  To: Arthur Fabre, bpf
  Cc: Alexei Starovoitov, Daniel Borkmann, John Fastabend,
	Andrii Nakryiko, Martin KaFai Lau, Song Liu, Yonghong Song,
	KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa, kernel-team

On Mon, 2025-01-06 at 18:15 +0100, Arthur Fabre wrote:
> In all three cases where a callee can abnormally return (tail_call(),
> LD_ABS, and LD_IND), test the verifier doesn't know the bounds of:
> 
> - r0 / what the callee returned.
> - References to the caller's stack passed to the callee.
> 
> Additionally, ensure the tail_call fallthrough case can't access r0, as
> bpf_tail_call() returns nothing on failure.
> 
> Signed-off-by: Arthur Fabre <afabre@cloudflare.com>
> ---

Acked-by: Eduard Zingerman <eddyz87@gmail.com>

[...]

> +#define TEST(NAME, CALLEE) \
> +	SEC("socket")					\
> +	__description("r0: " #NAME)	\
> +	__failure __msg("math between ctx pointer and register with unbounded min value") \
> +	__naked int check_abnormal_ret_r0_##NAME(void)	\
> +	{						\
> +		asm volatile("				\
> +		r6 = r1;				\
> +		r2 = r10;				\
> +		r2 += -8;				\
> +		call " #CALLEE ";			\
> +		r6 += r0;				\
> +		r0 = 0;					\
> +		exit;					\
> +	"	:					\
> +		:					\
> +		: __clobber_all);			\
> +	}						\
> +							\
> +	SEC("socket")					\
> +	__description("ref: " #NAME)	\
> +	__failure __msg("math between ctx pointer and register with unbounded min value") \
> +	__naked int check_abnormal_ret_ref_##NAME(void)	\
> +	{						\
> +		asm volatile("				\
> +		r6 = r1;				\
> +		r7 = r10;				\
> +		r7 += -8;				\
> +		r2 = r7;				\
> +		call " #CALLEE ";			\
> +		r0 = *(u64*)(r7 + 0);			\
> +		r6 += r0;				\
> +		exit;					\
> +	"	:					\
> +		:					\
> +		: __clobber_all);			\
> +	}

Nit: I think having both cases is an overkill, as both effectively
     test if branching occur.

[...]

> +struct {
> +	__uint(type, BPF_MAP_TYPE_PROG_ARRAY);
> +	__uint(max_entries, 1);
> +	__uint(key_size, sizeof(int));
> +	__array(values, void(void));
> +} map_prog SEC(".maps") = {
> +	.values = {
> +		[0] = (void *)&dummy_prog,
> +	},
> +};
> +
> +static __noinline __used
> +int callee_tail_call(struct __sk_buff *skb, __u64 *foo)
> +{
> +	bpf_tail_call(skb, &map_prog, 0);
> +	*foo = 1;
> +	return 0;
> +}

Nit: I'd also add a test where invalid action is taken
     after bpf_tail_call inside the callee,
     just to make sure that both branches are explored.

> +
> +SEC("socket")
> +__description("r0 not set by tail_call")
> +__failure __msg("R0 !read_ok")
> +int check_abnormal_ret_tail_call_fail(struct __sk_buff *skb)
> +{
> +	return bpf_tail_call(skb, &map_prog, 0);
> +}
> +
> +char _license[] SEC("license") = "GPL";



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH bpf v3 1/2] bpf: Account for early exit of bpf_tail_call() and LD_ABS
  2025-01-06 20:31   ` Eduard Zingerman
@ 2025-01-08 20:44     ` Arthur Fabre
  0 siblings, 0 replies; 7+ messages in thread
From: Arthur Fabre @ 2025-01-08 20:44 UTC (permalink / raw)
  To: Eduard Zingerman, bpf
  Cc: Alexei Starovoitov, Daniel Borkmann, John Fastabend,
	Andrii Nakryiko, Martin KaFai Lau, Song Liu, Yonghong Song,
	KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa, kernel-team

On Mon Jan 6, 2025 at 9:31 PM CET, Eduard Zingerman wrote:
> On Mon, 2025-01-06 at 18:15 +0100, Arthur Fabre wrote:
[...]

> This patch is correct as far as I can tell.
>
> Acked-by: Eduard Zingerman <eddyz87@gmail.com>

Thanks for the review!

> [...]
>
> > @@ -18770,6 +18780,21 @@ static int do_check(struct bpf_verifier_env *env)
> >  					return err;
> >  
> >  				mark_reg_scratched(env, BPF_REG_0);
> > +
> > +				if (insn->src_reg == 0 && insn->imm == BPF_FUNC_tail_call) {
> > +					/* Explore both cases: tail_call fails and we fallthrough,
> > +					 * or it succeeds and we exit the current function.
> > +					 */
> > +					if (!push_stack(env, env->insn_idx + 1, env->insn_idx, false))
> > +						return -ENOMEM;
> > +					/* bpf_tail_call() doesn't set r0 on failure / in the fallthrough case.
> > +					 * But it does on success, so we have to mark it after queueing the
> > +					 * fallthrough case, but before prepare_func_exit().
> > +					 */
> > +					__mark_reg_unknown(env, &state->frame[state->curframe]->regs[BPF_REG_0]);
> > +					exit = BPF_EXIT_TAIL_CALL;
> > +					goto process_bpf_exit_full;
> > +				}
>
> Nit: it's a bit unfortunate, that this logic is inside do_check,
>      instead of check_helper_call() and check_ld_abs().
>      But it makes BPF_EXIT_* propagation simpler.

Agreed, it's unfortunate to add more to do_check().

I tried passing exit / BPF_EXIT_* by pointer to check_helper_call() and
check_ld_abs(), but that still means we need a conditional in do_check()
to see if it's set:

if (exit != NULL)
	oto process_bpf_exit_full;

Happy to switch to that if you think it's cleaner.

[...]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH bpf v3 2/2] selftests/bpf: Test r0 and ref lifetime after BPF-BPF call with abnormal return
  2025-01-06 20:34   ` Eduard Zingerman
@ 2025-01-08 20:46     ` Arthur Fabre
  0 siblings, 0 replies; 7+ messages in thread
From: Arthur Fabre @ 2025-01-08 20:46 UTC (permalink / raw)
  To: Eduard Zingerman, bpf
  Cc: Alexei Starovoitov, Daniel Borkmann, John Fastabend,
	Andrii Nakryiko, Martin KaFai Lau, Song Liu, Yonghong Song,
	KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa, kernel-team

On Mon Jan 6, 2025 at 9:34 PM CET, Eduard Zingerman wrote:
> On Mon, 2025-01-06 at 18:15 +0100, Arthur Fabre wrote:
[...]
> > +#define TEST(NAME, CALLEE) \
> > +	SEC("socket")					\
> > +	__description("r0: " #NAME)	\
> > +	__failure __msg("math between ctx pointer and register with unbounded min value") \
> > +	__naked int check_abnormal_ret_r0_##NAME(void)	\
> > +	{						\
> > +		asm volatile("				\
> > +		r6 = r1;				\
> > +		r2 = r10;				\
> > +		r2 += -8;				\
> > +		call " #CALLEE ";			\
> > +		r6 += r0;				\
> > +		r0 = 0;					\
> > +		exit;					\
> > +	"	:					\
> > +		:					\
> > +		: __clobber_all);			\
> > +	}						\
> > +							\
> > +	SEC("socket")					\
> > +	__description("ref: " #NAME)	\
> > +	__failure __msg("math between ctx pointer and register with unbounded min value") \
> > +	__naked int check_abnormal_ret_ref_##NAME(void)	\
> > +	{						\
> > +		asm volatile("				\
> > +		r6 = r1;				\
> > +		r7 = r10;				\
> > +		r7 += -8;				\
> > +		r2 = r7;				\
> > +		call " #CALLEE ";			\
> > +		r0 = *(u64*)(r7 + 0);			\
> > +		r6 += r0;				\
> > +		exit;					\
> > +	"	:					\
> > +		:					\
> > +		: __clobber_all);			\
> > +	}
>
> Nit: I think having both cases is an overkill, as both effectively
>      test if branching occur.

Fair enough, I can drop the reference tests.

> [...]
>
> > +struct {
> > +	__uint(type, BPF_MAP_TYPE_PROG_ARRAY);
> > +	__uint(max_entries, 1);
> > +	__uint(key_size, sizeof(int));
> > +	__array(values, void(void));
> > +} map_prog SEC(".maps") = {
> > +	.values = {
> > +		[0] = (void *)&dummy_prog,
> > +	},
> > +};
> > +
> > +static __noinline __used
> > +int callee_tail_call(struct __sk_buff *skb, __u64 *foo)
> > +{
> > +	bpf_tail_call(skb, &map_prog, 0);
> > +	*foo = 1;
> > +	return 0;
> > +}
>
> Nit: I'd also add a test where invalid action is taken
>      after bpf_tail_call inside the callee,
>      just to make sure that both branches are explored.

Good idea, I'll add that in and resend. Thanks for the feedback!

>
> > +
> > +SEC("socket")
> > +__description("r0 not set by tail_call")
> > +__failure __msg("R0 !read_ok")
> > +int check_abnormal_ret_tail_call_fail(struct __sk_buff *skb)
> > +{
> > +	return bpf_tail_call(skb, &map_prog, 0);
> > +}
> > +
> > +char _license[] SEC("license") = "GPL";


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2025-01-08 20:46 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-01-06 17:15 [PATCH bpf v3 0/2] bpf: Account for early exit of bpf_tail_call() and LD_ABS Arthur Fabre
2025-01-06 17:15 ` [PATCH bpf v3 1/2] " Arthur Fabre
2025-01-06 20:31   ` Eduard Zingerman
2025-01-08 20:44     ` Arthur Fabre
2025-01-06 17:15 ` [PATCH bpf v3 2/2] selftests/bpf: Test r0 and ref lifetime after BPF-BPF call with abnormal return Arthur Fabre
2025-01-06 20:34   ` Eduard Zingerman
2025-01-08 20:46     ` Arthur Fabre

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox