From: Eduard Zingerman <eddyz87@gmail.com>
To: Arthur Fabre <afabre@cloudflare.com>, bpf@vger.kernel.org
Cc: Alexei Starovoitov <ast@kernel.org>,
Daniel Borkmann <daniel@iogearbox.net>,
John Fastabend <john.fastabend@gmail.com>,
Andrii Nakryiko <andrii@kernel.org>,
Martin KaFai Lau <martin.lau@linux.dev>,
Song Liu <song@kernel.org>,
Yonghong Song <yonghong.song@linux.dev>,
KP Singh <kpsingh@kernel.org>,
Stanislav Fomichev <sdf@fomichev.me>,
Hao Luo <haoluo@google.com>, Jiri Olsa <jolsa@kernel.org>,
kernel-team@cloudflare.com
Subject: Re: [PATCH bpf v3 1/2] bpf: Account for early exit of bpf_tail_call() and LD_ABS
Date: Mon, 06 Jan 2025 12:31:27 -0800 [thread overview]
Message-ID: <3f08fa54c29d5716982194801bfdae93c15a8c27.camel@gmail.com> (raw)
In-Reply-To: <20250106171709.2832649-2-afabre@cloudflare.com>
On Mon, 2025-01-06 at 18:15 +0100, Arthur Fabre wrote:
> bpf_tail_call(), LD_ABS, and LD_IND can cause the current function to
> return abnormally:
> - On success, bpf_tail_call() will jump to the tail_called program, and
> that program will return directly to the outer caller.
> - On failure, LD_ABS or LD_IND return directly to the outer caller.
>
> But the verifier doesn't account for these abnormal exits, so it assumes
> the instructions following a bpf_tail_call() or LD_ABS are always
> executed, and updates bounds info accordingly.
>
> Before BPF to BPF calls that was ok: the whole BPF program would
> terminate anyways, so it didn't matter that the verifier state didn't
> match reality.
>
> But if these instructions are used in a function call, the verifier will
> propagate some of this incorrect bounds info to the caller. There are at
> least two kinds of this:
> - The callee's return value in the caller.
> - References to the caller's stack passed into the caller.
>
> For example, loading:
>
> #include <linux/bpf.h>
> #include <bpf/bpf_helpers.h>
>
> struct {
> __uint(type, BPF_MAP_TYPE_PROG_ARRAY);
> __uint(max_entries, 1);
> __uint(key_size, sizeof(__u32));
> __uint(value_size, sizeof(__u32));
> } tail_call_map SEC(".maps");
>
> static __attribute__((noinline)) int callee(struct xdp_md *ctx)
> {
> bpf_tail_call(ctx, &tail_call_map, 0);
>
> int ret;
> asm volatile("%0 = 23" : "=r"(ret));
> return ret;
> }
>
> static SEC("xdp") int caller(struct xdp_md *ctx)
> {
> int res = callee(ctx);
> if (res == 23) {
> return XDP_PASS;
> }
> return XDP_DROP;
> }
>
> The verifier logs:
>
> func#0 @0
> func#1 @6
> 0: R1=ctx() R10=fp0
> ; int res = callee(ctx); @ test.c:24
> 0: (85) call pc+5
> caller:
> R10=fp0
> callee:
> frame1: R1=ctx() R10=fp0
> 6: frame1: R1=ctx() R10=fp0
> ; bpf_tail_call(ctx, &tail_call_map, 0); @ test.c:15
> 6: (18) r2 = 0xffff8a9c82a75800 ; frame1: R2_w=map_ptr(map=tail_call_map,ks=4,vs=4)
> 8: (b4) w3 = 0 ; frame1: R3_w=0
> 9: (85) call bpf_tail_call#12
> 10: frame1:
> ; asm volatile("%0 = 23" : "=r"(ret)); @ test.c:18
> 10: (b7) r0 = 23 ; frame1: R0_w=23
> ; return ret; @ test.c:19
> 11: (95) exit
> returning from callee:
> frame1: R0_w=23 R10=fp0
> to caller at 1:
> R0_w=23 R10=fp0
>
> from 11 to 1: R0_w=23 R10=fp0
> ; int res = callee(ctx); @ test.c:24
> 1: (bc) w1 = w0 ; R0_w=23 R1_w=23
> 2: (b4) w0 = 2 ; R0=2
> ; @ test.c:0
> 3: (16) if w1 == 0x17 goto pc+1
> 3: R1=23
> ; } @ test.c:29
> 5: (95) exit
> processed 10 insns (limit 1000000) max_states_per_insn 0 total_states 1 peak_states 1 mark_read 1
>
> And tracks R0_w=23 from the callee through to the caller.
> This lets it completely prune the res != 23 branch, skipping over
> instruction 4.
>
> But this isn't sound: the bpf_tail_call() could make the callee return
> before r0 = 23.
>
> Aside from pruning incorrect branches, this can also be used to read and
> write arbitrary memory by using r0 as a index.
>
> Make the verifier track instructions that can return abnormally as a
> branch that either exits, or falls through to the remaining
> instructions.
>
> This naturally checks for resource leaks, so we can remove the explicit
> checks for tail_calls and LD_ABS.
>
> Fixes: f4d7e40a5b71 ("bpf: introduce function calls (verification)")
> Signed-off-by: Arthur Fabre <afabre@cloudflare.com>
> Cc: stable@vger.kernel.org
> ---
This patch is correct as far as I can tell.
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
[...]
> @@ -18770,6 +18780,21 @@ static int do_check(struct bpf_verifier_env *env)
> return err;
>
> mark_reg_scratched(env, BPF_REG_0);
> +
> + if (insn->src_reg == 0 && insn->imm == BPF_FUNC_tail_call) {
> + /* Explore both cases: tail_call fails and we fallthrough,
> + * or it succeeds and we exit the current function.
> + */
> + if (!push_stack(env, env->insn_idx + 1, env->insn_idx, false))
> + return -ENOMEM;
> + /* bpf_tail_call() doesn't set r0 on failure / in the fallthrough case.
> + * But it does on success, so we have to mark it after queueing the
> + * fallthrough case, but before prepare_func_exit().
> + */
> + __mark_reg_unknown(env, &state->frame[state->curframe]->regs[BPF_REG_0]);
> + exit = BPF_EXIT_TAIL_CALL;
> + goto process_bpf_exit_full;
> + }
Nit: it's a bit unfortunate, that this logic is inside do_check,
instead of check_helper_call() and check_ld_abs().
But it makes BPF_EXIT_* propagation simpler.
> } else if (opcode == BPF_JA) {
> if (BPF_SRC(insn->code) != BPF_K ||
> insn->src_reg != BPF_REG_0 ||
[...]
next prev parent reply other threads:[~2025-01-06 20:31 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-01-06 17:15 [PATCH bpf v3 0/2] bpf: Account for early exit of bpf_tail_call() and LD_ABS Arthur Fabre
2025-01-06 17:15 ` [PATCH bpf v3 1/2] " Arthur Fabre
2025-01-06 20:31 ` Eduard Zingerman [this message]
2025-01-08 20:44 ` Arthur Fabre
2025-01-06 17:15 ` [PATCH bpf v3 2/2] selftests/bpf: Test r0 and ref lifetime after BPF-BPF call with abnormal return Arthur Fabre
2025-01-06 20:34 ` Eduard Zingerman
2025-01-08 20:46 ` Arthur Fabre
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=3f08fa54c29d5716982194801bfdae93c15a8c27.camel@gmail.com \
--to=eddyz87@gmail.com \
--cc=afabre@cloudflare.com \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=haoluo@google.com \
--cc=john.fastabend@gmail.com \
--cc=jolsa@kernel.org \
--cc=kernel-team@cloudflare.com \
--cc=kpsingh@kernel.org \
--cc=martin.lau@linux.dev \
--cc=sdf@fomichev.me \
--cc=song@kernel.org \
--cc=yonghong.song@linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox