From: Eduard Zingerman <eddyz87@gmail.com>
To: Leon Hwang <hffilwlqm@gmail.com>, bpf@vger.kernel.org
Cc: ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org,
maciej.fijalkowski@intel.com, puranjay@kernel.org,
jakub@cloudflare.com, pulehui@huawei.com,
kernel-patches-bot@fb.com
Subject: Re: [PATCH v5 bpf-next 1/3] bpf, x64: Fix tailcall hierarchy
Date: Wed, 10 Jul 2024 17:15:50 -0700 [thread overview]
Message-ID: <a00ad2e4df00bab1e5ea12cf22fe32d4933a7835.camel@gmail.com> (raw)
In-Reply-To: <20240623161528.68946-2-hffilwlqm@gmail.com>
On Mon, 2024-06-24 at 00:15 +0800, Leon Hwang wrote:
> This patch fixes a tailcall issue caused by abusing the tailcall in
> bpf2bpf feature.
>
> As we know, tail_call_cnt propagates by rax from caller to callee when
> to call subprog in tailcall context. But, like the following example,
> MAX_TAIL_CALL_CNT won't work because of missing tail_call_cnt
> back-propagation from callee to caller.
>
> \#include <linux/bpf.h>
> \#include <bpf/bpf_helpers.h>
> \#include "bpf_legacy.h"
>
> struct {
> __uint(type, BPF_MAP_TYPE_PROG_ARRAY);
> __uint(max_entries, 1);
> __uint(key_size, sizeof(__u32));
> __uint(value_size, sizeof(__u32));
> } jmp_table SEC(".maps");
>
> int count = 0;
>
> static __noinline
> int subprog_tail1(struct __sk_buff *skb)
> {
> bpf_tail_call_static(skb, &jmp_table, 0);
> return 0;
> }
>
> static __noinline
> int subprog_tail2(struct __sk_buff *skb)
> {
> bpf_tail_call_static(skb, &jmp_table, 0);
> return 0;
> }
>
> SEC("tc")
> int entry(struct __sk_buff *skb)
> {
> volatile int ret = 1;
>
> count++;
> subprog_tail1(skb);
> subprog_tail2(skb);
>
> return ret;
> }
>
> char __license[] SEC("license") = "GPL";
>
> At run time, the tail_call_cnt in entry() will be propagated to
> subprog_tail1() and subprog_tail2(). But, when the tail_call_cnt in
> subprog_tail1() updates when bpf_tail_call_static(), the tail_call_cnt
> in entry() won't be updated at the same time. As a result, in entry(),
> when tail_call_cnt in entry() is less than MAX_TAIL_CALL_CNT and
> subprog_tail1() returns because of MAX_TAIL_CALL_CNT limit,
> bpf_tail_call_static() in suprog_tail2() is able to run because the
> tail_call_cnt in subprog_tail2() propagated from entry() is less than
> MAX_TAIL_CALL_CNT.
>
> So, how many tailcalls are there for this case if no error happens?
>
> From top-down view, does it look like hierarchy layer and layer?
>
> With this view, there will be 2+4+8+...+2^33 = 2^34 - 2 = 17,179,869,182
> tailcalls for this case.
>
> How about there are N subprog_tail() in entry()? There will be almost
> N^34 tailcalls.
>
> Then, in this patch, it resolves this case on x86_64.
>
> In stead of propagating tail_call_cnt from caller to callee, it
> propagates its pointer, tail_call_cnt_ptr, tcc_ptr for short.
>
> However, where does it store tail_call_cnt?
>
> It stores tail_call_cnt on the stack of main prog. When tail call
> happens in subprog, it increments tail_call_cnt by tcc_ptr.
>
> Meanwhile, it stores tail_call_cnt_ptr on the stack of main prog, too.
>
> And, before jump to tail callee, it has to pop tail_call_cnt and
> tail_call_cnt_ptr.
>
> Then, at the prologue of subprog, it must not make rax as
> tail_call_cnt_ptr again. It has to reuse tail_call_cnt_ptr from caller.
>
> As a result, at run time, it has to recognize rax is tail_call_cnt or
> tail_call_cnt_ptr at prologue by:
>
> 1. rax is tail_call_cnt if rax is <= MAX_TAIL_CALL_CNT.
> 2. rax is tail_call_cnt_ptr if rax is > MAX_TAIL_CALL_CNT, because a
> pointer won't be <= MAX_TAIL_CALL_CNT.
>
> Furthermore, when trampoline is the caller of bpf prog, which is
> tail_call_reachable, it is required to propagate rax through trampoline.
>
> Fixes: ebf7d1f508a7 ("bpf, x64: rework pro/epilogue and tailcall handling in JIT")
> Fixes: e411901c0b77 ("bpf: allow for tailcalls in BPF subprograms for x64 JIT")
> Signed-off-by: Leon Hwang <hffilwlqm@gmail.com>
> ---
Hi Leon,
Sorry for delayed response.
I've looked through this patch and the changes make sense to me.
One thing that helped to understand the gist of the changes,
was dumping jited program using bpftool and annotating it with comments:
https://gist.github.com/eddyz87/0d48da052e9d174b2bb84174295c4215
Maybe consider adding something along these lines to the patch
description?
Reviewed-by: Eduard Zingerman <eddyz87@gmail.com>
[...]
next prev parent reply other threads:[~2024-07-11 0:15 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-06-23 16:15 [PATCH v5 bpf-next 0/3] bpf: Fix tailcall hierarchy Leon Hwang
2024-06-23 16:15 ` [PATCH v5 bpf-next 1/3] bpf, x64: " Leon Hwang
2024-07-11 0:15 ` Eduard Zingerman [this message]
2024-07-11 2:02 ` Leon Hwang
2024-06-23 16:15 ` [PATCH v5 bpf-next 2/3] bpf, arm64: " Leon Hwang
2024-06-26 12:05 ` Puranjay Mohan
2024-06-23 16:15 ` [PATCH v5 bpf-next 3/3] selftests/bpf: Add testcases for tailcall hierarchy fixing Leon Hwang
2024-07-11 2:01 ` Eduard Zingerman
2024-07-11 14:36 ` Leon Hwang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=a00ad2e4df00bab1e5ea12cf22fe32d4933a7835.camel@gmail.com \
--to=eddyz87@gmail.com \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=hffilwlqm@gmail.com \
--cc=jakub@cloudflare.com \
--cc=kernel-patches-bot@fb.com \
--cc=maciej.fijalkowski@intel.com \
--cc=pulehui@huawei.com \
--cc=puranjay@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox