From: Martin KaFai Lau <martin.lau@linux.dev>
To: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: Amery Hung <ameryhung@gmail.com>,
Jakub Sitnicki <jakub@cloudflare.com>,
Martin KaFai Lau <martin.lau@kernel.org>,
bpf <bpf@vger.kernel.org>,
Network Development <netdev@vger.kernel.org>,
"David S. Miller" <davem@davemloft.net>,
Eric Dumazet <edumazet@google.com>,
Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
Alexei Starovoitov <ast@kernel.org>,
Daniel Borkmann <daniel@iogearbox.net>,
Jesper Dangaard Brouer <hawk@kernel.org>,
John Fastabend <john.fastabend@gmail.com>,
Stanislav Fomichev <sdf@fomichev.me>,
Simon Horman <horms@kernel.org>,
Andrii Nakryiko <andrii@kernel.org>,
Eduard Zingerman <eddyz87@gmail.com>, Song Liu <song@kernel.org>,
Yonghong Song <yonghong.song@linux.dev>,
KP Singh <kpsingh@kernel.org>, Hao Luo <haoluo@google.com>,
Jiri Olsa <jolsa@kernel.org>,
kernel-team <kernel-team@cloudflare.com>
Subject: Re: [PATCH bpf-next v2 15/16] bpf: Realign skb metadata for TC progs using data_meta
Date: Mon, 5 Jan 2026 14:25:40 -0800 [thread overview]
Message-ID: <d267c646-1acc-4e5b-aa96-56759fca57d0@linux.dev> (raw)
In-Reply-To: <CAADnVQKB5vRJM4kJC5515snR6KHweE-Ld_W1wWgPSWATgiUCwg@mail.gmail.com>
On 1/5/26 1:47 PM, Alexei Starovoitov wrote:
> On Mon, Jan 5, 2026 at 12:55 PM Martin KaFai Lau <martin.lau@linux.dev> wrote:
>>
>>
>>
>> On 1/5/26 11:42 AM, Amery Hung wrote:
>>> On Mon, Jan 5, 2026 at 11:14 AM Alexei Starovoitov
>>> <alexei.starovoitov@gmail.com> wrote:
>>>>
>>>> On Mon, Jan 5, 2026 at 4:15 AM Jakub Sitnicki <jakub@cloudflare.com> wrote:
>>>>>
>>>>>
>>>>> +__bpf_kfunc_start_defs();
>>>>> +
>>>>> +__bpf_kfunc void bpf_skb_meta_realign(struct __sk_buff *skb_)
>>>>> +{
>>>>> + struct sk_buff *skb = (typeof(skb))skb_;
>>>>> + u8 *meta_end = skb_metadata_end(skb);
>>>>> + u8 meta_len = skb_metadata_len(skb);
>>>>> + u8 *meta;
>>>>> + int gap;
>>>>> +
>>>>> + gap = skb_mac_header(skb) - meta_end;
>>>>> + if (!meta_len || !gap)
>>>>> + return;
>>>>> +
>>>>> + if (WARN_ONCE(gap < 0, "skb metadata end past mac header")) {
>>>>> + skb_metadata_clear(skb);
>>>>> + return;
>>>>> + }
>>>>> +
>>>>> + meta = meta_end - meta_len;
>>>>> + memmove(meta + gap, meta, meta_len);
>>>>> + skb_shinfo(skb)->meta_end += gap;
>>>>> +
>>>>> + bpf_compute_data_pointers(skb);
>>>>> +}
>>>>> +
>>>>> +__bpf_kfunc_end_defs();
>>>>> +
>>>>> +BTF_KFUNCS_START(tc_cls_act_hidden_ids)
>>>>> +BTF_ID_FLAGS(func, bpf_skb_meta_realign)
>>>>> +BTF_KFUNCS_END(tc_cls_act_hidden_ids)
>>>>> +
>>>>> +BTF_ID_LIST_SINGLE(bpf_skb_meta_realign_ids, func, bpf_skb_meta_realign)
>>>>> +
>>>>> static int tc_cls_act_prologue(struct bpf_insn *insn_buf, u32 pkt_access_flags,
>>>>> const struct bpf_prog *prog)
>>>>> {
>>>>> - return bpf_unclone_prologue(insn_buf, pkt_access_flags, prog,
>>>>> - TC_ACT_SHOT);
>>>>> + struct bpf_insn *insn = insn_buf;
>>>>> + int cnt;
>>>>> +
>>>>> + if (pkt_access_flags & PA_F_DATA_META_LOAD) {
>>>>> + /* Realign skb metadata for access through data_meta pointer.
>>>>> + *
>>>>> + * r6 = r1; // r6 will be "u64 *ctx"
>>>>> + * r0 = bpf_skb_meta_realign(r1); // r0 is undefined
>>>>> + * r1 = r6;
>>>>> + */
>>>>> + *insn++ = BPF_MOV64_REG(BPF_REG_6, BPF_REG_1);
>>>>> + *insn++ = BPF_CALL_KFUNC(0, bpf_skb_meta_realign_ids[0]);
>>>>> + *insn++ = BPF_MOV64_REG(BPF_REG_1, BPF_REG_6);
>>>>> + }
>>>>
>>>> I see that we already did this hack with bpf_qdisc_init_prologue()
>>>> and bpf_qdisc_reset_destroy_epilogue().
>>>> Not sure why we went that route back then.
>>>>
>>>> imo much cleaner to do BPF_EMIT_CALL() and wrap
>>>> BPF_CALL_1(bpf_skb_meta_realign, struct sk_buff *, skb)
>>>>
>>>> BPF_CALL_x doesn't make it an uapi helper.
>>>> It's still a hidden kernel function,
>>>> while this kfunc stuff looks wrong, since kfunc isn't really hidden.
>>>>
>>>> I suspect progs can call this bpf_skb_meta_realign() explicitly,
>>>> just like they can call bpf_qdisc_init_prologue() ?
>>>>
>>>
>>> qdisc prologue and epilogue qdisc kfuncs should be hidden from users.
>>> The kfunc filter, bpf_qdisc_kfunc_filter(), determines what kfunc are
>>> actually exposed.
>>
>> Similar to Amery's comment, I recalled I tried the BPF_CALL_1 in the
>> qdisc but stopped at the "fn = env->ops->get_func_proto(insn->imm,
>> env->prog);" in do_misc_fixups(). Potentially it could add new enum ( >
>> __BPF_FUNC_MAX_ID) outside of the uapi and the user space tool should be
>> able to handle unknown helper also but we went with the kfunc+filter
>> approach without thinking too much about it.
>
> hmm. BPF_EMIT_CALL() does:
> #define BPF_CALL_IMM(x) ((void *)(x) - (void *)__bpf_call_base)
> .imm = BPF_CALL_IMM(FUNC)
>
> the imm shouldn't be going through validation anymore.
> none of the if (insn->imm == BPF_FUNC_...) in do_misc_fixups()
> will match, so I think I see the path where get_func_proto() is
> called.
> But how does it work then for all cases of BPF_EMIT_CALL?
> All of them happen after do_misc_fixups() ?
yeah, I think most (all?) of them (e.g. map_gen_lookup) happens in
do_misc_fixups which then does "goto next_insn;" to skip the
get_func_proto().
>
> I guess we can mark such emitted call in insn_aux_data as finalized
> and get_func_proto() isn't needed.
It is a good idea.
next prev parent reply other threads:[~2026-01-05 22:26 UTC|newest]
Thread overview: 37+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-01-05 12:14 [PATCH bpf-next v2 00/16] Decouple skb metadata tracking from MAC header offset Jakub Sitnicki
2026-01-05 12:14 ` [PATCH bpf-next v2 01/16] bnxt_en: Call skb_metadata_set when skb->data points at metadata end Jakub Sitnicki
2026-01-05 12:14 ` [PATCH bpf-next v2 02/16] i40e: " Jakub Sitnicki
2026-01-05 12:14 ` [PATCH bpf-next v2 03/16] igb: " Jakub Sitnicki
2026-01-05 12:14 ` [PATCH bpf-next v2 04/16] igc: " Jakub Sitnicki
2026-01-05 12:14 ` [PATCH bpf-next v2 05/16] ixgbe: " Jakub Sitnicki
2026-01-05 12:14 ` [PATCH bpf-next v2 06/16] net/mlx5e: " Jakub Sitnicki
2026-01-05 12:14 ` [PATCH bpf-next v2 07/16] veth: " Jakub Sitnicki
2026-01-05 12:14 ` [PATCH bpf-next v2 08/16] xsk: " Jakub Sitnicki
2026-01-05 12:14 ` [PATCH bpf-next v2 09/16] xdp: " Jakub Sitnicki
2026-01-05 12:14 ` [PATCH bpf-next v2 10/16] net: Track skb metadata end separately from MAC offset Jakub Sitnicki
2026-01-05 12:14 ` [PATCH bpf-next v2 11/16] bpf, verifier: Remove side effects from may_access_direct_pkt_data Jakub Sitnicki
2026-01-05 18:23 ` Eduard Zingerman
2026-01-05 12:14 ` [PATCH bpf-next v2 12/16] bpf, verifier: Turn seen_direct_write flag into a bitmap Jakub Sitnicki
2026-01-05 18:48 ` Eduard Zingerman
2026-01-05 12:14 ` [PATCH bpf-next v2 13/16] bpf, verifier: Propagate packet access flags to gen_prologue Jakub Sitnicki
2026-01-05 18:49 ` Eduard Zingerman
2026-01-05 12:14 ` [PATCH bpf-next v2 14/16] bpf, verifier: Track when data_meta pointer is loaded Jakub Sitnicki
2026-01-05 18:48 ` Eduard Zingerman
2026-01-05 12:14 ` [PATCH bpf-next v2 15/16] bpf: Realign skb metadata for TC progs using data_meta Jakub Sitnicki
2026-01-05 19:14 ` Alexei Starovoitov
2026-01-05 19:20 ` Jakub Sitnicki
2026-01-05 19:42 ` Amery Hung
2026-01-05 20:02 ` Alexei Starovoitov
2026-01-05 20:54 ` Martin KaFai Lau
2026-01-05 21:47 ` Alexei Starovoitov
2026-01-05 22:25 ` Martin KaFai Lau [this message]
2026-01-05 23:19 ` Amery Hung
2026-01-06 2:04 ` Alexei Starovoitov
2026-01-06 17:36 ` Jakub Sitnicki
2026-01-06 17:46 ` Amery Hung
2026-01-06 18:40 ` Alexei Starovoitov
2026-01-06 19:12 ` Jakub Sitnicki
2026-01-06 19:42 ` Amery Hung
2026-01-05 12:14 ` [PATCH bpf-next v2 16/16] selftests/bpf: Test skb metadata access after L2 decapsulation Jakub Sitnicki
2026-01-10 21:07 ` [PATCH bpf-next v2 00/16] Decouple skb metadata tracking from MAC header offset Jakub Sitnicki
2026-01-14 11:52 ` Jakub Sitnicki
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=d267c646-1acc-4e5b-aa96-56759fca57d0@linux.dev \
--to=martin.lau@linux.dev \
--cc=alexei.starovoitov@gmail.com \
--cc=ameryhung@gmail.com \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=davem@davemloft.net \
--cc=eddyz87@gmail.com \
--cc=edumazet@google.com \
--cc=haoluo@google.com \
--cc=hawk@kernel.org \
--cc=horms@kernel.org \
--cc=jakub@cloudflare.com \
--cc=john.fastabend@gmail.com \
--cc=jolsa@kernel.org \
--cc=kernel-team@cloudflare.com \
--cc=kpsingh@kernel.org \
--cc=kuba@kernel.org \
--cc=martin.lau@kernel.org \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=sdf@fomichev.me \
--cc=song@kernel.org \
--cc=yonghong.song@linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.