Re: [PATCH net v2] net: Clear skb metadata in LWT BPF xmit

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Jakub Sitnicki <jakub@cloudflare.com>
To: sashiko-bot@kernel.org
Cc: sashiko-reviews@lists.linux.dev,  bpf@vger.kernel.org
Subject: Re: [PATCH net v2] net: Clear skb metadata in LWT BPF xmit
Date: Wed, 17 Jun 2026 17:20:46 +0200	[thread overview]
Message-ID: <877bnxb3ch.fsf@cloudflare.com> (raw)
In-Reply-To: <20260515104741.DF6A9C2BCB0@smtp.kernel.org> (sashiko-bot@kernel.org's message of "Fri, 15 May 2026 10:47:40 +0000")

Circling back to this.

On Fri, May 15, 2026 at 10:47 AM GMT, sashiko-bot@kernel.org wrote:
> Thank you for your contribution! Sashiko AI review found 2 potential issue(s) to consider:
> - [High] Unconditionally calling `skb_metadata_clear()` in `bpf_xmit` causes a data race and corrupts metadata for cloned SKBs.
> - [High] Uninitialized memory read and metadata corruption in non-BPF encapsulations and `bpf_input`.
> --
>
> commit 2265e07ff5cfd9680de6b565a2b5679d5adafe20
> Author: Jakub Sitnicki <jakub@cloudflare.com>
>
> net: Clear skb metadata in LWT BPF xmit
>
> This commit clears the SKB metadata in the LWT BPF xmit path before
> invoking the BPF program. It aims to prevent a warning in skb_data_move()
> when bpf_skb_change_head() is called on packets with existing metadata and a
> non-matching MAC header.
>
>> Drop skb metadata in the LWT BPF xmit path, just before invoking the
>> program. Doing it late, rather than in ip[6]_finish_output2(), preserves
>> metadata for tc egress on the outgoing device when no LWT BPF program is
>> attached - the only scenario where carrying tc-layer metadata through the
>> IP output path could be meaningful. Non-BPF LWT encap types do not run
>> helpers that touch metadata, so the BPF xmit op is the only site that needs
>> the clear.
>
> Is this assumption accurate for non-BPF encapsulations and LWT_IN programs?
>
> When non-BPF encapsulations (like IPIP or SEG6) or LWT_IN programs use
> skb_push() to prepend outer headers, they shift skb->data downwards but do
> not migrate the metadata block.
>
> Because the metadata pointer is implicitly calculated as skb->data - meta_len,
> the new encapsulation header silently overwrites the original metadata memory.
> The implicit metadata pointer shifts further into the uninitialized SKB
> headroom.
>
> When the packet subsequently reaches TC egress, won't BPF programs accessing
> data_meta read uninitialized headroom memory instead of the preserved
> metadata?

I initially focused on the case where a warning is generated by
skb_data_move(). This can be triggered only from BPF LWT because the BPF
helpers have been converted to call skb_postpush_data_move() after
skb_push(), while non-BPF LWT encaps have not.

However, non-BPF LWT enaps mangle the metadata as well, as it was
pointed out. So this needs to be addressed.

>
>> diff --git a/net/core/lwt_bpf.c b/net/core/lwt_bpf.c
>> index f71ef82a5f3d3..d26cb9bb81893 100644
>> --- a/net/core/lwt_bpf.c
>> +++ b/net/core/lwt_bpf.c
>> @@ -284,6 +284,11 @@ static int bpf_xmit(struct sk_buff *skb)
>>  		__be16 proto = skb->protocol;
>>  		int ret;
>>  
>> +		/* LWT BPF programs cannot access skb metadata but can corrupt
>> +		 * it with helpers that operate on skb headroom. Drop it.
>> +		 */
>> +		skb_metadata_clear(skb);
>
> Since bpf_xmit() operates in the LWT IP output path which frequently processes
> cloned SKBs (e.g., from multicast forwarding or TCP retransmits), is it safe
> to call skb_metadata_clear() here without unsharing the SKB?
>
> skb_metadata_clear() directly modifies skb_shinfo(skb)->meta_len. For cloned
> SKBs, the skb_shinfo structure is shared across all clones. Modifying it
> without gaining exclusive ownership first (such as via skb_unclone()) will
> silently zero out the metadata length for all other clones of the same
> packet, causing them to unexpectedly lose their metadata when processed on
> concurrent, unrelated network paths.

This is not an issue because:

1. Only RX-originated skbs can carry metadata.
2. RX-originated skbs must be forwarded (routed) to hit LWT.
3. ip_forward() will skb_cow() the packet to unclone it before LWT..

>
>> +
>>  		ret = run_lwt_bpf(skb, &bpf->xmit, dst, CAN_REDIRECT);
>>  		switch (ret) {
>>  		case BPF_OK:

next prev parent reply	other threads:[~2026-06-17 15:20 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-14 10:47 [PATCH net v2] net: Clear skb metadata in LWT BPF xmit Jakub Sitnicki
2026-05-15  0:39 ` Chris Arges
2026-05-15 10:47 ` sashiko-bot
2026-06-17 15:20   ` Jakub Sitnicki [this message]
2026-05-18 10:00 ` Daniel Borkmann
2026-05-18 23:18 ` Jakub Kicinski
2026-05-19  9:47   ` Jakub Sitnicki

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=877bnxb3ch.fsf@cloudflare.com \
    --to=jakub@cloudflare.com \
    --cc=bpf@vger.kernel.org \
    --cc=sashiko-bot@kernel.org \
    --cc=sashiko-reviews@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.