BPF List
 help / color / mirror / Atom feed
* [PATCH net] net: Clear skb metadata before LWT xmit
@ 2026-04-28 19:44 Jakub Sitnicki
  2026-04-29 19:52 ` sashiko-bot
  2026-05-01  0:31 ` Jakub Kicinski
  0 siblings, 2 replies; 3+ messages in thread
From: Jakub Sitnicki @ 2026-04-28 19:44 UTC (permalink / raw)
  To: David S. Miller, David Ahern, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Simon Horman, Martin KaFai Lau
  Cc: netdev, bpf, kernel-team

skb metadata is meant for passing information between XDP and TC.
LWT programs cannot access the __sk_buff->data_meta pseudo-pointer.

However, LWT_XMIT programs can call helpers that move packet data.
One such helper is bpf_skb_change_head(), which uses skb_data_move() to
move both packet data and metadata after skb_push(). skb_data_move()
expects metadata to sit immediately before skb->data. This is the layout
that commit 8989d328dfe7 ("net: Helper to move packet data and metadata
after skb_push/pull") preserves for TC(X) programs when skb_push() or
skb_pull() moves packet data.

The IP output path can run LWT xmit before neighbour output has built the
outgoing L2 header. For forwarded, or otherwise RX-originated, packets
skb->data points at the network header while skb_mac_header() can still
point at the old L2 header. If such an skb still carries XDP metadata,
skb_data_move() sees metadata ending at skb_mac_header(), not immediately
before skb->data, and warns.

Drop skb metadata before calling into LWT xmit from the IPv4 and IPv6
output paths. Metadata is not part of the LWT BPF interface, and keeping it
attached lets hidden RX metadata affect LWT skb mutation helpers.

Fixes: 8989d328dfe7 ("net: Helper to move packet data and metadata after skb_push/pull")
Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
---
 net/ipv4/ip_output.c  | 2 ++
 net/ipv6/ip6_output.c | 2 ++
 2 files changed, 4 insertions(+)

diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
index e4790cc7b5c2..3295e043310f 100644
--- a/net/ipv4/ip_output.c
+++ b/net/ipv4/ip_output.c
@@ -220,6 +220,8 @@ static int ip_finish_output2(struct net *net, struct sock *sk, struct sk_buff *s
 			return -ENOMEM;
 	}
 
+	skb_metadata_clear(skb);
+
 	if (lwtunnel_xmit_redirect(dst->lwtstate)) {
 		int res = lwtunnel_xmit(skb);
 
diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
index 7e92909ab5be..734818463e07 100644
--- a/net/ipv6/ip6_output.c
+++ b/net/ipv6/ip6_output.c
@@ -111,6 +111,8 @@ static int ip6_finish_output2(struct net *net, struct sock *sk, struct sk_buff *
 		}
 	}
 
+	skb_metadata_clear(skb);
+
 	if (lwtunnel_xmit_redirect(dst->lwtstate)) {
 		int res = lwtunnel_xmit(skb);
 




^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH net] net: Clear skb metadata before LWT xmit
  2026-04-28 19:44 [PATCH net] net: Clear skb metadata before LWT xmit Jakub Sitnicki
@ 2026-04-29 19:52 ` sashiko-bot
  2026-05-01  0:31 ` Jakub Kicinski
  1 sibling, 0 replies; 3+ messages in thread
From: sashiko-bot @ 2026-04-29 19:52 UTC (permalink / raw)
  To: Jakub Sitnicki; +Cc: bpf

Thank you for your contribution! Sashiko AI review found 2 potential issue(s) to consider:
- [High] Modifying `skb_shinfo(skb)->meta_len` via `skb_metadata_clear()` on a potentially cloned SKB without unsharing causes data races and shared memory corruption.
- [High] Unconditional clearing of SKB metadata breaks XDP-to-TC metadata passing for all forwarded packets.
--

commit 9bd10448be11f5fe388ec5590ce62a619d12c1d2
Author: Jakub Sitnicki <jakub@cloudflare.com>

net: Clear skb metadata before LWT xmit

This commit attempts to clear SKB metadata before calling into LWT xmit from
the IPv4 and IPv6 output paths. This is intended to prevent hidden RX
metadata from affecting LWT SKB mutation helpers.

> diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
> index e4790cc7b5c2..3295e043310f 100644
> --- a/net/ipv4/ip_output.c
> +++ b/net/ipv4/ip_output.c
> @@ -220,6 +220,8 @@ static int ip_finish_output2(struct net *net, struct sock *sk, struct sk_buff *s
>  			return -ENOMEM;
>  	}
>  
> +	skb_metadata_clear(skb);

Does this clear metadata for all forwarded packets, rather than just those
destined for LWT xmit?

By placing skb_metadata_clear() outside of the lwtunnel_xmit_redirect()
block, it appears XDP metadata is permanently stripped from all packets on
the output path. Could this break XDP-to-TC egress metadata passing
system-wide when these packets later reach dev_queue_xmit()?

> +
>  	if (lwtunnel_xmit_redirect(dst->lwtstate)) {
>  		int res = lwtunnel_xmit(skb);
>  

[ ... ]

> diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
> index 7e92909ab5be..734818463e07 100644
> --- a/net/ipv6/ip6_output.c
> +++ b/net/ipv6/ip6_output.c
> @@ -111,6 +111,8 @@ static int ip6_finish_output2(struct net *net, struct sock *sk, struct sk_buff *
>  		}
>  	}
>  
> +	skb_metadata_clear(skb);

Can this modify the shared info on a cloned SKB without unsharing it first?

skb_metadata_clear() directly sets skb_shinfo(skb)->meta_len = 0. Earlier
in this function, multicast packets may be cloned:

net/ipv6/ip6_output.c:ip6_finish_output2() {
    ...
		if (!(dev->flags & IFF_LOOPBACK) && sk_mc_loop(sk) &&
		    ...
			struct sk_buff *newskb = skb_clone(skb, GFP_ATOMIC);

			if (newskb)
				NF_HOOK(NFPROTO_IPV6, NF_INET_POST_ROUTING,
    ...
}

If the SKB is cloned and being processed concurrently by NF_HOOK, would
writing to skb_shinfo(skb)->meta_len cause a data race and corrupt the
shared memory for the clone?

> +
>  	if (lwtunnel_xmit_redirect(dst->lwtstate)) {
>  		int res = lwtunnel_xmit(skb);

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260428-wip-skb-local-storage-from-scratch-v1-1-8f7ca9b378ce@cloudflare.com?part=1

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH net] net: Clear skb metadata before LWT xmit
  2026-04-28 19:44 [PATCH net] net: Clear skb metadata before LWT xmit Jakub Sitnicki
  2026-04-29 19:52 ` sashiko-bot
@ 2026-05-01  0:31 ` Jakub Kicinski
  1 sibling, 0 replies; 3+ messages in thread
From: Jakub Kicinski @ 2026-05-01  0:31 UTC (permalink / raw)
  To: Jakub Sitnicki
  Cc: David S. Miller, David Ahern, Eric Dumazet, Paolo Abeni,
	Simon Horman, Martin KaFai Lau, netdev, bpf, kernel-team

On Tue, 28 Apr 2026 21:44:37 +0200 Jakub Sitnicki wrote:
> skb metadata is meant for passing information between XDP and TC.
> LWT programs cannot access the __sk_buff->data_meta pseudo-pointer.

netdev wasn't CCed on the sashiko reply but I think it now auto-marks
the patches as Changes Requested :| The Sashiko comments didn't seem
actionable to me. WDYT?

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2026-05-01  0:31 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-28 19:44 [PATCH net] net: Clear skb metadata before LWT xmit Jakub Sitnicki
2026-04-29 19:52 ` sashiko-bot
2026-05-01  0:31 ` Jakub Kicinski

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox