public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH net-next 00/10] Call skb_metadata_set when skb->data points past metadata
@ 2026-01-10 21:05 Jakub Sitnicki
  2026-01-10 21:05 ` [PATCH net-next 01/10] net: Document skb_metadata_set contract with the drivers Jakub Sitnicki
                   ` (10 more replies)
  0 siblings, 11 replies; 34+ messages in thread
From: Jakub Sitnicki @ 2026-01-10 21:05 UTC (permalink / raw)
  To: netdev
  Cc: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman, Michael Chan, Pavan Chebbi, Andrew Lunn,
	Tony Nguyen, Przemek Kitszel, Saeed Mahameed, Leon Romanovsky,
	Tariq Toukan, Mark Bloch, Alexei Starovoitov, Daniel Borkmann,
	Jesper Dangaard Brouer, John Fastabend, Stanislav Fomichev,
	intel-wired-lan, bpf, kernel-team

This series is split out of [1] following discussion with Jakub.

To copy XDP metadata into an skb extension when skb_metadata_set() is
called, we need to locate the metadata contents.

These patches establish a contract with the drivers: skb_metadata_set()
must be called only after skb->data has been advanced past the metadata
area.

[1] https://lore.kernel.org/r/20260107-skb-meta-safeproof-netdevs-rx-only-v3-0-0d461c5e4764@cloudflare.com

Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
---
Jakub Sitnicki (10):
      net: Document skb_metadata_set contract with the drivers
      bnxt_en: Call skb_metadata_set when skb->data points past metadata
      i40e: Call skb_metadata_set when skb->data points past metadata
      igb: Call skb_metadata_set when skb->data points past metadata
      igc: Call skb_metadata_set when skb->data points past metadata
      ixgbe: Call skb_metadata_set when skb->data points past metadata
      mlx5e: Call skb_metadata_set when skb->data points past metadata
      veth: Call skb_metadata_set when skb->data points past metadata
      xsk: Call skb_metadata_set when skb->data points past metadata
      xdp: Call skb_metadata_set when skb->data points past metadata

 drivers/net/ethernet/broadcom/bnxt/bnxt.c           | 2 +-
 drivers/net/ethernet/intel/i40e/i40e_xsk.c          | 2 +-
 drivers/net/ethernet/intel/igb/igb_xsk.c            | 2 +-
 drivers/net/ethernet/intel/igc/igc_main.c           | 4 ++--
 drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c        | 2 +-
 drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.c | 2 +-
 drivers/net/veth.c                                  | 4 ++--
 include/linux/skbuff.h                              | 7 +++++++
 net/core/dev.c                                      | 5 ++++-
 net/core/xdp.c                                      | 2 +-
 10 files changed, 21 insertions(+), 11 deletions(-)


^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH net-next 01/10] net: Document skb_metadata_set contract with the drivers
  2026-01-10 21:05 [PATCH net-next 00/10] Call skb_metadata_set when skb->data points past metadata Jakub Sitnicki
@ 2026-01-10 21:05 ` Jakub Sitnicki
  2026-01-12 11:28   ` [Intel-wired-lan] " Loktionov, Aleksandr
  2026-01-10 21:05 ` [PATCH net-next 02/10] bnxt_en: Call skb_metadata_set when skb->data points past metadata Jakub Sitnicki
                   ` (9 subsequent siblings)
  10 siblings, 1 reply; 34+ messages in thread
From: Jakub Sitnicki @ 2026-01-10 21:05 UTC (permalink / raw)
  To: netdev
  Cc: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman, Michael Chan, Pavan Chebbi, Andrew Lunn,
	Tony Nguyen, Przemek Kitszel, Saeed Mahameed, Leon Romanovsky,
	Tariq Toukan, Mark Bloch, Alexei Starovoitov, Daniel Borkmann,
	Jesper Dangaard Brouer, John Fastabend, Stanislav Fomichev,
	intel-wired-lan, bpf, kernel-team

Prepare to copy XDP metadata into an skb extension chunk. To access the
metadata contents, we need to know where it is located. Document the
expectation - skb->data must point right past the metadata when
skb_metadata_set gets called.

Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
---
 include/linux/skbuff.h | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 86737076101d..df001283076f 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -4554,6 +4554,13 @@ static inline bool skb_metadata_differs(const struct sk_buff *skb_a,
 	       true : __skb_metadata_differs(skb_a, skb_b, len_a);
 }
 
+/**
+ * skb_metadata_set - Record packet metadata length.
+ * @skb: packet carrying the metadata
+ * @meta_len: number of bytes of metadata preceding skb->data
+ *
+ * Must be called when skb->data already points past the metadata area.
+ */
 static inline void skb_metadata_set(struct sk_buff *skb, u8 meta_len)
 {
 	skb_shinfo(skb)->meta_len = meta_len;

-- 
2.43.0


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH net-next 02/10] bnxt_en: Call skb_metadata_set when skb->data points past metadata
  2026-01-10 21:05 [PATCH net-next 00/10] Call skb_metadata_set when skb->data points past metadata Jakub Sitnicki
  2026-01-10 21:05 ` [PATCH net-next 01/10] net: Document skb_metadata_set contract with the drivers Jakub Sitnicki
@ 2026-01-10 21:05 ` Jakub Sitnicki
  2026-01-12 11:29   ` [Intel-wired-lan] " Loktionov, Aleksandr
  2026-01-10 21:05 ` [PATCH net-next 03/10] i40e: " Jakub Sitnicki
                   ` (8 subsequent siblings)
  10 siblings, 1 reply; 34+ messages in thread
From: Jakub Sitnicki @ 2026-01-10 21:05 UTC (permalink / raw)
  To: netdev
  Cc: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman, Michael Chan, Pavan Chebbi, Andrew Lunn,
	Tony Nguyen, Przemek Kitszel, Saeed Mahameed, Leon Romanovsky,
	Tariq Toukan, Mark Bloch, Alexei Starovoitov, Daniel Borkmann,
	Jesper Dangaard Brouer, John Fastabend, Stanislav Fomichev,
	intel-wired-lan, bpf, kernel-team

Prepare to copy the XDP metadata into an skb extension in skb_metadata_set.

Adjust the driver to pull from skb->data before calling skb_metadata_set.

Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index 8419d1eb4035..7d0d81d29167 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -1440,8 +1440,8 @@ static struct sk_buff *bnxt_copy_xdp(struct bnxt_napi *bnapi,
 		return skb;
 
 	if (metasize) {
-		skb_metadata_set(skb, metasize);
 		__skb_pull(skb, metasize);
+		skb_metadata_set(skb, metasize);
 	}
 
 	return skb;

-- 
2.43.0


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH net-next 03/10] i40e: Call skb_metadata_set when skb->data points past metadata
  2026-01-10 21:05 [PATCH net-next 00/10] Call skb_metadata_set when skb->data points past metadata Jakub Sitnicki
  2026-01-10 21:05 ` [PATCH net-next 01/10] net: Document skb_metadata_set contract with the drivers Jakub Sitnicki
  2026-01-10 21:05 ` [PATCH net-next 02/10] bnxt_en: Call skb_metadata_set when skb->data points past metadata Jakub Sitnicki
@ 2026-01-10 21:05 ` Jakub Sitnicki
  2026-01-12 11:30   ` [Intel-wired-lan] " Loktionov, Aleksandr
  2026-01-10 21:05 ` [PATCH net-next 04/10] igb: " Jakub Sitnicki
                   ` (7 subsequent siblings)
  10 siblings, 1 reply; 34+ messages in thread
From: Jakub Sitnicki @ 2026-01-10 21:05 UTC (permalink / raw)
  To: netdev
  Cc: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman, Michael Chan, Pavan Chebbi, Andrew Lunn,
	Tony Nguyen, Przemek Kitszel, Saeed Mahameed, Leon Romanovsky,
	Tariq Toukan, Mark Bloch, Alexei Starovoitov, Daniel Borkmann,
	Jesper Dangaard Brouer, John Fastabend, Stanislav Fomichev,
	intel-wired-lan, bpf, kernel-team

Prepare to copy the XDP metadata into an skb extension in skb_metadata_set.

Adjust the driver to pull from skb->data before calling skb_metadata_set.

Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
---
 drivers/net/ethernet/intel/i40e/i40e_xsk.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_xsk.c b/drivers/net/ethernet/intel/i40e/i40e_xsk.c
index 9f47388eaba5..11eff5bd840b 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_xsk.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_xsk.c
@@ -310,8 +310,8 @@ static struct sk_buff *i40e_construct_skb_zc(struct i40e_ring *rx_ring,
 	       ALIGN(totalsize, sizeof(long)));
 
 	if (metasize) {
-		skb_metadata_set(skb, metasize);
 		__skb_pull(skb, metasize);
+		skb_metadata_set(skb, metasize);
 	}
 
 	if (likely(!xdp_buff_has_frags(xdp)))

-- 
2.43.0


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH net-next 04/10] igb: Call skb_metadata_set when skb->data points past metadata
  2026-01-10 21:05 [PATCH net-next 00/10] Call skb_metadata_set when skb->data points past metadata Jakub Sitnicki
                   ` (2 preceding siblings ...)
  2026-01-10 21:05 ` [PATCH net-next 03/10] i40e: " Jakub Sitnicki
@ 2026-01-10 21:05 ` Jakub Sitnicki
  2026-01-12 11:31   ` [Intel-wired-lan] " Loktionov, Aleksandr
  2026-01-10 21:05 ` [PATCH net-next 05/10] igc: " Jakub Sitnicki
                   ` (6 subsequent siblings)
  10 siblings, 1 reply; 34+ messages in thread
From: Jakub Sitnicki @ 2026-01-10 21:05 UTC (permalink / raw)
  To: netdev
  Cc: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman, Michael Chan, Pavan Chebbi, Andrew Lunn,
	Tony Nguyen, Przemek Kitszel, Saeed Mahameed, Leon Romanovsky,
	Tariq Toukan, Mark Bloch, Alexei Starovoitov, Daniel Borkmann,
	Jesper Dangaard Brouer, John Fastabend, Stanislav Fomichev,
	intel-wired-lan, bpf, kernel-team

Prepare to copy the XDP metadata into an skb extension in skb_metadata_set.

Adjust the driver to pull from skb->data before calling skb_metadata_set.

Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
---
 drivers/net/ethernet/intel/igb/igb_xsk.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/intel/igb/igb_xsk.c b/drivers/net/ethernet/intel/igb/igb_xsk.c
index 30ce5fbb5b77..9202da66e32c 100644
--- a/drivers/net/ethernet/intel/igb/igb_xsk.c
+++ b/drivers/net/ethernet/intel/igb/igb_xsk.c
@@ -284,8 +284,8 @@ static struct sk_buff *igb_construct_skb_zc(struct igb_ring *rx_ring,
 	       ALIGN(totalsize, sizeof(long)));
 
 	if (metasize) {
-		skb_metadata_set(skb, metasize);
 		__skb_pull(skb, metasize);
+		skb_metadata_set(skb, metasize);
 	}
 
 	return skb;

-- 
2.43.0


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH net-next 05/10] igc: Call skb_metadata_set when skb->data points past metadata
  2026-01-10 21:05 [PATCH net-next 00/10] Call skb_metadata_set when skb->data points past metadata Jakub Sitnicki
                   ` (3 preceding siblings ...)
  2026-01-10 21:05 ` [PATCH net-next 04/10] igb: " Jakub Sitnicki
@ 2026-01-10 21:05 ` Jakub Sitnicki
  2026-01-12 11:31   ` [Intel-wired-lan] " Loktionov, Aleksandr
  2026-01-10 21:05 ` [PATCH net-next 06/10] ixgbe: " Jakub Sitnicki
                   ` (5 subsequent siblings)
  10 siblings, 1 reply; 34+ messages in thread
From: Jakub Sitnicki @ 2026-01-10 21:05 UTC (permalink / raw)
  To: netdev
  Cc: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman, Michael Chan, Pavan Chebbi, Andrew Lunn,
	Tony Nguyen, Przemek Kitszel, Saeed Mahameed, Leon Romanovsky,
	Tariq Toukan, Mark Bloch, Alexei Starovoitov, Daniel Borkmann,
	Jesper Dangaard Brouer, John Fastabend, Stanislav Fomichev,
	intel-wired-lan, bpf, kernel-team

Prepare to copy the XDP metadata into an skb extension in skb_metadata_set.

Adjust the driver to pull from skb->data before calling skb_metadata_set.

Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
---
 drivers/net/ethernet/intel/igc/igc_main.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/intel/igc/igc_main.c b/drivers/net/ethernet/intel/igc/igc_main.c
index 7aafa60ba0c8..ba758399615b 100644
--- a/drivers/net/ethernet/intel/igc/igc_main.c
+++ b/drivers/net/ethernet/intel/igc/igc_main.c
@@ -2024,8 +2024,8 @@ static struct sk_buff *igc_construct_skb(struct igc_ring *rx_ring,
 	       ALIGN(headlen + metasize, sizeof(long)));
 
 	if (metasize) {
-		skb_metadata_set(skb, metasize);
 		__skb_pull(skb, metasize);
+		skb_metadata_set(skb, metasize);
 	}
 
 	/* update all of the pointers */
@@ -2752,8 +2752,8 @@ static struct sk_buff *igc_construct_skb_zc(struct igc_ring *ring,
 	       ALIGN(totalsize, sizeof(long)));
 
 	if (metasize) {
-		skb_metadata_set(skb, metasize);
 		__skb_pull(skb, metasize);
+		skb_metadata_set(skb, metasize);
 	}
 
 	if (ctx->rx_ts) {

-- 
2.43.0


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH net-next 06/10] ixgbe: Call skb_metadata_set when skb->data points past metadata
  2026-01-10 21:05 [PATCH net-next 00/10] Call skb_metadata_set when skb->data points past metadata Jakub Sitnicki
                   ` (4 preceding siblings ...)
  2026-01-10 21:05 ` [PATCH net-next 05/10] igc: " Jakub Sitnicki
@ 2026-01-10 21:05 ` Jakub Sitnicki
  2026-01-12 11:32   ` [Intel-wired-lan] " Loktionov, Aleksandr
  2026-01-10 21:05 ` [PATCH net-next 07/10] mlx5e: " Jakub Sitnicki
                   ` (4 subsequent siblings)
  10 siblings, 1 reply; 34+ messages in thread
From: Jakub Sitnicki @ 2026-01-10 21:05 UTC (permalink / raw)
  To: netdev
  Cc: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman, Michael Chan, Pavan Chebbi, Andrew Lunn,
	Tony Nguyen, Przemek Kitszel, Saeed Mahameed, Leon Romanovsky,
	Tariq Toukan, Mark Bloch, Alexei Starovoitov, Daniel Borkmann,
	Jesper Dangaard Brouer, John Fastabend, Stanislav Fomichev,
	intel-wired-lan, bpf, kernel-team

Prepare to copy the XDP metadata into an skb extension in skb_metadata_set.

Adjust the driver to pull from skb->data before calling skb_metadata_set.

Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
---
 drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c
index 7b941505a9d0..69104f432f8d 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c
@@ -228,8 +228,8 @@ static struct sk_buff *ixgbe_construct_skb_zc(struct ixgbe_ring *rx_ring,
 	       ALIGN(totalsize, sizeof(long)));
 
 	if (metasize) {
-		skb_metadata_set(skb, metasize);
 		__skb_pull(skb, metasize);
+		skb_metadata_set(skb, metasize);
 	}
 
 	return skb;

-- 
2.43.0


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH net-next 07/10] mlx5e: Call skb_metadata_set when skb->data points past metadata
  2026-01-10 21:05 [PATCH net-next 00/10] Call skb_metadata_set when skb->data points past metadata Jakub Sitnicki
                   ` (5 preceding siblings ...)
  2026-01-10 21:05 ` [PATCH net-next 06/10] ixgbe: " Jakub Sitnicki
@ 2026-01-10 21:05 ` Jakub Sitnicki
  2026-01-12 11:32   ` [Intel-wired-lan] " Loktionov, Aleksandr
  2026-01-13  6:08   ` Tariq Toukan
  2026-01-10 21:05 ` [PATCH net-next 08/10] veth: " Jakub Sitnicki
                   ` (3 subsequent siblings)
  10 siblings, 2 replies; 34+ messages in thread
From: Jakub Sitnicki @ 2026-01-10 21:05 UTC (permalink / raw)
  To: netdev
  Cc: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman, Michael Chan, Pavan Chebbi, Andrew Lunn,
	Tony Nguyen, Przemek Kitszel, Saeed Mahameed, Leon Romanovsky,
	Tariq Toukan, Mark Bloch, Alexei Starovoitov, Daniel Borkmann,
	Jesper Dangaard Brouer, John Fastabend, Stanislav Fomichev,
	intel-wired-lan, bpf, kernel-team

Prepare to copy the XDP metadata into an skb extension in skb_metadata_set.

Adjust the driver to pull from skb->data before calling skb_metadata_set.

Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.c
index 2b05536d564a..20c983c3ce62 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.c
@@ -237,8 +237,8 @@ static struct sk_buff *mlx5e_xsk_construct_skb(struct mlx5e_rq *rq, struct xdp_b
 	skb_put_data(skb, xdp->data_meta, totallen);
 
 	if (metalen) {
-		skb_metadata_set(skb, metalen);
 		__skb_pull(skb, metalen);
+		skb_metadata_set(skb, metalen);
 	}
 
 	return skb;

-- 
2.43.0


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH net-next 08/10] veth: Call skb_metadata_set when skb->data points past metadata
  2026-01-10 21:05 [PATCH net-next 00/10] Call skb_metadata_set when skb->data points past metadata Jakub Sitnicki
                   ` (6 preceding siblings ...)
  2026-01-10 21:05 ` [PATCH net-next 07/10] mlx5e: " Jakub Sitnicki
@ 2026-01-10 21:05 ` Jakub Sitnicki
  2026-01-12 11:33   ` [Intel-wired-lan] " Loktionov, Aleksandr
  2026-01-10 21:05 ` [PATCH net-next 09/10] xsk: " Jakub Sitnicki
                   ` (2 subsequent siblings)
  10 siblings, 1 reply; 34+ messages in thread
From: Jakub Sitnicki @ 2026-01-10 21:05 UTC (permalink / raw)
  To: netdev
  Cc: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman, Michael Chan, Pavan Chebbi, Andrew Lunn,
	Tony Nguyen, Przemek Kitszel, Saeed Mahameed, Leon Romanovsky,
	Tariq Toukan, Mark Bloch, Alexei Starovoitov, Daniel Borkmann,
	Jesper Dangaard Brouer, John Fastabend, Stanislav Fomichev,
	intel-wired-lan, bpf, kernel-team

Prepare to copy the XDP metadata into an skb extension in skb_metadata_set.

Unlike other drivers, veth calls skb_metadata_set after eth_type_trans,
which pulls the Ethernet header and moves skb->data. This violates the
new contract with skb_metadata.

Adjust the driver to pull the MAC header after calling skb_metadata_set.

Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
---
 drivers/net/veth.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/veth.c b/drivers/net/veth.c
index 14e6f2a2fb77..1d1dbfa2e5ef 100644
--- a/drivers/net/veth.c
+++ b/drivers/net/veth.c
@@ -874,11 +874,11 @@ static struct sk_buff *veth_xdp_rcv_skb(struct veth_rq *rq,
 	else
 		skb->data_len = 0;
 
-	skb->protocol = eth_type_trans(skb, rq->dev);
-
 	metalen = xdp->data - xdp->data_meta;
 	if (metalen)
 		skb_metadata_set(skb, metalen);
+
+	skb->protocol = eth_type_trans(skb, rq->dev);
 out:
 	return skb;
 drop:

-- 
2.43.0


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH net-next 09/10] xsk: Call skb_metadata_set when skb->data points past metadata
  2026-01-10 21:05 [PATCH net-next 00/10] Call skb_metadata_set when skb->data points past metadata Jakub Sitnicki
                   ` (7 preceding siblings ...)
  2026-01-10 21:05 ` [PATCH net-next 08/10] veth: " Jakub Sitnicki
@ 2026-01-10 21:05 ` Jakub Sitnicki
  2026-01-12 11:33   ` [Intel-wired-lan] " Loktionov, Aleksandr
  2026-01-10 21:05 ` [PATCH net-next 10/10] xdp: " Jakub Sitnicki
  2026-01-13  3:08 ` [PATCH net-next 00/10] " Jakub Kicinski
  10 siblings, 1 reply; 34+ messages in thread
From: Jakub Sitnicki @ 2026-01-10 21:05 UTC (permalink / raw)
  To: netdev
  Cc: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman, Michael Chan, Pavan Chebbi, Andrew Lunn,
	Tony Nguyen, Przemek Kitszel, Saeed Mahameed, Leon Romanovsky,
	Tariq Toukan, Mark Bloch, Alexei Starovoitov, Daniel Borkmann,
	Jesper Dangaard Brouer, John Fastabend, Stanislav Fomichev,
	intel-wired-lan, bpf, kernel-team

Prepare to copy the XDP metadata into an skb extension in skb_metadata_set.

Adjust AF_XDP to pull from skb->data before calling skb_metadata_set.

Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
---
 net/core/xdp.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/core/xdp.c b/net/core/xdp.c
index 9100e160113a..e86ac1d6ad6d 100644
--- a/net/core/xdp.c
+++ b/net/core/xdp.c
@@ -768,8 +768,8 @@ struct sk_buff *xdp_build_skb_from_zc(struct xdp_buff *xdp)
 
 	metalen = xdp->data - xdp->data_meta;
 	if (metalen > 0) {
-		skb_metadata_set(skb, metalen);
 		__skb_pull(skb, metalen);
+		skb_metadata_set(skb, metalen);
 	}
 
 	skb_record_rx_queue(skb, rxq->queue_index);

-- 
2.43.0


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH net-next 10/10] xdp: Call skb_metadata_set when skb->data points past metadata
  2026-01-10 21:05 [PATCH net-next 00/10] Call skb_metadata_set when skb->data points past metadata Jakub Sitnicki
                   ` (8 preceding siblings ...)
  2026-01-10 21:05 ` [PATCH net-next 09/10] xsk: " Jakub Sitnicki
@ 2026-01-10 21:05 ` Jakub Sitnicki
  2026-01-12 11:33   ` [Intel-wired-lan] " Loktionov, Aleksandr
  2026-01-13  3:08 ` [PATCH net-next 00/10] " Jakub Kicinski
  10 siblings, 1 reply; 34+ messages in thread
From: Jakub Sitnicki @ 2026-01-10 21:05 UTC (permalink / raw)
  To: netdev
  Cc: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman, Michael Chan, Pavan Chebbi, Andrew Lunn,
	Tony Nguyen, Przemek Kitszel, Saeed Mahameed, Leon Romanovsky,
	Tariq Toukan, Mark Bloch, Alexei Starovoitov, Daniel Borkmann,
	Jesper Dangaard Brouer, John Fastabend, Stanislav Fomichev,
	intel-wired-lan, bpf, kernel-team

Prepare to copy the XDP metadata into an skb extension in skb_metadata_set.

XDP generic mode runs after MAC header has been already pulled. Adjust
skb->data before calling skb_metadata_set to adhere to new contract.

Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
---
 net/core/dev.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/net/core/dev.c b/net/core/dev.c
index c711da335510..f8e5672e835f 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -5468,8 +5468,11 @@ u32 bpf_prog_run_generic_xdp(struct sk_buff *skb, struct xdp_buff *xdp,
 		break;
 	case XDP_PASS:
 		metalen = xdp->data - xdp->data_meta;
-		if (metalen)
+		if (metalen) {
+			__skb_push(skb, mac_len);
 			skb_metadata_set(skb, metalen);
+			__skb_pull(skb, mac_len);
+		}
 		break;
 	}
 

-- 
2.43.0


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* RE: [Intel-wired-lan] [PATCH net-next 01/10] net: Document skb_metadata_set contract with the drivers
  2026-01-10 21:05 ` [PATCH net-next 01/10] net: Document skb_metadata_set contract with the drivers Jakub Sitnicki
@ 2026-01-12 11:28   ` Loktionov, Aleksandr
  0 siblings, 0 replies; 34+ messages in thread
From: Loktionov, Aleksandr @ 2026-01-12 11:28 UTC (permalink / raw)
  To: Jakub Sitnicki, netdev@vger.kernel.org
  Cc: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman, Michael Chan, Pavan Chebbi, Andrew Lunn,
	Nguyen, Anthony L, Kitszel, Przemyslaw, Saeed Mahameed,
	Leon Romanovsky, Tariq Toukan, Mark Bloch, Alexei Starovoitov,
	Daniel Borkmann, Jesper Dangaard Brouer, John Fastabend,
	Stanislav Fomichev, intel-wired-lan@lists.osuosl.org,
	bpf@vger.kernel.org, kernel-team@cloudflare.com



> -----Original Message-----
> From: Intel-wired-lan <intel-wired-lan-bounces@osuosl.org> On Behalf
> Of Jakub Sitnicki via Intel-wired-lan
> Sent: Saturday, January 10, 2026 10:05 PM
> To: netdev@vger.kernel.org
> Cc: David S. Miller <davem@davemloft.net>; Eric Dumazet
> <edumazet@google.com>; Jakub Kicinski <kuba@kernel.org>; Paolo Abeni
> <pabeni@redhat.com>; Simon Horman <horms@kernel.org>; Michael Chan
> <michael.chan@broadcom.com>; Pavan Chebbi <pavan.chebbi@broadcom.com>;
> Andrew Lunn <andrew+netdev@lunn.ch>; Nguyen, Anthony L
> <anthony.l.nguyen@intel.com>; Kitszel, Przemyslaw
> <przemyslaw.kitszel@intel.com>; Saeed Mahameed <saeedm@nvidia.com>;
> Leon Romanovsky <leon@kernel.org>; Tariq Toukan <tariqt@nvidia.com>;
> Mark Bloch <mbloch@nvidia.com>; Alexei Starovoitov <ast@kernel.org>;
> Daniel Borkmann <daniel@iogearbox.net>; Jesper Dangaard Brouer
> <hawk@kernel.org>; John Fastabend <john.fastabend@gmail.com>;
> Stanislav Fomichev <sdf@fomichev.me>; intel-wired-
> lan@lists.osuosl.org; bpf@vger.kernel.org; kernel-team@cloudflare.com
> Subject: [Intel-wired-lan] [PATCH net-next 01/10] net: Document
> skb_metadata_set contract with the drivers
> 
> Prepare to copy XDP metadata into an skb extension chunk. To access
> the metadata contents, we need to know where it is located. Document
> the expectation - skb->data must point right past the metadata when
> skb_metadata_set gets called.
> 
> Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
> ---
>  include/linux/skbuff.h | 7 +++++++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h index
> 86737076101d..df001283076f 100644
> --- a/include/linux/skbuff.h
> +++ b/include/linux/skbuff.h
> @@ -4554,6 +4554,13 @@ static inline bool skb_metadata_differs(const
> struct sk_buff *skb_a,
>  	       true : __skb_metadata_differs(skb_a, skb_b, len_a);  }
> 
> +/**
> + * skb_metadata_set - Record packet metadata length.
> + * @skb: packet carrying the metadata
> + * @meta_len: number of bytes of metadata preceding skb->data
> + *
> + * Must be called when skb->data already points past the metadata
> area.
> + */
>  static inline void skb_metadata_set(struct sk_buff *skb, u8 meta_len)
> {
>  	skb_shinfo(skb)->meta_len = meta_len;
> 
> --
> 2.43.0

Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>

^ permalink raw reply	[flat|nested] 34+ messages in thread

* RE: [Intel-wired-lan] [PATCH net-next 02/10] bnxt_en: Call skb_metadata_set when skb->data points past metadata
  2026-01-10 21:05 ` [PATCH net-next 02/10] bnxt_en: Call skb_metadata_set when skb->data points past metadata Jakub Sitnicki
@ 2026-01-12 11:29   ` Loktionov, Aleksandr
  0 siblings, 0 replies; 34+ messages in thread
From: Loktionov, Aleksandr @ 2026-01-12 11:29 UTC (permalink / raw)
  To: Jakub Sitnicki, netdev@vger.kernel.org
  Cc: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman, Michael Chan, Pavan Chebbi, Andrew Lunn,
	Nguyen, Anthony L, Kitszel, Przemyslaw, Saeed Mahameed,
	Leon Romanovsky, Tariq Toukan, Mark Bloch, Alexei Starovoitov,
	Daniel Borkmann, Jesper Dangaard Brouer, John Fastabend,
	Stanislav Fomichev, intel-wired-lan@lists.osuosl.org,
	bpf@vger.kernel.org, kernel-team@cloudflare.com



> -----Original Message-----
> From: Intel-wired-lan <intel-wired-lan-bounces@osuosl.org> On Behalf
> Of Jakub Sitnicki via Intel-wired-lan
> Sent: Saturday, January 10, 2026 10:05 PM
> To: netdev@vger.kernel.org
> Cc: David S. Miller <davem@davemloft.net>; Eric Dumazet
> <edumazet@google.com>; Jakub Kicinski <kuba@kernel.org>; Paolo Abeni
> <pabeni@redhat.com>; Simon Horman <horms@kernel.org>; Michael Chan
> <michael.chan@broadcom.com>; Pavan Chebbi <pavan.chebbi@broadcom.com>;
> Andrew Lunn <andrew+netdev@lunn.ch>; Nguyen, Anthony L
> <anthony.l.nguyen@intel.com>; Kitszel, Przemyslaw
> <przemyslaw.kitszel@intel.com>; Saeed Mahameed <saeedm@nvidia.com>;
> Leon Romanovsky <leon@kernel.org>; Tariq Toukan <tariqt@nvidia.com>;
> Mark Bloch <mbloch@nvidia.com>; Alexei Starovoitov <ast@kernel.org>;
> Daniel Borkmann <daniel@iogearbox.net>; Jesper Dangaard Brouer
> <hawk@kernel.org>; John Fastabend <john.fastabend@gmail.com>;
> Stanislav Fomichev <sdf@fomichev.me>; intel-wired-
> lan@lists.osuosl.org; bpf@vger.kernel.org; kernel-team@cloudflare.com
> Subject: [Intel-wired-lan] [PATCH net-next 02/10] bnxt_en: Call
> skb_metadata_set when skb->data points past metadata
> 
> Prepare to copy the XDP metadata into an skb extension in
> skb_metadata_set.
> 
> Adjust the driver to pull from skb->data before calling
> skb_metadata_set.
> 
> Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
> ---
>  drivers/net/ethernet/broadcom/bnxt/bnxt.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
> b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
> index 8419d1eb4035..7d0d81d29167 100644
> --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
> +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
> @@ -1440,8 +1440,8 @@ static struct sk_buff *bnxt_copy_xdp(struct
> bnxt_napi *bnapi,
>  		return skb;
> 
>  	if (metasize) {
> -		skb_metadata_set(skb, metasize);
>  		__skb_pull(skb, metasize);
> +		skb_metadata_set(skb, metasize);
>  	}
> 
>  	return skb;
> 
> --
> 2.43.0
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>

^ permalink raw reply	[flat|nested] 34+ messages in thread

* RE: [Intel-wired-lan] [PATCH net-next 03/10] i40e: Call skb_metadata_set when skb->data points past metadata
  2026-01-10 21:05 ` [PATCH net-next 03/10] i40e: " Jakub Sitnicki
@ 2026-01-12 11:30   ` Loktionov, Aleksandr
  0 siblings, 0 replies; 34+ messages in thread
From: Loktionov, Aleksandr @ 2026-01-12 11:30 UTC (permalink / raw)
  To: Jakub Sitnicki, netdev@vger.kernel.org
  Cc: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman, Michael Chan, Pavan Chebbi, Andrew Lunn,
	Nguyen, Anthony L, Kitszel, Przemyslaw, Saeed Mahameed,
	Leon Romanovsky, Tariq Toukan, Mark Bloch, Alexei Starovoitov,
	Daniel Borkmann, Jesper Dangaard Brouer, John Fastabend,
	Stanislav Fomichev, intel-wired-lan@lists.osuosl.org,
	bpf@vger.kernel.org, kernel-team@cloudflare.com



> -----Original Message-----
> From: Intel-wired-lan <intel-wired-lan-bounces@osuosl.org> On Behalf
> Of Jakub Sitnicki via Intel-wired-lan
> Sent: Saturday, January 10, 2026 10:05 PM
> To: netdev@vger.kernel.org
> Cc: David S. Miller <davem@davemloft.net>; Eric Dumazet
> <edumazet@google.com>; Jakub Kicinski <kuba@kernel.org>; Paolo Abeni
> <pabeni@redhat.com>; Simon Horman <horms@kernel.org>; Michael Chan
> <michael.chan@broadcom.com>; Pavan Chebbi <pavan.chebbi@broadcom.com>;
> Andrew Lunn <andrew+netdev@lunn.ch>; Nguyen, Anthony L
> <anthony.l.nguyen@intel.com>; Kitszel, Przemyslaw
> <przemyslaw.kitszel@intel.com>; Saeed Mahameed <saeedm@nvidia.com>;
> Leon Romanovsky <leon@kernel.org>; Tariq Toukan <tariqt@nvidia.com>;
> Mark Bloch <mbloch@nvidia.com>; Alexei Starovoitov <ast@kernel.org>;
> Daniel Borkmann <daniel@iogearbox.net>; Jesper Dangaard Brouer
> <hawk@kernel.org>; John Fastabend <john.fastabend@gmail.com>;
> Stanislav Fomichev <sdf@fomichev.me>; intel-wired-
> lan@lists.osuosl.org; bpf@vger.kernel.org; kernel-team@cloudflare.com
> Subject: [Intel-wired-lan] [PATCH net-next 03/10] i40e: Call
> skb_metadata_set when skb->data points past metadata
> 
> Prepare to copy the XDP metadata into an skb extension in
> skb_metadata_set.
> 
> Adjust the driver to pull from skb->data before calling
> skb_metadata_set.
> 
> Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
> ---
>  drivers/net/ethernet/intel/i40e/i40e_xsk.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/net/ethernet/intel/i40e/i40e_xsk.c
> b/drivers/net/ethernet/intel/i40e/i40e_xsk.c
> index 9f47388eaba5..11eff5bd840b 100644
> --- a/drivers/net/ethernet/intel/i40e/i40e_xsk.c
> +++ b/drivers/net/ethernet/intel/i40e/i40e_xsk.c
> @@ -310,8 +310,8 @@ static struct sk_buff
> *i40e_construct_skb_zc(struct i40e_ring *rx_ring,
>  	       ALIGN(totalsize, sizeof(long)));
> 
>  	if (metasize) {
> -		skb_metadata_set(skb, metasize);
>  		__skb_pull(skb, metasize);
> +		skb_metadata_set(skb, metasize);
>  	}
> 
>  	if (likely(!xdp_buff_has_frags(xdp)))
> 
> --
> 2.43.0
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>

^ permalink raw reply	[flat|nested] 34+ messages in thread

* RE: [Intel-wired-lan] [PATCH net-next 04/10] igb: Call skb_metadata_set when skb->data points past metadata
  2026-01-10 21:05 ` [PATCH net-next 04/10] igb: " Jakub Sitnicki
@ 2026-01-12 11:31   ` Loktionov, Aleksandr
  0 siblings, 0 replies; 34+ messages in thread
From: Loktionov, Aleksandr @ 2026-01-12 11:31 UTC (permalink / raw)
  To: Jakub Sitnicki, netdev@vger.kernel.org
  Cc: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman, Michael Chan, Pavan Chebbi, Andrew Lunn,
	Nguyen, Anthony L, Kitszel, Przemyslaw, Saeed Mahameed,
	Leon Romanovsky, Tariq Toukan, Mark Bloch, Alexei Starovoitov,
	Daniel Borkmann, Jesper Dangaard Brouer, John Fastabend,
	Stanislav Fomichev, intel-wired-lan@lists.osuosl.org,
	bpf@vger.kernel.org, kernel-team@cloudflare.com



> -----Original Message-----
> From: Intel-wired-lan <intel-wired-lan-bounces@osuosl.org> On Behalf
> Of Jakub Sitnicki via Intel-wired-lan
> Sent: Saturday, January 10, 2026 10:05 PM
> To: netdev@vger.kernel.org
> Cc: David S. Miller <davem@davemloft.net>; Eric Dumazet
> <edumazet@google.com>; Jakub Kicinski <kuba@kernel.org>; Paolo Abeni
> <pabeni@redhat.com>; Simon Horman <horms@kernel.org>; Michael Chan
> <michael.chan@broadcom.com>; Pavan Chebbi <pavan.chebbi@broadcom.com>;
> Andrew Lunn <andrew+netdev@lunn.ch>; Nguyen, Anthony L
> <anthony.l.nguyen@intel.com>; Kitszel, Przemyslaw
> <przemyslaw.kitszel@intel.com>; Saeed Mahameed <saeedm@nvidia.com>;
> Leon Romanovsky <leon@kernel.org>; Tariq Toukan <tariqt@nvidia.com>;
> Mark Bloch <mbloch@nvidia.com>; Alexei Starovoitov <ast@kernel.org>;
> Daniel Borkmann <daniel@iogearbox.net>; Jesper Dangaard Brouer
> <hawk@kernel.org>; John Fastabend <john.fastabend@gmail.com>;
> Stanislav Fomichev <sdf@fomichev.me>; intel-wired-
> lan@lists.osuosl.org; bpf@vger.kernel.org; kernel-team@cloudflare.com
> Subject: [Intel-wired-lan] [PATCH net-next 04/10] igb: Call
> skb_metadata_set when skb->data points past metadata
> 
> Prepare to copy the XDP metadata into an skb extension in
> skb_metadata_set.
> 
> Adjust the driver to pull from skb->data before calling
> skb_metadata_set.
> 
> Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
> ---
>  drivers/net/ethernet/intel/igb/igb_xsk.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/net/ethernet/intel/igb/igb_xsk.c
> b/drivers/net/ethernet/intel/igb/igb_xsk.c
> index 30ce5fbb5b77..9202da66e32c 100644
> --- a/drivers/net/ethernet/intel/igb/igb_xsk.c
> +++ b/drivers/net/ethernet/intel/igb/igb_xsk.c
> @@ -284,8 +284,8 @@ static struct sk_buff *igb_construct_skb_zc(struct
> igb_ring *rx_ring,
>  	       ALIGN(totalsize, sizeof(long)));
> 
>  	if (metasize) {
> -		skb_metadata_set(skb, metasize);
>  		__skb_pull(skb, metasize);
> +		skb_metadata_set(skb, metasize);
>  	}
> 
>  	return skb;
> 
> --
> 2.43.0
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>

^ permalink raw reply	[flat|nested] 34+ messages in thread

* RE: [Intel-wired-lan] [PATCH net-next 05/10] igc: Call skb_metadata_set when skb->data points past metadata
  2026-01-10 21:05 ` [PATCH net-next 05/10] igc: " Jakub Sitnicki
@ 2026-01-12 11:31   ` Loktionov, Aleksandr
  0 siblings, 0 replies; 34+ messages in thread
From: Loktionov, Aleksandr @ 2026-01-12 11:31 UTC (permalink / raw)
  To: Jakub Sitnicki, netdev@vger.kernel.org
  Cc: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman, Michael Chan, Pavan Chebbi, Andrew Lunn,
	Nguyen, Anthony L, Kitszel, Przemyslaw, Saeed Mahameed,
	Leon Romanovsky, Tariq Toukan, Mark Bloch, Alexei Starovoitov,
	Daniel Borkmann, Jesper Dangaard Brouer, John Fastabend,
	Stanislav Fomichev, intel-wired-lan@lists.osuosl.org,
	bpf@vger.kernel.org, kernel-team@cloudflare.com



> -----Original Message-----
> From: Intel-wired-lan <intel-wired-lan-bounces@osuosl.org> On Behalf
> Of Jakub Sitnicki via Intel-wired-lan
> Sent: Saturday, January 10, 2026 10:05 PM
> To: netdev@vger.kernel.org
> Cc: David S. Miller <davem@davemloft.net>; Eric Dumazet
> <edumazet@google.com>; Jakub Kicinski <kuba@kernel.org>; Paolo Abeni
> <pabeni@redhat.com>; Simon Horman <horms@kernel.org>; Michael Chan
> <michael.chan@broadcom.com>; Pavan Chebbi <pavan.chebbi@broadcom.com>;
> Andrew Lunn <andrew+netdev@lunn.ch>; Nguyen, Anthony L
> <anthony.l.nguyen@intel.com>; Kitszel, Przemyslaw
> <przemyslaw.kitszel@intel.com>; Saeed Mahameed <saeedm@nvidia.com>;
> Leon Romanovsky <leon@kernel.org>; Tariq Toukan <tariqt@nvidia.com>;
> Mark Bloch <mbloch@nvidia.com>; Alexei Starovoitov <ast@kernel.org>;
> Daniel Borkmann <daniel@iogearbox.net>; Jesper Dangaard Brouer
> <hawk@kernel.org>; John Fastabend <john.fastabend@gmail.com>;
> Stanislav Fomichev <sdf@fomichev.me>; intel-wired-
> lan@lists.osuosl.org; bpf@vger.kernel.org; kernel-team@cloudflare.com
> Subject: [Intel-wired-lan] [PATCH net-next 05/10] igc: Call
> skb_metadata_set when skb->data points past metadata
> 
> Prepare to copy the XDP metadata into an skb extension in
> skb_metadata_set.
> 
> Adjust the driver to pull from skb->data before calling
> skb_metadata_set.
> 
> Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
> ---
>  drivers/net/ethernet/intel/igc/igc_main.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/net/ethernet/intel/igc/igc_main.c
> b/drivers/net/ethernet/intel/igc/igc_main.c
> index 7aafa60ba0c8..ba758399615b 100644
> --- a/drivers/net/ethernet/intel/igc/igc_main.c
> +++ b/drivers/net/ethernet/intel/igc/igc_main.c
> @@ -2024,8 +2024,8 @@ static struct sk_buff *igc_construct_skb(struct
> igc_ring *rx_ring,
>  	       ALIGN(headlen + metasize, sizeof(long)));
> 
>  	if (metasize) {
> -		skb_metadata_set(skb, metasize);
>  		__skb_pull(skb, metasize);
> +		skb_metadata_set(skb, metasize);
>  	}
> 
>  	/* update all of the pointers */
> @@ -2752,8 +2752,8 @@ static struct sk_buff
> *igc_construct_skb_zc(struct igc_ring *ring,
>  	       ALIGN(totalsize, sizeof(long)));
> 
>  	if (metasize) {
> -		skb_metadata_set(skb, metasize);
>  		__skb_pull(skb, metasize);
> +		skb_metadata_set(skb, metasize);
>  	}
> 
>  	if (ctx->rx_ts) {
> 
> --
> 2.43.0
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>

^ permalink raw reply	[flat|nested] 34+ messages in thread

* RE: [Intel-wired-lan] [PATCH net-next 06/10] ixgbe: Call skb_metadata_set when skb->data points past metadata
  2026-01-10 21:05 ` [PATCH net-next 06/10] ixgbe: " Jakub Sitnicki
@ 2026-01-12 11:32   ` Loktionov, Aleksandr
  0 siblings, 0 replies; 34+ messages in thread
From: Loktionov, Aleksandr @ 2026-01-12 11:32 UTC (permalink / raw)
  To: Jakub Sitnicki, netdev@vger.kernel.org
  Cc: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman, Michael Chan, Pavan Chebbi, Andrew Lunn,
	Nguyen, Anthony L, Kitszel, Przemyslaw, Saeed Mahameed,
	Leon Romanovsky, Tariq Toukan, Mark Bloch, Alexei Starovoitov,
	Daniel Borkmann, Jesper Dangaard Brouer, John Fastabend,
	Stanislav Fomichev, intel-wired-lan@lists.osuosl.org,
	bpf@vger.kernel.org, kernel-team@cloudflare.com



> -----Original Message-----
> From: Intel-wired-lan <intel-wired-lan-bounces@osuosl.org> On Behalf
> Of Jakub Sitnicki via Intel-wired-lan
> Sent: Saturday, January 10, 2026 10:05 PM
> To: netdev@vger.kernel.org
> Cc: David S. Miller <davem@davemloft.net>; Eric Dumazet
> <edumazet@google.com>; Jakub Kicinski <kuba@kernel.org>; Paolo Abeni
> <pabeni@redhat.com>; Simon Horman <horms@kernel.org>; Michael Chan
> <michael.chan@broadcom.com>; Pavan Chebbi <pavan.chebbi@broadcom.com>;
> Andrew Lunn <andrew+netdev@lunn.ch>; Nguyen, Anthony L
> <anthony.l.nguyen@intel.com>; Kitszel, Przemyslaw
> <przemyslaw.kitszel@intel.com>; Saeed Mahameed <saeedm@nvidia.com>;
> Leon Romanovsky <leon@kernel.org>; Tariq Toukan <tariqt@nvidia.com>;
> Mark Bloch <mbloch@nvidia.com>; Alexei Starovoitov <ast@kernel.org>;
> Daniel Borkmann <daniel@iogearbox.net>; Jesper Dangaard Brouer
> <hawk@kernel.org>; John Fastabend <john.fastabend@gmail.com>;
> Stanislav Fomichev <sdf@fomichev.me>; intel-wired-
> lan@lists.osuosl.org; bpf@vger.kernel.org; kernel-team@cloudflare.com
> Subject: [Intel-wired-lan] [PATCH net-next 06/10] ixgbe: Call
> skb_metadata_set when skb->data points past metadata
> 
> Prepare to copy the XDP metadata into an skb extension in
> skb_metadata_set.
> 
> Adjust the driver to pull from skb->data before calling
> skb_metadata_set.
> 
> Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
> ---
>  drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c
> b/drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c
> index 7b941505a9d0..69104f432f8d 100644
> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c
> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c
> @@ -228,8 +228,8 @@ static struct sk_buff
> *ixgbe_construct_skb_zc(struct ixgbe_ring *rx_ring,
>  	       ALIGN(totalsize, sizeof(long)));
> 
>  	if (metasize) {
> -		skb_metadata_set(skb, metasize);
>  		__skb_pull(skb, metasize);
> +		skb_metadata_set(skb, metasize);
>  	}
> 
>  	return skb;
> 
> --
> 2.43.0
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>

^ permalink raw reply	[flat|nested] 34+ messages in thread

* RE: [Intel-wired-lan] [PATCH net-next 07/10] mlx5e: Call skb_metadata_set when skb->data points past metadata
  2026-01-10 21:05 ` [PATCH net-next 07/10] mlx5e: " Jakub Sitnicki
@ 2026-01-12 11:32   ` Loktionov, Aleksandr
  2026-01-13  6:08   ` Tariq Toukan
  1 sibling, 0 replies; 34+ messages in thread
From: Loktionov, Aleksandr @ 2026-01-12 11:32 UTC (permalink / raw)
  To: Jakub Sitnicki, netdev@vger.kernel.org
  Cc: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman, Michael Chan, Pavan Chebbi, Andrew Lunn,
	Nguyen, Anthony L, Kitszel, Przemyslaw, Saeed Mahameed,
	Leon Romanovsky, Tariq Toukan, Mark Bloch, Alexei Starovoitov,
	Daniel Borkmann, Jesper Dangaard Brouer, John Fastabend,
	Stanislav Fomichev, intel-wired-lan@lists.osuosl.org,
	bpf@vger.kernel.org, kernel-team@cloudflare.com



> -----Original Message-----
> From: Intel-wired-lan <intel-wired-lan-bounces@osuosl.org> On Behalf
> Of Jakub Sitnicki via Intel-wired-lan
> Sent: Saturday, January 10, 2026 10:05 PM
> To: netdev@vger.kernel.org
> Cc: David S. Miller <davem@davemloft.net>; Eric Dumazet
> <edumazet@google.com>; Jakub Kicinski <kuba@kernel.org>; Paolo Abeni
> <pabeni@redhat.com>; Simon Horman <horms@kernel.org>; Michael Chan
> <michael.chan@broadcom.com>; Pavan Chebbi <pavan.chebbi@broadcom.com>;
> Andrew Lunn <andrew+netdev@lunn.ch>; Nguyen, Anthony L
> <anthony.l.nguyen@intel.com>; Kitszel, Przemyslaw
> <przemyslaw.kitszel@intel.com>; Saeed Mahameed <saeedm@nvidia.com>;
> Leon Romanovsky <leon@kernel.org>; Tariq Toukan <tariqt@nvidia.com>;
> Mark Bloch <mbloch@nvidia.com>; Alexei Starovoitov <ast@kernel.org>;
> Daniel Borkmann <daniel@iogearbox.net>; Jesper Dangaard Brouer
> <hawk@kernel.org>; John Fastabend <john.fastabend@gmail.com>;
> Stanislav Fomichev <sdf@fomichev.me>; intel-wired-
> lan@lists.osuosl.org; bpf@vger.kernel.org; kernel-team@cloudflare.com
> Subject: [Intel-wired-lan] [PATCH net-next 07/10] mlx5e: Call
> skb_metadata_set when skb->data points past metadata
> 
> Prepare to copy the XDP metadata into an skb extension in
> skb_metadata_set.
> 
> Adjust the driver to pull from skb->data before calling
> skb_metadata_set.
> 
> Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
> ---
>  drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.c
> b/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.c
> index 2b05536d564a..20c983c3ce62 100644
> --- a/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.c
> +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.c
> @@ -237,8 +237,8 @@ static struct sk_buff
> *mlx5e_xsk_construct_skb(struct mlx5e_rq *rq, struct xdp_b
>  	skb_put_data(skb, xdp->data_meta, totallen);
> 
>  	if (metalen) {
> -		skb_metadata_set(skb, metalen);
>  		__skb_pull(skb, metalen);
> +		skb_metadata_set(skb, metalen);
>  	}
> 
>  	return skb;
> 
> --
> 2.43.0
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>

^ permalink raw reply	[flat|nested] 34+ messages in thread

* RE: [Intel-wired-lan] [PATCH net-next 08/10] veth: Call skb_metadata_set when skb->data points past metadata
  2026-01-10 21:05 ` [PATCH net-next 08/10] veth: " Jakub Sitnicki
@ 2026-01-12 11:33   ` Loktionov, Aleksandr
  0 siblings, 0 replies; 34+ messages in thread
From: Loktionov, Aleksandr @ 2026-01-12 11:33 UTC (permalink / raw)
  To: Jakub Sitnicki, netdev@vger.kernel.org
  Cc: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman, Michael Chan, Pavan Chebbi, Andrew Lunn,
	Nguyen, Anthony L, Kitszel, Przemyslaw, Saeed Mahameed,
	Leon Romanovsky, Tariq Toukan, Mark Bloch, Alexei Starovoitov,
	Daniel Borkmann, Jesper Dangaard Brouer, John Fastabend,
	Stanislav Fomichev, intel-wired-lan@lists.osuosl.org,
	bpf@vger.kernel.org, kernel-team@cloudflare.com



> -----Original Message-----
> From: Intel-wired-lan <intel-wired-lan-bounces@osuosl.org> On Behalf
> Of Jakub Sitnicki via Intel-wired-lan
> Sent: Saturday, January 10, 2026 10:05 PM
> To: netdev@vger.kernel.org
> Cc: David S. Miller <davem@davemloft.net>; Eric Dumazet
> <edumazet@google.com>; Jakub Kicinski <kuba@kernel.org>; Paolo Abeni
> <pabeni@redhat.com>; Simon Horman <horms@kernel.org>; Michael Chan
> <michael.chan@broadcom.com>; Pavan Chebbi <pavan.chebbi@broadcom.com>;
> Andrew Lunn <andrew+netdev@lunn.ch>; Nguyen, Anthony L
> <anthony.l.nguyen@intel.com>; Kitszel, Przemyslaw
> <przemyslaw.kitszel@intel.com>; Saeed Mahameed <saeedm@nvidia.com>;
> Leon Romanovsky <leon@kernel.org>; Tariq Toukan <tariqt@nvidia.com>;
> Mark Bloch <mbloch@nvidia.com>; Alexei Starovoitov <ast@kernel.org>;
> Daniel Borkmann <daniel@iogearbox.net>; Jesper Dangaard Brouer
> <hawk@kernel.org>; John Fastabend <john.fastabend@gmail.com>;
> Stanislav Fomichev <sdf@fomichev.me>; intel-wired-
> lan@lists.osuosl.org; bpf@vger.kernel.org; kernel-team@cloudflare.com
> Subject: [Intel-wired-lan] [PATCH net-next 08/10] veth: Call
> skb_metadata_set when skb->data points past metadata
> 
> Prepare to copy the XDP metadata into an skb extension in
> skb_metadata_set.
> 
> Unlike other drivers, veth calls skb_metadata_set after
> eth_type_trans, which pulls the Ethernet header and moves skb->data.
> This violates the new contract with skb_metadata.
> 
> Adjust the driver to pull the MAC header after calling
> skb_metadata_set.
> 
> Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
> ---
>  drivers/net/veth.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/net/veth.c b/drivers/net/veth.c index
> 14e6f2a2fb77..1d1dbfa2e5ef 100644
> --- a/drivers/net/veth.c
> +++ b/drivers/net/veth.c
> @@ -874,11 +874,11 @@ static struct sk_buff *veth_xdp_rcv_skb(struct
> veth_rq *rq,
>  	else
>  		skb->data_len = 0;
> 
> -	skb->protocol = eth_type_trans(skb, rq->dev);
> -
>  	metalen = xdp->data - xdp->data_meta;
>  	if (metalen)
>  		skb_metadata_set(skb, metalen);
> +
> +	skb->protocol = eth_type_trans(skb, rq->dev);
>  out:
>  	return skb;
>  drop:
> 
> --
> 2.43.0
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>

^ permalink raw reply	[flat|nested] 34+ messages in thread

* RE: [Intel-wired-lan] [PATCH net-next 09/10] xsk: Call skb_metadata_set when skb->data points past metadata
  2026-01-10 21:05 ` [PATCH net-next 09/10] xsk: " Jakub Sitnicki
@ 2026-01-12 11:33   ` Loktionov, Aleksandr
  0 siblings, 0 replies; 34+ messages in thread
From: Loktionov, Aleksandr @ 2026-01-12 11:33 UTC (permalink / raw)
  To: Jakub Sitnicki, netdev@vger.kernel.org
  Cc: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman, Michael Chan, Pavan Chebbi, Andrew Lunn,
	Nguyen, Anthony L, Kitszel, Przemyslaw, Saeed Mahameed,
	Leon Romanovsky, Tariq Toukan, Mark Bloch, Alexei Starovoitov,
	Daniel Borkmann, Jesper Dangaard Brouer, John Fastabend,
	Stanislav Fomichev, intel-wired-lan@lists.osuosl.org,
	bpf@vger.kernel.org, kernel-team@cloudflare.com



> -----Original Message-----
> From: Intel-wired-lan <intel-wired-lan-bounces@osuosl.org> On Behalf
> Of Jakub Sitnicki via Intel-wired-lan
> Sent: Saturday, January 10, 2026 10:05 PM
> To: netdev@vger.kernel.org
> Cc: David S. Miller <davem@davemloft.net>; Eric Dumazet
> <edumazet@google.com>; Jakub Kicinski <kuba@kernel.org>; Paolo Abeni
> <pabeni@redhat.com>; Simon Horman <horms@kernel.org>; Michael Chan
> <michael.chan@broadcom.com>; Pavan Chebbi <pavan.chebbi@broadcom.com>;
> Andrew Lunn <andrew+netdev@lunn.ch>; Nguyen, Anthony L
> <anthony.l.nguyen@intel.com>; Kitszel, Przemyslaw
> <przemyslaw.kitszel@intel.com>; Saeed Mahameed <saeedm@nvidia.com>;
> Leon Romanovsky <leon@kernel.org>; Tariq Toukan <tariqt@nvidia.com>;
> Mark Bloch <mbloch@nvidia.com>; Alexei Starovoitov <ast@kernel.org>;
> Daniel Borkmann <daniel@iogearbox.net>; Jesper Dangaard Brouer
> <hawk@kernel.org>; John Fastabend <john.fastabend@gmail.com>;
> Stanislav Fomichev <sdf@fomichev.me>; intel-wired-
> lan@lists.osuosl.org; bpf@vger.kernel.org; kernel-team@cloudflare.com
> Subject: [Intel-wired-lan] [PATCH net-next 09/10] xsk: Call
> skb_metadata_set when skb->data points past metadata
> 
> Prepare to copy the XDP metadata into an skb extension in
> skb_metadata_set.
> 
> Adjust AF_XDP to pull from skb->data before calling skb_metadata_set.
> 
> Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
> ---
>  net/core/xdp.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/net/core/xdp.c b/net/core/xdp.c index
> 9100e160113a..e86ac1d6ad6d 100644
> --- a/net/core/xdp.c
> +++ b/net/core/xdp.c
> @@ -768,8 +768,8 @@ struct sk_buff *xdp_build_skb_from_zc(struct
> xdp_buff *xdp)
> 
>  	metalen = xdp->data - xdp->data_meta;
>  	if (metalen > 0) {
> -		skb_metadata_set(skb, metalen);
>  		__skb_pull(skb, metalen);
> +		skb_metadata_set(skb, metalen);
>  	}
> 
>  	skb_record_rx_queue(skb, rxq->queue_index);
> 
> --
> 2.43.0
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>

^ permalink raw reply	[flat|nested] 34+ messages in thread

* RE: [Intel-wired-lan] [PATCH net-next 10/10] xdp: Call skb_metadata_set when skb->data points past metadata
  2026-01-10 21:05 ` [PATCH net-next 10/10] xdp: " Jakub Sitnicki
@ 2026-01-12 11:33   ` Loktionov, Aleksandr
  0 siblings, 0 replies; 34+ messages in thread
From: Loktionov, Aleksandr @ 2026-01-12 11:33 UTC (permalink / raw)
  To: Jakub Sitnicki, netdev@vger.kernel.org
  Cc: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman, Michael Chan, Pavan Chebbi, Andrew Lunn,
	Nguyen, Anthony L, Kitszel, Przemyslaw, Saeed Mahameed,
	Leon Romanovsky, Tariq Toukan, Mark Bloch, Alexei Starovoitov,
	Daniel Borkmann, Jesper Dangaard Brouer, John Fastabend,
	Stanislav Fomichev, intel-wired-lan@lists.osuosl.org,
	bpf@vger.kernel.org, kernel-team@cloudflare.com



> -----Original Message-----
> From: Intel-wired-lan <intel-wired-lan-bounces@osuosl.org> On Behalf
> Of Jakub Sitnicki via Intel-wired-lan
> Sent: Saturday, January 10, 2026 10:05 PM
> To: netdev@vger.kernel.org
> Cc: David S. Miller <davem@davemloft.net>; Eric Dumazet
> <edumazet@google.com>; Jakub Kicinski <kuba@kernel.org>; Paolo Abeni
> <pabeni@redhat.com>; Simon Horman <horms@kernel.org>; Michael Chan
> <michael.chan@broadcom.com>; Pavan Chebbi <pavan.chebbi@broadcom.com>;
> Andrew Lunn <andrew+netdev@lunn.ch>; Nguyen, Anthony L
> <anthony.l.nguyen@intel.com>; Kitszel, Przemyslaw
> <przemyslaw.kitszel@intel.com>; Saeed Mahameed <saeedm@nvidia.com>;
> Leon Romanovsky <leon@kernel.org>; Tariq Toukan <tariqt@nvidia.com>;
> Mark Bloch <mbloch@nvidia.com>; Alexei Starovoitov <ast@kernel.org>;
> Daniel Borkmann <daniel@iogearbox.net>; Jesper Dangaard Brouer
> <hawk@kernel.org>; John Fastabend <john.fastabend@gmail.com>;
> Stanislav Fomichev <sdf@fomichev.me>; intel-wired-
> lan@lists.osuosl.org; bpf@vger.kernel.org; kernel-team@cloudflare.com
> Subject: [Intel-wired-lan] [PATCH net-next 10/10] xdp: Call
> skb_metadata_set when skb->data points past metadata
> 
> Prepare to copy the XDP metadata into an skb extension in
> skb_metadata_set.
> 
> XDP generic mode runs after MAC header has been already pulled. Adjust
> skb->data before calling skb_metadata_set to adhere to new contract.
> 
> Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
> ---
>  net/core/dev.c | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/net/core/dev.c b/net/core/dev.c index
> c711da335510..f8e5672e835f 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -5468,8 +5468,11 @@ u32 bpf_prog_run_generic_xdp(struct sk_buff
> *skb, struct xdp_buff *xdp,
>  		break;
>  	case XDP_PASS:
>  		metalen = xdp->data - xdp->data_meta;
> -		if (metalen)
> +		if (metalen) {
> +			__skb_push(skb, mac_len);
>  			skb_metadata_set(skb, metalen);
> +			__skb_pull(skb, mac_len);
> +		}
>  		break;
>  	}
> 
> 
> --
> 2.43.0
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH net-next 00/10] Call skb_metadata_set when skb->data points past metadata
  2026-01-10 21:05 [PATCH net-next 00/10] Call skb_metadata_set when skb->data points past metadata Jakub Sitnicki
                   ` (9 preceding siblings ...)
  2026-01-10 21:05 ` [PATCH net-next 10/10] xdp: " Jakub Sitnicki
@ 2026-01-13  3:08 ` Jakub Kicinski
  2026-01-13 12:09   ` Paolo Abeni
  2026-01-13 12:33   ` Jakub Sitnicki
  10 siblings, 2 replies; 34+ messages in thread
From: Jakub Kicinski @ 2026-01-13  3:08 UTC (permalink / raw)
  To: Jakub Sitnicki
  Cc: netdev, David S. Miller, Eric Dumazet, Paolo Abeni, Simon Horman,
	Michael Chan, Pavan Chebbi, Andrew Lunn, Tony Nguyen,
	Przemek Kitszel, Saeed Mahameed, Leon Romanovsky, Tariq Toukan,
	Mark Bloch, Alexei Starovoitov, Daniel Borkmann,
	Jesper Dangaard Brouer, John Fastabend, Stanislav Fomichev,
	intel-wired-lan, bpf, kernel-team

On Sat, 10 Jan 2026 22:05:14 +0100 Jakub Sitnicki wrote:
> This series is split out of [1] following discussion with Jakub.
> 
> To copy XDP metadata into an skb extension when skb_metadata_set() is
> called, we need to locate the metadata contents.

"When skb_metadata_set() is called"? I think that may cause perf
regressions unless we merge major optimizations at the same time?
Should we defer touching the drivers until we have a PoC and some
idea whether allocating the extension right away is manageable or 
we are better off doing it via a kfunc in TC (after GRO)?
To be clear putting the metadata in an extension right away would
indeed be much cleaner, just not sure how much of the perf hit we 
can optimize away..

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH net-next 07/10] mlx5e: Call skb_metadata_set when skb->data points past metadata
  2026-01-10 21:05 ` [PATCH net-next 07/10] mlx5e: " Jakub Sitnicki
  2026-01-12 11:32   ` [Intel-wired-lan] " Loktionov, Aleksandr
@ 2026-01-13  6:08   ` Tariq Toukan
  2026-01-13 12:52     ` Jakub Sitnicki
  1 sibling, 1 reply; 34+ messages in thread
From: Tariq Toukan @ 2026-01-13  6:08 UTC (permalink / raw)
  To: Jakub Sitnicki, netdev
  Cc: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman, Michael Chan, Pavan Chebbi, Andrew Lunn,
	Tony Nguyen, Przemek Kitszel, Saeed Mahameed, Leon Romanovsky,
	Tariq Toukan, Mark Bloch, Alexei Starovoitov, Daniel Borkmann,
	Jesper Dangaard Brouer, John Fastabend, Stanislav Fomichev,
	intel-wired-lan, bpf, kernel-team



On 10/01/2026 23:05, Jakub Sitnicki wrote:
> Prepare to copy the XDP metadata into an skb extension in skb_metadata_set.
> 
> Adjust the driver to pull from skb->data before calling skb_metadata_set.
> 
> Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
> ---
>   drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.c
> index 2b05536d564a..20c983c3ce62 100644
> --- a/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.c
> +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.c
> @@ -237,8 +237,8 @@ static struct sk_buff *mlx5e_xsk_construct_skb(struct mlx5e_rq *rq, struct xdp_b
>   	skb_put_data(skb, xdp->data_meta, totallen);
>   
>   	if (metalen) {
> -		skb_metadata_set(skb, metalen);
>   		__skb_pull(skb, metalen);
> +		skb_metadata_set(skb, metalen);
>   	}
>   
>   	return skb;
> 

Patch itself is simple..

I share my concerns about the perf impact of the series idea.
Do you have some working PoC? Please share some perf numbers..


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH net-next 00/10] Call skb_metadata_set when skb->data points past metadata
  2026-01-13  3:08 ` [PATCH net-next 00/10] " Jakub Kicinski
@ 2026-01-13 12:09   ` Paolo Abeni
  2026-01-13 12:40     ` [Intel-wired-lan] " Jakub Sitnicki
  2026-01-13 18:52     ` Jesper Dangaard Brouer
  2026-01-13 12:33   ` Jakub Sitnicki
  1 sibling, 2 replies; 34+ messages in thread
From: Paolo Abeni @ 2026-01-13 12:09 UTC (permalink / raw)
  To: Jakub Kicinski, Jakub Sitnicki
  Cc: netdev, David S. Miller, Eric Dumazet, Simon Horman, Michael Chan,
	Pavan Chebbi, Andrew Lunn, Tony Nguyen, Przemek Kitszel,
	Saeed Mahameed, Leon Romanovsky, Tariq Toukan, Mark Bloch,
	Alexei Starovoitov, Daniel Borkmann, Jesper Dangaard Brouer,
	John Fastabend, Stanislav Fomichev, intel-wired-lan, bpf,
	kernel-team

On 1/13/26 4:08 AM, Jakub Kicinski wrote:
> On Sat, 10 Jan 2026 22:05:14 +0100 Jakub Sitnicki wrote:
>> This series is split out of [1] following discussion with Jakub.
>>
>> To copy XDP metadata into an skb extension when skb_metadata_set() is
>> called, we need to locate the metadata contents.
> 
> "When skb_metadata_set() is called"? I think that may cause perf
> regressions unless we merge major optimizations at the same time?
> Should we defer touching the drivers until we have a PoC and some
> idea whether allocating the extension right away is manageable or 
> we are better off doing it via a kfunc in TC (after GRO)?
> To be clear putting the metadata in an extension right away would
> indeed be much cleaner, just not sure how much of the perf hit we 
> can optimize away..

I agree it would be better deferring touching the driver before we have
proof there will not be significant regressions.

IIRC, at early MPTCP impl time, Eric suggested increasing struct sk_buff
size as an alternative to the mptcp skb extension, leaving the added
trailing part uninitialized when the sk_buff is allocated.

If skb extensions usage become so ubicuos they are basically allocated
for each packet, the total skb extension is kept under strict control
and remains reasonable (assuming it is :), perhaps we could consider
revisiting the above mentioned approach?

/P


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH net-next 00/10] Call skb_metadata_set when skb->data points past metadata
  2026-01-13  3:08 ` [PATCH net-next 00/10] " Jakub Kicinski
  2026-01-13 12:09   ` Paolo Abeni
@ 2026-01-13 12:33   ` Jakub Sitnicki
  2026-01-22 20:21     ` Martin KaFai Lau
  1 sibling, 1 reply; 34+ messages in thread
From: Jakub Sitnicki @ 2026-01-13 12:33 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: netdev, David S. Miller, Eric Dumazet, Paolo Abeni, Simon Horman,
	Michael Chan, Pavan Chebbi, Andrew Lunn, Tony Nguyen,
	Przemek Kitszel, Saeed Mahameed, Leon Romanovsky, Tariq Toukan,
	Mark Bloch, Alexei Starovoitov, Daniel Borkmann,
	Jesper Dangaard Brouer, John Fastabend, Stanislav Fomichev,
	intel-wired-lan, bpf, kernel-team

On Mon, Jan 12, 2026 at 07:08 PM -08, Jakub Kicinski wrote:
> On Sat, 10 Jan 2026 22:05:14 +0100 Jakub Sitnicki wrote:
>> This series is split out of [1] following discussion with Jakub.
>> 
>> To copy XDP metadata into an skb extension when skb_metadata_set() is
>> called, we need to locate the metadata contents.
>
> "When skb_metadata_set() is called"? I think that may cause perf
> regressions unless we merge major optimizations at the same time?
> Should we defer touching the drivers until we have a PoC and some
> idea whether allocating the extension right away is manageable or 
> we are better off doing it via a kfunc in TC (after GRO)?
> To be clear putting the metadata in an extension right away would
> indeed be much cleaner, just not sure how much of the perf hit we 
> can optimize away..

Good point. I'm hoping we don't have to allocate from
skb_metadata_set(), which does sound prohibitively expensive. Instead
we'd allocate the extension together with the skb if we know upfront
that metadata will be used.

Things took an unexpected turn and I'm figuring this out as I go.
Please bear with me :-)

Here are my thoughts:
 
1) The driver changes do clean up the interface, but you're right that
   it's premature churn if the approach changes. If the skb extension
   approach doesn't pan out, we're ready to fall back to headroom-based
   storage.
 
2) How do we handle CONFIG_SKB_EXTENSIONS=n? Without extensions,
   reliable metadata access after L2 encap/decap would require patching
   skb_push/pull call sites—or we declare the feature unsupported
   without CONFIG_SKB_EXTENSIONS=y.

3) When skb extensions are enabled, asking users to attach TC BPF progs
   to call a kfunc to all devices the skb goes through before L2
   encap/decap is impractical. The extension alloc/move needs to be
   baked into the stack.
 
I'll focus on getting a PoC together next. Stay tuned.
 
Thanks,
-jkbs

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [Intel-wired-lan] [PATCH net-next 00/10] Call skb_metadata_set when skb->data points past metadata
  2026-01-13 12:09   ` Paolo Abeni
@ 2026-01-13 12:40     ` Jakub Sitnicki
  2026-01-13 18:52     ` Jesper Dangaard Brouer
  1 sibling, 0 replies; 34+ messages in thread
From: Jakub Sitnicki @ 2026-01-13 12:40 UTC (permalink / raw)
  To: Paolo Abeni
  Cc: Jakub Kicinski, netdev, David S. Miller, Eric Dumazet,
	Simon Horman, Michael Chan, Pavan Chebbi, Andrew Lunn,
	Tony Nguyen, Przemek Kitszel, Saeed Mahameed, Leon Romanovsky,
	Tariq Toukan, Mark Bloch, Alexei Starovoitov, Daniel Borkmann,
	Jesper Dangaard Brouer, John Fastabend, Stanislav Fomichev,
	intel-wired-lan, bpf, kernel-team

On Tue, Jan 13, 2026 at 01:09 PM +01, Paolo Abeni wrote:
> IIRC, at early MPTCP impl time, Eric suggested increasing struct sk_buff
> size as an alternative to the mptcp skb extension, leaving the added
> trailing part uninitialized when the sk_buff is allocated.
>
> If skb extensions usage become so ubicuos they are basically allocated
> for each packet, the total skb extension is kept under strict control
> and remains reasonable (assuming it is :), perhaps we could consider
> revisiting the above mentioned approach?

I've been thinking the same thing. Great to hear that this idea is not
new.

FWIW, in our use cases we'd want to attach metadata to the first packet
of new TCP/QUIC flow, and ocassionally to sampled skbs for tracing.


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH net-next 07/10] mlx5e: Call skb_metadata_set when skb->data points past metadata
  2026-01-13  6:08   ` Tariq Toukan
@ 2026-01-13 12:52     ` Jakub Sitnicki
  0 siblings, 0 replies; 34+ messages in thread
From: Jakub Sitnicki @ 2026-01-13 12:52 UTC (permalink / raw)
  To: Tariq Toukan
  Cc: netdev, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Simon Horman, Michael Chan, Pavan Chebbi,
	Andrew Lunn, Tony Nguyen, Przemek Kitszel, Saeed Mahameed,
	Leon Romanovsky, Tariq Toukan, Mark Bloch, Alexei Starovoitov,
	Daniel Borkmann, Jesper Dangaard Brouer, John Fastabend,
	Stanislav Fomichev, intel-wired-lan, bpf, kernel-team

On Tue, Jan 13, 2026 at 08:08 AM +02, Tariq Toukan wrote:
> On 10/01/2026 23:05, Jakub Sitnicki wrote:
>> Prepare to copy the XDP metadata into an skb extension in skb_metadata_set.
>> Adjust the driver to pull from skb->data before calling skb_metadata_set.
>> Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
>> ---
>>   drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.c
>> b/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.c
>> index 2b05536d564a..20c983c3ce62 100644
>> --- a/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.c
>> +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.c
>> @@ -237,8 +237,8 @@ static struct sk_buff *mlx5e_xsk_construct_skb(struct mlx5e_rq *rq, struct xdp_b
>>   	skb_put_data(skb, xdp->data_meta, totallen);
>>     	if (metalen) {
>> -		skb_metadata_set(skb, metalen);
>>   		__skb_pull(skb, metalen);
>> +		skb_metadata_set(skb, metalen);
>>   	}
>>     	return skb;
>> 
>
> Patch itself is simple..
>
> I share my concerns about the perf impact of the series idea.
> Do you have some working PoC? Please share some perf numbers..

Sorry, nothing to show yet. I've shared more context in my reply to
Jakub [1].

The series itself is an interface cleanup, whether we end up needing it
for the metadata effort or not. Hence I wanted to salvage it from [2].

[1] https://lore.kernel.org/all/87bjixwv41.fsf@cloudflare.com/
[2] https://lore.kernel.org/r/20260107-skb-meta-safeproof-netdevs-rx-only-v3-0-0d461c5e4764@cloudflare.com


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH net-next 00/10] Call skb_metadata_set when skb->data points past metadata
  2026-01-13 12:09   ` Paolo Abeni
  2026-01-13 12:40     ` [Intel-wired-lan] " Jakub Sitnicki
@ 2026-01-13 18:52     ` Jesper Dangaard Brouer
  2026-01-13 20:22       ` [Intel-wired-lan] " Jakub Sitnicki
  1 sibling, 1 reply; 34+ messages in thread
From: Jesper Dangaard Brouer @ 2026-01-13 18:52 UTC (permalink / raw)
  To: Paolo Abeni, Jakub Kicinski, Jakub Sitnicki
  Cc: netdev, David S. Miller, Eric Dumazet, Simon Horman, Michael Chan,
	Pavan Chebbi, Andrew Lunn, Tony Nguyen, Przemek Kitszel,
	Saeed Mahameed, Leon Romanovsky, Tariq Toukan, Mark Bloch,
	Alexei Starovoitov, Daniel Borkmann, John Fastabend,
	Stanislav Fomichev, intel-wired-lan, bpf, kernel-team,
	Jesse Brandeburg, Willem Ferguson, Arthur Fabre



On 13/01/2026 13.09, Paolo Abeni wrote:
> On 1/13/26 4:08 AM, Jakub Kicinski wrote:
>> On Sat, 10 Jan 2026 22:05:14 +0100 Jakub Sitnicki wrote:
>>> This series is split out of [1] following discussion with Jakub.
>>>
>>> To copy XDP metadata into an skb extension when skb_metadata_set() is
>>> called, we need to locate the metadata contents.
>>
>> "When skb_metadata_set() is called"? I think that may cause perf
>> regressions unless we merge major optimizations at the same time?
>> Should we defer touching the drivers until we have a PoC and some
>> idea whether allocating the extension right away is manageable or
>> we are better off doing it via a kfunc in TC (after GRO)?
>> To be clear putting the metadata in an extension right away would
>> indeed be much cleaner, just not sure how much of the perf hit we
>> can optimize away..
> 
> I agree it would be better deferring touching the driver before we have
> proof there will not be significant regressions.

It will be a performance regression to (as cover-letter says):
  "To copy XDP metadata into an skb extension when skb_metadata_set() is 
called".
The XDP to TC-ingress code path is a fast-path IMHO.

*BUT* this patchset isn't doing that. To me it looks like a cleanup
patchset that simply makes it consistent when skb_metadata_set() called.
Selling it as a pre-requirement for doing copy later seems fishy.


> IIRC, at early MPTCP impl time, Eric suggested increasing struct sk_buff
> size as an alternative to the mptcp skb extension, leaving the added
> trailing part uninitialized when the sk_buff is allocated.
> 
> If skb extensions usage become so ubicuos they are basically allocated
> for each packet, the total skb extension is kept under strict control
> and remains reasonable (assuming it is :), perhaps we could consider
> revisiting the above mentioned approach?


I really like this idea.  As using the uninitialized tail room in the
SKB (memory area) will make SKB extensions fast.  Today SKBs are
allocated via SLUB-alloacator cache-aligned so the real size is 256
bytes.  On my system the actual SKB (sk_buff) size is 232 bytes (already
leaving us 24 bytes). The area that gets zero-initialized is only 192
bytes (3 cache-lines).  My experience with the SLUB allocator is that
increasing the object size doesn't increase the allocation cost (below
PAGE_SIZE).  So, the suggestion of simply allocating a larger sk_buff is
valid as it doesn't cost more (if we don't touch those cache-lines).  We
could even make it a CONFIG compile time option how big this area should be.

For Jakub this unfortunately challenge/breaks the design of keeping
data_meta area valid deeper into the netstack.  With all the challenges
around encapsulation/decap it seems hard/infeasible to maintain this
area in-front of the packet data pointer deeper into the netstack.

Instead of blindly copying XDP data_meta area into a single SKB
extension.  What if we make it the responsibility of the TC-ingress BPF-
hook to understand the data_meta format and via (kfunc) helpers
transfer/create the SKB extension that it deems relevant.
Would this be an acceptable approach that makes it easier to propagate
metadata deeper in netstack?

--Jesper

p.s. For compact storage of SKB extensions in the SKB tail-area, we
could revisit Arthur's "traits" (compact-KV storage).


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [Intel-wired-lan] [PATCH net-next 00/10] Call skb_metadata_set when skb->data points past metadata
  2026-01-13 18:52     ` Jesper Dangaard Brouer
@ 2026-01-13 20:22       ` Jakub Sitnicki
  2026-01-14 11:49         ` Toke Høiland-Jørgensen
  0 siblings, 1 reply; 34+ messages in thread
From: Jakub Sitnicki @ 2026-01-13 20:22 UTC (permalink / raw)
  To: Jesper Dangaard Brouer, Alexei Starovoitov, Jakub Kicinski
  Cc: Paolo Abeni, Jakub Kicinski, netdev, David S. Miller,
	Eric Dumazet, Simon Horman, Michael Chan, Pavan Chebbi,
	Andrew Lunn, Tony Nguyen, Przemek Kitszel, Saeed Mahameed,
	Leon Romanovsky, Tariq Toukan, Mark Bloch, Alexei Starovoitov,
	Daniel Borkmann, John Fastabend, Stanislav Fomichev,
	intel-wired-lan, bpf, kernel-team, Jesse Brandeburg,
	Willem Ferguson, Arthur Fabre

On Tue, Jan 13, 2026 at 07:52 PM +01, Jesper Dangaard Brouer wrote:
> *BUT* this patchset isn't doing that. To me it looks like a cleanup
> patchset that simply makes it consistent when skb_metadata_set() called.
> Selling it as a pre-requirement for doing copy later seems fishy.
 
Fair point on the framing. The interface cleanup is useful on its own -
I should have presented it that way rather than tying it to future work.

> Instead of blindly copying XDP data_meta area into a single SKB
> extension.  What if we make it the responsibility of the TC-ingress BPF-
> hook to understand the data_meta format and via (kfunc) helpers
> transfer/create the SKB extension that it deems relevant.
> Would this be an acceptable approach that makes it easier to propagate
> metadata deeper in netstack?

I think you and Jakub are actually proposing the same thing.
 
If we can access a buffer tied to an skb extension from BPF, this could
act as skb-local storage and solves the problem (with some operational
overhead to set up TC on ingress).
 
I'd also like to get Alexei's take on this. We had a discussion before
about not wanting to maintain two different storage areas for skb
metadata.
 
That was one of two reasons why we abandoned Arthur's patches and why I
tried to make the existing headroom-backed metadata area work.
 
But perhaps I misunderstood the earlier discussion. Alexei's point may
have been that we don't want another *headroom-backed* metadata area
accessible from XDP, because we already have that.
 
Looks like we have two options on the table:
 
Option A) Headroom-backed metadata
  - Use existing skb metadata area
  - Patch skb_push/pull call sites to preserve it
 
Option B) Extension-backed metadata
  - Store metadata in skb extension from BPF
  - TC BPF copies/extracts what it needs from headroom-metadata
 
Or is there an Option C I'm missing?

Thanks,
-jkbs

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [Intel-wired-lan] [PATCH net-next 00/10] Call skb_metadata_set when skb->data points past metadata
  2026-01-13 20:22       ` [Intel-wired-lan] " Jakub Sitnicki
@ 2026-01-14 11:49         ` Toke Høiland-Jørgensen
  2026-01-14 12:33           ` Jakub Sitnicki
  0 siblings, 1 reply; 34+ messages in thread
From: Toke Høiland-Jørgensen @ 2026-01-14 11:49 UTC (permalink / raw)
  To: Jakub Sitnicki, Jesper Dangaard Brouer, Alexei Starovoitov,
	Jakub Kicinski
  Cc: Paolo Abeni, Jakub Kicinski, netdev, David S. Miller,
	Eric Dumazet, Simon Horman, Michael Chan, Pavan Chebbi,
	Andrew Lunn, Tony Nguyen, Przemek Kitszel, Saeed Mahameed,
	Leon Romanovsky, Tariq Toukan, Mark Bloch, Alexei Starovoitov,
	Daniel Borkmann, John Fastabend, Stanislav Fomichev,
	intel-wired-lan, bpf, kernel-team, Jesse Brandeburg,
	Willem Ferguson, Arthur Fabre

Jakub Sitnicki via Intel-wired-lan <intel-wired-lan@osuosl.org> writes:

> On Tue, Jan 13, 2026 at 07:52 PM +01, Jesper Dangaard Brouer wrote:
>> *BUT* this patchset isn't doing that. To me it looks like a cleanup
>> patchset that simply makes it consistent when skb_metadata_set() called.
>> Selling it as a pre-requirement for doing copy later seems fishy.
>  
> Fair point on the framing. The interface cleanup is useful on its own -
> I should have presented it that way rather than tying it to future work.
>
>> Instead of blindly copying XDP data_meta area into a single SKB
>> extension.  What if we make it the responsibility of the TC-ingress BPF-
>> hook to understand the data_meta format and via (kfunc) helpers
>> transfer/create the SKB extension that it deems relevant.
>> Would this be an acceptable approach that makes it easier to propagate
>> metadata deeper in netstack?
>
> I think you and Jakub are actually proposing the same thing.
>  
> If we can access a buffer tied to an skb extension from BPF, this could
> act as skb-local storage and solves the problem (with some operational
> overhead to set up TC on ingress).
>  
> I'd also like to get Alexei's take on this. We had a discussion before
> about not wanting to maintain two different storage areas for skb
> metadata.
>  
> That was one of two reasons why we abandoned Arthur's patches and why I
> tried to make the existing headroom-backed metadata area work.
>  
> But perhaps I misunderstood the earlier discussion. Alexei's point may
> have been that we don't want another *headroom-backed* metadata area
> accessible from XDP, because we already have that.
>  
> Looks like we have two options on the table:
>  
> Option A) Headroom-backed metadata
>   - Use existing skb metadata area
>   - Patch skb_push/pull call sites to preserve it
>  
> Option B) Extension-backed metadata
>   - Store metadata in skb extension from BPF
>   - TC BPF copies/extracts what it needs from headroom-metadata
>  
> Or is there an Option C I'm missing?

Not sure if it's really an option C, but would it be possible to
consolidate them using verifier tricks? I.e., the data_meta field in the
__sk_buff struct is really a virtual pointer that the verifier rewrites
to loading an actual pointer from struct bpf_skb_data_end in skb->cb. So
in principle this could be loaded from an skb extension instead with the
BPF programs being none the wiser.

There's the additional wrinkle that the end of the data_meta pointer is
compared to the 'data' start pointer to check for overflow, which
wouldn't work anymore. Not sure if there's a way to make the verifier
rewrite those checks in a compatible way, or if this is even a path we
want to go down. But it would be a pretty neat way to make the whole
thing transparent and backwards compatible, I think :)

Other than that, I like the extention-backed metadata idea!

-Toke


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [Intel-wired-lan] [PATCH net-next 00/10] Call skb_metadata_set when skb->data points past metadata
  2026-01-14 11:49         ` Toke Høiland-Jørgensen
@ 2026-01-14 12:33           ` Jakub Sitnicki
  0 siblings, 0 replies; 34+ messages in thread
From: Jakub Sitnicki @ 2026-01-14 12:33 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen
  Cc: Jesper Dangaard Brouer, Alexei Starovoitov, Jakub Kicinski,
	Paolo Abeni, netdev, David S. Miller, Eric Dumazet, Simon Horman,
	Michael Chan, Pavan Chebbi, Andrew Lunn, Tony Nguyen,
	Przemek Kitszel, Saeed Mahameed, Leon Romanovsky, Tariq Toukan,
	Mark Bloch, Daniel Borkmann, John Fastabend, Stanislav Fomichev,
	intel-wired-lan, bpf, kernel-team, Jesse Brandeburg,
	Willem Ferguson, Arthur Fabre

On Wed, Jan 14, 2026 at 12:49 PM +01, Toke Høiland-Jørgensen wrote:
> Jakub Sitnicki via Intel-wired-lan <intel-wired-lan@osuosl.org> writes:
>
>> On Tue, Jan 13, 2026 at 07:52 PM +01, Jesper Dangaard Brouer wrote:
>>> *BUT* this patchset isn't doing that. To me it looks like a cleanup
>>> patchset that simply makes it consistent when skb_metadata_set() called.
>>> Selling it as a pre-requirement for doing copy later seems fishy.
>>  
>> Fair point on the framing. The interface cleanup is useful on its own -
>> I should have presented it that way rather than tying it to future work.
>>
>>> Instead of blindly copying XDP data_meta area into a single SKB
>>> extension.  What if we make it the responsibility of the TC-ingress BPF-
>>> hook to understand the data_meta format and via (kfunc) helpers
>>> transfer/create the SKB extension that it deems relevant.
>>> Would this be an acceptable approach that makes it easier to propagate
>>> metadata deeper in netstack?
>>
>> I think you and Jakub are actually proposing the same thing.
>>  
>> If we can access a buffer tied to an skb extension from BPF, this could
>> act as skb-local storage and solves the problem (with some operational
>> overhead to set up TC on ingress).
>>  
>> I'd also like to get Alexei's take on this. We had a discussion before
>> about not wanting to maintain two different storage areas for skb
>> metadata.
>>  
>> That was one of two reasons why we abandoned Arthur's patches and why I
>> tried to make the existing headroom-backed metadata area work.
>>  
>> But perhaps I misunderstood the earlier discussion. Alexei's point may
>> have been that we don't want another *headroom-backed* metadata area
>> accessible from XDP, because we already have that.
>>  
>> Looks like we have two options on the table:
>>  
>> Option A) Headroom-backed metadata
>>   - Use existing skb metadata area
>>   - Patch skb_push/pull call sites to preserve it
>>  
>> Option B) Extension-backed metadata
>>   - Store metadata in skb extension from BPF
>>   - TC BPF copies/extracts what it needs from headroom-metadata
>>  
>> Or is there an Option C I'm missing?
>
> Not sure if it's really an option C, but would it be possible to
> consolidate them using verifier tricks? I.e., the data_meta field in the
> __sk_buff struct is really a virtual pointer that the verifier rewrites
> to loading an actual pointer from struct bpf_skb_data_end in skb->cb. So
> in principle this could be loaded from an skb extension instead with the
> BPF programs being none the wiser.
>
> There's the additional wrinkle that the end of the data_meta pointer is
> compared to the 'data' start pointer to check for overflow, which
> wouldn't work anymore. Not sure if there's a way to make the verifier
> rewrite those checks in a compatible way, or if this is even a path we
> want to go down. But it would be a pretty neat way to make the whole
> thing transparent and backwards compatible, I think :)

I gave it a shot when working on [1]. Here's the challenge:

1) Keep the skb->data_meta + N <= skb->data checks working

This is what guarantees that your BPF program won't access memory
outside of the metadata area. So you can't rewrite the skb->data_meta
pseudo-pointer load. This means you must...

2) Patch the skb->data_meta pointer dereference after the check

Since deref happens at some unknown point after the skb->data_meta
pointer load, you may no longer have the context pointer in any of the
registers.

You might be able to hack it by spilling the context pointer to the
stack in the prologue, like I've seen bpf_qdisc does. But that I haven't
tried.

In general, I view it as a seconary issue since you can use a BPF dynptr
to access the skb metadata. It was exactly for that reason - to hide the
fact where the metadata is actually located.

> Other than that, I like the extention-backed metadata idea!

That's what I'm going to work on. I look at it as an skb local storage
backed by an skb extension.

If the user wants to transfer the contents of the skb metadata into
local storage, they can. But the extra allocation is their decision.

[1] https://lore.kernel.org/r/20260110-skb-meta-fixup-skb_metadata_set-calls-v1-0-1047878ed1b0@cloudflare.com

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH net-next 00/10] Call skb_metadata_set when skb->data points past metadata
  2026-01-13 12:33   ` Jakub Sitnicki
@ 2026-01-22 20:21     ` Martin KaFai Lau
  2026-01-25 19:15       ` Jakub Sitnicki
  0 siblings, 1 reply; 34+ messages in thread
From: Martin KaFai Lau @ 2026-01-22 20:21 UTC (permalink / raw)
  To: Jakub Sitnicki
  Cc: netdev, David S. Miller, Eric Dumazet, Paolo Abeni, Simon Horman,
	Michael Chan, Pavan Chebbi, Andrew Lunn, Tony Nguyen,
	Przemek Kitszel, Saeed Mahameed, Leon Romanovsky, Tariq Toukan,
	Mark Bloch, Alexei Starovoitov, Daniel Borkmann,
	Jesper Dangaard Brouer, John Fastabend, Stanislav Fomichev,
	intel-wired-lan, bpf, kernel-team, Jakub Kicinski, Amery Hung

On 1/13/26 4:33 AM, Jakub Sitnicki wrote:
> Good point. I'm hoping we don't have to allocate from
> skb_metadata_set(), which does sound prohibitively expensive. Instead
> we'd allocate the extension together with the skb if we know upfront
> that metadata will be used.

[ Sorry for being late. Have been catching up after holidays. ]

For the sk local storage (which was mentioned in other replies as making 
skb metadata to look more like sk local storage), there is a plan (Amery 
has been looking into it) to allocate the storage together with sk for 
performance reason. This means allocating a larger 'struct sock'. The 
extra space will be at the front of sk instead of the end of sk because 
of how the 'struct sock' is embedded in tcp_sock/udp_sock/... If skb is 
going in the same direction, it should be useful to have a similar 
scheme on: upfront allocation and then shared by multiple BPF progs.

The current thinking is to built upon the existing bpf_sk_local_storage 
usage. A boot param decides how much BPF space should be allocated for 
'struct sock'. When a bpf_sk_storage_map is created (with a new 
use_reserve flag), the space will be allocated permanently from the head 
space of every sk for this map. The read (from a BPF prog) will be at 
one stable offset before a sk. If there is no more head space left, the 
map creation will fail. User can decide if it wants to retry without the 
'use_reserve' flag.


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH net-next 00/10] Call skb_metadata_set when skb->data points past metadata
  2026-01-22 20:21     ` Martin KaFai Lau
@ 2026-01-25 19:15       ` Jakub Sitnicki
  2026-01-27 19:33         ` Martin KaFai Lau
  0 siblings, 1 reply; 34+ messages in thread
From: Jakub Sitnicki @ 2026-01-25 19:15 UTC (permalink / raw)
  To: Martin KaFai Lau
  Cc: netdev, David S. Miller, Eric Dumazet, Paolo Abeni, Simon Horman,
	Michael Chan, Pavan Chebbi, Andrew Lunn, Tony Nguyen,
	Przemek Kitszel, Saeed Mahameed, Leon Romanovsky, Tariq Toukan,
	Mark Bloch, Alexei Starovoitov, Daniel Borkmann,
	Jesper Dangaard Brouer, John Fastabend, Stanislav Fomichev,
	intel-wired-lan, bpf, kernel-team, Jakub Kicinski, Amery Hung

On Thu, Jan 22, 2026 at 12:21 PM -08, Martin KaFai Lau wrote:
> On 1/13/26 4:33 AM, Jakub Sitnicki wrote:
>> Good point. I'm hoping we don't have to allocate from
>> skb_metadata_set(), which does sound prohibitively expensive. Instead
>> we'd allocate the extension together with the skb if we know upfront
>> that metadata will be used.
>
> [ Sorry for being late. Have been catching up after holidays. ]
>
> For the sk local storage (which was mentioned in other replies as making skb
> metadata to look more like sk local storage), there is a plan (Amery has been
> looking into it) to allocate the storage together with sk for performance
> reason. This means allocating a larger 'struct sock'. The extra space will be at
> the front of sk instead of the end of sk because of how the 'struct sock' is
> embedded in tcp_sock/udp_sock/... If skb is going in the same direction, it
> should be useful to have a similar scheme on: upfront allocation and then shared
> by multiple BPF progs.
>
> The current thinking is to built upon the existing bpf_sk_local_storage usage. A
> boot param decides how much BPF space should be allocated for 'struct
> sock'. When a bpf_sk_storage_map is created (with a new use_reserve flag), the
> space will be allocated permanently from the head space of every sk for this
> map. The read (from a BPF prog) will be at one stable offset before a sk. If
> there is no more head space left, the map creation will fail. User can decide if
> it wants to retry without the 'use_reserve' flag.

Thanks for sharing the plans.

We will definitely be looking into ways of eliminating allocations in
the long run. With one allocation for skb_ext, one for
bpf_local_storage, and one for the actual map, it seems unlikely we will
be able to attach metadata this way to every packet. Which is something
we wanted for our "label packet once, use label everywhere" use case.

I'm not sure how much we can squeeze in together with the sk_buff.
Hopefully at least skb_ext plus a pointer to bpf_local_storage.

I'm also hoping we can allocate memory for bpf_local_storage together
with the backing space for the map, which update triggers the skb
extension activation.

Finally, bpf_local_storage itself has a pretty generous cache which
blows it up. Maybe the cache could be a flexible array, which could be
smaller for skb local storage.

All just ideas ATM. Initial RFC won't have any of these optimizations.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH net-next 00/10] Call skb_metadata_set when skb->data points past metadata
  2026-01-25 19:15       ` Jakub Sitnicki
@ 2026-01-27 19:33         ` Martin KaFai Lau
  0 siblings, 0 replies; 34+ messages in thread
From: Martin KaFai Lau @ 2026-01-27 19:33 UTC (permalink / raw)
  To: Jakub Sitnicki
  Cc: netdev, David S. Miller, Eric Dumazet, Paolo Abeni, Simon Horman,
	Michael Chan, Pavan Chebbi, Andrew Lunn, Tony Nguyen,
	Przemek Kitszel, Saeed Mahameed, Leon Romanovsky, Tariq Toukan,
	Mark Bloch, Alexei Starovoitov, Daniel Borkmann,
	Jesper Dangaard Brouer, John Fastabend, Stanislav Fomichev,
	intel-wired-lan, bpf, kernel-team, Jakub Kicinski, Amery Hung



On 1/25/26 11:15 AM, Jakub Sitnicki wrote:
> On Thu, Jan 22, 2026 at 12:21 PM -08, Martin KaFai Lau wrote:
>> On 1/13/26 4:33 AM, Jakub Sitnicki wrote:
>>> Good point. I'm hoping we don't have to allocate from
>>> skb_metadata_set(), which does sound prohibitively expensive. Instead
>>> we'd allocate the extension together with the skb if we know upfront
>>> that metadata will be used.
>>
>> [ Sorry for being late. Have been catching up after holidays. ]
>>
>> For the sk local storage (which was mentioned in other replies as making skb
>> metadata to look more like sk local storage), there is a plan (Amery has been
>> looking into it) to allocate the storage together with sk for performance
>> reason. This means allocating a larger 'struct sock'. The extra space will be at
>> the front of sk instead of the end of sk because of how the 'struct sock' is
>> embedded in tcp_sock/udp_sock/... If skb is going in the same direction, it
>> should be useful to have a similar scheme on: upfront allocation and then shared
>> by multiple BPF progs.
>>
>> The current thinking is to built upon the existing bpf_sk_local_storage usage. A
>> boot param decides how much BPF space should be allocated for 'struct
>> sock'. When a bpf_sk_storage_map is created (with a new use_reserve flag), the
>> space will be allocated permanently from the head space of every sk for this
>> map. The read (from a BPF prog) will be at one stable offset before a sk. If
>> there is no more head space left, the map creation will fail. User can decide if
>> it wants to retry without the 'use_reserve' flag.
> 
> Thanks for sharing the plans.
> 
> We will definitely be looking into ways of eliminating allocations in
> the long run. With one allocation for skb_ext, one for
> bpf_local_storage, and one for the actual map, it seems unlikely we will
> be able to attach metadata this way to every packet. Which is something
> we wanted for our "label packet once, use label everywhere" use case.
> 
> I'm not sure how much we can squeeze in together with the sk_buff.
> Hopefully at least skb_ext plus a pointer to bpf_local_storage.

yeah, only a bpf_local_storage pointer is needed in skb (or in skb_ext). 
It is the same for the bpf sk/task/... storage.

To be clear, for allocation in skb, I was thinking more about Paolo's 
comment on "...increasing struct sk_buff size as an alternative to the 
mptcp skb extension...".

> 
> I'm also hoping we can allocate memory for bpf_local_storage together
> with the backing space for the map, which update triggers the skb
> extension activation.

Allocate the actual storage at the end of bpf_local_storage? Hmm... off 
the top of my head, I don't have a good idea how to do it without 
trading off flexibility. If trading off flexibility, may as well 
allocate fixed extra space at the sk (/skb) and get a performance 
benefit (which would need to be measured).

> 
> Finally, bpf_local_storage itself has a pretty generous cache which
> blows it up. Maybe the cache could be a flexible array, which could be
> smaller for skb local storage.

For our usage, the cache has been slowly filling up, so we actually have 
another side of the issue. Improvements on bpf_local_storage is always 
welcomed.

I am currently more interested in getting the extra memory/headroom 
allocated for an sk. Eventually, the storage(s) that will be needed for 
all (or most) sk will use the extra headroom of sk. The current 
bpf_local_storage (pointer) in sk will be more for testing/ad-hoc 
purpose or for performance-insensitive usage.

It is probably off topic now. It seems having extra tail space in a skb 
is not in your current plan for the next respin.


^ permalink raw reply	[flat|nested] 34+ messages in thread

end of thread, other threads:[~2026-01-27 19:33 UTC | newest]

Thread overview: 34+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-01-10 21:05 [PATCH net-next 00/10] Call skb_metadata_set when skb->data points past metadata Jakub Sitnicki
2026-01-10 21:05 ` [PATCH net-next 01/10] net: Document skb_metadata_set contract with the drivers Jakub Sitnicki
2026-01-12 11:28   ` [Intel-wired-lan] " Loktionov, Aleksandr
2026-01-10 21:05 ` [PATCH net-next 02/10] bnxt_en: Call skb_metadata_set when skb->data points past metadata Jakub Sitnicki
2026-01-12 11:29   ` [Intel-wired-lan] " Loktionov, Aleksandr
2026-01-10 21:05 ` [PATCH net-next 03/10] i40e: " Jakub Sitnicki
2026-01-12 11:30   ` [Intel-wired-lan] " Loktionov, Aleksandr
2026-01-10 21:05 ` [PATCH net-next 04/10] igb: " Jakub Sitnicki
2026-01-12 11:31   ` [Intel-wired-lan] " Loktionov, Aleksandr
2026-01-10 21:05 ` [PATCH net-next 05/10] igc: " Jakub Sitnicki
2026-01-12 11:31   ` [Intel-wired-lan] " Loktionov, Aleksandr
2026-01-10 21:05 ` [PATCH net-next 06/10] ixgbe: " Jakub Sitnicki
2026-01-12 11:32   ` [Intel-wired-lan] " Loktionov, Aleksandr
2026-01-10 21:05 ` [PATCH net-next 07/10] mlx5e: " Jakub Sitnicki
2026-01-12 11:32   ` [Intel-wired-lan] " Loktionov, Aleksandr
2026-01-13  6:08   ` Tariq Toukan
2026-01-13 12:52     ` Jakub Sitnicki
2026-01-10 21:05 ` [PATCH net-next 08/10] veth: " Jakub Sitnicki
2026-01-12 11:33   ` [Intel-wired-lan] " Loktionov, Aleksandr
2026-01-10 21:05 ` [PATCH net-next 09/10] xsk: " Jakub Sitnicki
2026-01-12 11:33   ` [Intel-wired-lan] " Loktionov, Aleksandr
2026-01-10 21:05 ` [PATCH net-next 10/10] xdp: " Jakub Sitnicki
2026-01-12 11:33   ` [Intel-wired-lan] " Loktionov, Aleksandr
2026-01-13  3:08 ` [PATCH net-next 00/10] " Jakub Kicinski
2026-01-13 12:09   ` Paolo Abeni
2026-01-13 12:40     ` [Intel-wired-lan] " Jakub Sitnicki
2026-01-13 18:52     ` Jesper Dangaard Brouer
2026-01-13 20:22       ` [Intel-wired-lan] " Jakub Sitnicki
2026-01-14 11:49         ` Toke Høiland-Jørgensen
2026-01-14 12:33           ` Jakub Sitnicki
2026-01-13 12:33   ` Jakub Sitnicki
2026-01-22 20:21     ` Martin KaFai Lau
2026-01-25 19:15       ` Jakub Sitnicki
2026-01-27 19:33         ` Martin KaFai Lau

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox