public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH bpf-next v2 00/16] Decouple skb metadata tracking from MAC header offset
@ 2026-01-05 12:14 Jakub Sitnicki
  2026-01-05 12:14 ` [PATCH bpf-next v2 01/16] bnxt_en: Call skb_metadata_set when skb->data points at metadata end Jakub Sitnicki
                   ` (17 more replies)
  0 siblings, 18 replies; 37+ messages in thread
From: Jakub Sitnicki @ 2026-01-05 12:14 UTC (permalink / raw)
  To: bpf
  Cc: netdev, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Alexei Starovoitov, Daniel Borkmann,
	Jesper Dangaard Brouer, John Fastabend, Stanislav Fomichev,
	Simon Horman, Andrii Nakryiko, Martin KaFai Lau, Eduard Zingerman,
	Song Liu, Yonghong Song, KP Singh, Hao Luo, Jiri Olsa,
	kernel-team

This series continues the effort to provide reliable access to xdp/skb
metadata from BPF context on the receive path. We have recently talked
about it at Plumbers [1].

Currently skb metadata location is tied to the MAC header offset:

  [headroom][metadata][MAC hdr][L3 pkt]
                      ^
                      skb_metadata_end = head + mac_header

This design breaks on L2 decapsulation (VLAN, GRE, etc.) when the MAC
offset is reset. The naive fix is to memmove metadata on every decap path,
but we can avoid this cost by tracking metadata position independently.

Introduce a dedicated meta_end field in skb_shared_info that records where
metadata ends relative to skb->head:

  [headroom][metadata][gap][MAC hdr][L3 pkt]
                     ^
                     skb_metadata_end = head + meta_end
                     
This allows BPF dynptr access (bpf_dynptr_from_skb_meta()) to work without
memmove. For skb->data_meta pointer access, which expects metadata
immediately before skb->data, make the verifier inject realignment code in
TC BPF prologue.

Patches 1-9 enforce the calling convention: skb_metadata_set() must be
called after skb->data points past the metadata area, ensuring meta_end
captures the correct position. Patch 10 implements the core change.
Patches 11-14 extend the verifier to track data_meta usage, and patch 15
adds the realignment logic. Patch 16 adds selftests covering L2 decap
scenarios.

Note: This series does not address moving metadata on L2 encapsulation when
forwarding packets. VLAN and QinQ have already been patched when fixing TC
BPF helpers [2], but other tagging/tunnel code still requires changes.

Note to maintainers: This is not a typical series, in the sense that it
touches both the networking drivers and the BPF verifier. The driver
changes (patches 1-9) can be split out, if it makes patch wrangling easier.

Thanks,
-jkbs

[1] https://lpc.events/event/19/contributions/2269/
[2] https://lore.kernel.org/all/20251105-skb-meta-rx-path-v4-0-5ceb08a9b37b@cloudflare.com/

Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
---
Changes in v2:
- Add veth driver fix (patch 7)
- Add selftests for L2 decap paths (patch 16)
- Link to RFC: https://lore.kernel.org/r/20251124-skb-meta-safeproof-netdevs-rx-only-v1-0-8978f5054417@cloudflare.com

---
Jakub Sitnicki (16):
      bnxt_en: Call skb_metadata_set when skb->data points at metadata end
      i40e: Call skb_metadata_set when skb->data points at metadata end
      igb: Call skb_metadata_set when skb->data points at metadata end
      igc: Call skb_metadata_set when skb->data points at metadata end
      ixgbe: Call skb_metadata_set when skb->data points at metadata end
      net/mlx5e: Call skb_metadata_set when skb->data points at metadata end
      veth: Call skb_metadata_set when skb->data points at metadata end
      xsk: Call skb_metadata_set when skb->data points at metadata end
      xdp: Call skb_metadata_set when skb->data points at metadata end
      net: Track skb metadata end separately from MAC offset
      bpf, verifier: Remove side effects from may_access_direct_pkt_data
      bpf, verifier: Turn seen_direct_write flag into a bitmap
      bpf, verifier: Propagate packet access flags to gen_prologue
      bpf, verifier: Track when data_meta pointer is loaded
      bpf: Realign skb metadata for TC progs using data_meta
      selftests/bpf: Test skb metadata access after L2 decapsulation

 drivers/net/ethernet/broadcom/bnxt/bnxt.c          |   2 +-
 drivers/net/ethernet/intel/i40e/i40e_xsk.c         |   2 +-
 drivers/net/ethernet/intel/igb/igb_xsk.c           |   2 +-
 drivers/net/ethernet/intel/igc/igc_main.c          |   4 +-
 drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c       |   2 +-
 .../net/ethernet/mellanox/mlx5/core/en/xsk/rx.c    |   2 +-
 drivers/net/veth.c                                 |   4 +-
 include/linux/bpf.h                                |   2 +-
 include/linux/bpf_verifier.h                       |   7 +-
 include/linux/skbuff.h                             |  37 ++-
 kernel/bpf/cgroup.c                                |   2 +-
 kernel/bpf/verifier.c                              |  42 ++-
 net/core/dev.c                                     |   5 +-
 net/core/filter.c                                  |  66 ++++-
 net/core/skbuff.c                                  |  10 +-
 net/core/xdp.c                                     |   2 +-
 net/sched/bpf_qdisc.c                              |   3 +-
 tools/testing/selftests/bpf/config                 |   6 +-
 .../bpf/prog_tests/xdp_context_test_run.c          | 292 +++++++++++++++++++++
 tools/testing/selftests/bpf/progs/test_xdp_meta.c  |  48 ++--
 .../testing/selftests/bpf/test_kmods/bpf_testmod.c |   6 +-
 21 files changed, 459 insertions(+), 87 deletions(-)


^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH bpf-next v2 01/16] bnxt_en: Call skb_metadata_set when skb->data points at metadata end
  2026-01-05 12:14 [PATCH bpf-next v2 00/16] Decouple skb metadata tracking from MAC header offset Jakub Sitnicki
@ 2026-01-05 12:14 ` Jakub Sitnicki
  2026-01-05 12:14 ` [PATCH bpf-next v2 02/16] i40e: " Jakub Sitnicki
                   ` (16 subsequent siblings)
  17 siblings, 0 replies; 37+ messages in thread
From: Jakub Sitnicki @ 2026-01-05 12:14 UTC (permalink / raw)
  To: bpf
  Cc: netdev, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Alexei Starovoitov, Daniel Borkmann,
	Jesper Dangaard Brouer, John Fastabend, Stanislav Fomichev,
	Simon Horman, Andrii Nakryiko, Martin KaFai Lau, Eduard Zingerman,
	Song Liu, Yonghong Song, KP Singh, Hao Luo, Jiri Olsa,
	kernel-team

Prepare to track skb metadata location independently of MAC header offset.

Following changes will make skb_metadata_set() record where metadata ends
relative to skb->head. Hence the helper must be called when skb->data
already points past the metadata area.

Adjust the driver to pull from skb->data before calling skb_metadata_set().

Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index d17d0ea89c36..cb1aa2895172 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -1440,8 +1440,8 @@ static struct sk_buff *bnxt_copy_xdp(struct bnxt_napi *bnapi,
 		return skb;
 
 	if (metasize) {
-		skb_metadata_set(skb, metasize);
 		__skb_pull(skb, metasize);
+		skb_metadata_set(skb, metasize);
 	}
 
 	return skb;

-- 
2.43.0


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH bpf-next v2 02/16] i40e: Call skb_metadata_set when skb->data points at metadata end
  2026-01-05 12:14 [PATCH bpf-next v2 00/16] Decouple skb metadata tracking from MAC header offset Jakub Sitnicki
  2026-01-05 12:14 ` [PATCH bpf-next v2 01/16] bnxt_en: Call skb_metadata_set when skb->data points at metadata end Jakub Sitnicki
@ 2026-01-05 12:14 ` Jakub Sitnicki
  2026-01-05 12:14 ` [PATCH bpf-next v2 03/16] igb: " Jakub Sitnicki
                   ` (15 subsequent siblings)
  17 siblings, 0 replies; 37+ messages in thread
From: Jakub Sitnicki @ 2026-01-05 12:14 UTC (permalink / raw)
  To: bpf
  Cc: netdev, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Alexei Starovoitov, Daniel Borkmann,
	Jesper Dangaard Brouer, John Fastabend, Stanislav Fomichev,
	Simon Horman, Andrii Nakryiko, Martin KaFai Lau, Eduard Zingerman,
	Song Liu, Yonghong Song, KP Singh, Hao Luo, Jiri Olsa,
	kernel-team

Prepare to track skb metadata location independently of MAC header offset.

Following changes will make skb_metadata_set() record where metadata ends
relative to skb->head. Hence the helper must be called when skb->data
already points past the metadata area.

Adjust the driver to pull from skb->data before calling skb_metadata_set().

Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
---
 drivers/net/ethernet/intel/i40e/i40e_xsk.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_xsk.c b/drivers/net/ethernet/intel/i40e/i40e_xsk.c
index 9f47388eaba5..11eff5bd840b 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_xsk.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_xsk.c
@@ -310,8 +310,8 @@ static struct sk_buff *i40e_construct_skb_zc(struct i40e_ring *rx_ring,
 	       ALIGN(totalsize, sizeof(long)));
 
 	if (metasize) {
-		skb_metadata_set(skb, metasize);
 		__skb_pull(skb, metasize);
+		skb_metadata_set(skb, metasize);
 	}
 
 	if (likely(!xdp_buff_has_frags(xdp)))

-- 
2.43.0


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH bpf-next v2 03/16] igb: Call skb_metadata_set when skb->data points at metadata end
  2026-01-05 12:14 [PATCH bpf-next v2 00/16] Decouple skb metadata tracking from MAC header offset Jakub Sitnicki
  2026-01-05 12:14 ` [PATCH bpf-next v2 01/16] bnxt_en: Call skb_metadata_set when skb->data points at metadata end Jakub Sitnicki
  2026-01-05 12:14 ` [PATCH bpf-next v2 02/16] i40e: " Jakub Sitnicki
@ 2026-01-05 12:14 ` Jakub Sitnicki
  2026-01-05 12:14 ` [PATCH bpf-next v2 04/16] igc: " Jakub Sitnicki
                   ` (14 subsequent siblings)
  17 siblings, 0 replies; 37+ messages in thread
From: Jakub Sitnicki @ 2026-01-05 12:14 UTC (permalink / raw)
  To: bpf
  Cc: netdev, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Alexei Starovoitov, Daniel Borkmann,
	Jesper Dangaard Brouer, John Fastabend, Stanislav Fomichev,
	Simon Horman, Andrii Nakryiko, Martin KaFai Lau, Eduard Zingerman,
	Song Liu, Yonghong Song, KP Singh, Hao Luo, Jiri Olsa,
	kernel-team

Prepare to track skb metadata location independently of MAC header offset.

Following changes will make skb_metadata_set() record where metadata ends
relative to skb->head. Hence the helper must be called when skb->data
already points past the metadata area.

Adjust the driver to pull from skb->data before calling skb_metadata_set().

Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
---
 drivers/net/ethernet/intel/igb/igb_xsk.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/intel/igb/igb_xsk.c b/drivers/net/ethernet/intel/igb/igb_xsk.c
index 30ce5fbb5b77..9202da66e32c 100644
--- a/drivers/net/ethernet/intel/igb/igb_xsk.c
+++ b/drivers/net/ethernet/intel/igb/igb_xsk.c
@@ -284,8 +284,8 @@ static struct sk_buff *igb_construct_skb_zc(struct igb_ring *rx_ring,
 	       ALIGN(totalsize, sizeof(long)));
 
 	if (metasize) {
-		skb_metadata_set(skb, metasize);
 		__skb_pull(skb, metasize);
+		skb_metadata_set(skb, metasize);
 	}
 
 	return skb;

-- 
2.43.0


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH bpf-next v2 04/16] igc: Call skb_metadata_set when skb->data points at metadata end
  2026-01-05 12:14 [PATCH bpf-next v2 00/16] Decouple skb metadata tracking from MAC header offset Jakub Sitnicki
                   ` (2 preceding siblings ...)
  2026-01-05 12:14 ` [PATCH bpf-next v2 03/16] igb: " Jakub Sitnicki
@ 2026-01-05 12:14 ` Jakub Sitnicki
  2026-01-05 12:14 ` [PATCH bpf-next v2 05/16] ixgbe: " Jakub Sitnicki
                   ` (13 subsequent siblings)
  17 siblings, 0 replies; 37+ messages in thread
From: Jakub Sitnicki @ 2026-01-05 12:14 UTC (permalink / raw)
  To: bpf
  Cc: netdev, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Alexei Starovoitov, Daniel Borkmann,
	Jesper Dangaard Brouer, John Fastabend, Stanislav Fomichev,
	Simon Horman, Andrii Nakryiko, Martin KaFai Lau, Eduard Zingerman,
	Song Liu, Yonghong Song, KP Singh, Hao Luo, Jiri Olsa,
	kernel-team

Prepare to track skb metadata location independently of MAC header offset.

Following changes will make skb_metadata_set() record where metadata ends
relative to skb->head. Hence the helper must be called when skb->data
already points past the metadata area.

Adjust the driver to pull from skb->data before calling skb_metadata_set().

Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
---
 drivers/net/ethernet/intel/igc/igc_main.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/intel/igc/igc_main.c b/drivers/net/ethernet/intel/igc/igc_main.c
index 7aafa60ba0c8..ba758399615b 100644
--- a/drivers/net/ethernet/intel/igc/igc_main.c
+++ b/drivers/net/ethernet/intel/igc/igc_main.c
@@ -2024,8 +2024,8 @@ static struct sk_buff *igc_construct_skb(struct igc_ring *rx_ring,
 	       ALIGN(headlen + metasize, sizeof(long)));
 
 	if (metasize) {
-		skb_metadata_set(skb, metasize);
 		__skb_pull(skb, metasize);
+		skb_metadata_set(skb, metasize);
 	}
 
 	/* update all of the pointers */
@@ -2752,8 +2752,8 @@ static struct sk_buff *igc_construct_skb_zc(struct igc_ring *ring,
 	       ALIGN(totalsize, sizeof(long)));
 
 	if (metasize) {
-		skb_metadata_set(skb, metasize);
 		__skb_pull(skb, metasize);
+		skb_metadata_set(skb, metasize);
 	}
 
 	if (ctx->rx_ts) {

-- 
2.43.0


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH bpf-next v2 05/16] ixgbe: Call skb_metadata_set when skb->data points at metadata end
  2026-01-05 12:14 [PATCH bpf-next v2 00/16] Decouple skb metadata tracking from MAC header offset Jakub Sitnicki
                   ` (3 preceding siblings ...)
  2026-01-05 12:14 ` [PATCH bpf-next v2 04/16] igc: " Jakub Sitnicki
@ 2026-01-05 12:14 ` Jakub Sitnicki
  2026-01-05 12:14 ` [PATCH bpf-next v2 06/16] net/mlx5e: " Jakub Sitnicki
                   ` (12 subsequent siblings)
  17 siblings, 0 replies; 37+ messages in thread
From: Jakub Sitnicki @ 2026-01-05 12:14 UTC (permalink / raw)
  To: bpf
  Cc: netdev, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Alexei Starovoitov, Daniel Borkmann,
	Jesper Dangaard Brouer, John Fastabend, Stanislav Fomichev,
	Simon Horman, Andrii Nakryiko, Martin KaFai Lau, Eduard Zingerman,
	Song Liu, Yonghong Song, KP Singh, Hao Luo, Jiri Olsa,
	kernel-team

Prepare to track skb metadata location independently of MAC header offset.

Following changes will make skb_metadata_set() record where metadata ends
relative to skb->head. Hence the helper must be called when skb->data
already points past the metadata area.

Adjust the driver to pull from skb->data before calling skb_metadata_set().

Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
---
 drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c
index 7b941505a9d0..69104f432f8d 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c
@@ -228,8 +228,8 @@ static struct sk_buff *ixgbe_construct_skb_zc(struct ixgbe_ring *rx_ring,
 	       ALIGN(totalsize, sizeof(long)));
 
 	if (metasize) {
-		skb_metadata_set(skb, metasize);
 		__skb_pull(skb, metasize);
+		skb_metadata_set(skb, metasize);
 	}
 
 	return skb;

-- 
2.43.0


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH bpf-next v2 06/16] net/mlx5e: Call skb_metadata_set when skb->data points at metadata end
  2026-01-05 12:14 [PATCH bpf-next v2 00/16] Decouple skb metadata tracking from MAC header offset Jakub Sitnicki
                   ` (4 preceding siblings ...)
  2026-01-05 12:14 ` [PATCH bpf-next v2 05/16] ixgbe: " Jakub Sitnicki
@ 2026-01-05 12:14 ` Jakub Sitnicki
  2026-01-05 12:14 ` [PATCH bpf-next v2 07/16] veth: " Jakub Sitnicki
                   ` (11 subsequent siblings)
  17 siblings, 0 replies; 37+ messages in thread
From: Jakub Sitnicki @ 2026-01-05 12:14 UTC (permalink / raw)
  To: bpf
  Cc: netdev, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Alexei Starovoitov, Daniel Borkmann,
	Jesper Dangaard Brouer, John Fastabend, Stanislav Fomichev,
	Simon Horman, Andrii Nakryiko, Martin KaFai Lau, Eduard Zingerman,
	Song Liu, Yonghong Song, KP Singh, Hao Luo, Jiri Olsa,
	kernel-team

Prepare to track skb metadata location independently of MAC header offset.

Following changes will make skb_metadata_set() record where metadata ends
relative to skb->head. Hence the helper must be called when skb->data
already points past the metadata area.

Adjust the driver to pull from skb->data before calling skb_metadata_set().

Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.c
index 2b05536d564a..20c983c3ce62 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.c
@@ -237,8 +237,8 @@ static struct sk_buff *mlx5e_xsk_construct_skb(struct mlx5e_rq *rq, struct xdp_b
 	skb_put_data(skb, xdp->data_meta, totallen);
 
 	if (metalen) {
-		skb_metadata_set(skb, metalen);
 		__skb_pull(skb, metalen);
+		skb_metadata_set(skb, metalen);
 	}
 
 	return skb;

-- 
2.43.0


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH bpf-next v2 07/16] veth: Call skb_metadata_set when skb->data points at metadata end
  2026-01-05 12:14 [PATCH bpf-next v2 00/16] Decouple skb metadata tracking from MAC header offset Jakub Sitnicki
                   ` (5 preceding siblings ...)
  2026-01-05 12:14 ` [PATCH bpf-next v2 06/16] net/mlx5e: " Jakub Sitnicki
@ 2026-01-05 12:14 ` Jakub Sitnicki
  2026-01-05 12:14 ` [PATCH bpf-next v2 08/16] xsk: " Jakub Sitnicki
                   ` (10 subsequent siblings)
  17 siblings, 0 replies; 37+ messages in thread
From: Jakub Sitnicki @ 2026-01-05 12:14 UTC (permalink / raw)
  To: bpf
  Cc: netdev, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Alexei Starovoitov, Daniel Borkmann,
	Jesper Dangaard Brouer, John Fastabend, Stanislav Fomichev,
	Simon Horman, Andrii Nakryiko, Martin KaFai Lau, Eduard Zingerman,
	Song Liu, Yonghong Song, KP Singh, Hao Luo, Jiri Olsa,
	kernel-team

Prepare to track skb metadata location independently of MAC header offset.

Following changes will make skb_metadata_set() record where metadata ends
relative to skb->head. Hence the helper must be called when skb->data
points right past the metadata area.

Unlike other drivers, veth calls skb_metadata_set() after eth_type_trans(),
which pulls the Ethernet header and moves skb->data. This violates the
future calling convention.

Adjust the driver to pull the MAC header after calling skb_metadata_set().

Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
---
 drivers/net/veth.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/veth.c b/drivers/net/veth.c
index 14e6f2a2fb77..1d1dbfa2e5ef 100644
--- a/drivers/net/veth.c
+++ b/drivers/net/veth.c
@@ -874,11 +874,11 @@ static struct sk_buff *veth_xdp_rcv_skb(struct veth_rq *rq,
 	else
 		skb->data_len = 0;
 
-	skb->protocol = eth_type_trans(skb, rq->dev);
-
 	metalen = xdp->data - xdp->data_meta;
 	if (metalen)
 		skb_metadata_set(skb, metalen);
+
+	skb->protocol = eth_type_trans(skb, rq->dev);
 out:
 	return skb;
 drop:

-- 
2.43.0


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH bpf-next v2 08/16] xsk: Call skb_metadata_set when skb->data points at metadata end
  2026-01-05 12:14 [PATCH bpf-next v2 00/16] Decouple skb metadata tracking from MAC header offset Jakub Sitnicki
                   ` (6 preceding siblings ...)
  2026-01-05 12:14 ` [PATCH bpf-next v2 07/16] veth: " Jakub Sitnicki
@ 2026-01-05 12:14 ` Jakub Sitnicki
  2026-01-05 12:14 ` [PATCH bpf-next v2 09/16] xdp: " Jakub Sitnicki
                   ` (9 subsequent siblings)
  17 siblings, 0 replies; 37+ messages in thread
From: Jakub Sitnicki @ 2026-01-05 12:14 UTC (permalink / raw)
  To: bpf
  Cc: netdev, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Alexei Starovoitov, Daniel Borkmann,
	Jesper Dangaard Brouer, John Fastabend, Stanislav Fomichev,
	Simon Horman, Andrii Nakryiko, Martin KaFai Lau, Eduard Zingerman,
	Song Liu, Yonghong Song, KP Singh, Hao Luo, Jiri Olsa,
	kernel-team

Prepare to track skb metadata location independently of MAC header offset.

Following changes will make skb_metadata_set() record where metadata ends
relative to skb->head. Hence the helper must be called when skb->data
already points past the metadata area.

Adjust AF_XDP to pull from skb->data before calling skb_metadata_set().

Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
---
 net/core/xdp.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/core/xdp.c b/net/core/xdp.c
index fee6d080ee85..f35661305660 100644
--- a/net/core/xdp.c
+++ b/net/core/xdp.c
@@ -768,8 +768,8 @@ struct sk_buff *xdp_build_skb_from_zc(struct xdp_buff *xdp)
 
 	metalen = xdp->data - xdp->data_meta;
 	if (metalen > 0) {
-		skb_metadata_set(skb, metalen);
 		__skb_pull(skb, metalen);
+		skb_metadata_set(skb, metalen);
 	}
 
 	skb_record_rx_queue(skb, rxq->queue_index);

-- 
2.43.0


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH bpf-next v2 09/16] xdp: Call skb_metadata_set when skb->data points at metadata end
  2026-01-05 12:14 [PATCH bpf-next v2 00/16] Decouple skb metadata tracking from MAC header offset Jakub Sitnicki
                   ` (7 preceding siblings ...)
  2026-01-05 12:14 ` [PATCH bpf-next v2 08/16] xsk: " Jakub Sitnicki
@ 2026-01-05 12:14 ` Jakub Sitnicki
  2026-01-05 12:14 ` [PATCH bpf-next v2 10/16] net: Track skb metadata end separately from MAC offset Jakub Sitnicki
                   ` (8 subsequent siblings)
  17 siblings, 0 replies; 37+ messages in thread
From: Jakub Sitnicki @ 2026-01-05 12:14 UTC (permalink / raw)
  To: bpf
  Cc: netdev, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Alexei Starovoitov, Daniel Borkmann,
	Jesper Dangaard Brouer, John Fastabend, Stanislav Fomichev,
	Simon Horman, Andrii Nakryiko, Martin KaFai Lau, Eduard Zingerman,
	Song Liu, Yonghong Song, KP Singh, Hao Luo, Jiri Olsa,
	kernel-team

Prepare to track skb metadata location independently of MAC header offset.

Following changes will make skb_metadata_set() record where metadata ends
relative to skb->head. Hence the helper must be called when skb->data
points just past the metadata area.

Tweak XDP generic mode accordingly.

Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
---
 net/core/dev.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/net/core/dev.c b/net/core/dev.c
index 9094c0fb8c68..7f984d86dfe9 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -5467,8 +5467,11 @@ u32 bpf_prog_run_generic_xdp(struct sk_buff *skb, struct xdp_buff *xdp,
 		break;
 	case XDP_PASS:
 		metalen = xdp->data - xdp->data_meta;
-		if (metalen)
+		if (metalen) {
+			__skb_push(skb, mac_len);
 			skb_metadata_set(skb, metalen);
+			__skb_pull(skb, mac_len);
+		}
 		break;
 	}
 

-- 
2.43.0


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH bpf-next v2 10/16] net: Track skb metadata end separately from MAC offset
  2026-01-05 12:14 [PATCH bpf-next v2 00/16] Decouple skb metadata tracking from MAC header offset Jakub Sitnicki
                   ` (8 preceding siblings ...)
  2026-01-05 12:14 ` [PATCH bpf-next v2 09/16] xdp: " Jakub Sitnicki
@ 2026-01-05 12:14 ` Jakub Sitnicki
  2026-01-05 12:14 ` [PATCH bpf-next v2 11/16] bpf, verifier: Remove side effects from may_access_direct_pkt_data Jakub Sitnicki
                   ` (7 subsequent siblings)
  17 siblings, 0 replies; 37+ messages in thread
From: Jakub Sitnicki @ 2026-01-05 12:14 UTC (permalink / raw)
  To: bpf
  Cc: netdev, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Alexei Starovoitov, Daniel Borkmann,
	Jesper Dangaard Brouer, John Fastabend, Stanislav Fomichev,
	Simon Horman, Andrii Nakryiko, Martin KaFai Lau, Eduard Zingerman,
	Song Liu, Yonghong Song, KP Singh, Hao Luo, Jiri Olsa,
	kernel-team

Currently skb metadata location is derived from the MAC header offset.
This breaks when L2 tunnel/tagging devices (VLAN, GRE, etc.) reset the MAC
offset after pulling the encapsulation header, making the metadata
inaccessible.

A naive fix would be to move metadata on every skb_pull() path. However, we
can avoid a memmove on L2 decapsulation if we can locate metadata
independently of the MAC offset.

Introduce a meta_end field in skb_shared_info to track where metadata ends,
decoupling it from mac_header. The new field takes 2 bytes out of the
existing 4 byte hole, with structure size unchanged if we reorder the
gso_type field.

Update skb_metadata_set() to record meta_end at the time of the call, and
adjust skb_data_move() and pskb_expand_head() to keep meta_end in sync with
head buffer layout.

Remove the now-unneeded metadata adjustment in skb_reorder_vlan_header().

Note that this breaks BPF skb metadata access through skb->data_meta when
there is a gap between meta_end and skb->data. Following BPF verifier
changes address this.

Also, we still need to relocate the metadata on encapsulation on forward
path. VLAN and QinQ have already been patched when fixing TC BPF helpers
[1], but other tagging/tunnel code still requires similar changes. This
will be done as a follow up.

Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
---
 include/linux/skbuff.h | 14 ++++++++++++--
 net/core/skbuff.c      | 10 ++--------
 2 files changed, 14 insertions(+), 10 deletions(-)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 86737076101d..6dd09f55a975 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -595,15 +595,16 @@ struct skb_shared_info {
 	__u8		meta_len;
 	__u8		nr_frags;
 	__u8		tx_flags;
+	u16		meta_end;
 	unsigned short	gso_size;
 	/* Warning: this field is not always filled in (UFO)! */
 	unsigned short	gso_segs;
+	unsigned int	gso_type;
 	struct sk_buff	*frag_list;
 	union {
 		struct skb_shared_hwtstamps hwtstamps;
 		struct xsk_tx_metadata_compl xsk_meta;
 	};
-	unsigned int	gso_type;
 	u32		tskey;
 
 	/*
@@ -4499,7 +4500,7 @@ static inline u8 skb_metadata_len(const struct sk_buff *skb)
 
 static inline void *skb_metadata_end(const struct sk_buff *skb)
 {
-	return skb_mac_header(skb);
+	return skb->head + skb_shinfo(skb)->meta_end;
 }
 
 static inline bool __skb_metadata_differs(const struct sk_buff *skb_a,
@@ -4554,8 +4555,16 @@ static inline bool skb_metadata_differs(const struct sk_buff *skb_a,
 	       true : __skb_metadata_differs(skb_a, skb_b, len_a);
 }
 
+/**
+ * skb_metadata_set - Record packet metadata length and location.
+ * @skb: packet carrying the metadata
+ * @meta_len: number of bytes of metadata preceding skb->data
+ *
+ * Must be called when skb->data already points past the metadata area.
+ */
 static inline void skb_metadata_set(struct sk_buff *skb, u8 meta_len)
 {
+	skb_shinfo(skb)->meta_end = skb_headroom(skb);
 	skb_shinfo(skb)->meta_len = meta_len;
 }
 
@@ -4601,6 +4610,7 @@ static inline void skb_data_move(struct sk_buff *skb, const int len,
 	}
 
 	memmove(meta + len, meta, meta_len + n);
+	skb_shinfo(skb)->meta_end += len;
 	return;
 
 no_metadata:
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index a00808f7be6a..afff023e70b1 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -2325,6 +2325,7 @@ int pskb_expand_head(struct sk_buff *skb, int nhead, int ntail,
 #endif
 	skb->tail	      += off;
 	skb_headers_offset_update(skb, nhead);
+	skb_shinfo(skb)->meta_end += nhead;
 	skb->cloned   = 0;
 	skb->hdr_len  = 0;
 	skb->nohdr    = 0;
@@ -6238,8 +6239,7 @@ EXPORT_SYMBOL_GPL(skb_scrub_packet);
 
 static struct sk_buff *skb_reorder_vlan_header(struct sk_buff *skb)
 {
-	int mac_len, meta_len;
-	void *meta;
+	int mac_len;
 
 	if (skb_cow(skb, skb_headroom(skb)) < 0) {
 		kfree_skb(skb);
@@ -6252,12 +6252,6 @@ static struct sk_buff *skb_reorder_vlan_header(struct sk_buff *skb)
 			mac_len - VLAN_HLEN - ETH_TLEN);
 	}
 
-	meta_len = skb_metadata_len(skb);
-	if (meta_len) {
-		meta = skb_metadata_end(skb) - meta_len;
-		memmove(meta + VLAN_HLEN, meta, meta_len);
-	}
-
 	skb->mac_header += VLAN_HLEN;
 	return skb;
 }

-- 
2.43.0


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH bpf-next v2 11/16] bpf, verifier: Remove side effects from may_access_direct_pkt_data
  2026-01-05 12:14 [PATCH bpf-next v2 00/16] Decouple skb metadata tracking from MAC header offset Jakub Sitnicki
                   ` (9 preceding siblings ...)
  2026-01-05 12:14 ` [PATCH bpf-next v2 10/16] net: Track skb metadata end separately from MAC offset Jakub Sitnicki
@ 2026-01-05 12:14 ` Jakub Sitnicki
  2026-01-05 18:23   ` Eduard Zingerman
  2026-01-05 12:14 ` [PATCH bpf-next v2 12/16] bpf, verifier: Turn seen_direct_write flag into a bitmap Jakub Sitnicki
                   ` (6 subsequent siblings)
  17 siblings, 1 reply; 37+ messages in thread
From: Jakub Sitnicki @ 2026-01-05 12:14 UTC (permalink / raw)
  To: bpf
  Cc: netdev, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Alexei Starovoitov, Daniel Borkmann,
	Jesper Dangaard Brouer, John Fastabend, Stanislav Fomichev,
	Simon Horman, Andrii Nakryiko, Martin KaFai Lau, Eduard Zingerman,
	Song Liu, Yonghong Song, KP Singh, Hao Luo, Jiri Olsa,
	kernel-team

The may_access_direct_pkt_data() helper sets env->seen_direct_write as a
side effect, which creates awkward calling patterns:

- check_special_kfunc() has a comment warning readers about the side effect
- specialize_kfunc() must save and restore the flag around the call

Make the helper a pure function by moving the seen_direct_write flag
setting to call sites that need it.

Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
---
 kernel/bpf/verifier.c | 33 ++++++++++++---------------------
 1 file changed, 12 insertions(+), 21 deletions(-)

diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 9394b0de2ef0..52d76a848f65 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -6151,13 +6151,9 @@ static bool may_access_direct_pkt_data(struct bpf_verifier_env *env,
 		if (meta)
 			return meta->pkt_access;
 
-		env->seen_direct_write = true;
 		return true;
 
 	case BPF_PROG_TYPE_CGROUP_SOCKOPT:
-		if (t == BPF_WRITE)
-			env->seen_direct_write = true;
-
 		return true;
 
 	default:
@@ -7708,15 +7704,17 @@ static int check_mem_access(struct bpf_verifier_env *env, int insn_idx, u32 regn
 			err = check_stack_write(env, regno, off, size,
 						value_regno, insn_idx);
 	} else if (reg_is_pkt_pointer(reg)) {
-		if (t == BPF_WRITE && !may_access_direct_pkt_data(env, NULL, t)) {
-			verbose(env, "cannot write into packet\n");
-			return -EACCES;
-		}
-		if (t == BPF_WRITE && value_regno >= 0 &&
-		    is_pointer_value(env, value_regno)) {
-			verbose(env, "R%d leaks addr into packet\n",
-				value_regno);
-			return -EACCES;
+		if (t == BPF_WRITE) {
+			if (!may_access_direct_pkt_data(env, NULL, BPF_WRITE)) {
+				verbose(env, "cannot write into packet\n");
+				return -EACCES;
+			}
+			if (value_regno >= 0 && is_pointer_value(env, value_regno)) {
+				verbose(env, "R%d leaks addr into packet\n",
+					value_regno);
+				return -EACCES;
+			}
+			env->seen_direct_write = true;
 		}
 		err = check_packet_access(env, regno, off, size, false);
 		if (!err && t == BPF_READ && value_regno >= 0)
@@ -13883,11 +13881,11 @@ static int check_special_kfunc(struct bpf_verifier_env *env, struct bpf_kfunc_ca
 		if (meta->func_id == special_kfunc_list[KF_bpf_dynptr_slice]) {
 			regs[BPF_REG_0].type |= MEM_RDONLY;
 		} else {
-			/* this will set env->seen_direct_write to true */
 			if (!may_access_direct_pkt_data(env, NULL, BPF_WRITE)) {
 				verbose(env, "the prog does not allow writes to packet data\n");
 				return -EINVAL;
 			}
+			env->seen_direct_write = true;
 		}
 
 		if (!meta->initialized_dynptr.id) {
@@ -22388,7 +22386,6 @@ static int fixup_call_args(struct bpf_verifier_env *env)
 static int specialize_kfunc(struct bpf_verifier_env *env, struct bpf_kfunc_desc *desc, int insn_idx)
 {
 	struct bpf_prog *prog = env->prog;
-	bool seen_direct_write;
 	void *xdp_kfunc;
 	bool is_rdonly;
 	u32 func_id = desc->func_id;
@@ -22404,16 +22401,10 @@ static int specialize_kfunc(struct bpf_verifier_env *env, struct bpf_kfunc_desc
 			addr = (unsigned long)xdp_kfunc;
 		/* fallback to default kfunc when not supported by netdev */
 	} else if (func_id == special_kfunc_list[KF_bpf_dynptr_from_skb]) {
-		seen_direct_write = env->seen_direct_write;
 		is_rdonly = !may_access_direct_pkt_data(env, NULL, BPF_WRITE);
 
 		if (is_rdonly)
 			addr = (unsigned long)bpf_dynptr_from_skb_rdonly;
-
-		/* restore env->seen_direct_write to its original value, since
-		 * may_access_direct_pkt_data mutates it
-		 */
-		env->seen_direct_write = seen_direct_write;
 	} else if (func_id == special_kfunc_list[KF_bpf_set_dentry_xattr]) {
 		if (bpf_lsm_has_d_inode_locked(prog))
 			addr = (unsigned long)bpf_set_dentry_xattr_locked;

-- 
2.43.0


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH bpf-next v2 12/16] bpf, verifier: Turn seen_direct_write flag into a bitmap
  2026-01-05 12:14 [PATCH bpf-next v2 00/16] Decouple skb metadata tracking from MAC header offset Jakub Sitnicki
                   ` (10 preceding siblings ...)
  2026-01-05 12:14 ` [PATCH bpf-next v2 11/16] bpf, verifier: Remove side effects from may_access_direct_pkt_data Jakub Sitnicki
@ 2026-01-05 12:14 ` Jakub Sitnicki
  2026-01-05 18:48   ` Eduard Zingerman
  2026-01-05 12:14 ` [PATCH bpf-next v2 13/16] bpf, verifier: Propagate packet access flags to gen_prologue Jakub Sitnicki
                   ` (5 subsequent siblings)
  17 siblings, 1 reply; 37+ messages in thread
From: Jakub Sitnicki @ 2026-01-05 12:14 UTC (permalink / raw)
  To: bpf
  Cc: netdev, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Alexei Starovoitov, Daniel Borkmann,
	Jesper Dangaard Brouer, John Fastabend, Stanislav Fomichev,
	Simon Horman, Andrii Nakryiko, Martin KaFai Lau, Eduard Zingerman,
	Song Liu, Yonghong Song, KP Singh, Hao Luo, Jiri Olsa,
	kernel-team

Convert seen_direct_write from a boolean to a bitmap (seen_packet_access)
in preparation for tracking additional packet access patterns.

No functional change.

Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
---
 include/linux/bpf_verifier.h |  6 +++++-
 kernel/bpf/verifier.c        | 11 ++++++-----
 2 files changed, 11 insertions(+), 6 deletions(-)

diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
index 130bcbd66f60..c8397ae51880 100644
--- a/include/linux/bpf_verifier.h
+++ b/include/linux/bpf_verifier.h
@@ -647,6 +647,10 @@ enum priv_stack_mode {
 	PRIV_STACK_ADAPTIVE,
 };
 
+enum packet_access_flags {
+	PA_F_DIRECT_WRITE = BIT(0),
+};
+
 struct bpf_subprog_info {
 	/* 'start' has to be the first field otherwise find_subprog() won't work */
 	u32 start; /* insn idx of function entry point */
@@ -773,7 +777,7 @@ struct bpf_verifier_env {
 	bool bpf_capable;
 	bool bypass_spec_v1;
 	bool bypass_spec_v4;
-	bool seen_direct_write;
+	u8 seen_packet_access;	/* combination of enum packet_access_flags */
 	bool seen_exception;
 	struct bpf_insn_aux_data *insn_aux_data; /* array of per-insn state */
 	const struct bpf_line_info *prev_linfo;
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 52d76a848f65..f6094fd3fd94 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -7714,7 +7714,7 @@ static int check_mem_access(struct bpf_verifier_env *env, int insn_idx, u32 regn
 					value_regno);
 				return -EACCES;
 			}
-			env->seen_direct_write = true;
+			env->seen_packet_access |= PA_F_DIRECT_WRITE;
 		}
 		err = check_packet_access(env, regno, off, size, false);
 		if (!err && t == BPF_READ && value_regno >= 0)
@@ -13885,7 +13885,7 @@ static int check_special_kfunc(struct bpf_verifier_env *env, struct bpf_kfunc_ca
 				verbose(env, "the prog does not allow writes to packet data\n");
 				return -EINVAL;
 			}
-			env->seen_direct_write = true;
+			env->seen_packet_access |= PA_F_DIRECT_WRITE;
 		}
 
 		if (!meta->initialized_dynptr.id) {
@@ -21758,6 +21758,7 @@ static int convert_ctx_accesses(struct bpf_verifier_env *env)
 	struct bpf_prog *new_prog;
 	enum bpf_access_type type;
 	bool is_narrower_load;
+	bool seen_direct_write;
 	int epilogue_idx = 0;
 
 	if (ops->gen_epilogue) {
@@ -21785,13 +21786,13 @@ static int convert_ctx_accesses(struct bpf_verifier_env *env)
 		}
 	}
 
-	if (ops->gen_prologue || env->seen_direct_write) {
+	seen_direct_write = env->seen_packet_access & PA_F_DIRECT_WRITE;
+	if (ops->gen_prologue || seen_direct_write) {
 		if (!ops->gen_prologue) {
 			verifier_bug(env, "gen_prologue is null");
 			return -EFAULT;
 		}
-		cnt = ops->gen_prologue(insn_buf, env->seen_direct_write,
-					env->prog);
+		cnt = ops->gen_prologue(insn_buf, seen_direct_write, env->prog);
 		if (cnt >= INSN_BUF_SIZE) {
 			verifier_bug(env, "prologue is too long");
 			return -EFAULT;

-- 
2.43.0


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH bpf-next v2 13/16] bpf, verifier: Propagate packet access flags to gen_prologue
  2026-01-05 12:14 [PATCH bpf-next v2 00/16] Decouple skb metadata tracking from MAC header offset Jakub Sitnicki
                   ` (11 preceding siblings ...)
  2026-01-05 12:14 ` [PATCH bpf-next v2 12/16] bpf, verifier: Turn seen_direct_write flag into a bitmap Jakub Sitnicki
@ 2026-01-05 12:14 ` Jakub Sitnicki
  2026-01-05 18:49   ` Eduard Zingerman
  2026-01-05 12:14 ` [PATCH bpf-next v2 14/16] bpf, verifier: Track when data_meta pointer is loaded Jakub Sitnicki
                   ` (4 subsequent siblings)
  17 siblings, 1 reply; 37+ messages in thread
From: Jakub Sitnicki @ 2026-01-05 12:14 UTC (permalink / raw)
  To: bpf
  Cc: netdev, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Alexei Starovoitov, Daniel Borkmann,
	Jesper Dangaard Brouer, John Fastabend, Stanislav Fomichev,
	Simon Horman, Andrii Nakryiko, Martin KaFai Lau, Eduard Zingerman,
	Song Liu, Yonghong Song, KP Singh, Hao Luo, Jiri Olsa,
	kernel-team

Change gen_prologue() to accept the packet access flags bitmap. This allows
gen_prologue() to inspect multiple access patterns when needed.

No functional change.

Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
---
 include/linux/bpf.h                                  |  2 +-
 kernel/bpf/cgroup.c                                  |  2 +-
 kernel/bpf/verifier.c                                |  6 ++----
 net/core/filter.c                                    | 15 ++++++++-------
 net/sched/bpf_qdisc.c                                |  3 ++-
 tools/testing/selftests/bpf/test_kmods/bpf_testmod.c |  6 +++---
 6 files changed, 17 insertions(+), 17 deletions(-)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index a63e47d2109c..e279a16bcccb 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -1106,7 +1106,7 @@ struct bpf_verifier_ops {
 	bool (*is_valid_access)(int off, int size, enum bpf_access_type type,
 				const struct bpf_prog *prog,
 				struct bpf_insn_access_aux *info);
-	int (*gen_prologue)(struct bpf_insn *insn, bool direct_write,
+	int (*gen_prologue)(struct bpf_insn *insn, u32 pkt_access_flags,
 			    const struct bpf_prog *prog);
 	int (*gen_epilogue)(struct bpf_insn *insn, const struct bpf_prog *prog,
 			    s16 ctx_stack_off);
diff --git a/kernel/bpf/cgroup.c b/kernel/bpf/cgroup.c
index 69988af44b37..d96465cd7d43 100644
--- a/kernel/bpf/cgroup.c
+++ b/kernel/bpf/cgroup.c
@@ -2694,7 +2694,7 @@ static u32 cg_sockopt_convert_ctx_access(enum bpf_access_type type,
 }
 
 static int cg_sockopt_get_prologue(struct bpf_insn *insn_buf,
-				   bool direct_write,
+				   u32 pkt_access_flags,
 				   const struct bpf_prog *prog)
 {
 	/* Nothing to do for sockopt argument. The data is kzalloc'ated.
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index f6094fd3fd94..558c983f30e3 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -21758,7 +21758,6 @@ static int convert_ctx_accesses(struct bpf_verifier_env *env)
 	struct bpf_prog *new_prog;
 	enum bpf_access_type type;
 	bool is_narrower_load;
-	bool seen_direct_write;
 	int epilogue_idx = 0;
 
 	if (ops->gen_epilogue) {
@@ -21786,13 +21785,12 @@ static int convert_ctx_accesses(struct bpf_verifier_env *env)
 		}
 	}
 
-	seen_direct_write = env->seen_packet_access & PA_F_DIRECT_WRITE;
-	if (ops->gen_prologue || seen_direct_write) {
+	if (ops->gen_prologue || (env->seen_packet_access & PA_F_DIRECT_WRITE)) {
 		if (!ops->gen_prologue) {
 			verifier_bug(env, "gen_prologue is null");
 			return -EFAULT;
 		}
-		cnt = ops->gen_prologue(insn_buf, seen_direct_write, env->prog);
+		cnt = ops->gen_prologue(insn_buf, env->seen_packet_access, env->prog);
 		if (cnt >= INSN_BUF_SIZE) {
 			verifier_bug(env, "prologue is too long");
 			return -EFAULT;
diff --git a/net/core/filter.c b/net/core/filter.c
index d43df98e1ded..07af2a94cc9a 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -9052,7 +9052,7 @@ static bool sock_filter_is_valid_access(int off, int size,
 					       prog->expected_attach_type);
 }
 
-static int bpf_noop_prologue(struct bpf_insn *insn_buf, bool direct_write,
+static int bpf_noop_prologue(struct bpf_insn *insn_buf, u32 pkt_access_flags,
 			     const struct bpf_prog *prog)
 {
 	/* Neither direct read nor direct write requires any preliminary
@@ -9061,12 +9061,12 @@ static int bpf_noop_prologue(struct bpf_insn *insn_buf, bool direct_write,
 	return 0;
 }
 
-static int bpf_unclone_prologue(struct bpf_insn *insn_buf, bool direct_write,
+static int bpf_unclone_prologue(struct bpf_insn *insn_buf, u32 pkt_access_flags,
 				const struct bpf_prog *prog, int drop_verdict)
 {
 	struct bpf_insn *insn = insn_buf;
 
-	if (!direct_write)
+	if (!(pkt_access_flags & PA_F_DIRECT_WRITE))
 		return 0;
 
 	/* if (!skb->cloned)
@@ -9135,10 +9135,11 @@ static int bpf_gen_ld_abs(const struct bpf_insn *orig,
 	return insn - insn_buf;
 }
 
-static int tc_cls_act_prologue(struct bpf_insn *insn_buf, bool direct_write,
+static int tc_cls_act_prologue(struct bpf_insn *insn_buf, u32 pkt_access_flags,
 			       const struct bpf_prog *prog)
 {
-	return bpf_unclone_prologue(insn_buf, direct_write, prog, TC_ACT_SHOT);
+	return bpf_unclone_prologue(insn_buf, pkt_access_flags, prog,
+				    TC_ACT_SHOT);
 }
 
 static bool tc_cls_act_is_valid_access(int off, int size,
@@ -9476,10 +9477,10 @@ static bool sock_ops_is_valid_access(int off, int size,
 	return true;
 }
 
-static int sk_skb_prologue(struct bpf_insn *insn_buf, bool direct_write,
+static int sk_skb_prologue(struct bpf_insn *insn_buf, u32 pkt_access_flags,
 			   const struct bpf_prog *prog)
 {
-	return bpf_unclone_prologue(insn_buf, direct_write, prog, SK_DROP);
+	return bpf_unclone_prologue(insn_buf, pkt_access_flags, prog, SK_DROP);
 }
 
 static bool sk_skb_is_valid_access(int off, int size,
diff --git a/net/sched/bpf_qdisc.c b/net/sched/bpf_qdisc.c
index b9771788b9b3..dc7b9db7e785 100644
--- a/net/sched/bpf_qdisc.c
+++ b/net/sched/bpf_qdisc.c
@@ -132,7 +132,8 @@ static int bpf_qdisc_btf_struct_access(struct bpf_verifier_log *log,
 
 BTF_ID_LIST_SINGLE(bpf_qdisc_init_prologue_ids, func, bpf_qdisc_init_prologue)
 
-static int bpf_qdisc_gen_prologue(struct bpf_insn *insn_buf, bool direct_write,
+static int bpf_qdisc_gen_prologue(struct bpf_insn *insn_buf,
+				  u32 direct_access_flags,
 				  const struct bpf_prog *prog)
 {
 	struct bpf_insn *insn = insn_buf;
diff --git a/tools/testing/selftests/bpf/test_kmods/bpf_testmod.c b/tools/testing/selftests/bpf/test_kmods/bpf_testmod.c
index 1c41d03bd5a1..d698e45783e3 100644
--- a/tools/testing/selftests/bpf/test_kmods/bpf_testmod.c
+++ b/tools/testing/selftests/bpf/test_kmods/bpf_testmod.c
@@ -1397,7 +1397,7 @@ static int bpf_test_mod_st_ops__test_pro_epilogue(struct st_ops_args *args)
 static int bpf_cgroup_from_id_id;
 static int bpf_cgroup_release_id;
 
-static int st_ops_gen_prologue_with_kfunc(struct bpf_insn *insn_buf, bool direct_write,
+static int st_ops_gen_prologue_with_kfunc(struct bpf_insn *insn_buf,
 					  const struct bpf_prog *prog)
 {
 	struct bpf_insn *insn = insn_buf;
@@ -1473,7 +1473,7 @@ static int st_ops_gen_epilogue_with_kfunc(struct bpf_insn *insn_buf, const struc
 }
 
 #define KFUNC_PRO_EPI_PREFIX "test_kfunc_"
-static int st_ops_gen_prologue(struct bpf_insn *insn_buf, bool direct_write,
+static int st_ops_gen_prologue(struct bpf_insn *insn_buf, u32 pkt_access_flags,
 			       const struct bpf_prog *prog)
 {
 	struct bpf_insn *insn = insn_buf;
@@ -1483,7 +1483,7 @@ static int st_ops_gen_prologue(struct bpf_insn *insn_buf, bool direct_write,
 		return 0;
 
 	if (!strncmp(prog->aux->name, KFUNC_PRO_EPI_PREFIX, strlen(KFUNC_PRO_EPI_PREFIX)))
-		return st_ops_gen_prologue_with_kfunc(insn_buf, direct_write, prog);
+		return st_ops_gen_prologue_with_kfunc(insn_buf, prog);
 
 	/* r6 = r1[0]; // r6 will be "struct st_ops *args". r1 is "u64 *ctx".
 	 * r7 = r6->a;

-- 
2.43.0


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH bpf-next v2 14/16] bpf, verifier: Track when data_meta pointer is loaded
  2026-01-05 12:14 [PATCH bpf-next v2 00/16] Decouple skb metadata tracking from MAC header offset Jakub Sitnicki
                   ` (12 preceding siblings ...)
  2026-01-05 12:14 ` [PATCH bpf-next v2 13/16] bpf, verifier: Propagate packet access flags to gen_prologue Jakub Sitnicki
@ 2026-01-05 12:14 ` Jakub Sitnicki
  2026-01-05 18:48   ` Eduard Zingerman
  2026-01-05 12:14 ` [PATCH bpf-next v2 15/16] bpf: Realign skb metadata for TC progs using data_meta Jakub Sitnicki
                   ` (3 subsequent siblings)
  17 siblings, 1 reply; 37+ messages in thread
From: Jakub Sitnicki @ 2026-01-05 12:14 UTC (permalink / raw)
  To: bpf
  Cc: netdev, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Alexei Starovoitov, Daniel Borkmann,
	Jesper Dangaard Brouer, John Fastabend, Stanislav Fomichev,
	Simon Horman, Andrii Nakryiko, Martin KaFai Lau, Eduard Zingerman,
	Song Liu, Yonghong Song, KP Singh, Hao Luo, Jiri Olsa,
	kernel-team

Introduce PA_F_DATA_META_LOAD flag to track when a BPF program loads the
skb->data_meta pointer.

This information will be used by gen_prologue() to handle cases where there
is a gap between metadata end and skb->data, requiring metadata to be
realigned.

Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
---
 include/linux/bpf_verifier.h | 1 +
 kernel/bpf/verifier.c        | 4 ++++
 2 files changed, 5 insertions(+)

diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
index c8397ae51880..b32ddf0f0ab3 100644
--- a/include/linux/bpf_verifier.h
+++ b/include/linux/bpf_verifier.h
@@ -649,6 +649,7 @@ enum priv_stack_mode {
 
 enum packet_access_flags {
 	PA_F_DIRECT_WRITE = BIT(0),
+	PA_F_DATA_META_LOAD = BIT(1),
 };
 
 struct bpf_subprog_info {
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 558c983f30e3..1ca5c5e895ee 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -6226,6 +6226,10 @@ static int check_ctx_access(struct bpf_verifier_env *env, int insn_idx, int off,
 		} else {
 			env->insn_aux_data[insn_idx].ctx_field_size = info->ctx_field_size;
 		}
+
+		if (base_type(info->reg_type) == PTR_TO_PACKET_META)
+			env->seen_packet_access |= PA_F_DATA_META_LOAD;
+
 		/* remember the offset of last byte accessed in ctx */
 		if (env->prog->aux->max_ctx_offset < off + size)
 			env->prog->aux->max_ctx_offset = off + size;

-- 
2.43.0


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH bpf-next v2 15/16] bpf: Realign skb metadata for TC progs using data_meta
  2026-01-05 12:14 [PATCH bpf-next v2 00/16] Decouple skb metadata tracking from MAC header offset Jakub Sitnicki
                   ` (13 preceding siblings ...)
  2026-01-05 12:14 ` [PATCH bpf-next v2 14/16] bpf, verifier: Track when data_meta pointer is loaded Jakub Sitnicki
@ 2026-01-05 12:14 ` Jakub Sitnicki
  2026-01-05 19:14   ` Alexei Starovoitov
  2026-01-05 12:14 ` [PATCH bpf-next v2 16/16] selftests/bpf: Test skb metadata access after L2 decapsulation Jakub Sitnicki
                   ` (2 subsequent siblings)
  17 siblings, 1 reply; 37+ messages in thread
From: Jakub Sitnicki @ 2026-01-05 12:14 UTC (permalink / raw)
  To: bpf
  Cc: netdev, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Alexei Starovoitov, Daniel Borkmann,
	Jesper Dangaard Brouer, John Fastabend, Stanislav Fomichev,
	Simon Horman, Andrii Nakryiko, Martin KaFai Lau, Eduard Zingerman,
	Song Liu, Yonghong Song, KP Singh, Hao Luo, Jiri Olsa,
	kernel-team

After decoupling metadata location from MAC header offset, a gap can appear
between metadata and skb->data on L2 decapsulation (e.g., VLAN, GRE). This
breaks the BPF data_meta pointer which assumes metadata is directly before
skb->data.

Introduce bpf_skb_meta_realign() kfunc to close the gap by moving metadata
to immediately precede the MAC header. Inject a call to it in
tc_cls_act_prologue() when the verifier detects data_meta access
(PA_F_DATA_META_LOAD flag).

Update skb_data_move() to handle the gap case: on skb_push(), move metadata
to the top of the head buffer; on skb_pull() where metadata is already
detached, leave it in place.

This restores data_meta functionality for TC programs while keeping the
performance benefit of avoiding memmove on L2 decapsulation for programs
that don't use data_meta.

Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
---
 include/linux/skbuff.h | 25 +++++++++++++++--------
 net/core/filter.c      | 55 ++++++++++++++++++++++++++++++++++++++++++++++++--
 2 files changed, 70 insertions(+), 10 deletions(-)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 6dd09f55a975..0fc4df42826e 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -4600,19 +4600,28 @@ static inline void skb_data_move(struct sk_buff *skb, const int len,
 	if (!meta_len)
 		goto no_metadata;
 
-	meta_end = skb_metadata_end(skb);
-	meta = meta_end - meta_len;
-
-	if (WARN_ON_ONCE(meta_end + len != skb->data ||
-			 meta_len > skb_headroom(skb))) {
+	/* Not enough headroom left for metadata. Drop it. */
+	if (WARN_ONCE(meta_len > skb_headroom(skb),
+		      "skb headroom smaller than metadata")) {
 		skb_metadata_clear(skb);
 		goto no_metadata;
 	}
 
-	memmove(meta + len, meta, meta_len + n);
-	skb_shinfo(skb)->meta_end += len;
-	return;
+	meta_end = skb_metadata_end(skb);
+	meta = meta_end - meta_len;
 
+	/* Metadata in front of data before push/pull. Keep it that way. */
+	if (meta_end == skb->data - len) {
+		memmove(meta + len, meta, meta_len + n);
+		skb_shinfo(skb)->meta_end += len;
+		return;
+	}
+
+	if (len < 0) {
+		/* Data pushed. Move metadata to the top. */
+		memmove(skb->head, meta, meta_len);
+		skb_shinfo(skb)->meta_end = meta_len;
+	}
 no_metadata:
 	memmove(skb->data, skb->data - len, n);
 }
diff --git a/net/core/filter.c b/net/core/filter.c
index 07af2a94cc9a..7f5bc6a505e1 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -9135,11 +9135,62 @@ static int bpf_gen_ld_abs(const struct bpf_insn *orig,
 	return insn - insn_buf;
 }
 
+__bpf_kfunc_start_defs();
+
+__bpf_kfunc void bpf_skb_meta_realign(struct __sk_buff *skb_)
+{
+	struct sk_buff *skb = (typeof(skb))skb_;
+	u8 *meta_end = skb_metadata_end(skb);
+	u8 meta_len = skb_metadata_len(skb);
+	u8 *meta;
+	int gap;
+
+	gap = skb_mac_header(skb) - meta_end;
+	if (!meta_len || !gap)
+		return;
+
+	if (WARN_ONCE(gap < 0, "skb metadata end past mac header")) {
+		skb_metadata_clear(skb);
+		return;
+	}
+
+	meta = meta_end - meta_len;
+	memmove(meta + gap, meta, meta_len);
+	skb_shinfo(skb)->meta_end += gap;
+
+	bpf_compute_data_pointers(skb);
+}
+
+__bpf_kfunc_end_defs();
+
+BTF_KFUNCS_START(tc_cls_act_hidden_ids)
+BTF_ID_FLAGS(func, bpf_skb_meta_realign)
+BTF_KFUNCS_END(tc_cls_act_hidden_ids)
+
+BTF_ID_LIST_SINGLE(bpf_skb_meta_realign_ids, func, bpf_skb_meta_realign)
+
 static int tc_cls_act_prologue(struct bpf_insn *insn_buf, u32 pkt_access_flags,
 			       const struct bpf_prog *prog)
 {
-	return bpf_unclone_prologue(insn_buf, pkt_access_flags, prog,
-				    TC_ACT_SHOT);
+	struct bpf_insn *insn = insn_buf;
+	int cnt;
+
+	if (pkt_access_flags & PA_F_DATA_META_LOAD) {
+		/* Realign skb metadata for access through data_meta pointer.
+		 *
+		 * r6 = r1; // r6 will be "u64 *ctx"
+		 * r0 = bpf_skb_meta_realign(r1); // r0 is undefined
+		 * r1 = r6;
+		 */
+		*insn++ = BPF_MOV64_REG(BPF_REG_6, BPF_REG_1);
+		*insn++ = BPF_CALL_KFUNC(0, bpf_skb_meta_realign_ids[0]);
+		*insn++ = BPF_MOV64_REG(BPF_REG_1, BPF_REG_6);
+	}
+	cnt = bpf_unclone_prologue(insn, pkt_access_flags, prog, TC_ACT_SHOT);
+	if (!cnt && insn > insn_buf)
+		*insn++ = prog->insnsi[0];
+
+	return cnt + insn - insn_buf;
 }
 
 static bool tc_cls_act_is_valid_access(int off, int size,

-- 
2.43.0


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH bpf-next v2 16/16] selftests/bpf: Test skb metadata access after L2 decapsulation
  2026-01-05 12:14 [PATCH bpf-next v2 00/16] Decouple skb metadata tracking from MAC header offset Jakub Sitnicki
                   ` (14 preceding siblings ...)
  2026-01-05 12:14 ` [PATCH bpf-next v2 15/16] bpf: Realign skb metadata for TC progs using data_meta Jakub Sitnicki
@ 2026-01-05 12:14 ` Jakub Sitnicki
  2026-01-10 21:07 ` [PATCH bpf-next v2 00/16] Decouple skb metadata tracking from MAC header offset Jakub Sitnicki
  2026-01-14 11:52 ` Jakub Sitnicki
  17 siblings, 0 replies; 37+ messages in thread
From: Jakub Sitnicki @ 2026-01-05 12:14 UTC (permalink / raw)
  To: bpf
  Cc: netdev, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Alexei Starovoitov, Daniel Borkmann,
	Jesper Dangaard Brouer, John Fastabend, Stanislav Fomichev,
	Simon Horman, Andrii Nakryiko, Martin KaFai Lau, Eduard Zingerman,
	Song Liu, Yonghong Song, KP Singh, Hao Luo, Jiri Olsa,
	kernel-team

Add tests to verify that XDP/skb metadata remains accessible after L2
decapsulation now that metadata tracking has been decoupled from the MAC
header offset.

Use a 2-netns setup to test each L2 decapsulation path that resets the MAC
header offset: GRE (IPv4/IPv6), VXLAN, GENEVE, L2TPv3, VLAN, QinQ, and
MPLS. SRv6 and NSH are not tested due to setup complexity (SRv6 requires 4
namespaces according to selftests/net/srv6_hl2encap_red_l2vpn_test.sh, NSH
requires Open vSwitch).

For each encapsulation type, test three access scenarios after L2 decap:
- direct skb->data_meta pointer access
- dynptr-based metadata access (bpf_dynptr_from_skb_meta)
- metadata access after move or skb head realloc (bpf_skb_adjust_room)

Change test_xdp_meta.c to identify test packets by payload content instead
of source MAC address, since L2 encapsulation pushes its own MAC header.

Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
---
 tools/testing/selftests/bpf/config                 |   6 +-
 .../bpf/prog_tests/xdp_context_test_run.c          | 292 +++++++++++++++++++++
 tools/testing/selftests/bpf/progs/test_xdp_meta.c  |  48 ++--
 3 files changed, 325 insertions(+), 21 deletions(-)

diff --git a/tools/testing/selftests/bpf/config b/tools/testing/selftests/bpf/config
index 558839e3c185..f867b7eff085 100644
--- a/tools/testing/selftests/bpf/config
+++ b/tools/testing/selftests/bpf/config
@@ -130,4 +130,8 @@ CONFIG_INFINIBAND=y
 CONFIG_SMC=y
 CONFIG_SMC_HS_CTRL_BPF=y
 CONFIG_DIBS=y
-CONFIG_DIBS_LO=y
\ No newline at end of file
+CONFIG_DIBS_LO=y
+CONFIG_L2TP=y
+CONFIG_L2TP_V3=y
+CONFIG_L2TP_IP=y
+CONFIG_L2TP_ETH=y
diff --git a/tools/testing/selftests/bpf/prog_tests/xdp_context_test_run.c b/tools/testing/selftests/bpf/prog_tests/xdp_context_test_run.c
index ee94c281888a..dc7f936216db 100644
--- a/tools/testing/selftests/bpf/prog_tests/xdp_context_test_run.c
+++ b/tools/testing/selftests/bpf/prog_tests/xdp_context_test_run.c
@@ -8,9 +8,14 @@
 #define TX_NAME "veth1"
 #define TX_NETNS "xdp_context_tx"
 #define RX_NETNS "xdp_context_rx"
+#define RX_MAC "02:00:00:00:00:01"
+#define TX_MAC "02:00:00:00:00:02"
 #define TAP_NAME "tap0"
 #define DUMMY_NAME "dum0"
 #define TAP_NETNS "xdp_context_tuntap"
+#define ENCAP_DEV "encap"
+#define DECAP_RX_NETNS "xdp_context_decap_rx"
+#define DECAP_TX_NETNS "xdp_context_decap_tx"
 
 #define TEST_PAYLOAD_LEN 32
 static const __u8 test_payload[TEST_PAYLOAD_LEN] = {
@@ -127,6 +132,7 @@ static int send_test_packet(int ifindex)
 	/* We use the Ethernet header only to identify the test packet */
 	struct ethhdr eth = {
 		.h_source = { 0x12, 0x34, 0xDE, 0xAD, 0xBE, 0xEF },
+		.h_proto = htons(ETH_P_IP),
 	};
 
 	memcpy(packet, &eth, sizeof(eth));
@@ -155,6 +161,34 @@ static int send_test_packet(int ifindex)
 	return -1;
 }
 
+static int send_routed_packet(int af, const char *ip)
+{
+	struct sockaddr_storage addr;
+	socklen_t alen;
+	int r, sock = -1;
+
+	r = make_sockaddr(af, ip, 42, &addr, &alen);
+	if (!ASSERT_OK(r, "make_sockaddr"))
+		goto err;
+
+	sock = socket(af, SOCK_DGRAM, 0);
+	if (!ASSERT_OK_FD(sock, "socket"))
+		goto err;
+
+	r = sendto(sock, test_payload, sizeof(test_payload), 0,
+		   (struct sockaddr *)&addr, alen);
+	if (!ASSERT_EQ(r, sizeof(test_payload), "sendto"))
+		goto err;
+
+	close(sock);
+	return 0;
+
+err:
+	if (sock >= 0)
+		close(sock);
+	return -1;
+}
+
 static int write_test_packet(int tap_fd)
 {
 	__u8 packet[sizeof(struct ethhdr) + TEST_PAYLOAD_LEN];
@@ -510,3 +544,261 @@ void test_xdp_context_tuntap(void)
 
 	test_xdp_meta__destroy(skel);
 }
+
+enum l2_encap_type {
+	GRE4_ENCAP,
+	GRE6_ENCAP,
+	VXLAN_ENCAP,
+	GENEVE_ENCAP,
+	L2TPV3_ENCAP,
+	VLAN_ENCAP,
+	QINQ_ENCAP,
+	MPLS_ENCAP,
+};
+
+static bool l2_encap_uses_ipv6(enum l2_encap_type encap_type)
+{
+	return encap_type == GRE6_ENCAP;
+}
+
+static bool l2_encap_uses_routing(enum l2_encap_type encap_type)
+{
+	return encap_type == MPLS_ENCAP;
+}
+
+static bool setup_l2_encap_dev(enum l2_encap_type encap_type,
+			       const char *encap_dev, const char *lower_dev,
+			       const char *net_prefix, const char *local_addr,
+			       const char *remote_addr)
+{
+	switch (encap_type) {
+	case GRE4_ENCAP:
+		SYS(fail, "ip link add %s type gretap local %s remote %s",
+		    encap_dev, local_addr, remote_addr);
+		return true;
+
+	case GRE6_ENCAP:
+		SYS(fail, "ip link add %s type ip6gretap local %s remote %s",
+		    encap_dev, local_addr, remote_addr);
+		return true;
+
+	case VXLAN_ENCAP:
+		SYS(fail,
+		    "ip link add %s type vxlan id 42 local %s remote %s dstport 4789",
+		    encap_dev, local_addr, remote_addr);
+		return true;
+
+	case GENEVE_ENCAP:
+		SYS(fail,
+		    "ip link add %s type geneve id 42 remote %s",
+		    encap_dev, remote_addr);
+		return true;
+
+	case L2TPV3_ENCAP:
+		SYS(fail,
+		    "ip l2tp add tunnel tunnel_id 42 peer_tunnel_id 42 encap ip local %s remote %s",
+		    local_addr, remote_addr);
+		SYS(fail,
+		    "ip l2tp add session name %s tunnel_id 42 session_id 42 peer_session_id 42",
+		    encap_dev);
+		return true;
+
+	case VLAN_ENCAP:
+		SYS(fail, "ip link set dev %s down", lower_dev);
+		SYS(fail, "ethtool -K %s rx-vlan-hw-parse off", lower_dev);
+		SYS(fail, "ethtool -K %s tx-vlan-hw-insert off", lower_dev);
+		SYS(fail, "ip link set dev %s up", lower_dev);
+		SYS(fail, "ip link add %s link %s type vlan id 42", encap_dev,
+		    lower_dev);
+		return true;
+
+	case QINQ_ENCAP:
+		SYS(fail, "ip link set dev %s down", lower_dev);
+		SYS(fail, "ethtool -K %s rx-vlan-hw-parse off", lower_dev);
+		SYS(fail, "ethtool -K %s tx-vlan-hw-insert off", lower_dev);
+		SYS(fail, "ethtool -K %s rx-vlan-stag-hw-parse off", lower_dev);
+		SYS(fail, "ethtool -K %s tx-vlan-stag-hw-insert off", lower_dev);
+		SYS(fail, "ip link set dev %s up", lower_dev);
+		SYS(fail, "ip link add vlan.100 link %s type vlan proto 802.1ad id 100", lower_dev);
+		SYS(fail, "ip link set dev vlan.100 up");
+		SYS(fail, "ip link add %s link vlan.100 type vlan id 42", encap_dev);
+		return true;
+
+	case MPLS_ENCAP:
+		SYS(fail, "sysctl -wq net.mpls.platform_labels=65535");
+		SYS(fail, "sysctl -wq net.mpls.conf.%s.input=1", lower_dev);
+		SYS(fail, "ip route change %s encap mpls 42 via %s", net_prefix, remote_addr);
+		SYS(fail, "ip -f mpls route add 42 dev lo");
+		SYS(fail, "ip link set dev lo name %s", encap_dev);
+
+		return true;
+	}
+fail:
+	return false;
+}
+
+static void test_l2_decap(enum l2_encap_type encap_type,
+			  struct bpf_program *xdp_prog,
+			  struct bpf_program *tc_prog, bool *test_pass)
+{
+	LIBBPF_OPTS(bpf_tc_hook, tc_hook, .attach_point = BPF_TC_INGRESS);
+	LIBBPF_OPTS(bpf_tc_opts, tc_opts, .handle = 1, .priority = 1);
+	const char *net, *rx_ip, *tx_ip, *addr_opts;
+	int af, plen;
+	struct netns_obj *rx_ns = NULL, *tx_ns = NULL;
+	struct nstoken *nstoken = NULL;
+	int lower_ifindex, upper_ifindex;
+	int ret;
+
+	if (l2_encap_uses_ipv6(encap_type)) {
+		af = AF_INET6;
+		net = "fd00::/64";
+		rx_ip = "fd00::1";
+		tx_ip = "fd00::2";
+		plen = 64;
+		addr_opts = "nodad";
+	} else {
+		af = AF_INET;
+		net = "192.0.2.0/24";
+		rx_ip = "192.0.2.1";
+		tx_ip = "192.0.2.2";
+		plen = 24;
+		addr_opts = "";
+	}
+
+	*test_pass = false;
+
+	rx_ns = netns_new(DECAP_RX_NETNS, false);
+	if (!ASSERT_OK_PTR(rx_ns, "create rx_ns"))
+		return;
+
+	tx_ns = netns_new(DECAP_TX_NETNS, false);
+	if (!ASSERT_OK_PTR(tx_ns, "create tx_ns"))
+		goto close;
+
+	SYS(close, "ip link add " RX_NAME " address " RX_MAC " netns " DECAP_RX_NETNS
+		   " type veth peer name " TX_NAME " address " TX_MAC " netns " DECAP_TX_NETNS);
+
+	nstoken = open_netns(DECAP_RX_NETNS);
+	if (!ASSERT_OK_PTR(nstoken, "setns rx_ns"))
+		goto close;
+
+	SYS(close, "ip addr add %s/%u dev %s %s", rx_ip, plen, RX_NAME, addr_opts);
+	SYS(close, "ip link set dev %s up", RX_NAME);
+
+	if (!setup_l2_encap_dev(encap_type, ENCAP_DEV, RX_NAME, net, rx_ip, tx_ip))
+		goto close;
+	SYS(close, "ip link set dev %s up", ENCAP_DEV);
+
+	lower_ifindex = if_nametoindex(RX_NAME);
+	if (!ASSERT_GE(lower_ifindex, 0, "if_nametoindex lower"))
+		goto close;
+
+	upper_ifindex = if_nametoindex(ENCAP_DEV);
+	if (!ASSERT_GE(upper_ifindex, 0, "if_nametoindex upper"))
+		goto close;
+
+	ret = bpf_xdp_attach(lower_ifindex, bpf_program__fd(xdp_prog), 0, NULL);
+	if (!ASSERT_GE(ret, 0, "bpf_xdp_attach"))
+		goto close;
+
+	tc_hook.ifindex = upper_ifindex;
+	ret = bpf_tc_hook_create(&tc_hook);
+	if (!ASSERT_OK(ret, "bpf_tc_hook_create"))
+		goto close;
+
+	tc_opts.prog_fd = bpf_program__fd(tc_prog);
+	ret = bpf_tc_attach(&tc_hook, &tc_opts);
+	if (!ASSERT_OK(ret, "bpf_tc_attach"))
+		goto close;
+
+	close_netns(nstoken);
+
+	nstoken = open_netns(DECAP_TX_NETNS);
+	if (!ASSERT_OK_PTR(nstoken, "setns tx_ns"))
+		goto close;
+
+	SYS(close, "ip addr add %s/%u dev %s %s", tx_ip, plen, TX_NAME, addr_opts);
+	SYS(close, "ip neigh add %s lladdr %s nud permanent dev %s", rx_ip, RX_MAC, TX_NAME);
+	SYS(close, "ip link set dev %s up", TX_NAME);
+
+	if (!setup_l2_encap_dev(encap_type, ENCAP_DEV, TX_NAME, net, tx_ip, rx_ip))
+		goto close;
+	SYS(close, "ip link set dev %s up", ENCAP_DEV);
+
+	upper_ifindex = if_nametoindex(ENCAP_DEV);
+	if (!ASSERT_GE(upper_ifindex, 0, "if_nametoindex upper"))
+		goto close;
+
+	if (l2_encap_uses_routing(encap_type))
+		ret = send_routed_packet(af, rx_ip);
+	else
+		ret = send_test_packet(upper_ifindex);
+	if (!ASSERT_OK(ret, "send packet"))
+		goto close;
+
+	if (!ASSERT_TRUE(*test_pass, "test_pass"))
+		dump_err_stream(tc_prog);
+
+close:
+	close_netns(nstoken);
+	netns_free(rx_ns);
+	netns_free(tx_ns);
+}
+
+__printf(1, 2) static bool start_subtest(const char *fmt, ...)
+{
+	char *subtest_name;
+	va_list ap;
+	int r;
+
+	va_start(ap, fmt);
+	r = vasprintf(&subtest_name, fmt, ap);
+	va_end(ap);
+	if (!ASSERT_GE(r, 0, "format string"))
+		return false;
+
+	r = test__start_subtest(subtest_name);
+	free(subtest_name);
+	return r;
+}
+
+void test_xdp_context_l2_decap(void)
+{
+	const struct test {
+		enum l2_encap_type encap_type;
+		const char *encap_name;
+	} tests[] = {
+		{ GRE4_ENCAP, "gre4" },
+		{ GRE6_ENCAP, "gre6" },
+		{ VXLAN_ENCAP, "vxlan" },
+		{ GENEVE_ENCAP, "geneve" },
+		{ L2TPV3_ENCAP, "l2tpv3" },
+		{ VLAN_ENCAP, "vlan" },
+		{ QINQ_ENCAP, "qinq" },
+		{ MPLS_ENCAP, "mpls" },
+	};
+	struct test_xdp_meta *skel;
+	const struct test *t;
+
+	skel = test_xdp_meta__open_and_load();
+	if (!ASSERT_OK_PTR(skel, "open and load skeleton"))
+		return;
+
+	for (t = tests; t < tests + ARRAY_SIZE(tests); t++) {
+		if (start_subtest("%s_direct_access", t->encap_name))
+			test_l2_decap(t->encap_type, skel->progs.ing_xdp,
+				      skel->progs.ing_cls,
+				      &skel->bss->test_pass);
+		if (start_subtest("%s_dynptr_read", t->encap_name))
+			test_l2_decap(t->encap_type, skel->progs.ing_xdp,
+				      skel->progs.ing_cls_dynptr_read,
+				      &skel->bss->test_pass);
+		if (start_subtest("%s_helper_adjust_room", t->encap_name))
+			test_l2_decap(t->encap_type, skel->progs.ing_xdp,
+				      skel->progs.helper_skb_adjust_room,
+				      &skel->bss->test_pass);
+	}
+
+	test_xdp_meta__destroy(skel);
+}
diff --git a/tools/testing/selftests/bpf/progs/test_xdp_meta.c b/tools/testing/selftests/bpf/progs/test_xdp_meta.c
index 0a0f371a2dec..69130e250e84 100644
--- a/tools/testing/selftests/bpf/progs/test_xdp_meta.c
+++ b/tools/testing/selftests/bpf/progs/test_xdp_meta.c
@@ -280,18 +280,35 @@ int ing_cls_dynptr_offset_oob(struct __sk_buff *ctx)
 	return TC_ACT_SHOT;
 }
 
+/* Test packets carry test metadata pattern as payload. */
+static bool is_test_packet(struct xdp_md *ctx)
+{
+	__u8 meta_have[META_SIZE];
+	__u32 len;
+	int ret;
+
+	len = bpf_xdp_get_buff_len(ctx);
+	if (len < META_SIZE)
+		return false;
+	ret = bpf_xdp_load_bytes(ctx, len - META_SIZE, meta_have, META_SIZE);
+	if (ret)
+		return false;
+	ret = __builtin_memcmp(meta_have, meta_want, META_SIZE);
+	if (ret)
+		return false;
+
+	return true;
+}
+
 /* Reserve and clear space for metadata but don't populate it */
 SEC("xdp")
 int ing_xdp_zalloc_meta(struct xdp_md *ctx)
 {
-	struct ethhdr *eth = ctx_ptr(ctx, data);
 	__u8 *meta;
 	int ret;
 
 	/* Drop any non-test packets */
-	if (eth + 1 > ctx_ptr(ctx, data_end))
-		return XDP_DROP;
-	if (!check_smac(eth))
+	if (!is_test_packet(ctx))
 		return XDP_DROP;
 
 	ret = bpf_xdp_adjust_meta(ctx, -META_SIZE);
@@ -310,33 +327,24 @@ int ing_xdp_zalloc_meta(struct xdp_md *ctx)
 SEC("xdp")
 int ing_xdp(struct xdp_md *ctx)
 {
-	__u8 *data, *data_meta, *data_end, *payload;
-	struct ethhdr *eth;
+	__u8 *data, *data_meta;
 	int ret;
 
+	/* Drop any non-test packets */
+	if (!is_test_packet(ctx))
+		return XDP_DROP;
+
 	ret = bpf_xdp_adjust_meta(ctx, -META_SIZE);
 	if (ret < 0)
 		return XDP_DROP;
 
 	data_meta = ctx_ptr(ctx, data_meta);
-	data_end  = ctx_ptr(ctx, data_end);
 	data      = ctx_ptr(ctx, data);
 
-	eth = (struct ethhdr *)data;
-	payload = data + sizeof(struct ethhdr);
-
-	if (payload + META_SIZE > data_end ||
-	    data_meta + META_SIZE > data)
-		return XDP_DROP;
-
-	/* The Linux networking stack may send other packets on the test
-	 * interface that interfere with the test. Just drop them.
-	 * The test packets can be recognized by their source MAC address.
-	 */
-	if (!check_smac(eth))
+	if (data_meta + META_SIZE > data)
 		return XDP_DROP;
 
-	__builtin_memcpy(data_meta, payload, META_SIZE);
+	__builtin_memcpy(data_meta, meta_want, META_SIZE);
 	return XDP_PASS;
 }
 

-- 
2.43.0


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* Re: [PATCH bpf-next v2 11/16] bpf, verifier: Remove side effects from may_access_direct_pkt_data
  2026-01-05 12:14 ` [PATCH bpf-next v2 11/16] bpf, verifier: Remove side effects from may_access_direct_pkt_data Jakub Sitnicki
@ 2026-01-05 18:23   ` Eduard Zingerman
  0 siblings, 0 replies; 37+ messages in thread
From: Eduard Zingerman @ 2026-01-05 18:23 UTC (permalink / raw)
  To: Jakub Sitnicki, bpf
  Cc: netdev, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Alexei Starovoitov, Daniel Borkmann,
	Jesper Dangaard Brouer, John Fastabend, Stanislav Fomichev,
	Simon Horman, Andrii Nakryiko, Martin KaFai Lau, Song Liu,
	Yonghong Song, KP Singh, Hao Luo, Jiri Olsa, kernel-team

On Mon, 2026-01-05 at 13:14 +0100, Jakub Sitnicki wrote:
> The may_access_direct_pkt_data() helper sets env->seen_direct_write as a
> side effect, which creates awkward calling patterns:
> 
> - check_special_kfunc() has a comment warning readers about the side effect
> - specialize_kfunc() must save and restore the flag around the call
> 
> Make the helper a pure function by moving the seen_direct_write flag
> setting to call sites that need it.
> 
> Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
> ---

Acked-by: Eduard Zingerman <eddyz87@gmail.com>

>  kernel/bpf/verifier.c | 33 ++++++++++++---------------------
>  1 file changed, 12 insertions(+), 21 deletions(-)
> 
> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index 9394b0de2ef0..52d76a848f65 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -6151,13 +6151,9 @@ static bool may_access_direct_pkt_data(struct bpf_verifier_env *env,
>  		if (meta)
>  			return meta->pkt_access;
>  
> -		env->seen_direct_write = true;

Note to reviewers:
the call to may_access_direct_pkt_data() in check_func_arg() always
has a non-NULL 'meta', so it is correct to skip setting
'env->seen_direct_write' there, behavior does not change.

>  		return true;
>  
>  	case BPF_PROG_TYPE_CGROUP_SOCKOPT:
> -		if (t == BPF_WRITE)
> -			env->seen_direct_write = true;
> -
>  		return true;
>  
>  	default:

[...]

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH bpf-next v2 14/16] bpf, verifier: Track when data_meta pointer is loaded
  2026-01-05 12:14 ` [PATCH bpf-next v2 14/16] bpf, verifier: Track when data_meta pointer is loaded Jakub Sitnicki
@ 2026-01-05 18:48   ` Eduard Zingerman
  0 siblings, 0 replies; 37+ messages in thread
From: Eduard Zingerman @ 2026-01-05 18:48 UTC (permalink / raw)
  To: Jakub Sitnicki, bpf
  Cc: netdev, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Alexei Starovoitov, Daniel Borkmann,
	Jesper Dangaard Brouer, John Fastabend, Stanislav Fomichev,
	Simon Horman, Andrii Nakryiko, Martin KaFai Lau, Song Liu,
	Yonghong Song, KP Singh, Hao Luo, Jiri Olsa, kernel-team

On Mon, 2026-01-05 at 13:14 +0100, Jakub Sitnicki wrote:
> Introduce PA_F_DATA_META_LOAD flag to track when a BPF program loads the
> skb->data_meta pointer.
> 
> This information will be used by gen_prologue() to handle cases where there
> is a gap between metadata end and skb->data, requiring metadata to be
> realigned.
> 
> Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
> ---

Reviewed-by: Eduard Zingerman <eddyz87@gmail.com>

[...]

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH bpf-next v2 12/16] bpf, verifier: Turn seen_direct_write flag into a bitmap
  2026-01-05 12:14 ` [PATCH bpf-next v2 12/16] bpf, verifier: Turn seen_direct_write flag into a bitmap Jakub Sitnicki
@ 2026-01-05 18:48   ` Eduard Zingerman
  0 siblings, 0 replies; 37+ messages in thread
From: Eduard Zingerman @ 2026-01-05 18:48 UTC (permalink / raw)
  To: Jakub Sitnicki, bpf
  Cc: netdev, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Alexei Starovoitov, Daniel Borkmann,
	Jesper Dangaard Brouer, John Fastabend, Stanislav Fomichev,
	Simon Horman, Andrii Nakryiko, Martin KaFai Lau, Song Liu,
	Yonghong Song, KP Singh, Hao Luo, Jiri Olsa, kernel-team

On Mon, 2026-01-05 at 13:14 +0100, Jakub Sitnicki wrote:
> Convert seen_direct_write from a boolean to a bitmap (seen_packet_access)
> in preparation for tracking additional packet access patterns.
> 
> No functional change.
> 
> Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
> ---

Reviewed-by: Eduard Zingerman <eddyz87@gmail.com>

[...]

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH bpf-next v2 13/16] bpf, verifier: Propagate packet access flags to gen_prologue
  2026-01-05 12:14 ` [PATCH bpf-next v2 13/16] bpf, verifier: Propagate packet access flags to gen_prologue Jakub Sitnicki
@ 2026-01-05 18:49   ` Eduard Zingerman
  0 siblings, 0 replies; 37+ messages in thread
From: Eduard Zingerman @ 2026-01-05 18:49 UTC (permalink / raw)
  To: Jakub Sitnicki, bpf
  Cc: netdev, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Alexei Starovoitov, Daniel Borkmann,
	Jesper Dangaard Brouer, John Fastabend, Stanislav Fomichev,
	Simon Horman, Andrii Nakryiko, Martin KaFai Lau, Song Liu,
	Yonghong Song, KP Singh, Hao Luo, Jiri Olsa, kernel-team

On Mon, 2026-01-05 at 13:14 +0100, Jakub Sitnicki wrote:
> Change gen_prologue() to accept the packet access flags bitmap. This allows
> gen_prologue() to inspect multiple access patterns when needed.
> 
> No functional change.
> 
> Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
> ---

Reviewed-by: Eduard Zingerman <eddyz87@gmail.com>

[...]

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH bpf-next v2 15/16] bpf: Realign skb metadata for TC progs using data_meta
  2026-01-05 12:14 ` [PATCH bpf-next v2 15/16] bpf: Realign skb metadata for TC progs using data_meta Jakub Sitnicki
@ 2026-01-05 19:14   ` Alexei Starovoitov
  2026-01-05 19:20     ` Jakub Sitnicki
  2026-01-05 19:42     ` Amery Hung
  0 siblings, 2 replies; 37+ messages in thread
From: Alexei Starovoitov @ 2026-01-05 19:14 UTC (permalink / raw)
  To: Jakub Sitnicki, Amery Hung, Martin KaFai Lau
  Cc: bpf, Network Development, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Alexei Starovoitov, Daniel Borkmann,
	Jesper Dangaard Brouer, John Fastabend, Stanislav Fomichev,
	Simon Horman, Andrii Nakryiko, Martin KaFai Lau, Eduard Zingerman,
	Song Liu, Yonghong Song, KP Singh, Hao Luo, Jiri Olsa,
	kernel-team

On Mon, Jan 5, 2026 at 4:15 AM Jakub Sitnicki <jakub@cloudflare.com> wrote:
>
>
> +__bpf_kfunc_start_defs();
> +
> +__bpf_kfunc void bpf_skb_meta_realign(struct __sk_buff *skb_)
> +{
> +       struct sk_buff *skb = (typeof(skb))skb_;
> +       u8 *meta_end = skb_metadata_end(skb);
> +       u8 meta_len = skb_metadata_len(skb);
> +       u8 *meta;
> +       int gap;
> +
> +       gap = skb_mac_header(skb) - meta_end;
> +       if (!meta_len || !gap)
> +               return;
> +
> +       if (WARN_ONCE(gap < 0, "skb metadata end past mac header")) {
> +               skb_metadata_clear(skb);
> +               return;
> +       }
> +
> +       meta = meta_end - meta_len;
> +       memmove(meta + gap, meta, meta_len);
> +       skb_shinfo(skb)->meta_end += gap;
> +
> +       bpf_compute_data_pointers(skb);
> +}
> +
> +__bpf_kfunc_end_defs();
> +
> +BTF_KFUNCS_START(tc_cls_act_hidden_ids)
> +BTF_ID_FLAGS(func, bpf_skb_meta_realign)
> +BTF_KFUNCS_END(tc_cls_act_hidden_ids)
> +
> +BTF_ID_LIST_SINGLE(bpf_skb_meta_realign_ids, func, bpf_skb_meta_realign)
> +
>  static int tc_cls_act_prologue(struct bpf_insn *insn_buf, u32 pkt_access_flags,
>                                const struct bpf_prog *prog)
>  {
> -       return bpf_unclone_prologue(insn_buf, pkt_access_flags, prog,
> -                                   TC_ACT_SHOT);
> +       struct bpf_insn *insn = insn_buf;
> +       int cnt;
> +
> +       if (pkt_access_flags & PA_F_DATA_META_LOAD) {
> +               /* Realign skb metadata for access through data_meta pointer.
> +                *
> +                * r6 = r1; // r6 will be "u64 *ctx"
> +                * r0 = bpf_skb_meta_realign(r1); // r0 is undefined
> +                * r1 = r6;
> +                */
> +               *insn++ = BPF_MOV64_REG(BPF_REG_6, BPF_REG_1);
> +               *insn++ = BPF_CALL_KFUNC(0, bpf_skb_meta_realign_ids[0]);
> +               *insn++ = BPF_MOV64_REG(BPF_REG_1, BPF_REG_6);
> +       }

I see that we already did this hack with bpf_qdisc_init_prologue()
and bpf_qdisc_reset_destroy_epilogue().
Not sure why we went that route back then.

imo much cleaner to do BPF_EMIT_CALL() and wrap
BPF_CALL_1(bpf_skb_meta_realign, struct sk_buff *, skb)

BPF_CALL_x doesn't make it an uapi helper.
It's still a hidden kernel function,
while this kfunc stuff looks wrong, since kfunc isn't really hidden.

I suspect progs can call this bpf_skb_meta_realign() explicitly,
just like they can call bpf_qdisc_init_prologue() ?

cc Amery, Martin.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH bpf-next v2 15/16] bpf: Realign skb metadata for TC progs using data_meta
  2026-01-05 19:14   ` Alexei Starovoitov
@ 2026-01-05 19:20     ` Jakub Sitnicki
  2026-01-05 19:42     ` Amery Hung
  1 sibling, 0 replies; 37+ messages in thread
From: Jakub Sitnicki @ 2026-01-05 19:20 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Amery Hung, Martin KaFai Lau, bpf, Network Development,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Alexei Starovoitov, Daniel Borkmann, Jesper Dangaard Brouer,
	John Fastabend, Stanislav Fomichev, Simon Horman, Andrii Nakryiko,
	Martin KaFai Lau, Eduard Zingerman, Song Liu, Yonghong Song,
	KP Singh, Hao Luo, Jiri Olsa, kernel-team

On Mon, Jan 05, 2026 at 11:14 AM -08, Alexei Starovoitov wrote:
> On Mon, Jan 5, 2026 at 4:15 AM Jakub Sitnicki <jakub@cloudflare.com> wrote:
>>
>>
>> +__bpf_kfunc_start_defs();
>> +
>> +__bpf_kfunc void bpf_skb_meta_realign(struct __sk_buff *skb_)
>> +{
>> +       struct sk_buff *skb = (typeof(skb))skb_;
>> +       u8 *meta_end = skb_metadata_end(skb);
>> +       u8 meta_len = skb_metadata_len(skb);
>> +       u8 *meta;
>> +       int gap;
>> +
>> +       gap = skb_mac_header(skb) - meta_end;
>> +       if (!meta_len || !gap)
>> +               return;
>> +
>> +       if (WARN_ONCE(gap < 0, "skb metadata end past mac header")) {
>> +               skb_metadata_clear(skb);
>> +               return;
>> +       }
>> +
>> +       meta = meta_end - meta_len;
>> +       memmove(meta + gap, meta, meta_len);
>> +       skb_shinfo(skb)->meta_end += gap;
>> +
>> +       bpf_compute_data_pointers(skb);
>> +}
>> +
>> +__bpf_kfunc_end_defs();
>> +
>> +BTF_KFUNCS_START(tc_cls_act_hidden_ids)
>> +BTF_ID_FLAGS(func, bpf_skb_meta_realign)
>> +BTF_KFUNCS_END(tc_cls_act_hidden_ids)
>> +
>> +BTF_ID_LIST_SINGLE(bpf_skb_meta_realign_ids, func, bpf_skb_meta_realign)
>> +
>>  static int tc_cls_act_prologue(struct bpf_insn *insn_buf, u32 pkt_access_flags,
>>                                const struct bpf_prog *prog)
>>  {
>> -       return bpf_unclone_prologue(insn_buf, pkt_access_flags, prog,
>> -                                   TC_ACT_SHOT);
>> +       struct bpf_insn *insn = insn_buf;
>> +       int cnt;
>> +
>> +       if (pkt_access_flags & PA_F_DATA_META_LOAD) {
>> +               /* Realign skb metadata for access through data_meta pointer.
>> +                *
>> +                * r6 = r1; // r6 will be "u64 *ctx"
>> +                * r0 = bpf_skb_meta_realign(r1); // r0 is undefined
>> +                * r1 = r6;
>> +                */
>> +               *insn++ = BPF_MOV64_REG(BPF_REG_6, BPF_REG_1);
>> +               *insn++ = BPF_CALL_KFUNC(0, bpf_skb_meta_realign_ids[0]);
>> +               *insn++ = BPF_MOV64_REG(BPF_REG_1, BPF_REG_6);
>> +       }
>
> I see that we already did this hack with bpf_qdisc_init_prologue()
> and bpf_qdisc_reset_destroy_epilogue().
> Not sure why we went that route back then.

Right. That's where I "stole" it from.

> imo much cleaner to do BPF_EMIT_CALL() and wrap
> BPF_CALL_1(bpf_skb_meta_realign, struct sk_buff *, skb)
>
> BPF_CALL_x doesn't make it an uapi helper.
> It's still a hidden kernel function,
> while this kfunc stuff looks wrong, since kfunc isn't really hidden.
>
> I suspect progs can call this bpf_skb_meta_realign() explicitly,
> just like they can call bpf_qdisc_init_prologue() ?

Thanks for pointing me in the right direction. Will rework this.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH bpf-next v2 15/16] bpf: Realign skb metadata for TC progs using data_meta
  2026-01-05 19:14   ` Alexei Starovoitov
  2026-01-05 19:20     ` Jakub Sitnicki
@ 2026-01-05 19:42     ` Amery Hung
  2026-01-05 20:02       ` Alexei Starovoitov
  2026-01-05 20:54       ` Martin KaFai Lau
  1 sibling, 2 replies; 37+ messages in thread
From: Amery Hung @ 2026-01-05 19:42 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Jakub Sitnicki, Martin KaFai Lau, bpf, Network Development,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Alexei Starovoitov, Daniel Borkmann, Jesper Dangaard Brouer,
	John Fastabend, Stanislav Fomichev, Simon Horman, Andrii Nakryiko,
	Martin KaFai Lau, Eduard Zingerman, Song Liu, Yonghong Song,
	KP Singh, Hao Luo, Jiri Olsa, kernel-team

On Mon, Jan 5, 2026 at 11:14 AM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Mon, Jan 5, 2026 at 4:15 AM Jakub Sitnicki <jakub@cloudflare.com> wrote:
> >
> >
> > +__bpf_kfunc_start_defs();
> > +
> > +__bpf_kfunc void bpf_skb_meta_realign(struct __sk_buff *skb_)
> > +{
> > +       struct sk_buff *skb = (typeof(skb))skb_;
> > +       u8 *meta_end = skb_metadata_end(skb);
> > +       u8 meta_len = skb_metadata_len(skb);
> > +       u8 *meta;
> > +       int gap;
> > +
> > +       gap = skb_mac_header(skb) - meta_end;
> > +       if (!meta_len || !gap)
> > +               return;
> > +
> > +       if (WARN_ONCE(gap < 0, "skb metadata end past mac header")) {
> > +               skb_metadata_clear(skb);
> > +               return;
> > +       }
> > +
> > +       meta = meta_end - meta_len;
> > +       memmove(meta + gap, meta, meta_len);
> > +       skb_shinfo(skb)->meta_end += gap;
> > +
> > +       bpf_compute_data_pointers(skb);
> > +}
> > +
> > +__bpf_kfunc_end_defs();
> > +
> > +BTF_KFUNCS_START(tc_cls_act_hidden_ids)
> > +BTF_ID_FLAGS(func, bpf_skb_meta_realign)
> > +BTF_KFUNCS_END(tc_cls_act_hidden_ids)
> > +
> > +BTF_ID_LIST_SINGLE(bpf_skb_meta_realign_ids, func, bpf_skb_meta_realign)
> > +
> >  static int tc_cls_act_prologue(struct bpf_insn *insn_buf, u32 pkt_access_flags,
> >                                const struct bpf_prog *prog)
> >  {
> > -       return bpf_unclone_prologue(insn_buf, pkt_access_flags, prog,
> > -                                   TC_ACT_SHOT);
> > +       struct bpf_insn *insn = insn_buf;
> > +       int cnt;
> > +
> > +       if (pkt_access_flags & PA_F_DATA_META_LOAD) {
> > +               /* Realign skb metadata for access through data_meta pointer.
> > +                *
> > +                * r6 = r1; // r6 will be "u64 *ctx"
> > +                * r0 = bpf_skb_meta_realign(r1); // r0 is undefined
> > +                * r1 = r6;
> > +                */
> > +               *insn++ = BPF_MOV64_REG(BPF_REG_6, BPF_REG_1);
> > +               *insn++ = BPF_CALL_KFUNC(0, bpf_skb_meta_realign_ids[0]);
> > +               *insn++ = BPF_MOV64_REG(BPF_REG_1, BPF_REG_6);
> > +       }
>
> I see that we already did this hack with bpf_qdisc_init_prologue()
> and bpf_qdisc_reset_destroy_epilogue().
> Not sure why we went that route back then.
>
> imo much cleaner to do BPF_EMIT_CALL() and wrap
> BPF_CALL_1(bpf_skb_meta_realign, struct sk_buff *, skb)
>
> BPF_CALL_x doesn't make it an uapi helper.
> It's still a hidden kernel function,
> while this kfunc stuff looks wrong, since kfunc isn't really hidden.
>
> I suspect progs can call this bpf_skb_meta_realign() explicitly,
> just like they can call bpf_qdisc_init_prologue() ?
>

qdisc prologue and epilogue qdisc kfuncs should be hidden from users.
The kfunc filter, bpf_qdisc_kfunc_filter(), determines what kfunc are
actually exposed.

BPF_CALL_x is simpler as there is no need for a kfunc filter to hide
it. However, IMO for qdisc they don't make too much difference since
bpf qdisc already needs the filter to limit .enqueue and .dequeue
specific kfunc.

Am I missing anything?

> cc Amery, Martin.
>

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH bpf-next v2 15/16] bpf: Realign skb metadata for TC progs using data_meta
  2026-01-05 19:42     ` Amery Hung
@ 2026-01-05 20:02       ` Alexei Starovoitov
  2026-01-05 20:54       ` Martin KaFai Lau
  1 sibling, 0 replies; 37+ messages in thread
From: Alexei Starovoitov @ 2026-01-05 20:02 UTC (permalink / raw)
  To: Amery Hung
  Cc: Jakub Sitnicki, Martin KaFai Lau, bpf, Network Development,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Alexei Starovoitov, Daniel Borkmann, Jesper Dangaard Brouer,
	John Fastabend, Stanislav Fomichev, Simon Horman, Andrii Nakryiko,
	Martin KaFai Lau, Eduard Zingerman, Song Liu, Yonghong Song,
	KP Singh, Hao Luo, Jiri Olsa, kernel-team

On Mon, Jan 5, 2026 at 11:43 AM Amery Hung <ameryhung@gmail.com> wrote:
>
> On Mon, Jan 5, 2026 at 11:14 AM Alexei Starovoitov
> <alexei.starovoitov@gmail.com> wrote:
> >
> > On Mon, Jan 5, 2026 at 4:15 AM Jakub Sitnicki <jakub@cloudflare.com> wrote:
> > >
> > >
> > > +__bpf_kfunc_start_defs();
> > > +
> > > +__bpf_kfunc void bpf_skb_meta_realign(struct __sk_buff *skb_)
> > > +{
> > > +       struct sk_buff *skb = (typeof(skb))skb_;
> > > +       u8 *meta_end = skb_metadata_end(skb);
> > > +       u8 meta_len = skb_metadata_len(skb);
> > > +       u8 *meta;
> > > +       int gap;
> > > +
> > > +       gap = skb_mac_header(skb) - meta_end;
> > > +       if (!meta_len || !gap)
> > > +               return;
> > > +
> > > +       if (WARN_ONCE(gap < 0, "skb metadata end past mac header")) {
> > > +               skb_metadata_clear(skb);
> > > +               return;
> > > +       }
> > > +
> > > +       meta = meta_end - meta_len;
> > > +       memmove(meta + gap, meta, meta_len);
> > > +       skb_shinfo(skb)->meta_end += gap;
> > > +
> > > +       bpf_compute_data_pointers(skb);
> > > +}
> > > +
> > > +__bpf_kfunc_end_defs();
> > > +
> > > +BTF_KFUNCS_START(tc_cls_act_hidden_ids)
> > > +BTF_ID_FLAGS(func, bpf_skb_meta_realign)
> > > +BTF_KFUNCS_END(tc_cls_act_hidden_ids)
> > > +
> > > +BTF_ID_LIST_SINGLE(bpf_skb_meta_realign_ids, func, bpf_skb_meta_realign)
> > > +
> > >  static int tc_cls_act_prologue(struct bpf_insn *insn_buf, u32 pkt_access_flags,
> > >                                const struct bpf_prog *prog)
> > >  {
> > > -       return bpf_unclone_prologue(insn_buf, pkt_access_flags, prog,
> > > -                                   TC_ACT_SHOT);
> > > +       struct bpf_insn *insn = insn_buf;
> > > +       int cnt;
> > > +
> > > +       if (pkt_access_flags & PA_F_DATA_META_LOAD) {
> > > +               /* Realign skb metadata for access through data_meta pointer.
> > > +                *
> > > +                * r6 = r1; // r6 will be "u64 *ctx"
> > > +                * r0 = bpf_skb_meta_realign(r1); // r0 is undefined
> > > +                * r1 = r6;
> > > +                */
> > > +               *insn++ = BPF_MOV64_REG(BPF_REG_6, BPF_REG_1);
> > > +               *insn++ = BPF_CALL_KFUNC(0, bpf_skb_meta_realign_ids[0]);
> > > +               *insn++ = BPF_MOV64_REG(BPF_REG_1, BPF_REG_6);
> > > +       }
> >
> > I see that we already did this hack with bpf_qdisc_init_prologue()
> > and bpf_qdisc_reset_destroy_epilogue().
> > Not sure why we went that route back then.
> >
> > imo much cleaner to do BPF_EMIT_CALL() and wrap
> > BPF_CALL_1(bpf_skb_meta_realign, struct sk_buff *, skb)
> >
> > BPF_CALL_x doesn't make it an uapi helper.
> > It's still a hidden kernel function,
> > while this kfunc stuff looks wrong, since kfunc isn't really hidden.
> >
> > I suspect progs can call this bpf_skb_meta_realign() explicitly,
> > just like they can call bpf_qdisc_init_prologue() ?
> >
>
> qdisc prologue and epilogue qdisc kfuncs should be hidden from users.
> The kfunc filter, bpf_qdisc_kfunc_filter(), determines what kfunc are
> actually exposed.

I see.

> BPF_CALL_x is simpler as there is no need for a kfunc filter to hide
> it. However, IMO for qdisc they don't make too much difference since
> bpf qdisc already needs the filter to limit .enqueue and .dequeue
> specific kfunc.
>
> Am I missing anything?

Just weird to have kfuncs that are not really kfunc.
A special bpf_qdisc_init_prologue_ids[] just to call it, etc
BPF_EMIT_CALL() is lower cognitive load.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH bpf-next v2 15/16] bpf: Realign skb metadata for TC progs using data_meta
  2026-01-05 19:42     ` Amery Hung
  2026-01-05 20:02       ` Alexei Starovoitov
@ 2026-01-05 20:54       ` Martin KaFai Lau
  2026-01-05 21:47         ` Alexei Starovoitov
  1 sibling, 1 reply; 37+ messages in thread
From: Martin KaFai Lau @ 2026-01-05 20:54 UTC (permalink / raw)
  To: Amery Hung, Alexei Starovoitov
  Cc: Jakub Sitnicki, Martin KaFai Lau, bpf, Network Development,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Alexei Starovoitov, Daniel Borkmann, Jesper Dangaard Brouer,
	John Fastabend, Stanislav Fomichev, Simon Horman, Andrii Nakryiko,
	Eduard Zingerman, Song Liu, Yonghong Song, KP Singh, Hao Luo,
	Jiri Olsa, kernel-team



On 1/5/26 11:42 AM, Amery Hung wrote:
> On Mon, Jan 5, 2026 at 11:14 AM Alexei Starovoitov
> <alexei.starovoitov@gmail.com> wrote:
>>
>> On Mon, Jan 5, 2026 at 4:15 AM Jakub Sitnicki <jakub@cloudflare.com> wrote:
>>>
>>>
>>> +__bpf_kfunc_start_defs();
>>> +
>>> +__bpf_kfunc void bpf_skb_meta_realign(struct __sk_buff *skb_)
>>> +{
>>> +       struct sk_buff *skb = (typeof(skb))skb_;
>>> +       u8 *meta_end = skb_metadata_end(skb);
>>> +       u8 meta_len = skb_metadata_len(skb);
>>> +       u8 *meta;
>>> +       int gap;
>>> +
>>> +       gap = skb_mac_header(skb) - meta_end;
>>> +       if (!meta_len || !gap)
>>> +               return;
>>> +
>>> +       if (WARN_ONCE(gap < 0, "skb metadata end past mac header")) {
>>> +               skb_metadata_clear(skb);
>>> +               return;
>>> +       }
>>> +
>>> +       meta = meta_end - meta_len;
>>> +       memmove(meta + gap, meta, meta_len);
>>> +       skb_shinfo(skb)->meta_end += gap;
>>> +
>>> +       bpf_compute_data_pointers(skb);
>>> +}
>>> +
>>> +__bpf_kfunc_end_defs();
>>> +
>>> +BTF_KFUNCS_START(tc_cls_act_hidden_ids)
>>> +BTF_ID_FLAGS(func, bpf_skb_meta_realign)
>>> +BTF_KFUNCS_END(tc_cls_act_hidden_ids)
>>> +
>>> +BTF_ID_LIST_SINGLE(bpf_skb_meta_realign_ids, func, bpf_skb_meta_realign)
>>> +
>>>   static int tc_cls_act_prologue(struct bpf_insn *insn_buf, u32 pkt_access_flags,
>>>                                 const struct bpf_prog *prog)
>>>   {
>>> -       return bpf_unclone_prologue(insn_buf, pkt_access_flags, prog,
>>> -                                   TC_ACT_SHOT);
>>> +       struct bpf_insn *insn = insn_buf;
>>> +       int cnt;
>>> +
>>> +       if (pkt_access_flags & PA_F_DATA_META_LOAD) {
>>> +               /* Realign skb metadata for access through data_meta pointer.
>>> +                *
>>> +                * r6 = r1; // r6 will be "u64 *ctx"
>>> +                * r0 = bpf_skb_meta_realign(r1); // r0 is undefined
>>> +                * r1 = r6;
>>> +                */
>>> +               *insn++ = BPF_MOV64_REG(BPF_REG_6, BPF_REG_1);
>>> +               *insn++ = BPF_CALL_KFUNC(0, bpf_skb_meta_realign_ids[0]);
>>> +               *insn++ = BPF_MOV64_REG(BPF_REG_1, BPF_REG_6);
>>> +       }
>>
>> I see that we already did this hack with bpf_qdisc_init_prologue()
>> and bpf_qdisc_reset_destroy_epilogue().
>> Not sure why we went that route back then.
>>
>> imo much cleaner to do BPF_EMIT_CALL() and wrap
>> BPF_CALL_1(bpf_skb_meta_realign, struct sk_buff *, skb)
>>
>> BPF_CALL_x doesn't make it an uapi helper.
>> It's still a hidden kernel function,
>> while this kfunc stuff looks wrong, since kfunc isn't really hidden.
>>
>> I suspect progs can call this bpf_skb_meta_realign() explicitly,
>> just like they can call bpf_qdisc_init_prologue() ?
>>
> 
> qdisc prologue and epilogue qdisc kfuncs should be hidden from users.
> The kfunc filter, bpf_qdisc_kfunc_filter(), determines what kfunc are
> actually exposed.

Similar to Amery's comment, I recalled I tried the BPF_CALL_1 in the 
qdisc but stopped at the "fn = env->ops->get_func_proto(insn->imm, 
env->prog);" in do_misc_fixups(). Potentially it could add new enum ( > 
__BPF_FUNC_MAX_ID) outside of the uapi and the user space tool should be 
able to handle unknown helper also but we went with the kfunc+filter 
approach without thinking too much about it.

https://lore.kernel.org/bpf/3961c9ce-21d3-4a35-956c-5e1a6eb6031b@linux.dev/

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH bpf-next v2 15/16] bpf: Realign skb metadata for TC progs using data_meta
  2026-01-05 20:54       ` Martin KaFai Lau
@ 2026-01-05 21:47         ` Alexei Starovoitov
  2026-01-05 22:25           ` Martin KaFai Lau
  0 siblings, 1 reply; 37+ messages in thread
From: Alexei Starovoitov @ 2026-01-05 21:47 UTC (permalink / raw)
  To: Martin KaFai Lau
  Cc: Amery Hung, Jakub Sitnicki, Martin KaFai Lau, bpf,
	Network Development, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Alexei Starovoitov, Daniel Borkmann,
	Jesper Dangaard Brouer, John Fastabend, Stanislav Fomichev,
	Simon Horman, Andrii Nakryiko, Eduard Zingerman, Song Liu,
	Yonghong Song, KP Singh, Hao Luo, Jiri Olsa, kernel-team

On Mon, Jan 5, 2026 at 12:55 PM Martin KaFai Lau <martin.lau@linux.dev> wrote:
>
>
>
> On 1/5/26 11:42 AM, Amery Hung wrote:
> > On Mon, Jan 5, 2026 at 11:14 AM Alexei Starovoitov
> > <alexei.starovoitov@gmail.com> wrote:
> >>
> >> On Mon, Jan 5, 2026 at 4:15 AM Jakub Sitnicki <jakub@cloudflare.com> wrote:
> >>>
> >>>
> >>> +__bpf_kfunc_start_defs();
> >>> +
> >>> +__bpf_kfunc void bpf_skb_meta_realign(struct __sk_buff *skb_)
> >>> +{
> >>> +       struct sk_buff *skb = (typeof(skb))skb_;
> >>> +       u8 *meta_end = skb_metadata_end(skb);
> >>> +       u8 meta_len = skb_metadata_len(skb);
> >>> +       u8 *meta;
> >>> +       int gap;
> >>> +
> >>> +       gap = skb_mac_header(skb) - meta_end;
> >>> +       if (!meta_len || !gap)
> >>> +               return;
> >>> +
> >>> +       if (WARN_ONCE(gap < 0, "skb metadata end past mac header")) {
> >>> +               skb_metadata_clear(skb);
> >>> +               return;
> >>> +       }
> >>> +
> >>> +       meta = meta_end - meta_len;
> >>> +       memmove(meta + gap, meta, meta_len);
> >>> +       skb_shinfo(skb)->meta_end += gap;
> >>> +
> >>> +       bpf_compute_data_pointers(skb);
> >>> +}
> >>> +
> >>> +__bpf_kfunc_end_defs();
> >>> +
> >>> +BTF_KFUNCS_START(tc_cls_act_hidden_ids)
> >>> +BTF_ID_FLAGS(func, bpf_skb_meta_realign)
> >>> +BTF_KFUNCS_END(tc_cls_act_hidden_ids)
> >>> +
> >>> +BTF_ID_LIST_SINGLE(bpf_skb_meta_realign_ids, func, bpf_skb_meta_realign)
> >>> +
> >>>   static int tc_cls_act_prologue(struct bpf_insn *insn_buf, u32 pkt_access_flags,
> >>>                                 const struct bpf_prog *prog)
> >>>   {
> >>> -       return bpf_unclone_prologue(insn_buf, pkt_access_flags, prog,
> >>> -                                   TC_ACT_SHOT);
> >>> +       struct bpf_insn *insn = insn_buf;
> >>> +       int cnt;
> >>> +
> >>> +       if (pkt_access_flags & PA_F_DATA_META_LOAD) {
> >>> +               /* Realign skb metadata for access through data_meta pointer.
> >>> +                *
> >>> +                * r6 = r1; // r6 will be "u64 *ctx"
> >>> +                * r0 = bpf_skb_meta_realign(r1); // r0 is undefined
> >>> +                * r1 = r6;
> >>> +                */
> >>> +               *insn++ = BPF_MOV64_REG(BPF_REG_6, BPF_REG_1);
> >>> +               *insn++ = BPF_CALL_KFUNC(0, bpf_skb_meta_realign_ids[0]);
> >>> +               *insn++ = BPF_MOV64_REG(BPF_REG_1, BPF_REG_6);
> >>> +       }
> >>
> >> I see that we already did this hack with bpf_qdisc_init_prologue()
> >> and bpf_qdisc_reset_destroy_epilogue().
> >> Not sure why we went that route back then.
> >>
> >> imo much cleaner to do BPF_EMIT_CALL() and wrap
> >> BPF_CALL_1(bpf_skb_meta_realign, struct sk_buff *, skb)
> >>
> >> BPF_CALL_x doesn't make it an uapi helper.
> >> It's still a hidden kernel function,
> >> while this kfunc stuff looks wrong, since kfunc isn't really hidden.
> >>
> >> I suspect progs can call this bpf_skb_meta_realign() explicitly,
> >> just like they can call bpf_qdisc_init_prologue() ?
> >>
> >
> > qdisc prologue and epilogue qdisc kfuncs should be hidden from users.
> > The kfunc filter, bpf_qdisc_kfunc_filter(), determines what kfunc are
> > actually exposed.
>
> Similar to Amery's comment, I recalled I tried the BPF_CALL_1 in the
> qdisc but stopped at the "fn = env->ops->get_func_proto(insn->imm,
> env->prog);" in do_misc_fixups(). Potentially it could add new enum ( >
> __BPF_FUNC_MAX_ID) outside of the uapi and the user space tool should be
> able to handle unknown helper also but we went with the kfunc+filter
> approach without thinking too much about it.

hmm. BPF_EMIT_CALL() does:
#define BPF_CALL_IMM(x) ((void *)(x) - (void *)__bpf_call_base)
.imm   = BPF_CALL_IMM(FUNC)

the imm shouldn't be going through validation anymore.
none of the if (insn->imm == BPF_FUNC_...) in do_misc_fixups()
will match, so I think I see the path where get_func_proto() is
called.
But how does it work then for all cases of BPF_EMIT_CALL?
All of them happen after do_misc_fixups() ?

I guess we can mark such emitted call in insn_aux_data as finalized
and get_func_proto() isn't needed.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH bpf-next v2 15/16] bpf: Realign skb metadata for TC progs using data_meta
  2026-01-05 21:47         ` Alexei Starovoitov
@ 2026-01-05 22:25           ` Martin KaFai Lau
  2026-01-05 23:19             ` Amery Hung
  0 siblings, 1 reply; 37+ messages in thread
From: Martin KaFai Lau @ 2026-01-05 22:25 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Amery Hung, Jakub Sitnicki, Martin KaFai Lau, bpf,
	Network Development, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Alexei Starovoitov, Daniel Borkmann,
	Jesper Dangaard Brouer, John Fastabend, Stanislav Fomichev,
	Simon Horman, Andrii Nakryiko, Eduard Zingerman, Song Liu,
	Yonghong Song, KP Singh, Hao Luo, Jiri Olsa, kernel-team



On 1/5/26 1:47 PM, Alexei Starovoitov wrote:
> On Mon, Jan 5, 2026 at 12:55 PM Martin KaFai Lau <martin.lau@linux.dev> wrote:
>>
>>
>>
>> On 1/5/26 11:42 AM, Amery Hung wrote:
>>> On Mon, Jan 5, 2026 at 11:14 AM Alexei Starovoitov
>>> <alexei.starovoitov@gmail.com> wrote:
>>>>
>>>> On Mon, Jan 5, 2026 at 4:15 AM Jakub Sitnicki <jakub@cloudflare.com> wrote:
>>>>>
>>>>>
>>>>> +__bpf_kfunc_start_defs();
>>>>> +
>>>>> +__bpf_kfunc void bpf_skb_meta_realign(struct __sk_buff *skb_)
>>>>> +{
>>>>> +       struct sk_buff *skb = (typeof(skb))skb_;
>>>>> +       u8 *meta_end = skb_metadata_end(skb);
>>>>> +       u8 meta_len = skb_metadata_len(skb);
>>>>> +       u8 *meta;
>>>>> +       int gap;
>>>>> +
>>>>> +       gap = skb_mac_header(skb) - meta_end;
>>>>> +       if (!meta_len || !gap)
>>>>> +               return;
>>>>> +
>>>>> +       if (WARN_ONCE(gap < 0, "skb metadata end past mac header")) {
>>>>> +               skb_metadata_clear(skb);
>>>>> +               return;
>>>>> +       }
>>>>> +
>>>>> +       meta = meta_end - meta_len;
>>>>> +       memmove(meta + gap, meta, meta_len);
>>>>> +       skb_shinfo(skb)->meta_end += gap;
>>>>> +
>>>>> +       bpf_compute_data_pointers(skb);
>>>>> +}
>>>>> +
>>>>> +__bpf_kfunc_end_defs();
>>>>> +
>>>>> +BTF_KFUNCS_START(tc_cls_act_hidden_ids)
>>>>> +BTF_ID_FLAGS(func, bpf_skb_meta_realign)
>>>>> +BTF_KFUNCS_END(tc_cls_act_hidden_ids)
>>>>> +
>>>>> +BTF_ID_LIST_SINGLE(bpf_skb_meta_realign_ids, func, bpf_skb_meta_realign)
>>>>> +
>>>>>    static int tc_cls_act_prologue(struct bpf_insn *insn_buf, u32 pkt_access_flags,
>>>>>                                  const struct bpf_prog *prog)
>>>>>    {
>>>>> -       return bpf_unclone_prologue(insn_buf, pkt_access_flags, prog,
>>>>> -                                   TC_ACT_SHOT);
>>>>> +       struct bpf_insn *insn = insn_buf;
>>>>> +       int cnt;
>>>>> +
>>>>> +       if (pkt_access_flags & PA_F_DATA_META_LOAD) {
>>>>> +               /* Realign skb metadata for access through data_meta pointer.
>>>>> +                *
>>>>> +                * r6 = r1; // r6 will be "u64 *ctx"
>>>>> +                * r0 = bpf_skb_meta_realign(r1); // r0 is undefined
>>>>> +                * r1 = r6;
>>>>> +                */
>>>>> +               *insn++ = BPF_MOV64_REG(BPF_REG_6, BPF_REG_1);
>>>>> +               *insn++ = BPF_CALL_KFUNC(0, bpf_skb_meta_realign_ids[0]);
>>>>> +               *insn++ = BPF_MOV64_REG(BPF_REG_1, BPF_REG_6);
>>>>> +       }
>>>>
>>>> I see that we already did this hack with bpf_qdisc_init_prologue()
>>>> and bpf_qdisc_reset_destroy_epilogue().
>>>> Not sure why we went that route back then.
>>>>
>>>> imo much cleaner to do BPF_EMIT_CALL() and wrap
>>>> BPF_CALL_1(bpf_skb_meta_realign, struct sk_buff *, skb)
>>>>
>>>> BPF_CALL_x doesn't make it an uapi helper.
>>>> It's still a hidden kernel function,
>>>> while this kfunc stuff looks wrong, since kfunc isn't really hidden.
>>>>
>>>> I suspect progs can call this bpf_skb_meta_realign() explicitly,
>>>> just like they can call bpf_qdisc_init_prologue() ?
>>>>
>>>
>>> qdisc prologue and epilogue qdisc kfuncs should be hidden from users.
>>> The kfunc filter, bpf_qdisc_kfunc_filter(), determines what kfunc are
>>> actually exposed.
>>
>> Similar to Amery's comment, I recalled I tried the BPF_CALL_1 in the
>> qdisc but stopped at the "fn = env->ops->get_func_proto(insn->imm,
>> env->prog);" in do_misc_fixups(). Potentially it could add new enum ( >
>> __BPF_FUNC_MAX_ID) outside of the uapi and the user space tool should be
>> able to handle unknown helper also but we went with the kfunc+filter
>> approach without thinking too much about it.
> 
> hmm. BPF_EMIT_CALL() does:
> #define BPF_CALL_IMM(x) ((void *)(x) - (void *)__bpf_call_base)
> .imm   = BPF_CALL_IMM(FUNC)
> 
> the imm shouldn't be going through validation anymore.
> none of the if (insn->imm == BPF_FUNC_...) in do_misc_fixups()
> will match, so I think I see the path where get_func_proto() is
> called.
> But how does it work then for all cases of BPF_EMIT_CALL?
> All of them happen after do_misc_fixups() ?

yeah, I think most (all?) of them (e.g. map_gen_lookup) happens in 
do_misc_fixups which then does "goto next_insn;" to skip the 
get_func_proto().

> 
> I guess we can mark such emitted call in insn_aux_data as finalized
> and get_func_proto() isn't needed.

It is a good idea.


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH bpf-next v2 15/16] bpf: Realign skb metadata for TC progs using data_meta
  2026-01-05 22:25           ` Martin KaFai Lau
@ 2026-01-05 23:19             ` Amery Hung
  2026-01-06  2:04               ` Alexei Starovoitov
  0 siblings, 1 reply; 37+ messages in thread
From: Amery Hung @ 2026-01-05 23:19 UTC (permalink / raw)
  To: Martin KaFai Lau
  Cc: Alexei Starovoitov, Jakub Sitnicki, Martin KaFai Lau, bpf,
	Network Development, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Alexei Starovoitov, Daniel Borkmann,
	Jesper Dangaard Brouer, John Fastabend, Stanislav Fomichev,
	Simon Horman, Andrii Nakryiko, Eduard Zingerman, Song Liu,
	Yonghong Song, KP Singh, Hao Luo, Jiri Olsa, kernel-team

On Mon, Jan 5, 2026 at 2:26 PM Martin KaFai Lau <martin.lau@linux.dev> wrote:
>
>
>
> On 1/5/26 1:47 PM, Alexei Starovoitov wrote:
> > On Mon, Jan 5, 2026 at 12:55 PM Martin KaFai Lau <martin.lau@linux.dev> wrote:
> >>
> >>
> >>
> >> On 1/5/26 11:42 AM, Amery Hung wrote:
> >>> On Mon, Jan 5, 2026 at 11:14 AM Alexei Starovoitov
> >>> <alexei.starovoitov@gmail.com> wrote:
> >>>>
> >>>> On Mon, Jan 5, 2026 at 4:15 AM Jakub Sitnicki <jakub@cloudflare.com> wrote:
> >>>>>
> >>>>>
> >>>>> +__bpf_kfunc_start_defs();
> >>>>> +
> >>>>> +__bpf_kfunc void bpf_skb_meta_realign(struct __sk_buff *skb_)
> >>>>> +{
> >>>>> +       struct sk_buff *skb = (typeof(skb))skb_;
> >>>>> +       u8 *meta_end = skb_metadata_end(skb);
> >>>>> +       u8 meta_len = skb_metadata_len(skb);
> >>>>> +       u8 *meta;
> >>>>> +       int gap;
> >>>>> +
> >>>>> +       gap = skb_mac_header(skb) - meta_end;
> >>>>> +       if (!meta_len || !gap)
> >>>>> +               return;
> >>>>> +
> >>>>> +       if (WARN_ONCE(gap < 0, "skb metadata end past mac header")) {
> >>>>> +               skb_metadata_clear(skb);
> >>>>> +               return;
> >>>>> +       }
> >>>>> +
> >>>>> +       meta = meta_end - meta_len;
> >>>>> +       memmove(meta + gap, meta, meta_len);
> >>>>> +       skb_shinfo(skb)->meta_end += gap;
> >>>>> +
> >>>>> +       bpf_compute_data_pointers(skb);
> >>>>> +}
> >>>>> +
> >>>>> +__bpf_kfunc_end_defs();
> >>>>> +
> >>>>> +BTF_KFUNCS_START(tc_cls_act_hidden_ids)
> >>>>> +BTF_ID_FLAGS(func, bpf_skb_meta_realign)
> >>>>> +BTF_KFUNCS_END(tc_cls_act_hidden_ids)
> >>>>> +
> >>>>> +BTF_ID_LIST_SINGLE(bpf_skb_meta_realign_ids, func, bpf_skb_meta_realign)
> >>>>> +
> >>>>>    static int tc_cls_act_prologue(struct bpf_insn *insn_buf, u32 pkt_access_flags,
> >>>>>                                  const struct bpf_prog *prog)
> >>>>>    {
> >>>>> -       return bpf_unclone_prologue(insn_buf, pkt_access_flags, prog,
> >>>>> -                                   TC_ACT_SHOT);
> >>>>> +       struct bpf_insn *insn = insn_buf;
> >>>>> +       int cnt;
> >>>>> +
> >>>>> +       if (pkt_access_flags & PA_F_DATA_META_LOAD) {
> >>>>> +               /* Realign skb metadata for access through data_meta pointer.
> >>>>> +                *
> >>>>> +                * r6 = r1; // r6 will be "u64 *ctx"
> >>>>> +                * r0 = bpf_skb_meta_realign(r1); // r0 is undefined
> >>>>> +                * r1 = r6;
> >>>>> +                */
> >>>>> +               *insn++ = BPF_MOV64_REG(BPF_REG_6, BPF_REG_1);
> >>>>> +               *insn++ = BPF_CALL_KFUNC(0, bpf_skb_meta_realign_ids[0]);
> >>>>> +               *insn++ = BPF_MOV64_REG(BPF_REG_1, BPF_REG_6);
> >>>>> +       }
> >>>>
> >>>> I see that we already did this hack with bpf_qdisc_init_prologue()
> >>>> and bpf_qdisc_reset_destroy_epilogue().
> >>>> Not sure why we went that route back then.
> >>>>
> >>>> imo much cleaner to do BPF_EMIT_CALL() and wrap
> >>>> BPF_CALL_1(bpf_skb_meta_realign, struct sk_buff *, skb)
> >>>>
> >>>> BPF_CALL_x doesn't make it an uapi helper.
> >>>> It's still a hidden kernel function,
> >>>> while this kfunc stuff looks wrong, since kfunc isn't really hidden.
> >>>>
> >>>> I suspect progs can call this bpf_skb_meta_realign() explicitly,
> >>>> just like they can call bpf_qdisc_init_prologue() ?
> >>>>
> >>>
> >>> qdisc prologue and epilogue qdisc kfuncs should be hidden from users.
> >>> The kfunc filter, bpf_qdisc_kfunc_filter(), determines what kfunc are
> >>> actually exposed.
> >>
> >> Similar to Amery's comment, I recalled I tried the BPF_CALL_1 in the
> >> qdisc but stopped at the "fn = env->ops->get_func_proto(insn->imm,
> >> env->prog);" in do_misc_fixups(). Potentially it could add new enum ( >
> >> __BPF_FUNC_MAX_ID) outside of the uapi and the user space tool should be
> >> able to handle unknown helper also but we went with the kfunc+filter
> >> approach without thinking too much about it.
> >
> > hmm. BPF_EMIT_CALL() does:
> > #define BPF_CALL_IMM(x) ((void *)(x) - (void *)__bpf_call_base)
> > .imm   = BPF_CALL_IMM(FUNC)
> >
> > the imm shouldn't be going through validation anymore.
> > none of the if (insn->imm == BPF_FUNC_...) in do_misc_fixups()
> > will match, so I think I see the path where get_func_proto() is
> > called.
> > But how does it work then for all cases of BPF_EMIT_CALL?
> > All of them happen after do_misc_fixups() ?
>
> yeah, I think most (all?) of them (e.g. map_gen_lookup) happens in
> do_misc_fixups which then does "goto next_insn;" to skip the
> get_func_proto().
>
> >
> > I guess we can mark such emitted call in insn_aux_data as finalized
> > and get_func_proto() isn't needed.
>
> It is a good idea.
>

Hmm, insn_aux_data has to be marked in gen_{pro,epi}logue since this
is the only place we know whether the call needs fixup or not. However
insn_aux_data is not available yet in gen_{pro,epi}logue because we
haven't resized insn_aux_data.

Can we do some hack based on the fact that calls emitted by
BPF_EMIT_CALL() are finalized while calls emitted by BPF_RAW_INSN()
most likely are not?
Let BPF_EMIT_CALL() mark the call insn as finalized temporarily (e.g.,
.off = 1). Then, when do_misc_fixups() encounters it just reset off to
0 and don't call get_func_proto().

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH bpf-next v2 15/16] bpf: Realign skb metadata for TC progs using data_meta
  2026-01-05 23:19             ` Amery Hung
@ 2026-01-06  2:04               ` Alexei Starovoitov
  2026-01-06 17:36                 ` Jakub Sitnicki
  0 siblings, 1 reply; 37+ messages in thread
From: Alexei Starovoitov @ 2026-01-06  2:04 UTC (permalink / raw)
  To: Amery Hung
  Cc: Martin KaFai Lau, Jakub Sitnicki, Martin KaFai Lau, bpf,
	Network Development, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Alexei Starovoitov, Daniel Borkmann,
	Jesper Dangaard Brouer, John Fastabend, Stanislav Fomichev,
	Simon Horman, Andrii Nakryiko, Eduard Zingerman, Song Liu,
	Yonghong Song, KP Singh, Hao Luo, Jiri Olsa, kernel-team

On Mon, Jan 5, 2026 at 3:19 PM Amery Hung <ameryhung@gmail.com> wrote:
>
> >
> > >
> > > I guess we can mark such emitted call in insn_aux_data as finalized
> > > and get_func_proto() isn't needed.
> >
> > It is a good idea.
> >
>
> Hmm, insn_aux_data has to be marked in gen_{pro,epi}logue since this
> is the only place we know whether the call needs fixup or not. However
> insn_aux_data is not available yet in gen_{pro,epi}logue because we
> haven't resized insn_aux_data.
>
> Can we do some hack based on the fact that calls emitted by
> BPF_EMIT_CALL() are finalized while calls emitted by BPF_RAW_INSN()
> most likely are not?
> Let BPF_EMIT_CALL() mark the call insn as finalized temporarily (e.g.,
> .off = 1). Then, when do_misc_fixups() encounters it just reset off to
> 0 and don't call get_func_proto().

marking inside insn via off=1 or whatever is an option,
but once we remove BPF_CALL_KFUNC from gen_prologue we can
delete add_kfunc_in_insns() altogether and replace it with
a similar loop that does
if (bpf_helper_call()) mark insn_aux_data.

That would be a nice benefit, since add_kfunc_call() from there
was always a bit odd, since we're adding kfuncs early before the main
verifier pass and after, because of gen_prologue.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH bpf-next v2 15/16] bpf: Realign skb metadata for TC progs using data_meta
  2026-01-06  2:04               ` Alexei Starovoitov
@ 2026-01-06 17:36                 ` Jakub Sitnicki
  2026-01-06 17:46                   ` Amery Hung
  0 siblings, 1 reply; 37+ messages in thread
From: Jakub Sitnicki @ 2026-01-06 17:36 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Amery Hung, Martin KaFai Lau, Martin KaFai Lau, bpf,
	Network Development, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Alexei Starovoitov, Daniel Borkmann,
	Jesper Dangaard Brouer, John Fastabend, Stanislav Fomichev,
	Simon Horman, Andrii Nakryiko, Eduard Zingerman, Song Liu,
	Yonghong Song, KP Singh, Hao Luo, Jiri Olsa, kernel-team

On Mon, Jan 05, 2026 at 06:04 PM -08, Alexei Starovoitov wrote:
> On Mon, Jan 5, 2026 at 3:19 PM Amery Hung <ameryhung@gmail.com> wrote:
>>
>> >
>> > >
>> > > I guess we can mark such emitted call in insn_aux_data as finalized
>> > > and get_func_proto() isn't needed.
>> >
>> > It is a good idea.
>> >
>>
>> Hmm, insn_aux_data has to be marked in gen_{pro,epi}logue since this
>> is the only place we know whether the call needs fixup or not. However
>> insn_aux_data is not available yet in gen_{pro,epi}logue because we
>> haven't resized insn_aux_data.
>>
>> Can we do some hack based on the fact that calls emitted by
>> BPF_EMIT_CALL() are finalized while calls emitted by BPF_RAW_INSN()
>> most likely are not?
>> Let BPF_EMIT_CALL() mark the call insn as finalized temporarily (e.g.,
>> .off = 1). Then, when do_misc_fixups() encounters it just reset off to
>> 0 and don't call get_func_proto().
>
> marking inside insn via off=1 or whatever is an option,
> but once we remove BPF_CALL_KFUNC from gen_prologue we can
> delete add_kfunc_in_insns() altogether and replace it with
> a similar loop that does
> if (bpf_helper_call()) mark insn_aux_data.
>
> That would be a nice benefit, since add_kfunc_call() from there
> was always a bit odd, since we're adding kfuncs early before the main
> verifier pass and after, because of gen_prologue.

Thanks for all the pointers.

I understood we're looking for something like this:

---8<---
diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
index b32ddf0f0ab3..9ccd56c04a45 100644
--- a/include/linux/bpf_verifier.h
+++ b/include/linux/bpf_verifier.h
@@ -561,6 +561,7 @@ struct bpf_insn_aux_data {
 	bool non_sleepable; /* helper/kfunc may be called from non-sleepable context */
 	bool is_iter_next; /* bpf_iter_<type>_next() kfunc call */
 	bool call_with_percpu_alloc_ptr; /* {this,per}_cpu_ptr() with prog percpu alloc */
+	bool finalized_call; /* call holds function offset relative to __bpf_base_call */
 	u8 alu_state; /* used in combination with alu_limit */
 	/* true if STX or LDX instruction is a part of a spill/fill
 	 * pattern for a bpf_fastcall call.
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 1ca5c5e895ee..cc737d103cdd 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -21806,6 +21806,14 @@ static int convert_ctx_accesses(struct bpf_verifier_env *env)
 			env->prog = new_prog;
 			delta += cnt - 1;
 
+			/* gen_prologue emits function calls with target address
+			 * relative to __bpf_call_base. Skip patch_call_imm fixup.
+			 */
+			for (i = 0; i < cnt - 1; i++) {
+				if (bpf_helper_call(&env->prog->insnsi[i]))
+					env->insn_aux_data[i].finalized_call = true;
+			}
+
 			ret = add_kfunc_in_insns(env, insn_buf, cnt - 1);
 			if (ret < 0)
 				return ret;
@@ -23412,6 +23420,9 @@ static int do_misc_fixups(struct bpf_verifier_env *env)
 			goto next_insn;
 		}
 patch_call_imm:
+		if (env->insn_aux_data[i + delta].finalized_call)
+			goto next_insn;
+
 		fn = env->ops->get_func_proto(insn->imm, env->prog);
 		/* all functions that have prototype and verifier allowed
 		 * programs to call them, must be real in-kernel functions
@@ -23423,6 +23434,7 @@ static int do_misc_fixups(struct bpf_verifier_env *env)
 			return -EFAULT;
 		}
 		insn->imm = fn->func - __bpf_call_base;
+		env->insn_aux_data[i + delta].finalized_call = true;
 next_insn:
 		if (subprogs[cur_subprog + 1].start == i + delta + 1) {
 			subprogs[cur_subprog].stack_depth += stack_depth_extra;
diff --git a/net/core/filter.c b/net/core/filter.c
index 7f5bc6a505e1..53993c2c492d 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -9082,8 +9082,7 @@ static int bpf_unclone_prologue(struct bpf_insn *insn_buf, u32 pkt_access_flags,
 	/* ret = bpf_skb_pull_data(skb, 0); */
 	*insn++ = BPF_MOV64_REG(BPF_REG_6, BPF_REG_1);
 	*insn++ = BPF_ALU64_REG(BPF_XOR, BPF_REG_2, BPF_REG_2);
-	*insn++ = BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0,
-			       BPF_FUNC_skb_pull_data);
+	*insn++ = BPF_EMIT_CALL(bpf_skb_pull_data);
 	/* if (!ret)
 	 *      goto restore;
 	 * return TC_ACT_SHOT;
@@ -9135,11 +9134,8 @@ static int bpf_gen_ld_abs(const struct bpf_insn *orig,
 	return insn - insn_buf;
 }
 
-__bpf_kfunc_start_defs();
-
-__bpf_kfunc void bpf_skb_meta_realign(struct __sk_buff *skb_)
+static void bpf_skb_meta_realign(struct sk_buff *skb)
 {
-	struct sk_buff *skb = (typeof(skb))skb_;
 	u8 *meta_end = skb_metadata_end(skb);
 	u8 meta_len = skb_metadata_len(skb);
 	u8 *meta;
@@ -9161,14 +9157,6 @@ __bpf_kfunc void bpf_skb_meta_realign(struct __sk_buff *skb_)
 	bpf_compute_data_pointers(skb);
 }
 
-__bpf_kfunc_end_defs();
-
-BTF_KFUNCS_START(tc_cls_act_hidden_ids)
-BTF_ID_FLAGS(func, bpf_skb_meta_realign)
-BTF_KFUNCS_END(tc_cls_act_hidden_ids)
-
-BTF_ID_LIST_SINGLE(bpf_skb_meta_realign_ids, func, bpf_skb_meta_realign)
-
 static int tc_cls_act_prologue(struct bpf_insn *insn_buf, u32 pkt_access_flags,
 			       const struct bpf_prog *prog)
 {
@@ -9182,8 +9170,10 @@ static int tc_cls_act_prologue(struct bpf_insn *insn_buf, u32 pkt_access_flags,
 		 * r0 = bpf_skb_meta_realign(r1); // r0 is undefined
 		 * r1 = r6;
 		 */
+		BUILD_BUG_ON(!__same_type(&bpf_skb_meta_realign,
+					  (void (*)(struct sk_buff *skb))NULL));
 		*insn++ = BPF_MOV64_REG(BPF_REG_6, BPF_REG_1);
-		*insn++ = BPF_CALL_KFUNC(0, bpf_skb_meta_realign_ids[0]);
+		*insn++ = BPF_EMIT_CALL(bpf_skb_meta_realign);
 		*insn++ = BPF_MOV64_REG(BPF_REG_1, BPF_REG_6);
 	}
 	cnt = bpf_unclone_prologue(insn, pkt_access_flags, prog, TC_ACT_SHOT);

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* Re: [PATCH bpf-next v2 15/16] bpf: Realign skb metadata for TC progs using data_meta
  2026-01-06 17:36                 ` Jakub Sitnicki
@ 2026-01-06 17:46                   ` Amery Hung
  2026-01-06 18:40                     ` Alexei Starovoitov
  2026-01-06 19:12                     ` Jakub Sitnicki
  0 siblings, 2 replies; 37+ messages in thread
From: Amery Hung @ 2026-01-06 17:46 UTC (permalink / raw)
  To: Jakub Sitnicki
  Cc: Alexei Starovoitov, Martin KaFai Lau, Martin KaFai Lau, bpf,
	Network Development, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Alexei Starovoitov, Daniel Borkmann,
	Jesper Dangaard Brouer, John Fastabend, Stanislav Fomichev,
	Simon Horman, Andrii Nakryiko, Eduard Zingerman, Song Liu,
	Yonghong Song, KP Singh, Hao Luo, Jiri Olsa, kernel-team

On Tue, Jan 6, 2026 at 9:36 AM Jakub Sitnicki <jakub@cloudflare.com> wrote:
>
> On Mon, Jan 05, 2026 at 06:04 PM -08, Alexei Starovoitov wrote:
> > On Mon, Jan 5, 2026 at 3:19 PM Amery Hung <ameryhung@gmail.com> wrote:
> >>
> >> >
> >> > >
> >> > > I guess we can mark such emitted call in insn_aux_data as finalized
> >> > > and get_func_proto() isn't needed.
> >> >
> >> > It is a good idea.
> >> >
> >>
> >> Hmm, insn_aux_data has to be marked in gen_{pro,epi}logue since this
> >> is the only place we know whether the call needs fixup or not. However
> >> insn_aux_data is not available yet in gen_{pro,epi}logue because we
> >> haven't resized insn_aux_data.
> >>
> >> Can we do some hack based on the fact that calls emitted by
> >> BPF_EMIT_CALL() are finalized while calls emitted by BPF_RAW_INSN()
> >> most likely are not?
> >> Let BPF_EMIT_CALL() mark the call insn as finalized temporarily (e.g.,
> >> .off = 1). Then, when do_misc_fixups() encounters it just reset off to
> >> 0 and don't call get_func_proto().
> >
> > marking inside insn via off=1 or whatever is an option,
> > but once we remove BPF_CALL_KFUNC from gen_prologue we can
> > delete add_kfunc_in_insns() altogether and replace it with
> > a similar loop that does
> > if (bpf_helper_call()) mark insn_aux_data.
> >
> > That would be a nice benefit, since add_kfunc_call() from there
> > was always a bit odd, since we're adding kfuncs early before the main
> > verifier pass and after, because of gen_prologue.
>
> Thanks for all the pointers.
>
> I understood we're looking for something like this:
>
> ---8<---
> diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
> index b32ddf0f0ab3..9ccd56c04a45 100644
> --- a/include/linux/bpf_verifier.h
> +++ b/include/linux/bpf_verifier.h
> @@ -561,6 +561,7 @@ struct bpf_insn_aux_data {
>         bool non_sleepable; /* helper/kfunc may be called from non-sleepable context */
>         bool is_iter_next; /* bpf_iter_<type>_next() kfunc call */
>         bool call_with_percpu_alloc_ptr; /* {this,per}_cpu_ptr() with prog percpu alloc */
> +       bool finalized_call; /* call holds function offset relative to __bpf_base_call */
>         u8 alu_state; /* used in combination with alu_limit */
>         /* true if STX or LDX instruction is a part of a spill/fill
>          * pattern for a bpf_fastcall call.
> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index 1ca5c5e895ee..cc737d103cdd 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -21806,6 +21806,14 @@ static int convert_ctx_accesses(struct bpf_verifier_env *env)
>                         env->prog = new_prog;
>                         delta += cnt - 1;
>
> +                       /* gen_prologue emits function calls with target address
> +                        * relative to __bpf_call_base. Skip patch_call_imm fixup.
> +                        */
> +                       for (i = 0; i < cnt - 1; i++) {
> +                               if (bpf_helper_call(&env->prog->insnsi[i]))
> +                                       env->insn_aux_data[i].finalized_call = true;
> +                       }
> +
>                         ret = add_kfunc_in_insns(env, insn_buf, cnt - 1);

And then we can get rid of this function as there is no use case for
having a new kfunc in gen_{pro,epi}logue.

>                         if (ret < 0)
>                                 return ret;
> @@ -23412,6 +23420,9 @@ static int do_misc_fixups(struct bpf_verifier_env *env)
>                         goto next_insn;
>                 }
>  patch_call_imm:
> +               if (env->insn_aux_data[i + delta].finalized_call)
> +                       goto next_insn;
> +
>                 fn = env->ops->get_func_proto(insn->imm, env->prog);
>                 /* all functions that have prototype and verifier allowed
>                  * programs to call them, must be real in-kernel functions
> @@ -23423,6 +23434,7 @@ static int do_misc_fixups(struct bpf_verifier_env *env)
>                         return -EFAULT;
>                 }
>                 insn->imm = fn->func - __bpf_call_base;
> +               env->insn_aux_data[i + delta].finalized_call = true;
>  next_insn:
>                 if (subprogs[cur_subprog + 1].start == i + delta + 1) {
>                         subprogs[cur_subprog].stack_depth += stack_depth_extra;
> diff --git a/net/core/filter.c b/net/core/filter.c
> index 7f5bc6a505e1..53993c2c492d 100644
> --- a/net/core/filter.c
> +++ b/net/core/filter.c
> @@ -9082,8 +9082,7 @@ static int bpf_unclone_prologue(struct bpf_insn *insn_buf, u32 pkt_access_flags,
>         /* ret = bpf_skb_pull_data(skb, 0); */
>         *insn++ = BPF_MOV64_REG(BPF_REG_6, BPF_REG_1);
>         *insn++ = BPF_ALU64_REG(BPF_XOR, BPF_REG_2, BPF_REG_2);
> -       *insn++ = BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0,
> -                              BPF_FUNC_skb_pull_data);

This is why I was suggesting setting off = 1 in BPF_EMIT_CALL to mark
a call as finalized. So that we can continue to support using
BPF_RAW_INSN to emit a helper call in prologue and epilogue.

> +       *insn++ = BPF_EMIT_CALL(bpf_skb_pull_data);
>         /* if (!ret)
>          *      goto restore;
>          * return TC_ACT_SHOT;
> @@ -9135,11 +9134,8 @@ static int bpf_gen_ld_abs(const struct bpf_insn *orig,
>         return insn - insn_buf;
>  }
>
> -__bpf_kfunc_start_defs();
> -
> -__bpf_kfunc void bpf_skb_meta_realign(struct __sk_buff *skb_)
> +static void bpf_skb_meta_realign(struct sk_buff *skb)
>  {
> -       struct sk_buff *skb = (typeof(skb))skb_;
>         u8 *meta_end = skb_metadata_end(skb);
>         u8 meta_len = skb_metadata_len(skb);
>         u8 *meta;
> @@ -9161,14 +9157,6 @@ __bpf_kfunc void bpf_skb_meta_realign(struct __sk_buff *skb_)
>         bpf_compute_data_pointers(skb);
>  }
>
> -__bpf_kfunc_end_defs();
> -
> -BTF_KFUNCS_START(tc_cls_act_hidden_ids)
> -BTF_ID_FLAGS(func, bpf_skb_meta_realign)
> -BTF_KFUNCS_END(tc_cls_act_hidden_ids)
> -
> -BTF_ID_LIST_SINGLE(bpf_skb_meta_realign_ids, func, bpf_skb_meta_realign)
> -
>  static int tc_cls_act_prologue(struct bpf_insn *insn_buf, u32 pkt_access_flags,
>                                const struct bpf_prog *prog)
>  {
> @@ -9182,8 +9170,10 @@ static int tc_cls_act_prologue(struct bpf_insn *insn_buf, u32 pkt_access_flags,
>                  * r0 = bpf_skb_meta_realign(r1); // r0 is undefined
>                  * r1 = r6;
>                  */
> +               BUILD_BUG_ON(!__same_type(&bpf_skb_meta_realign,
> +                                         (void (*)(struct sk_buff *skb))NULL));
>                 *insn++ = BPF_MOV64_REG(BPF_REG_6, BPF_REG_1);
> -               *insn++ = BPF_CALL_KFUNC(0, bpf_skb_meta_realign_ids[0]);
> +               *insn++ = BPF_EMIT_CALL(bpf_skb_meta_realign);
>                 *insn++ = BPF_MOV64_REG(BPF_REG_1, BPF_REG_6);
>         }
>         cnt = bpf_unclone_prologue(insn, pkt_access_flags, prog, TC_ACT_SHOT);

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH bpf-next v2 15/16] bpf: Realign skb metadata for TC progs using data_meta
  2026-01-06 17:46                   ` Amery Hung
@ 2026-01-06 18:40                     ` Alexei Starovoitov
  2026-01-06 19:12                     ` Jakub Sitnicki
  1 sibling, 0 replies; 37+ messages in thread
From: Alexei Starovoitov @ 2026-01-06 18:40 UTC (permalink / raw)
  To: Amery Hung
  Cc: Jakub Sitnicki, Martin KaFai Lau, Martin KaFai Lau, bpf,
	Network Development, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Alexei Starovoitov, Daniel Borkmann,
	Jesper Dangaard Brouer, John Fastabend, Stanislav Fomichev,
	Simon Horman, Andrii Nakryiko, Eduard Zingerman, Song Liu,
	Yonghong Song, KP Singh, Hao Luo, Jiri Olsa, kernel-team

On Tue, Jan 6, 2026 at 9:47 AM Amery Hung <ameryhung@gmail.com> wrote:
>
> > diff --git a/net/core/filter.c b/net/core/filter.c
> > index 7f5bc6a505e1..53993c2c492d 100644
> > --- a/net/core/filter.c
> > +++ b/net/core/filter.c
> > @@ -9082,8 +9082,7 @@ static int bpf_unclone_prologue(struct bpf_insn *insn_buf, u32 pkt_access_flags,
> >         /* ret = bpf_skb_pull_data(skb, 0); */
> >         *insn++ = BPF_MOV64_REG(BPF_REG_6, BPF_REG_1);
> >         *insn++ = BPF_ALU64_REG(BPF_XOR, BPF_REG_2, BPF_REG_2);
> > -       *insn++ = BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0,
> > -                              BPF_FUNC_skb_pull_data);
>
> This is why I was suggesting setting off = 1 in BPF_EMIT_CALL to mark
> a call as finalized. So that we can continue to support using
> BPF_RAW_INSN to emit a helper call in prologue and epilogue.

That's the only place in the code where BPF_RAW_INSN(CALL, BPF_FUNC_xxx)
is used, and it was done this way only because we didn't think
of finalized_call concept.
So I don't think we should introduce more corner cases with off=1.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH bpf-next v2 15/16] bpf: Realign skb metadata for TC progs using data_meta
  2026-01-06 17:46                   ` Amery Hung
  2026-01-06 18:40                     ` Alexei Starovoitov
@ 2026-01-06 19:12                     ` Jakub Sitnicki
  2026-01-06 19:42                       ` Amery Hung
  1 sibling, 1 reply; 37+ messages in thread
From: Jakub Sitnicki @ 2026-01-06 19:12 UTC (permalink / raw)
  To: Amery Hung, Alexei Starovoitov, Martin KaFai Lau
  Cc: Martin KaFai Lau, bpf, Network Development, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Alexei Starovoitov,
	Daniel Borkmann, Jesper Dangaard Brouer, John Fastabend,
	Stanislav Fomichev, Simon Horman, Andrii Nakryiko,
	Eduard Zingerman, Song Liu, Yonghong Song, KP Singh, Hao Luo,
	Jiri Olsa, kernel-team

On Tue, Jan 06, 2026 at 09:46 AM -08, Amery Hung wrote:
> On Tue, Jan 6, 2026 at 9:36 AM Jakub Sitnicki <jakub@cloudflare.com> wrote:
>> --- a/kernel/bpf/verifier.c
>> +++ b/kernel/bpf/verifier.c
>> @@ -21806,6 +21806,14 @@ static int convert_ctx_accesses(struct bpf_verifier_env *env)
>>                         env->prog = new_prog;
>>                         delta += cnt - 1;
>>
>> +                       /* gen_prologue emits function calls with target address
>> +                        * relative to __bpf_call_base. Skip patch_call_imm fixup.
>> +                        */
>> +                       for (i = 0; i < cnt - 1; i++) {
>> +                               if (bpf_helper_call(&env->prog->insnsi[i]))
>> +                                       env->insn_aux_data[i].finalized_call = true;
>> +                       }
>> +
>>                         ret = add_kfunc_in_insns(env, insn_buf, cnt - 1);
>
> And then we can get rid of this function as there is no use case for
> having a new kfunc in gen_{pro,epi}logue.

Happy to convert bpf_{qdisc,testmod} gen_{pro,epi}logue to use
BPF_EMIT_CALL instead of BPF_CALL_KFUNC.

If it's alright with you, I'd like to kill kfunc support in
{pro,epi}logue as a follow up.

Looks like there will be a bit of churn in selftests to remove the
coverage. And this series is getting quite long.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH bpf-next v2 15/16] bpf: Realign skb metadata for TC progs using data_meta
  2026-01-06 19:12                     ` Jakub Sitnicki
@ 2026-01-06 19:42                       ` Amery Hung
  0 siblings, 0 replies; 37+ messages in thread
From: Amery Hung @ 2026-01-06 19:42 UTC (permalink / raw)
  To: Jakub Sitnicki
  Cc: Alexei Starovoitov, Martin KaFai Lau, Martin KaFai Lau, bpf,
	Network Development, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Alexei Starovoitov, Daniel Borkmann,
	Jesper Dangaard Brouer, John Fastabend, Stanislav Fomichev,
	Simon Horman, Andrii Nakryiko, Eduard Zingerman, Song Liu,
	Yonghong Song, KP Singh, Hao Luo, Jiri Olsa, kernel-team

On Tue, Jan 6, 2026 at 11:12 AM Jakub Sitnicki <jakub@cloudflare.com> wrote:
>
> On Tue, Jan 06, 2026 at 09:46 AM -08, Amery Hung wrote:
> > On Tue, Jan 6, 2026 at 9:36 AM Jakub Sitnicki <jakub@cloudflare.com> wrote:
> >> --- a/kernel/bpf/verifier.c
> >> +++ b/kernel/bpf/verifier.c
> >> @@ -21806,6 +21806,14 @@ static int convert_ctx_accesses(struct bpf_verifier_env *env)
> >>                         env->prog = new_prog;
> >>                         delta += cnt - 1;
> >>
> >> +                       /* gen_prologue emits function calls with target address
> >> +                        * relative to __bpf_call_base. Skip patch_call_imm fixup.
> >> +                        */
> >> +                       for (i = 0; i < cnt - 1; i++) {
> >> +                               if (bpf_helper_call(&env->prog->insnsi[i]))
> >> +                                       env->insn_aux_data[i].finalized_call = true;
> >> +                       }
> >> +
> >>                         ret = add_kfunc_in_insns(env, insn_buf, cnt - 1);
> >
> > And then we can get rid of this function as there is no use case for
> > having a new kfunc in gen_{pro,epi}logue.
>
> Happy to convert bpf_{qdisc,testmod} gen_{pro,epi}logue to use
> BPF_EMIT_CALL instead of BPF_CALL_KFUNC.
>
> If it's alright with you, I'd like to kill kfunc support in
> {pro,epi}logue as a follow up.
>

Totally. Appreciate it!

Make sense to do in another patchset.

> Looks like there will be a bit of churn in selftests to remove the
> coverage. And this series is getting quite long.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH bpf-next v2 00/16] Decouple skb metadata tracking from MAC header offset
  2026-01-05 12:14 [PATCH bpf-next v2 00/16] Decouple skb metadata tracking from MAC header offset Jakub Sitnicki
                   ` (15 preceding siblings ...)
  2026-01-05 12:14 ` [PATCH bpf-next v2 16/16] selftests/bpf: Test skb metadata access after L2 decapsulation Jakub Sitnicki
@ 2026-01-10 21:07 ` Jakub Sitnicki
  2026-01-14 11:52 ` Jakub Sitnicki
  17 siblings, 0 replies; 37+ messages in thread
From: Jakub Sitnicki @ 2026-01-10 21:07 UTC (permalink / raw)
  To: bpf
  Cc: netdev, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Alexei Starovoitov, Daniel Borkmann,
	Jesper Dangaard Brouer, John Fastabend, Stanislav Fomichev,
	Simon Horman, Andrii Nakryiko, Martin KaFai Lau, Eduard Zingerman,
	Song Liu, Yonghong Song, KP Singh, Hao Luo, Jiri Olsa,
	kernel-team

I've split out the driver changes:

https://lore.kernel.org/r/20260110-skb-meta-fixup-skb_metadata_set-calls-v1-0-1047878ed1b0@cloudflare.com

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH bpf-next v2 00/16] Decouple skb metadata tracking from MAC header offset
  2026-01-05 12:14 [PATCH bpf-next v2 00/16] Decouple skb metadata tracking from MAC header offset Jakub Sitnicki
                   ` (16 preceding siblings ...)
  2026-01-10 21:07 ` [PATCH bpf-next v2 00/16] Decouple skb metadata tracking from MAC header offset Jakub Sitnicki
@ 2026-01-14 11:52 ` Jakub Sitnicki
  17 siblings, 0 replies; 37+ messages in thread
From: Jakub Sitnicki @ 2026-01-14 11:52 UTC (permalink / raw)
  To: bpf
  Cc: netdev, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Alexei Starovoitov, Daniel Borkmann,
	Jesper Dangaard Brouer, John Fastabend, Stanislav Fomichev,
	Simon Horman, Andrii Nakryiko, Martin KaFai Lau, Eduard Zingerman,
	Song Liu, Yonghong Song, KP Singh, Hao Luo, Jiri Olsa,
	kernel-team

On Mon, Jan 05, 2026 at 01:14 PM +01, Jakub Sitnicki wrote:
> This series continues the effort to provide reliable access to xdp/skb
> metadata from BPF context on the receive path.

FYI, I'm dropping the effort to make xdp/skb metadata survive L2
encap/decap, following the discussion here and in [1].

Instead, I'll work on an skb local storage backed by an skb extension,
similar to what we have for cgroups/sockets/etc.

This approach avoids patching skb_push/pull call sites and implicit
allocations on hot paths.
 
I'm happy to split out BPF_EMIT_CALL support for prologue/epilogue if
it's considered useful on its own.

[1] https://lore.kernel.org/r/20260110-skb-meta-fixup-skb_metadata_set-calls-v1-0-1047878ed1b0@cloudflare.com

Thanks,
-jkbs

^ permalink raw reply	[flat|nested] 37+ messages in thread

end of thread, other threads:[~2026-01-14 11:52 UTC | newest]

Thread overview: 37+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-01-05 12:14 [PATCH bpf-next v2 00/16] Decouple skb metadata tracking from MAC header offset Jakub Sitnicki
2026-01-05 12:14 ` [PATCH bpf-next v2 01/16] bnxt_en: Call skb_metadata_set when skb->data points at metadata end Jakub Sitnicki
2026-01-05 12:14 ` [PATCH bpf-next v2 02/16] i40e: " Jakub Sitnicki
2026-01-05 12:14 ` [PATCH bpf-next v2 03/16] igb: " Jakub Sitnicki
2026-01-05 12:14 ` [PATCH bpf-next v2 04/16] igc: " Jakub Sitnicki
2026-01-05 12:14 ` [PATCH bpf-next v2 05/16] ixgbe: " Jakub Sitnicki
2026-01-05 12:14 ` [PATCH bpf-next v2 06/16] net/mlx5e: " Jakub Sitnicki
2026-01-05 12:14 ` [PATCH bpf-next v2 07/16] veth: " Jakub Sitnicki
2026-01-05 12:14 ` [PATCH bpf-next v2 08/16] xsk: " Jakub Sitnicki
2026-01-05 12:14 ` [PATCH bpf-next v2 09/16] xdp: " Jakub Sitnicki
2026-01-05 12:14 ` [PATCH bpf-next v2 10/16] net: Track skb metadata end separately from MAC offset Jakub Sitnicki
2026-01-05 12:14 ` [PATCH bpf-next v2 11/16] bpf, verifier: Remove side effects from may_access_direct_pkt_data Jakub Sitnicki
2026-01-05 18:23   ` Eduard Zingerman
2026-01-05 12:14 ` [PATCH bpf-next v2 12/16] bpf, verifier: Turn seen_direct_write flag into a bitmap Jakub Sitnicki
2026-01-05 18:48   ` Eduard Zingerman
2026-01-05 12:14 ` [PATCH bpf-next v2 13/16] bpf, verifier: Propagate packet access flags to gen_prologue Jakub Sitnicki
2026-01-05 18:49   ` Eduard Zingerman
2026-01-05 12:14 ` [PATCH bpf-next v2 14/16] bpf, verifier: Track when data_meta pointer is loaded Jakub Sitnicki
2026-01-05 18:48   ` Eduard Zingerman
2026-01-05 12:14 ` [PATCH bpf-next v2 15/16] bpf: Realign skb metadata for TC progs using data_meta Jakub Sitnicki
2026-01-05 19:14   ` Alexei Starovoitov
2026-01-05 19:20     ` Jakub Sitnicki
2026-01-05 19:42     ` Amery Hung
2026-01-05 20:02       ` Alexei Starovoitov
2026-01-05 20:54       ` Martin KaFai Lau
2026-01-05 21:47         ` Alexei Starovoitov
2026-01-05 22:25           ` Martin KaFai Lau
2026-01-05 23:19             ` Amery Hung
2026-01-06  2:04               ` Alexei Starovoitov
2026-01-06 17:36                 ` Jakub Sitnicki
2026-01-06 17:46                   ` Amery Hung
2026-01-06 18:40                     ` Alexei Starovoitov
2026-01-06 19:12                     ` Jakub Sitnicki
2026-01-06 19:42                       ` Amery Hung
2026-01-05 12:14 ` [PATCH bpf-next v2 16/16] selftests/bpf: Test skb metadata access after L2 decapsulation Jakub Sitnicki
2026-01-10 21:07 ` [PATCH bpf-next v2 00/16] Decouple skb metadata tracking from MAC header offset Jakub Sitnicki
2026-01-14 11:52 ` Jakub Sitnicki

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox