public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/5] bpf: skb_adjust_room helper refactor and tunnel decap flags
@ 2026-03-18 13:42 Nick Hudson
  2026-03-18 13:42 ` [PATCH v2 1/5] bpf: name the enum for BPF_FUNC_skb_adjust_room flags Nick Hudson
                   ` (5 more replies)
  0 siblings, 6 replies; 21+ messages in thread
From: Nick Hudson @ 2026-03-18 13:42 UTC (permalink / raw)
  To: bpf, netdev; +Cc: Willem de Bruijn, Nick Hudson

This series refactors the bpf_skb_adjust_room() helper to support tunnel
decapsulation with L4 and IPXIP modes, in addition to the existing L3
decapsulation support.

The changes are structured as follows:
1. Name the enum for BPF_FUNC_skb_adjust_room flags
2. Add new BPF_F_ADJ_ROOM_DECAP_* flags for L4 and IPXIP decap modes
3. Introduce helper masks to simplify validation logic
4. Enable the new decap flags and add guard rails for invalid combinations
5. Clear tunnel GSO state when decapsulating in skb_adjust_room

These patches enable BPF programs to efficiently decapsulate various tunnel
encapsulation formats and properly handle GSO state transitions.

Changes v1 -> v2:
- Patch 3: Decap flag acceptance intentionally remains L3-only
  while adding the helper masks.
- Patch 4: Decap with L4/IPXIP support enabled with guard rails.

Nick Hudson (5):
  bpf: name the enum for BPF_FUNC_skb_adjust_room flags
  bpf: add BPF_F_ADJ_ROOM_DECAP_* flags for tunnel decapsulation
  bpf: add helper masks for ADJ_ROOM flags and encap validation
  bpf: allow new DECAP flags and add guard rails
  bpf: clear decap tunnel GSO state in skb_adjust_room

 include/uapi/linux/bpf.h       |  36 ++++++++++-
 net/core/filter.c              | 107 +++++++++++++++++++++++++++------
 tools/include/uapi/linux/bpf.h |  36 ++++++++++-
 3 files changed, 156 insertions(+), 23 deletions(-)

-- 
2.34.1


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH v2 1/5] bpf: name the enum for BPF_FUNC_skb_adjust_room flags
  2026-03-18 13:42 [PATCH v2 0/5] bpf: skb_adjust_room helper refactor and tunnel decap flags Nick Hudson
@ 2026-03-18 13:42 ` Nick Hudson
  2026-03-21  0:39   ` Willem de Bruijn
  2026-03-24 17:34   ` Martin KaFai Lau
  2026-03-18 13:42 ` [PATCH v2 2/5] bpf: add BPF_F_ADJ_ROOM_DECAP_* flags for tunnel decapsulation Nick Hudson
                   ` (4 subsequent siblings)
  5 siblings, 2 replies; 21+ messages in thread
From: Nick Hudson @ 2026-03-18 13:42 UTC (permalink / raw)
  To: bpf, netdev
  Cc: Willem de Bruijn, Nick Hudson, Max Tottenham, Anna Glasgall,
	Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	linux-kernel

The existing anonymous enum for BPF_FUNC_skb_adjust_room flags is
named to enum bpf_adj_room_flags to enable CO-RE (Compile Once -
Run Everywhere) lookups in BPF programs.

Co-developed-by: Max Tottenham <mtottenh@akamai.com>
Signed-off-by: Max Tottenham <mtottenh@akamai.com>
Co-developed-by: Anna Glasgall <aglasgal@akamai.com>
Signed-off-by: Anna Glasgall <aglasgal@akamai.com>
Signed-off-by: Nick Hudson <nhudson@akamai.com>
---
 include/uapi/linux/bpf.h       | 2 +-
 tools/include/uapi/linux/bpf.h | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index c8d400b7680a..bc4b25eb72ce 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -6209,7 +6209,7 @@ enum {
 };
 
 /* BPF_FUNC_skb_adjust_room flags. */
-enum {
+enum bpf_adj_room_flags {
 	BPF_F_ADJ_ROOM_FIXED_GSO	= (1ULL << 0),
 	BPF_F_ADJ_ROOM_ENCAP_L3_IPV4	= (1ULL << 1),
 	BPF_F_ADJ_ROOM_ENCAP_L3_IPV6	= (1ULL << 2),
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index 5e38b4887de6..db2c520d0e92 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -6209,7 +6209,7 @@ enum {
 };
 
 /* BPF_FUNC_skb_adjust_room flags. */
-enum {
+enum bpf_adj_room_flags {
 	BPF_F_ADJ_ROOM_FIXED_GSO	= (1ULL << 0),
 	BPF_F_ADJ_ROOM_ENCAP_L3_IPV4	= (1ULL << 1),
 	BPF_F_ADJ_ROOM_ENCAP_L3_IPV6	= (1ULL << 2),
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v2 2/5] bpf: add BPF_F_ADJ_ROOM_DECAP_* flags for tunnel decapsulation
  2026-03-18 13:42 [PATCH v2 0/5] bpf: skb_adjust_room helper refactor and tunnel decap flags Nick Hudson
  2026-03-18 13:42 ` [PATCH v2 1/5] bpf: name the enum for BPF_FUNC_skb_adjust_room flags Nick Hudson
@ 2026-03-18 13:42 ` Nick Hudson
  2026-03-21  0:39   ` Willem de Bruijn
  2026-03-18 13:42 ` [PATCH v2 3/5] bpf: add helper masks for ADJ_ROOM flags and encap validation Nick Hudson
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 21+ messages in thread
From: Nick Hudson @ 2026-03-18 13:42 UTC (permalink / raw)
  To: bpf, netdev
  Cc: Willem de Bruijn, Nick Hudson, Max Tottenham, Anna Glasgall,
	Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	linux-kernel

Add new bpf_skb_adjust_room() decapsulation flags:

- BPF_F_ADJ_ROOM_DECAP_L4_GRE
- BPF_F_ADJ_ROOM_DECAP_L4_UDP
- BPF_F_ADJ_ROOM_DECAP_IPXIP4
- BPF_F_ADJ_ROOM_DECAP_IPXIP6

These flags let BPF programs describe which tunnel layer is being
removed, so later changes can update tunnel-related GSO state
accordingly during decapsulation.

This patch only introduces the UAPI flag definitions and helper
documentation.

Co-developed-by: Max Tottenham <mtottenh@akamai.com>
Signed-off-by: Max Tottenham <mtottenh@akamai.com>
Co-developed-by: Anna Glasgall <aglasgal@akamai.com>
Signed-off-by: Anna Glasgall <aglasgal@akamai.com>
Signed-off-by: Nick Hudson <nhudson@akamai.com>
---
 include/uapi/linux/bpf.h       | 34 ++++++++++++++++++++++++++++++++--
 tools/include/uapi/linux/bpf.h | 34 ++++++++++++++++++++++++++++++++--
 2 files changed, 64 insertions(+), 4 deletions(-)

diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index bc4b25eb72ce..2ef886dc9685 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -3010,8 +3010,34 @@ union bpf_attr {
  *
  *		* **BPF_F_ADJ_ROOM_DECAP_L3_IPV4**,
  *		  **BPF_F_ADJ_ROOM_DECAP_L3_IPV6**:
- *		  Indicate the new IP header version after decapsulating the outer
- *		  IP header. Used when the inner and outer IP versions are different.
+ *		  Indicate the new IP header version after decapsulating the
+ *		  outer IP header. Used when the inner and outer IP versions
+ *		  are different. These flags only trigger a protocol change
+ *		  without clearing any tunnel-specific GSO flags.
+ *
+ *		* **BPF_F_ADJ_ROOM_DECAP_L4_GRE**:
+ *		  Clear GRE tunnel GSO flags (SKB_GSO_GRE and SKB_GSO_GRE_CSUM)
+ *		  when decapsulating a GRE tunnel.
+ *
+ *		* **BPF_F_ADJ_ROOM_DECAP_L4_UDP**:
+ *		  Clear UDP tunnel GSO flags (SKB_GSO_UDP_TUNNEL and
+ *		  SKB_GSO_UDP_TUNNEL_CSUM) when decapsulating a UDP tunnel.
+ *
+ *		* **BPF_F_ADJ_ROOM_DECAP_IPXIP4**:
+ *		  Clear IPIP/SIT tunnel GSO flag (SKB_GSO_IPXIP4) when decapsulating
+ *		  a tunnel with an outer IPv4 header (IPv4-in-IPv4 or IPv6-in-IPv4).
+ *
+ *		* **BPF_F_ADJ_ROOM_DECAP_IPXIP6**:
+ *		  Clear IPv6 encapsulation tunnel GSO flag (SKB_GSO_IPXIP6) when
+ *		  decapsulating a tunnel with an outer IPv6 header (IPv6-in-IPv6
+ *		  or IPv4-in-IPv6).
+ *
+ *		When using the decapsulation flags above, the skb->encapsulation
+ *		flag is automatically cleared if all tunnel-specific GSO flags
+ *		(SKB_GSO_UDP_TUNNEL, SKB_GSO_UDP_TUNNEL_CSUM, SKB_GSO_GRE,
+ *		SKB_GSO_GRE_CSUM, SKB_GSO_IPXIP4, SKB_GSO_IPXIP6) have been
+ *		removed from the packet. This handles cases where all tunnel
+ *		layers have been decapsulated.
  *
  * 		A call to this helper is susceptible to change the underlying
  * 		packet buffer. Therefore, at load time, all checks on pointers
@@ -6219,6 +6245,10 @@ enum bpf_adj_room_flags {
 	BPF_F_ADJ_ROOM_ENCAP_L2_ETH	= (1ULL << 6),
 	BPF_F_ADJ_ROOM_DECAP_L3_IPV4	= (1ULL << 7),
 	BPF_F_ADJ_ROOM_DECAP_L3_IPV6	= (1ULL << 8),
+	BPF_F_ADJ_ROOM_DECAP_L4_GRE	= (1ULL << 9),
+	BPF_F_ADJ_ROOM_DECAP_L4_UDP	= (1ULL << 10),
+	BPF_F_ADJ_ROOM_DECAP_IPXIP4	= (1ULL << 11),
+	BPF_F_ADJ_ROOM_DECAP_IPXIP6	= (1ULL << 12),
 };
 
 enum {
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index db2c520d0e92..e9a5c67ff5e2 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -3010,8 +3010,34 @@ union bpf_attr {
  *
  *		* **BPF_F_ADJ_ROOM_DECAP_L3_IPV4**,
  *		  **BPF_F_ADJ_ROOM_DECAP_L3_IPV6**:
- *		  Indicate the new IP header version after decapsulating the outer
- *		  IP header. Used when the inner and outer IP versions are different.
+ *		  Indicate the new IP header version after decapsulating the
+ *		  outer IP header. Used when the inner and outer IP versions
+ *		  are different. These flags only trigger a protocol change
+ *		  without clearing any tunnel-specific GSO flags.
+ *
+ *		* **BPF_F_ADJ_ROOM_DECAP_L4_GRE**:
+ *		  Clear GRE tunnel GSO flags (SKB_GSO_GRE and SKB_GSO_GRE_CSUM)
+ *		  when decapsulating a GRE tunnel.
+ *
+ *		* **BPF_F_ADJ_ROOM_DECAP_L4_UDP**:
+ *		  Clear UDP tunnel GSO flags (SKB_GSO_UDP_TUNNEL and
+ *		  SKB_GSO_UDP_TUNNEL_CSUM) when decapsulating a UDP tunnel.
+ *
+ *		* **BPF_F_ADJ_ROOM_DECAP_IPXIP4**:
+ *		  Clear IPIP/SIT tunnel GSO flag (SKB_GSO_IPXIP4) when decapsulating
+ *		  a tunnel with an outer IPv4 header (IPv4-in-IPv4 or IPv6-in-IPv4).
+ *
+ *		* **BPF_F_ADJ_ROOM_DECAP_IPXIP6**:
+ *		  Clear IPv6 encapsulation tunnel GSO flag (SKB_GSO_IPXIP6) when
+ *		  decapsulating a tunnel with an outer IPv6 header (IPv6-in-IPv6
+ *		  or IPv4-in-IPv6).
+ *
+ *		When using the decapsulation flags above, the skb->encapsulation
+ *		flag is automatically cleared if all tunnel-specific GSO flags
+ *		(SKB_GSO_UDP_TUNNEL, SKB_GSO_UDP_TUNNEL_CSUM, SKB_GSO_GRE,
+ *		SKB_GSO_GRE_CSUM, SKB_GSO_IPXIP4, SKB_GSO_IPXIP6) have been
+ *		removed from the packet. This handles cases where all tunnel
+ *		layers have been decapsulated.
  *
  * 		A call to this helper is susceptible to change the underlying
  * 		packet buffer. Therefore, at load time, all checks on pointers
@@ -6219,6 +6245,10 @@ enum bpf_adj_room_flags {
 	BPF_F_ADJ_ROOM_ENCAP_L2_ETH	= (1ULL << 6),
 	BPF_F_ADJ_ROOM_DECAP_L3_IPV4	= (1ULL << 7),
 	BPF_F_ADJ_ROOM_DECAP_L3_IPV6	= (1ULL << 8),
+	BPF_F_ADJ_ROOM_DECAP_L4_GRE	= (1ULL << 9),
+	BPF_F_ADJ_ROOM_DECAP_L4_UDP	= (1ULL << 10),
+	BPF_F_ADJ_ROOM_DECAP_IPXIP4	= (1ULL << 11),
+	BPF_F_ADJ_ROOM_DECAP_IPXIP6	= (1ULL << 12),
 };
 
 enum {
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v2 3/5] bpf: add helper masks for ADJ_ROOM flags and encap validation
  2026-03-18 13:42 [PATCH v2 0/5] bpf: skb_adjust_room helper refactor and tunnel decap flags Nick Hudson
  2026-03-18 13:42 ` [PATCH v2 1/5] bpf: name the enum for BPF_FUNC_skb_adjust_room flags Nick Hudson
  2026-03-18 13:42 ` [PATCH v2 2/5] bpf: add BPF_F_ADJ_ROOM_DECAP_* flags for tunnel decapsulation Nick Hudson
@ 2026-03-18 13:42 ` Nick Hudson
  2026-03-21  0:39   ` Willem de Bruijn
  2026-03-24 18:12   ` Martin KaFai Lau
  2026-03-18 13:42 ` [PATCH v2 4/5] bpf: allow new DECAP flags and add guard rails Nick Hudson
                   ` (2 subsequent siblings)
  5 siblings, 2 replies; 21+ messages in thread
From: Nick Hudson @ 2026-03-18 13:42 UTC (permalink / raw)
  To: bpf, netdev
  Cc: Willem de Bruijn, Nick Hudson, Max Tottenham, Anna Glasgall,
	Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, linux-kernel

Introduce helper masks for bpf_skb_adjust_room() flags to simplify
validation logic:

- BPF_F_ADJ_ROOM_DECAP_L4_MASK
- BPF_F_ADJ_ROOM_DECAP_IPXIP_MASK
- BPF_F_ADJ_ROOM_ENCAP_MASK
- BPF_F_ADJ_ROOM_DECAP_MASK

Add flag validation to bpf_skb_net_grow() to reject invalid encap
flags early. Refactor existing validation checks in bpf_skb_net_shrink()
and bpf_skb_adjust_room() to use the new masks (no behavior change).

Co-developed-by: Max Tottenham <mtottenh@akamai.com>
Signed-off-by: Max Tottenham <mtottenh@akamai.com>
Co-developed-by: Anna Glasgall <aglasgal@akamai.com>
Signed-off-by: Anna Glasgall <aglasgal@akamai.com>
Signed-off-by: Nick Hudson <nhudson@akamai.com>
---
 net/core/filter.c | 31 +++++++++++++++++++++++--------
 1 file changed, 23 insertions(+), 8 deletions(-)

diff --git a/net/core/filter.c b/net/core/filter.c
index 0d5d5a17acb2..7c2871b40fe4 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -3483,14 +3483,25 @@ static u32 bpf_skb_net_base_len(const struct sk_buff *skb)
 #define BPF_F_ADJ_ROOM_DECAP_L3_MASK	(BPF_F_ADJ_ROOM_DECAP_L3_IPV4 | \
 					 BPF_F_ADJ_ROOM_DECAP_L3_IPV6)
 
-#define BPF_F_ADJ_ROOM_MASK		(BPF_F_ADJ_ROOM_FIXED_GSO | \
-					 BPF_F_ADJ_ROOM_ENCAP_L3_MASK | \
+#define BPF_F_ADJ_ROOM_DECAP_L4_MASK	(BPF_F_ADJ_ROOM_DECAP_L4_UDP | \
+					 BPF_F_ADJ_ROOM_DECAP_L4_GRE)
+
+#define BPF_F_ADJ_ROOM_DECAP_IPXIP_MASK	(BPF_F_ADJ_ROOM_DECAP_IPXIP4 | \
+					 BPF_F_ADJ_ROOM_DECAP_IPXIP6)
+
+#define BPF_F_ADJ_ROOM_ENCAP_MASK	(BPF_F_ADJ_ROOM_ENCAP_L3_MASK | \
 					 BPF_F_ADJ_ROOM_ENCAP_L4_GRE | \
 					 BPF_F_ADJ_ROOM_ENCAP_L4_UDP | \
 					 BPF_F_ADJ_ROOM_ENCAP_L2_ETH | \
 					 BPF_F_ADJ_ROOM_ENCAP_L2( \
-					  BPF_ADJ_ROOM_ENCAP_L2_MASK) | \
-					 BPF_F_ADJ_ROOM_DECAP_L3_MASK)
+					  BPF_ADJ_ROOM_ENCAP_L2_MASK))
+
+#define BPF_F_ADJ_ROOM_DECAP_MASK	(BPF_F_ADJ_ROOM_DECAP_L3_MASK)
+
+#define BPF_F_ADJ_ROOM_MASK		(BPF_F_ADJ_ROOM_FIXED_GSO | \
+					 BPF_F_ADJ_ROOM_ENCAP_MASK | \
+					 BPF_F_ADJ_ROOM_DECAP_MASK | \
+					 BPF_F_ADJ_ROOM_NO_CSUM_RESET)
 
 static int bpf_skb_net_grow(struct sk_buff *skb, u32 off, u32 len_diff,
 			    u64 flags)
@@ -3502,6 +3513,11 @@ static int bpf_skb_net_grow(struct sk_buff *skb, u32 off, u32 len_diff,
 	unsigned int gso_type = SKB_GSO_DODGY;
 	int ret;
 
+	if (unlikely(flags & ~(BPF_F_ADJ_ROOM_ENCAP_MASK |
+			       BPF_F_ADJ_ROOM_NO_CSUM_RESET |
+			       BPF_F_ADJ_ROOM_FIXED_GSO)))
+		return -EINVAL;
+
 	if (skb_is_gso(skb) && !skb_is_gso_tcp(skb)) {
 		/* udp gso_size delineates datagrams, only allow if fixed */
 		if (!(skb_shinfo(skb)->gso_type & SKB_GSO_UDP_L4) ||
@@ -3611,8 +3627,8 @@ static int bpf_skb_net_shrink(struct sk_buff *skb, u32 off, u32 len_diff,
 {
 	int ret;
 
-	if (unlikely(flags & ~(BPF_F_ADJ_ROOM_FIXED_GSO |
-			       BPF_F_ADJ_ROOM_DECAP_L3_MASK |
+	if (unlikely(flags & ~(BPF_F_ADJ_ROOM_DECAP_MASK |
+			       BPF_F_ADJ_ROOM_FIXED_GSO |
 			       BPF_F_ADJ_ROOM_NO_CSUM_RESET)))
 		return -EINVAL;
 
@@ -3708,8 +3724,7 @@ BPF_CALL_4(bpf_skb_adjust_room, struct sk_buff *, skb, s32, len_diff,
 	u32 off;
 	int ret;
 
-	if (unlikely(flags & ~(BPF_F_ADJ_ROOM_MASK |
-			       BPF_F_ADJ_ROOM_NO_CSUM_RESET)))
+	if (unlikely(flags & ~BPF_F_ADJ_ROOM_MASK))
 		return -EINVAL;
 	if (unlikely(len_diff_abs > 0xfffU))
 		return -EFAULT;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v2 4/5] bpf: allow new DECAP flags and add guard rails
  2026-03-18 13:42 [PATCH v2 0/5] bpf: skb_adjust_room helper refactor and tunnel decap flags Nick Hudson
                   ` (2 preceding siblings ...)
  2026-03-18 13:42 ` [PATCH v2 3/5] bpf: add helper masks for ADJ_ROOM flags and encap validation Nick Hudson
@ 2026-03-18 13:42 ` Nick Hudson
  2026-03-18 20:02   ` Willem de Bruijn
                     ` (2 more replies)
  2026-03-18 13:42 ` [PATCH v2 5/5] bpf: clear decap tunnel GSO state in skb_adjust_room Nick Hudson
  2026-03-18 20:01 ` [PATCH v2 0/5] bpf: skb_adjust_room helper refactor and tunnel decap flags Willem de Bruijn
  5 siblings, 3 replies; 21+ messages in thread
From: Nick Hudson @ 2026-03-18 13:42 UTC (permalink / raw)
  To: bpf, netdev
  Cc: Willem de Bruijn, Nick Hudson, Max Tottenham, Anna Glasgall,
	Martin KaFai Lau, Daniel Borkmann, Alexei Starovoitov,
	Andrii Nakryiko, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, linux-kernel

Add checks to require shrink-only decap, reject conflicting decap flag
combinations, and verify removed length is sufficient for claimed header
decapsulation.

Co-developed-by: Max Tottenham <mtottenh@akamai.com>
Signed-off-by: Max Tottenham <mtottenh@akamai.com>
Co-developed-by: Anna Glasgall <aglasgal@akamai.com>
Signed-off-by: Anna Glasgall <aglasgal@akamai.com>
Signed-off-by: Nick Hudson <nhudson@akamai.com>
---
 net/core/filter.c | 47 +++++++++++++++++++++++++++++++++++++----------
 1 file changed, 37 insertions(+), 10 deletions(-)

diff --git a/net/core/filter.c b/net/core/filter.c
index 7c2871b40fe4..47aec44a9cd3 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -56,6 +56,7 @@
 #include <net/sock_reuseport.h>
 #include <net/busy_poll.h>
 #include <net/tcp.h>
+#include <net/gre.h>
 #include <net/xfrm.h>
 #include <net/udp.h>
 #include <linux/bpf_trace.h>
@@ -3496,7 +3497,9 @@ static u32 bpf_skb_net_base_len(const struct sk_buff *skb)
 					 BPF_F_ADJ_ROOM_ENCAP_L2( \
 					  BPF_ADJ_ROOM_ENCAP_L2_MASK))
 
-#define BPF_F_ADJ_ROOM_DECAP_MASK	(BPF_F_ADJ_ROOM_DECAP_L3_MASK)
+#define BPF_F_ADJ_ROOM_DECAP_MASK	(BPF_F_ADJ_ROOM_DECAP_L3_MASK | \
+					 BPF_F_ADJ_ROOM_DECAP_L4_MASK | \
+					 BPF_F_ADJ_ROOM_DECAP_IPXIP_MASK)
 
 #define BPF_F_ADJ_ROOM_MASK		(BPF_F_ADJ_ROOM_FIXED_GSO | \
 					 BPF_F_ADJ_ROOM_ENCAP_MASK | \
@@ -3743,20 +3746,44 @@ BPF_CALL_4(bpf_skb_adjust_room, struct sk_buff *, skb, s32, len_diff,
 		return -ENOTSUPP;
 	}
 
-	if (flags & BPF_F_ADJ_ROOM_DECAP_L3_MASK) {
+	if (flags & BPF_F_ADJ_ROOM_DECAP_MASK) {
+		u32 len_decap_min = 0;
+
 		if (!shrink)
 			return -EINVAL;
 
-		switch (flags & BPF_F_ADJ_ROOM_DECAP_L3_MASK) {
-		case BPF_F_ADJ_ROOM_DECAP_L3_IPV4:
+		/* Reject mutually exclusive decap flag pairs. */
+		if ((flags & BPF_F_ADJ_ROOM_DECAP_L3_MASK) ==
+		    BPF_F_ADJ_ROOM_DECAP_L3_MASK)
+			return -EINVAL;
+
+		if ((flags & BPF_F_ADJ_ROOM_DECAP_L4_MASK) ==
+		    BPF_F_ADJ_ROOM_DECAP_L4_MASK)
+			return -EINVAL;
+
+		if ((flags & BPF_F_ADJ_ROOM_DECAP_IPXIP_MASK) ==
+		    BPF_F_ADJ_ROOM_DECAP_IPXIP_MASK)
+			return -EINVAL;
+
+		/* Reject mutually exclusive decap tunnel type flags. */
+		if ((flags & BPF_F_ADJ_ROOM_DECAP_L4_MASK) &&
+		    (flags & BPF_F_ADJ_ROOM_DECAP_IPXIP_MASK))
+			return -EINVAL;
+
+		if (flags & BPF_F_ADJ_ROOM_DECAP_L4_UDP)
+			len_decap_min += sizeof(struct udphdr);
+
+		if (flags & BPF_F_ADJ_ROOM_DECAP_L4_GRE)
+			len_decap_min += sizeof(struct gre_base_hdr);
+
+		if (len_diff_abs < len_decap_min)
+			return -EINVAL;
+
+		if (flags & BPF_F_ADJ_ROOM_DECAP_L3_IPV4)
 			len_min = sizeof(struct iphdr);
-			break;
-		case BPF_F_ADJ_ROOM_DECAP_L3_IPV6:
+
+		if (flags & BPF_F_ADJ_ROOM_DECAP_L3_IPV6)
 			len_min = sizeof(struct ipv6hdr);
-			break;
-		default:
-			return -EINVAL;
-		}
 	}
 
 	len_cur = skb->len - skb_network_offset(skb);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v2 5/5] bpf: clear decap tunnel GSO state in skb_adjust_room
  2026-03-18 13:42 [PATCH v2 0/5] bpf: skb_adjust_room helper refactor and tunnel decap flags Nick Hudson
                   ` (3 preceding siblings ...)
  2026-03-18 13:42 ` [PATCH v2 4/5] bpf: allow new DECAP flags and add guard rails Nick Hudson
@ 2026-03-18 13:42 ` Nick Hudson
  2026-03-18 20:09   ` Willem de Bruijn
  2026-03-18 20:01 ` [PATCH v2 0/5] bpf: skb_adjust_room helper refactor and tunnel decap flags Willem de Bruijn
  5 siblings, 1 reply; 21+ messages in thread
From: Nick Hudson @ 2026-03-18 13:42 UTC (permalink / raw)
  To: bpf, netdev
  Cc: Willem de Bruijn, Nick Hudson, Max Tottenham, Anna Glasgall,
	Martin KaFai Lau, Daniel Borkmann, Alexei Starovoitov,
	Andrii Nakryiko, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, linux-kernel

On shrink in bpf_skb_adjust_room(), clear tunnel-specific GSO flags
according to the decapsulation flags:

- BPF_F_ADJ_ROOM_DECAP_L4_UDP clears SKB_GSO_UDP_TUNNEL{,_CSUM}
- BPF_F_ADJ_ROOM_DECAP_L4_GRE clears SKB_GSO_GRE{,_CSUM}
- BPF_F_ADJ_ROOM_DECAP_IPXIP4 clears SKB_GSO_IPXIP4
- BPF_F_ADJ_ROOM_DECAP_IPXIP6 clears SKB_GSO_IPXIP6

When all tunnel-related GSO bits are cleared, also clear
skb->encapsulation.

Co-developed-by: Max Tottenham <mtottenh@akamai.com>
Signed-off-by: Max Tottenham <mtottenh@akamai.com>
Co-developed-by: Anna Glasgall <aglasgal@akamai.com>
Signed-off-by: Anna Glasgall <aglasgal@akamai.com>
Signed-off-by: Nick Hudson <nhudson@akamai.com>
---
 net/core/filter.c | 31 +++++++++++++++++++++++++++++++
 1 file changed, 31 insertions(+)

diff --git a/net/core/filter.c b/net/core/filter.c
index 47aec44a9cd3..35af1199ab97 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -3665,6 +3665,37 @@ static int bpf_skb_net_shrink(struct sk_buff *skb, u32 off, u32 len_diff,
 		if (!(flags & BPF_F_ADJ_ROOM_FIXED_GSO))
 			skb_increase_gso_size(shinfo, len_diff);
 
+		/* Selective GSO flag clearing based on decap type.
+		 * Only clear the flags for the tunnel layer being removed.
+		 */
+		if ((flags & BPF_F_ADJ_ROOM_DECAP_L4_UDP) &&
+		    (shinfo->gso_type & (SKB_GSO_UDP_TUNNEL |
+					 SKB_GSO_UDP_TUNNEL_CSUM)))
+			shinfo->gso_type &= ~(SKB_GSO_UDP_TUNNEL |
+					      SKB_GSO_UDP_TUNNEL_CSUM);
+		if ((flags & BPF_F_ADJ_ROOM_DECAP_L4_GRE) &&
+		    (shinfo->gso_type & (SKB_GSO_GRE | SKB_GSO_GRE_CSUM)))
+			shinfo->gso_type &= ~(SKB_GSO_GRE |
+					      SKB_GSO_GRE_CSUM);
+		if ((flags & BPF_F_ADJ_ROOM_DECAP_IPXIP4) &&
+		    (shinfo->gso_type & SKB_GSO_IPXIP4))
+			shinfo->gso_type &= ~SKB_GSO_IPXIP4;
+		if ((flags & BPF_F_ADJ_ROOM_DECAP_IPXIP6) &&
+		    (shinfo->gso_type & SKB_GSO_IPXIP6))
+			shinfo->gso_type &= ~SKB_GSO_IPXIP6;
+
+		/* Clear encapsulation flag only when no tunnel GSO flags remain */
+		if (flags & BPF_F_ADJ_ROOM_DECAP_MASK) {
+			if (!(shinfo->gso_type & (SKB_GSO_UDP_TUNNEL |
+						  SKB_GSO_UDP_TUNNEL_CSUM |
+						  SKB_GSO_GRE |
+						  SKB_GSO_GRE_CSUM |
+						  SKB_GSO_IPXIP4 |
+						  SKB_GSO_IPXIP6)))
+				if (skb->encapsulation)
+					skb->encapsulation = 0;
+		}
+
 		/* Header must be checked, and gso_segs recomputed. */
 		shinfo->gso_type |= SKB_GSO_DODGY;
 		shinfo->gso_segs = 0;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 0/5] bpf: skb_adjust_room helper refactor and tunnel decap flags
  2026-03-18 13:42 [PATCH v2 0/5] bpf: skb_adjust_room helper refactor and tunnel decap flags Nick Hudson
                   ` (4 preceding siblings ...)
  2026-03-18 13:42 ` [PATCH v2 5/5] bpf: clear decap tunnel GSO state in skb_adjust_room Nick Hudson
@ 2026-03-18 20:01 ` Willem de Bruijn
  5 siblings, 0 replies; 21+ messages in thread
From: Willem de Bruijn @ 2026-03-18 20:01 UTC (permalink / raw)
  To: Nick Hudson, bpf, netdev; +Cc: Willem de Bruijn, Nick Hudson

Nick Hudson wrote:
> This series refactors the bpf_skb_adjust_room() helper to support tunnel
> decapsulation with L4 and IPXIP modes, in addition to the existing L3
> decapsulation support.
> 
> The changes are structured as follows:
> 1. Name the enum for BPF_FUNC_skb_adjust_room flags
> 2. Add new BPF_F_ADJ_ROOM_DECAP_* flags for L4 and IPXIP decap modes
> 3. Introduce helper masks to simplify validation logic
> 4. Enable the new decap flags and add guard rails for invalid combinations
> 5. Clear tunnel GSO state when decapsulating in skb_adjust_room
> 
> These patches enable BPF programs to efficiently decapsulate various tunnel
> encapsulation formats and properly handle GSO state transitions.

Consider expanding the existing BPF tunneling tests. Similar to
commit 166b5a7f2ca3 ("selftests_bpf: extend test_tc_tunnel for UDP
encap"). Given how far along this series is fine to do separately.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 4/5] bpf: allow new DECAP flags and add guard rails
  2026-03-18 13:42 ` [PATCH v2 4/5] bpf: allow new DECAP flags and add guard rails Nick Hudson
@ 2026-03-18 20:02   ` Willem de Bruijn
  2026-03-19  8:17     ` Hudson, Nick
  2026-03-21  0:40   ` Willem de Bruijn
  2026-03-24 18:30   ` Martin KaFai Lau
  2 siblings, 1 reply; 21+ messages in thread
From: Willem de Bruijn @ 2026-03-18 20:02 UTC (permalink / raw)
  To: Nick Hudson, bpf, netdev
  Cc: Willem de Bruijn, Nick Hudson, Max Tottenham, Anna Glasgall,
	Martin KaFai Lau, Daniel Borkmann, Alexei Starovoitov,
	Andrii Nakryiko, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, linux-kernel

Nick Hudson wrote:
> Add checks to require shrink-only decap, reject conflicting decap flag
> combinations, and verify removed length is sufficient for claimed header
> decapsulation.
> 
> Co-developed-by: Max Tottenham <mtottenh@akamai.com>
> Signed-off-by: Max Tottenham <mtottenh@akamai.com>
> Co-developed-by: Anna Glasgall <aglasgal@akamai.com>
> Signed-off-by: Anna Glasgall <aglasgal@akamai.com>
> Signed-off-by: Nick Hudson <nhudson@akamai.com>
> ---
>  net/core/filter.c | 47 +++++++++++++++++++++++++++++++++++++----------
>  1 file changed, 37 insertions(+), 10 deletions(-)
> 
> diff --git a/net/core/filter.c b/net/core/filter.c
> index 7c2871b40fe4..47aec44a9cd3 100644
> --- a/net/core/filter.c
> +++ b/net/core/filter.c
> @@ -56,6 +56,7 @@
>  #include <net/sock_reuseport.h>
>  #include <net/busy_poll.h>
>  #include <net/tcp.h>
> +#include <net/gre.h>
>  #include <net/xfrm.h>
>  #include <net/udp.h>
>  #include <linux/bpf_trace.h>
> @@ -3496,7 +3497,9 @@ static u32 bpf_skb_net_base_len(const struct sk_buff *skb)
>  					 BPF_F_ADJ_ROOM_ENCAP_L2( \
>  					  BPF_ADJ_ROOM_ENCAP_L2_MASK))
>  
> -#define BPF_F_ADJ_ROOM_DECAP_MASK	(BPF_F_ADJ_ROOM_DECAP_L3_MASK)
> +#define BPF_F_ADJ_ROOM_DECAP_MASK	(BPF_F_ADJ_ROOM_DECAP_L3_MASK | \
> +					 BPF_F_ADJ_ROOM_DECAP_L4_MASK | \
> +					 BPF_F_ADJ_ROOM_DECAP_IPXIP_MASK)
>  
>  #define BPF_F_ADJ_ROOM_MASK		(BPF_F_ADJ_ROOM_FIXED_GSO | \
>  					 BPF_F_ADJ_ROOM_ENCAP_MASK | \
> @@ -3743,20 +3746,44 @@ BPF_CALL_4(bpf_skb_adjust_room, struct sk_buff *, skb, s32, len_diff,
>  		return -ENOTSUPP;
>  	}
>  
> -	if (flags & BPF_F_ADJ_ROOM_DECAP_L3_MASK) {
> +	if (flags & BPF_F_ADJ_ROOM_DECAP_MASK) {
> +		u32 len_decap_min = 0;
> +
>  		if (!shrink)
>  			return -EINVAL;
>  
> -		switch (flags & BPF_F_ADJ_ROOM_DECAP_L3_MASK) {
> -		case BPF_F_ADJ_ROOM_DECAP_L3_IPV4:
> +		/* Reject mutually exclusive decap flag pairs. */
> +		if ((flags & BPF_F_ADJ_ROOM_DECAP_L3_MASK) ==
> +		    BPF_F_ADJ_ROOM_DECAP_L3_MASK)
> +			return -EINVAL;
> +
> +		if ((flags & BPF_F_ADJ_ROOM_DECAP_L4_MASK) ==
> +		    BPF_F_ADJ_ROOM_DECAP_L4_MASK)
> +			return -EINVAL;
> +
> +		if ((flags & BPF_F_ADJ_ROOM_DECAP_IPXIP_MASK) ==
> +		    BPF_F_ADJ_ROOM_DECAP_IPXIP_MASK)
> +			return -EINVAL;
> +
> +		/* Reject mutually exclusive decap tunnel type flags. */
> +		if ((flags & BPF_F_ADJ_ROOM_DECAP_L4_MASK) &&
> +		    (flags & BPF_F_ADJ_ROOM_DECAP_IPXIP_MASK))
> +			return -EINVAL;
> +
> +		if (flags & BPF_F_ADJ_ROOM_DECAP_L4_UDP)
> +			len_decap_min += sizeof(struct udphdr);
> +
> +		if (flags & BPF_F_ADJ_ROOM_DECAP_L4_GRE)
> +			len_decap_min += sizeof(struct gre_base_hdr);
> +
> +		if (len_diff_abs < len_decap_min)
> +			return -EINVAL;

Should this test come after the below IP flags?

> +
> +		if (flags & BPF_F_ADJ_ROOM_DECAP_L3_IPV4)
>  			len_min = sizeof(struct iphdr);
> -			break;
> -		case BPF_F_ADJ_ROOM_DECAP_L3_IPV6:
> +
> +		if (flags & BPF_F_ADJ_ROOM_DECAP_L3_IPV6)
>  			len_min = sizeof(struct ipv6hdr);
> -			break;
> -		default:
> -			return -EINVAL;
> -		}
>  	}
>  
>  	len_cur = skb->len - skb_network_offset(skb);
> -- 
> 2.34.1
> 



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 5/5] bpf: clear decap tunnel GSO state in skb_adjust_room
  2026-03-18 13:42 ` [PATCH v2 5/5] bpf: clear decap tunnel GSO state in skb_adjust_room Nick Hudson
@ 2026-03-18 20:09   ` Willem de Bruijn
  0 siblings, 0 replies; 21+ messages in thread
From: Willem de Bruijn @ 2026-03-18 20:09 UTC (permalink / raw)
  To: Nick Hudson, bpf, netdev
  Cc: Willem de Bruijn, Nick Hudson, Max Tottenham, Anna Glasgall,
	Martin KaFai Lau, Daniel Borkmann, Alexei Starovoitov,
	Andrii Nakryiko, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, linux-kernel

Nick Hudson wrote:
> On shrink in bpf_skb_adjust_room(), clear tunnel-specific GSO flags
> according to the decapsulation flags:
> 
> - BPF_F_ADJ_ROOM_DECAP_L4_UDP clears SKB_GSO_UDP_TUNNEL{,_CSUM}
> - BPF_F_ADJ_ROOM_DECAP_L4_GRE clears SKB_GSO_GRE{,_CSUM}
> - BPF_F_ADJ_ROOM_DECAP_IPXIP4 clears SKB_GSO_IPXIP4
> - BPF_F_ADJ_ROOM_DECAP_IPXIP6 clears SKB_GSO_IPXIP6
> 
> When all tunnel-related GSO bits are cleared, also clear
> skb->encapsulation.
> 
> Co-developed-by: Max Tottenham <mtottenh@akamai.com>
> Signed-off-by: Max Tottenham <mtottenh@akamai.com>
> Co-developed-by: Anna Glasgall <aglasgal@akamai.com>
> Signed-off-by: Anna Glasgall <aglasgal@akamai.com>
> Signed-off-by: Nick Hudson <nhudson@akamai.com>
> ---
>  net/core/filter.c | 31 +++++++++++++++++++++++++++++++
>  1 file changed, 31 insertions(+)
> 
> diff --git a/net/core/filter.c b/net/core/filter.c
> index 47aec44a9cd3..35af1199ab97 100644
> --- a/net/core/filter.c
> +++ b/net/core/filter.c
> @@ -3665,6 +3665,37 @@ static int bpf_skb_net_shrink(struct sk_buff *skb, u32 off, u32 len_diff,
>  		if (!(flags & BPF_F_ADJ_ROOM_FIXED_GSO))
>  			skb_increase_gso_size(shinfo, len_diff);
>  
> +		/* Selective GSO flag clearing based on decap type.
> +		 * Only clear the flags for the tunnel layer being removed.
> +		 */
> +		if ((flags & BPF_F_ADJ_ROOM_DECAP_L4_UDP) &&
> +		    (shinfo->gso_type & (SKB_GSO_UDP_TUNNEL |
> +					 SKB_GSO_UDP_TUNNEL_CSUM)))
> +			shinfo->gso_type &= ~(SKB_GSO_UDP_TUNNEL |
> +					      SKB_GSO_UDP_TUNNEL_CSUM);
> +		if ((flags & BPF_F_ADJ_ROOM_DECAP_L4_GRE) &&
> +		    (shinfo->gso_type & (SKB_GSO_GRE | SKB_GSO_GRE_CSUM)))
> +			shinfo->gso_type &= ~(SKB_GSO_GRE |
> +					      SKB_GSO_GRE_CSUM);
> +		if ((flags & BPF_F_ADJ_ROOM_DECAP_IPXIP4) &&
> +		    (shinfo->gso_type & SKB_GSO_IPXIP4))
> +			shinfo->gso_type &= ~SKB_GSO_IPXIP4;
> +		if ((flags & BPF_F_ADJ_ROOM_DECAP_IPXIP6) &&
> +		    (shinfo->gso_type & SKB_GSO_IPXIP6))
> +			shinfo->gso_type &= ~SKB_GSO_IPXIP6;
> +
> +		/* Clear encapsulation flag only when no tunnel GSO flags remain */
> +		if (flags & BPF_F_ADJ_ROOM_DECAP_MASK) {
> +			if (!(shinfo->gso_type & (SKB_GSO_UDP_TUNNEL |
> +						  SKB_GSO_UDP_TUNNEL_CSUM |
> +						  SKB_GSO_GRE |
> +						  SKB_GSO_GRE_CSUM |
> +						  SKB_GSO_IPXIP4 |
> +						  SKB_GSO_IPXIP6)))
> +				if (skb->encapsulation)
> +					skb->encapsulation = 0;

Is there any chance that this might clear it while some other tunnel
is still active? From a quick grep on skb->encapsulation the only
possible hit I see is SKB_GSO_ESP.

> +		}
> +
>  		/* Header must be checked, and gso_segs recomputed. */
>  		shinfo->gso_type |= SKB_GSO_DODGY;
>  		shinfo->gso_segs = 0;
> -- 
> 2.34.1
> 



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 4/5] bpf: allow new DECAP flags and add guard rails
  2026-03-18 20:02   ` Willem de Bruijn
@ 2026-03-19  8:17     ` Hudson, Nick
  2026-03-19 13:24       ` Willem de Bruijn
  0 siblings, 1 reply; 21+ messages in thread
From: Hudson, Nick @ 2026-03-19  8:17 UTC (permalink / raw)
  To: Willem de Bruijn
  Cc: bpf@vger.kernel.org, netdev@vger.kernel.org, Tottenham, Max,
	Glasgall, Anna, Martin KaFai Lau, Daniel Borkmann,
	Alexei Starovoitov, Andrii Nakryiko, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	linux-kernel@vger.kernel.org

[-- Attachment #1: Type: text/plain, Size: 3678 bytes --]



> On 18 Mar 2026, at 20:02, Willem de Bruijn <willemdebruijn.kernel@gmail.com> wrote:
> 
> !-------------------------------------------------------------------|
>  This Message Is From an External Sender
>  This message came from outside your organization.
> |-------------------------------------------------------------------!
> 
> Nick Hudson wrote:
>> Add checks to require shrink-only decap, reject conflicting decap flag
>> combinations, and verify removed length is sufficient for claimed header
>> decapsulation.
>> 
>> Co-developed-by: Max Tottenham <mtottenh@akamai.com>
>> Signed-off-by: Max Tottenham <mtottenh@akamai.com>
>> Co-developed-by: Anna Glasgall <aglasgal@akamai.com>
>> Signed-off-by: Anna Glasgall <aglasgal@akamai.com>
>> Signed-off-by: Nick Hudson <nhudson@akamai.com>
>> ---
>> net/core/filter.c | 47 +++++++++++++++++++++++++++++++++++++----------
>> 1 file changed, 37 insertions(+), 10 deletions(-)
>> 
>> diff --git a/net/core/filter.c b/net/core/filter.c
>> index 7c2871b40fe4..47aec44a9cd3 100644
>> --- a/net/core/filter.c
>> +++ b/net/core/filter.c
>> @@ -56,6 +56,7 @@
>> #include <net/sock_reuseport.h>
>> #include <net/busy_poll.h>
>> #include <net/tcp.h>
>> +#include <net/gre.h>
>> #include <net/xfrm.h>
>> #include <net/udp.h>
>> #include <linux/bpf_trace.h>
>> @@ -3496,7 +3497,9 @@ static u32 bpf_skb_net_base_len(const struct sk_buff *skb)
>> BPF_F_ADJ_ROOM_ENCAP_L2( \
>>  BPF_ADJ_ROOM_ENCAP_L2_MASK))
>> 
>> -#define BPF_F_ADJ_ROOM_DECAP_MASK (BPF_F_ADJ_ROOM_DECAP_L3_MASK)
>> +#define BPF_F_ADJ_ROOM_DECAP_MASK (BPF_F_ADJ_ROOM_DECAP_L3_MASK | \
>> + BPF_F_ADJ_ROOM_DECAP_L4_MASK | \
>> + BPF_F_ADJ_ROOM_DECAP_IPXIP_MASK)
>> 
>> #define BPF_F_ADJ_ROOM_MASK (BPF_F_ADJ_ROOM_FIXED_GSO | \
>> BPF_F_ADJ_ROOM_ENCAP_MASK | \
>> @@ -3743,20 +3746,44 @@ BPF_CALL_4(bpf_skb_adjust_room, struct sk_buff *, skb, s32, len_diff,
>> return -ENOTSUPP;
>> }
>> 
>> - if (flags & BPF_F_ADJ_ROOM_DECAP_L3_MASK) {
>> + if (flags & BPF_F_ADJ_ROOM_DECAP_MASK) {
>> + u32 len_decap_min = 0;
>> +
>> if (!shrink)
>> return -EINVAL;
>> 
>> - switch (flags & BPF_F_ADJ_ROOM_DECAP_L3_MASK) {
>> - case BPF_F_ADJ_ROOM_DECAP_L3_IPV4:
>> + /* Reject mutually exclusive decap flag pairs. */
>> + if ((flags & BPF_F_ADJ_ROOM_DECAP_L3_MASK) ==
>> +    BPF_F_ADJ_ROOM_DECAP_L3_MASK)
>> + return -EINVAL;
>> +
>> + if ((flags & BPF_F_ADJ_ROOM_DECAP_L4_MASK) ==
>> +    BPF_F_ADJ_ROOM_DECAP_L4_MASK)
>> + return -EINVAL;
>> +
>> + if ((flags & BPF_F_ADJ_ROOM_DECAP_IPXIP_MASK) ==
>> +    BPF_F_ADJ_ROOM_DECAP_IPXIP_MASK)
>> + return -EINVAL;
>> +
>> + /* Reject mutually exclusive decap tunnel type flags. */
>> + if ((flags & BPF_F_ADJ_ROOM_DECAP_L4_MASK) &&
>> +    (flags & BPF_F_ADJ_ROOM_DECAP_IPXIP_MASK))
>> + return -EINVAL;
>> +
>> + if (flags & BPF_F_ADJ_ROOM_DECAP_L4_UDP)
>> + len_decap_min += sizeof(struct udphdr);
>> +
>> + if (flags & BPF_F_ADJ_ROOM_DECAP_L4_GRE)
>> + len_decap_min += sizeof(struct gre_base_hdr);
>> +
>> + if (len_diff_abs < len_decap_min)
>> + return -EINVAL;
> 
> Should this test come after the below IP flags?

Should it?

Seems to me it can bail early without having to check the IP flags. len_decap_min vs len_min.

What am I missing?

> 
>> +
>> + if (flags & BPF_F_ADJ_ROOM_DECAP_L3_IPV4)
>> len_min = sizeof(struct iphdr);
>> - break;
>> - case BPF_F_ADJ_ROOM_DECAP_L3_IPV6:
>> +
>> + if (flags & BPF_F_ADJ_ROOM_DECAP_L3_IPV6)
>> len_min = sizeof(struct ipv6hdr);
>> - break;
>> - default:
>> - return -EINVAL;
>> - }
>> }
>> 
>> len_cur = skb->len - skb_network_offset(skb);
>> -- 
>> 2.34.1
>> 
> 
> 


[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 3067 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 4/5] bpf: allow new DECAP flags and add guard rails
  2026-03-19  8:17     ` Hudson, Nick
@ 2026-03-19 13:24       ` Willem de Bruijn
  0 siblings, 0 replies; 21+ messages in thread
From: Willem de Bruijn @ 2026-03-19 13:24 UTC (permalink / raw)
  To: Hudson, Nick, Willem de Bruijn
  Cc: bpf@vger.kernel.org, netdev@vger.kernel.org, Tottenham, Max,
	Glasgall, Anna, Martin KaFai Lau, Daniel Borkmann,
	Alexei Starovoitov, Andrii Nakryiko, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	linux-kernel@vger.kernel.org

Hudson, Nick wrote:
> 
> 
> > On 18 Mar 2026, at 20:02, Willem de Bruijn <willemdebruijn.kernel@gmail.com> wrote:
> > 
> > !-------------------------------------------------------------------|
> >  This Message Is From an External Sender
> >  This message came from outside your organization.
> > |-------------------------------------------------------------------!
> > 
> > Nick Hudson wrote:
> >> Add checks to require shrink-only decap, reject conflicting decap flag
> >> combinations, and verify removed length is sufficient for claimed header
> >> decapsulation.
> >> 
> >> Co-developed-by: Max Tottenham <mtottenh@akamai.com>
> >> Signed-off-by: Max Tottenham <mtottenh@akamai.com>
> >> Co-developed-by: Anna Glasgall <aglasgal@akamai.com>
> >> Signed-off-by: Anna Glasgall <aglasgal@akamai.com>
> >> Signed-off-by: Nick Hudson <nhudson@akamai.com>
> >> ---
> >> net/core/filter.c | 47 +++++++++++++++++++++++++++++++++++++----------
> >> 1 file changed, 37 insertions(+), 10 deletions(-)
> >> 
> >> diff --git a/net/core/filter.c b/net/core/filter.c
> >> index 7c2871b40fe4..47aec44a9cd3 100644
> >> --- a/net/core/filter.c
> >> +++ b/net/core/filter.c
> >> @@ -56,6 +56,7 @@
> >> #include <net/sock_reuseport.h>
> >> #include <net/busy_poll.h>
> >> #include <net/tcp.h>
> >> +#include <net/gre.h>
> >> #include <net/xfrm.h>
> >> #include <net/udp.h>
> >> #include <linux/bpf_trace.h>
> >> @@ -3496,7 +3497,9 @@ static u32 bpf_skb_net_base_len(const struct sk_buff *skb)
> >> BPF_F_ADJ_ROOM_ENCAP_L2( \
> >>  BPF_ADJ_ROOM_ENCAP_L2_MASK))
> >> 
> >> -#define BPF_F_ADJ_ROOM_DECAP_MASK (BPF_F_ADJ_ROOM_DECAP_L3_MASK)
> >> +#define BPF_F_ADJ_ROOM_DECAP_MASK (BPF_F_ADJ_ROOM_DECAP_L3_MASK | \
> >> + BPF_F_ADJ_ROOM_DECAP_L4_MASK | \
> >> + BPF_F_ADJ_ROOM_DECAP_IPXIP_MASK)
> >> 
> >> #define BPF_F_ADJ_ROOM_MASK (BPF_F_ADJ_ROOM_FIXED_GSO | \
> >> BPF_F_ADJ_ROOM_ENCAP_MASK | \
> >> @@ -3743,20 +3746,44 @@ BPF_CALL_4(bpf_skb_adjust_room, struct sk_buff *, skb, s32, len_diff,
> >> return -ENOTSUPP;
> >> }
> >> 
> >> - if (flags & BPF_F_ADJ_ROOM_DECAP_L3_MASK) {
> >> + if (flags & BPF_F_ADJ_ROOM_DECAP_MASK) {
> >> + u32 len_decap_min = 0;
> >> +
> >> if (!shrink)
> >> return -EINVAL;
> >> 
> >> - switch (flags & BPF_F_ADJ_ROOM_DECAP_L3_MASK) {
> >> - case BPF_F_ADJ_ROOM_DECAP_L3_IPV4:
> >> + /* Reject mutually exclusive decap flag pairs. */
> >> + if ((flags & BPF_F_ADJ_ROOM_DECAP_L3_MASK) ==
> >> +    BPF_F_ADJ_ROOM_DECAP_L3_MASK)
> >> + return -EINVAL;
> >> +
> >> + if ((flags & BPF_F_ADJ_ROOM_DECAP_L4_MASK) ==
> >> +    BPF_F_ADJ_ROOM_DECAP_L4_MASK)
> >> + return -EINVAL;
> >> +
> >> + if ((flags & BPF_F_ADJ_ROOM_DECAP_IPXIP_MASK) ==
> >> +    BPF_F_ADJ_ROOM_DECAP_IPXIP_MASK)
> >> + return -EINVAL;
> >> +
> >> + /* Reject mutually exclusive decap tunnel type flags. */
> >> + if ((flags & BPF_F_ADJ_ROOM_DECAP_L4_MASK) &&
> >> +    (flags & BPF_F_ADJ_ROOM_DECAP_IPXIP_MASK))
> >> + return -EINVAL;
> >> +
> >> + if (flags & BPF_F_ADJ_ROOM_DECAP_L4_UDP)
> >> + len_decap_min += sizeof(struct udphdr);
> >> +
> >> + if (flags & BPF_F_ADJ_ROOM_DECAP_L4_GRE)
> >> + len_decap_min += sizeof(struct gre_base_hdr);
> >> +
> >> + if (len_diff_abs < len_decap_min)
> >> + return -EINVAL;
> > 
> > Should this test come after the below IP flags?
> 
> Should it?
> 
> Seems to me it can bail early without having to check the IP flags. len_decap_min vs len_min.
> 
> What am I missing?

I would think it common that UDP decap also includes an L3 decap, in
which case the len_decap_min should include both header lengths.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 1/5] bpf: name the enum for BPF_FUNC_skb_adjust_room flags
  2026-03-18 13:42 ` [PATCH v2 1/5] bpf: name the enum for BPF_FUNC_skb_adjust_room flags Nick Hudson
@ 2026-03-21  0:39   ` Willem de Bruijn
  2026-03-24 17:34   ` Martin KaFai Lau
  1 sibling, 0 replies; 21+ messages in thread
From: Willem de Bruijn @ 2026-03-21  0:39 UTC (permalink / raw)
  To: Nick Hudson, bpf, netdev
  Cc: Willem de Bruijn, Nick Hudson, Max Tottenham, Anna Glasgall,
	Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	linux-kernel

Nick Hudson wrote:
> The existing anonymous enum for BPF_FUNC_skb_adjust_room flags is
> named to enum bpf_adj_room_flags to enable CO-RE (Compile Once -
> Run Everywhere) lookups in BPF programs.
> 
> Co-developed-by: Max Tottenham <mtottenh@akamai.com>
> Signed-off-by: Max Tottenham <mtottenh@akamai.com>
> Co-developed-by: Anna Glasgall <aglasgal@akamai.com>
> Signed-off-by: Anna Glasgall <aglasgal@akamai.com>
> Signed-off-by: Nick Hudson <nhudson@akamai.com>

Reviewed-by: Willem de Bruijn <willemb@google.com>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 2/5] bpf: add BPF_F_ADJ_ROOM_DECAP_* flags for tunnel decapsulation
  2026-03-18 13:42 ` [PATCH v2 2/5] bpf: add BPF_F_ADJ_ROOM_DECAP_* flags for tunnel decapsulation Nick Hudson
@ 2026-03-21  0:39   ` Willem de Bruijn
  0 siblings, 0 replies; 21+ messages in thread
From: Willem de Bruijn @ 2026-03-21  0:39 UTC (permalink / raw)
  To: Nick Hudson, bpf, netdev
  Cc: Willem de Bruijn, Nick Hudson, Max Tottenham, Anna Glasgall,
	Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	linux-kernel

Nick Hudson wrote:
> Add new bpf_skb_adjust_room() decapsulation flags:
> 
> - BPF_F_ADJ_ROOM_DECAP_L4_GRE
> - BPF_F_ADJ_ROOM_DECAP_L4_UDP
> - BPF_F_ADJ_ROOM_DECAP_IPXIP4
> - BPF_F_ADJ_ROOM_DECAP_IPXIP6
> 
> These flags let BPF programs describe which tunnel layer is being
> removed, so later changes can update tunnel-related GSO state
> accordingly during decapsulation.
> 
> This patch only introduces the UAPI flag definitions and helper
> documentation.
> 
> Co-developed-by: Max Tottenham <mtottenh@akamai.com>
> Signed-off-by: Max Tottenham <mtottenh@akamai.com>
> Co-developed-by: Anna Glasgall <aglasgal@akamai.com>
> Signed-off-by: Anna Glasgall <aglasgal@akamai.com>
> Signed-off-by: Nick Hudson <nhudson@akamai.com>

Reviewed-by: Willem de Bruijn <willemb@google.com>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 3/5] bpf: add helper masks for ADJ_ROOM flags and encap validation
  2026-03-18 13:42 ` [PATCH v2 3/5] bpf: add helper masks for ADJ_ROOM flags and encap validation Nick Hudson
@ 2026-03-21  0:39   ` Willem de Bruijn
  2026-03-24 18:12   ` Martin KaFai Lau
  1 sibling, 0 replies; 21+ messages in thread
From: Willem de Bruijn @ 2026-03-21  0:39 UTC (permalink / raw)
  To: Nick Hudson, bpf, netdev
  Cc: Willem de Bruijn, Nick Hudson, Max Tottenham, Anna Glasgall,
	Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, linux-kernel

Nick Hudson wrote:
> Introduce helper masks for bpf_skb_adjust_room() flags to simplify
> validation logic:
> 
> - BPF_F_ADJ_ROOM_DECAP_L4_MASK
> - BPF_F_ADJ_ROOM_DECAP_IPXIP_MASK
> - BPF_F_ADJ_ROOM_ENCAP_MASK
> - BPF_F_ADJ_ROOM_DECAP_MASK
> 
> Add flag validation to bpf_skb_net_grow() to reject invalid encap
> flags early. Refactor existing validation checks in bpf_skb_net_shrink()
> and bpf_skb_adjust_room() to use the new masks (no behavior change).
> 
> Co-developed-by: Max Tottenham <mtottenh@akamai.com>
> Signed-off-by: Max Tottenham <mtottenh@akamai.com>
> Co-developed-by: Anna Glasgall <aglasgal@akamai.com>
> Signed-off-by: Anna Glasgall <aglasgal@akamai.com>
> Signed-off-by: Nick Hudson <nhudson@akamai.com>

Reviewed-by: Willem de Bruijn <willemb@google.com>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 4/5] bpf: allow new DECAP flags and add guard rails
  2026-03-18 13:42 ` [PATCH v2 4/5] bpf: allow new DECAP flags and add guard rails Nick Hudson
  2026-03-18 20:02   ` Willem de Bruijn
@ 2026-03-21  0:40   ` Willem de Bruijn
  2026-03-24 18:30   ` Martin KaFai Lau
  2 siblings, 0 replies; 21+ messages in thread
From: Willem de Bruijn @ 2026-03-21  0:40 UTC (permalink / raw)
  To: Nick Hudson, bpf, netdev
  Cc: Willem de Bruijn, Nick Hudson, Max Tottenham, Anna Glasgall,
	Martin KaFai Lau, Daniel Borkmann, Alexei Starovoitov,
	Andrii Nakryiko, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, linux-kernel

Nick Hudson wrote:
> Add checks to require shrink-only decap, reject conflicting decap flag
> combinations, and verify removed length is sufficient for claimed header
> decapsulation.
> 
> Co-developed-by: Max Tottenham <mtottenh@akamai.com>
> Signed-off-by: Max Tottenham <mtottenh@akamai.com>
> Co-developed-by: Anna Glasgall <aglasgal@akamai.com>
> Signed-off-by: Anna Glasgall <aglasgal@akamai.com>
> Signed-off-by: Nick Hudson <nhudson@akamai.com>

Reviewed-by: Willem de Bruijn <willemb@google.com>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 1/5] bpf: name the enum for BPF_FUNC_skb_adjust_room flags
  2026-03-18 13:42 ` [PATCH v2 1/5] bpf: name the enum for BPF_FUNC_skb_adjust_room flags Nick Hudson
  2026-03-21  0:39   ` Willem de Bruijn
@ 2026-03-24 17:34   ` Martin KaFai Lau
  1 sibling, 0 replies; 21+ messages in thread
From: Martin KaFai Lau @ 2026-03-24 17:34 UTC (permalink / raw)
  To: Nick Hudson
  Cc: Willem de Bruijn, Max Tottenham, Anna Glasgall,
	Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, bpf, netdev,
	linux-kernel


On 3/18/26 6:42 AM, Nick Hudson wrote:
> The existing anonymous enum for BPF_FUNC_skb_adjust_room flags is
> named to enum bpf_adj_room_flags to enable CO-RE (Compile Once -
> Run Everywhere) lookups in BPF programs.

It would be useful to demonstrate the intended CO-RE usage in a 
selftest. I suspect it is bpf_core_enum_value_exists().

There are existing tests in test_tc_tunnel.c for the earlier 
BPF_F_ADJ_ROOM_* flag additions. Please add similar tests for the new 
flags introduced in this series.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 3/5] bpf: add helper masks for ADJ_ROOM flags and encap validation
  2026-03-18 13:42 ` [PATCH v2 3/5] bpf: add helper masks for ADJ_ROOM flags and encap validation Nick Hudson
  2026-03-21  0:39   ` Willem de Bruijn
@ 2026-03-24 18:12   ` Martin KaFai Lau
  2026-03-26 17:02     ` Hudson, Nick
  1 sibling, 1 reply; 21+ messages in thread
From: Martin KaFai Lau @ 2026-03-24 18:12 UTC (permalink / raw)
  To: Nick Hudson
  Cc: Willem de Bruijn, Max Tottenham, Anna Glasgall,
	Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni, bpf,
	netdev, linux-kernel

On 3/18/26 6:42 AM, Nick Hudson wrote:
> Introduce helper masks for bpf_skb_adjust_room() flags to simplify
> validation logic:
> 
> - BPF_F_ADJ_ROOM_DECAP_L4_MASK
> - BPF_F_ADJ_ROOM_DECAP_IPXIP_MASK
> - BPF_F_ADJ_ROOM_ENCAP_MASK
> - BPF_F_ADJ_ROOM_DECAP_MASK
> 
> Add flag validation to bpf_skb_net_grow() to reject invalid encap
> flags early. Refactor existing validation checks in bpf_skb_net_shrink()
> and bpf_skb_adjust_room() to use the new masks (no behavior change).
> 
> Co-developed-by: Max Tottenham <mtottenh@akamai.com>
> Signed-off-by: Max Tottenham <mtottenh@akamai.com>
> Co-developed-by: Anna Glasgall <aglasgal@akamai.com>
> Signed-off-by: Anna Glasgall <aglasgal@akamai.com>
> Signed-off-by: Nick Hudson <nhudson@akamai.com>
> ---
>   net/core/filter.c | 31 +++++++++++++++++++++++--------
>   1 file changed, 23 insertions(+), 8 deletions(-)
> 
> diff --git a/net/core/filter.c b/net/core/filter.c
> index 0d5d5a17acb2..7c2871b40fe4 100644
> --- a/net/core/filter.c
> +++ b/net/core/filter.c
> @@ -3483,14 +3483,25 @@ static u32 bpf_skb_net_base_len(const struct sk_buff *skb)
>   #define BPF_F_ADJ_ROOM_DECAP_L3_MASK	(BPF_F_ADJ_ROOM_DECAP_L3_IPV4 | \
>   					 BPF_F_ADJ_ROOM_DECAP_L3_IPV6)
>   
> -#define BPF_F_ADJ_ROOM_MASK		(BPF_F_ADJ_ROOM_FIXED_GSO | \
> -					 BPF_F_ADJ_ROOM_ENCAP_L3_MASK | \
> +#define BPF_F_ADJ_ROOM_DECAP_L4_MASK	(BPF_F_ADJ_ROOM_DECAP_L4_UDP | \
> +					 BPF_F_ADJ_ROOM_DECAP_L4_GRE)
> +
> +#define BPF_F_ADJ_ROOM_DECAP_IPXIP_MASK	(BPF_F_ADJ_ROOM_DECAP_IPXIP4 | \
> +					 BPF_F_ADJ_ROOM_DECAP_IPXIP6)
> +
> +#define BPF_F_ADJ_ROOM_ENCAP_MASK	(BPF_F_ADJ_ROOM_ENCAP_L3_MASK | \
>   					 BPF_F_ADJ_ROOM_ENCAP_L4_GRE | \
>   					 BPF_F_ADJ_ROOM_ENCAP_L4_UDP | \
>   					 BPF_F_ADJ_ROOM_ENCAP_L2_ETH | \
>   					 BPF_F_ADJ_ROOM_ENCAP_L2( \
> -					  BPF_ADJ_ROOM_ENCAP_L2_MASK) | \
> -					 BPF_F_ADJ_ROOM_DECAP_L3_MASK)
> +					  BPF_ADJ_ROOM_ENCAP_L2_MASK))
> +
> +#define BPF_F_ADJ_ROOM_DECAP_MASK	(BPF_F_ADJ_ROOM_DECAP_L3_MASK)
> +
> +#define BPF_F_ADJ_ROOM_MASK		(BPF_F_ADJ_ROOM_FIXED_GSO | \
> +					 BPF_F_ADJ_ROOM_ENCAP_MASK | \
> +					 BPF_F_ADJ_ROOM_DECAP_MASK | \
> +					 BPF_F_ADJ_ROOM_NO_CSUM_RESET)

The patch does two things: refactoring of existing macros 
(BPF_F_ADJ_ROOM_ENCAP_MASK, BPF_F_ADJ_ROOM_DECAP_MASK) and new additions 
(BPF_F_ADJ_ROOM_DECAP_L4_MASK, BPF_F_ADJ_ROOM_DECAP_IPXIP_MASK) that 
depend on the new flags from the UAPI changes in patch 2.

The refactoring does not depend on the new UAPI flags and could be a 
separate patch placed earlier in the series. That way a reviewer can 
verify it is a no-op without the new flag additions getting in
the way. The (BPF_F_ADJ_ROOM_DECAP_L4_MASK, 
BPF_F_ADJ_ROOM_DECAP_IPXIP_MASK) can be introduced together in patch 4 
when it is first used.

>   
>   static int bpf_skb_net_grow(struct sk_buff *skb, u32 off, u32 len_diff,
>   			    u64 flags)
> @@ -3502,6 +3513,11 @@ static int bpf_skb_net_grow(struct sk_buff *skb, u32 off, u32 len_diff,
>   	unsigned int gso_type = SKB_GSO_DODGY;
>   	int ret;
>   
> +	if (unlikely(flags & ~(BPF_F_ADJ_ROOM_ENCAP_MASK |
> +			       BPF_F_ADJ_ROOM_NO_CSUM_RESET |
> +			       BPF_F_ADJ_ROOM_FIXED_GSO)))

Under which case this new check will be hit?

> +		return -EINVAL;
> +
>   	if (skb_is_gso(skb) && !skb_is_gso_tcp(skb)) {
>   		/* udp gso_size delineates datagrams, only allow if fixed */
>   		if (!(skb_shinfo(skb)->gso_type & SKB_GSO_UDP_L4) ||
> @@ -3611,8 +3627,8 @@ static int bpf_skb_net_shrink(struct sk_buff *skb, u32 off, u32 len_diff,
>   {
>   	int ret;
>   
> -	if (unlikely(flags & ~(BPF_F_ADJ_ROOM_FIXED_GSO |
> -			       BPF_F_ADJ_ROOM_DECAP_L3_MASK |
> +	if (unlikely(flags & ~(BPF_F_ADJ_ROOM_DECAP_MASK |
> +			       BPF_F_ADJ_ROOM_FIXED_GSO |
>   			       BPF_F_ADJ_ROOM_NO_CSUM_RESET)))
>   		return -EINVAL;
>   
> @@ -3708,8 +3724,7 @@ BPF_CALL_4(bpf_skb_adjust_room, struct sk_buff *, skb, s32, len_diff,
>   	u32 off;
>   	int ret;
>   
> -	if (unlikely(flags & ~(BPF_F_ADJ_ROOM_MASK |
> -			       BPF_F_ADJ_ROOM_NO_CSUM_RESET)))
> +	if (unlikely(flags & ~BPF_F_ADJ_ROOM_MASK))
>   		return -EINVAL;
>   	if (unlikely(len_diff_abs > 0xfffU))
>   		return -EFAULT;


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 4/5] bpf: allow new DECAP flags and add guard rails
  2026-03-18 13:42 ` [PATCH v2 4/5] bpf: allow new DECAP flags and add guard rails Nick Hudson
  2026-03-18 20:02   ` Willem de Bruijn
  2026-03-21  0:40   ` Willem de Bruijn
@ 2026-03-24 18:30   ` Martin KaFai Lau
  2026-03-26 17:02     ` Hudson, Nick
  2 siblings, 1 reply; 21+ messages in thread
From: Martin KaFai Lau @ 2026-03-24 18:30 UTC (permalink / raw)
  To: Nick Hudson
  Cc: Willem de Bruijn, Max Tottenham, Anna Glasgall, Daniel Borkmann,
	Alexei Starovoitov, Andrii Nakryiko, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, bpf, netdev,
	linux-kernel



On 3/18/26 6:42 AM, Nick Hudson wrote:
> Add checks to require shrink-only decap, reject conflicting decap flag
> combinations, and verify removed length is sufficient for claimed header
> decapsulation.
> 
> Co-developed-by: Max Tottenham <mtottenh@akamai.com>
> Signed-off-by: Max Tottenham <mtottenh@akamai.com>
> Co-developed-by: Anna Glasgall <aglasgal@akamai.com>
> Signed-off-by: Anna Glasgall <aglasgal@akamai.com>
> Signed-off-by: Nick Hudson <nhudson@akamai.com>
> ---
>   net/core/filter.c | 47 +++++++++++++++++++++++++++++++++++++----------
>   1 file changed, 37 insertions(+), 10 deletions(-)
> 
> diff --git a/net/core/filter.c b/net/core/filter.c
> index 7c2871b40fe4..47aec44a9cd3 100644
> --- a/net/core/filter.c
> +++ b/net/core/filter.c
> @@ -56,6 +56,7 @@
>   #include <net/sock_reuseport.h>
>   #include <net/busy_poll.h>
>   #include <net/tcp.h>
> +#include <net/gre.h>
>   #include <net/xfrm.h>
>   #include <net/udp.h>
>   #include <linux/bpf_trace.h>
> @@ -3496,7 +3497,9 @@ static u32 bpf_skb_net_base_len(const struct sk_buff *skb)
>   					 BPF_F_ADJ_ROOM_ENCAP_L2( \
>   					  BPF_ADJ_ROOM_ENCAP_L2_MASK))
>   
> -#define BPF_F_ADJ_ROOM_DECAP_MASK	(BPF_F_ADJ_ROOM_DECAP_L3_MASK)
> +#define BPF_F_ADJ_ROOM_DECAP_MASK	(BPF_F_ADJ_ROOM_DECAP_L3_MASK | \
> +					 BPF_F_ADJ_ROOM_DECAP_L4_MASK | \
> +					 BPF_F_ADJ_ROOM_DECAP_IPXIP_MASK)
>   
>   #define BPF_F_ADJ_ROOM_MASK		(BPF_F_ADJ_ROOM_FIXED_GSO | \
>   					 BPF_F_ADJ_ROOM_ENCAP_MASK | \
> @@ -3743,20 +3746,44 @@ BPF_CALL_4(bpf_skb_adjust_room, struct sk_buff *, skb, s32, len_diff,
>   		return -ENOTSUPP;
>   	}
>   
> -	if (flags & BPF_F_ADJ_ROOM_DECAP_L3_MASK) {
> +	if (flags & BPF_F_ADJ_ROOM_DECAP_MASK) {

This change should be done together with the macro refactoring patch 
mentioned in patch 3.

> +		u32 len_decap_min = 0;
> +
>   		if (!shrink)
>   			return -EINVAL;
>   
> -		switch (flags & BPF_F_ADJ_ROOM_DECAP_L3_MASK) {
> -		case BPF_F_ADJ_ROOM_DECAP_L3_IPV4:
> +		/* Reject mutually exclusive decap flag pairs. */
> +		if ((flags & BPF_F_ADJ_ROOM_DECAP_L3_MASK) ==
> +		    BPF_F_ADJ_ROOM_DECAP_L3_MASK)

iiuc, this 'if' and the len_min assignment changes below replace the 
existing switch case. Please separate this no-op change from the new 
flag validation logic. It is small enough to be done together in the 
macro refactoring patch also.

> +			return -EINVAL;
> +
> +		if ((flags & BPF_F_ADJ_ROOM_DECAP_L4_MASK) ==
> +		    BPF_F_ADJ_ROOM_DECAP_L4_MASK)
> +			return -EINVAL;
> +
> +		if ((flags & BPF_F_ADJ_ROOM_DECAP_IPXIP_MASK) ==
> +		    BPF_F_ADJ_ROOM_DECAP_IPXIP_MASK)
> +			return -EINVAL;
> +
> +		/* Reject mutually exclusive decap tunnel type flags. */
> +		if ((flags & BPF_F_ADJ_ROOM_DECAP_L4_MASK) &&
> +		    (flags & BPF_F_ADJ_ROOM_DECAP_IPXIP_MASK))
> +			return -EINVAL;
> +
> +		if (flags & BPF_F_ADJ_ROOM_DECAP_L4_UDP)
> +			len_decap_min += sizeof(struct udphdr);
> +
> +		if (flags & BPF_F_ADJ_ROOM_DECAP_L4_GRE)
> +			len_decap_min += sizeof(struct gre_base_hdr);
> +
> +		if (len_diff_abs < len_decap_min)
> +			return -EINVAL;
> +
> +		if (flags & BPF_F_ADJ_ROOM_DECAP_L3_IPV4)
>   			len_min = sizeof(struct iphdr);
> -			break;
> -		case BPF_F_ADJ_ROOM_DECAP_L3_IPV6:
> +
> +		if (flags & BPF_F_ADJ_ROOM_DECAP_L3_IPV6)
>   			len_min = sizeof(struct ipv6hdr);
> -			break;
> -		default:
> -			return -EINVAL;
> -		}
>   	}
>   
>   	len_cur = skb->len - skb_network_offset(skb);


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 4/5] bpf: allow new DECAP flags and add guard rails
  2026-03-24 18:30   ` Martin KaFai Lau
@ 2026-03-26 17:02     ` Hudson, Nick
  0 siblings, 0 replies; 21+ messages in thread
From: Hudson, Nick @ 2026-03-26 17:02 UTC (permalink / raw)
  To: Martin KaFai Lau
  Cc: Willem de Bruijn, Tottenham, Max, Glasgall, Anna, Daniel Borkmann,
	Alexei Starovoitov, Andrii Nakryiko, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, bpf@vger.kernel.org,
	netdev@vger.kernel.org, linux-kernel@vger.kernel.org

[-- Attachment #1: Type: text/plain, Size: 3964 bytes --]



> On Mar 24, 2026, at 6:30 PM, Martin KaFai Lau <martin.lau@linux.dev> wrote:
> 
> !-------------------------------------------------------------------|
> This Message Is From an External Sender
> This message came from outside your organization.
> |-------------------------------------------------------------------!
> 
> 
> 
> On 3/18/26 6:42 AM, Nick Hudson wrote:
>> Add checks to require shrink-only decap, reject conflicting decap flag
>> combinations, and verify removed length is sufficient for claimed header
>> decapsulation.
>> Co-developed-by: Max Tottenham <mtottenh@akamai.com>
>> Signed-off-by: Max Tottenham <mtottenh@akamai.com>
>> Co-developed-by: Anna Glasgall <aglasgal@akamai.com>
>> Signed-off-by: Anna Glasgall <aglasgal@akamai.com>
>> Signed-off-by: Nick Hudson <nhudson@akamai.com>
>> ---
>>  net/core/filter.c | 47 +++++++++++++++++++++++++++++++++++++----------
>>  1 file changed, 37 insertions(+), 10 deletions(-)
>> diff --git a/net/core/filter.c b/net/core/filter.c
>> index 7c2871b40fe4..47aec44a9cd3 100644
>> --- a/net/core/filter.c
>> +++ b/net/core/filter.c
>> @@ -56,6 +56,7 @@
>>  #include <net/sock_reuseport.h>
>>  #include <net/busy_poll.h>
>>  #include <net/tcp.h>
>> +#include <net/gre.h>
>>  #include <net/xfrm.h>
>>  #include <net/udp.h>
>>  #include <linux/bpf_trace.h>
>> @@ -3496,7 +3497,9 @@ static u32 bpf_skb_net_base_len(const struct sk_buff *skb)
>>  					 BPF_F_ADJ_ROOM_ENCAP_L2( \
>>  					  BPF_ADJ_ROOM_ENCAP_L2_MASK))
>>  -#define BPF_F_ADJ_ROOM_DECAP_MASK	(BPF_F_ADJ_ROOM_DECAP_L3_MASK)
>> +#define BPF_F_ADJ_ROOM_DECAP_MASK	(BPF_F_ADJ_ROOM_DECAP_L3_MASK | \
>> +					 BPF_F_ADJ_ROOM_DECAP_L4_MASK | \
>> +					 BPF_F_ADJ_ROOM_DECAP_IPXIP_MASK)
>>    #define BPF_F_ADJ_ROOM_MASK		(BPF_F_ADJ_ROOM_FIXED_GSO | \
>>  					 BPF_F_ADJ_ROOM_ENCAP_MASK | \
>> @@ -3743,20 +3746,44 @@ BPF_CALL_4(bpf_skb_adjust_room, struct sk_buff *, skb, s32, len_diff,
>>  		return -ENOTSUPP;
>>  	}
>>  -	if (flags & BPF_F_ADJ_ROOM_DECAP_L3_MASK) {
>> +	if (flags & BPF_F_ADJ_ROOM_DECAP_MASK) {
> 
> This change should be done together with the macro refactoring patch mentioned in patch 3.

OK, will send a new version with it done this way.

> 
>> +		u32 len_decap_min = 0;
>> +
>>  		if (!shrink)
>>  			return -EINVAL;
>>  -		switch (flags & BPF_F_ADJ_ROOM_DECAP_L3_MASK) {
>> -		case BPF_F_ADJ_ROOM_DECAP_L3_IPV4:
>> +		/* Reject mutually exclusive decap flag pairs. */
>> +		if ((flags & BPF_F_ADJ_ROOM_DECAP_L3_MASK) ==
>> +		    BPF_F_ADJ_ROOM_DECAP_L3_MASK)
> 
> iiuc, this 'if' and the len_min assignment changes below replace the existing switch case. Please separate this no-op change from the new flag validation logic. It is small enough to be done together in the macro refactoring patch also.
> 
>> +			return -EINVAL;
>> +
>> +		if ((flags & BPF_F_ADJ_ROOM_DECAP_L4_MASK) ==
>> +		    BPF_F_ADJ_ROOM_DECAP_L4_MASK)
>> +			return -EINVAL;
>> +
>> +		if ((flags & BPF_F_ADJ_ROOM_DECAP_IPXIP_MASK) ==
>> +		    BPF_F_ADJ_ROOM_DECAP_IPXIP_MASK)
>> +			return -EINVAL;
>> +
>> +		/* Reject mutually exclusive decap tunnel type flags. */
>> +		if ((flags & BPF_F_ADJ_ROOM_DECAP_L4_MASK) &&
>> +		    (flags & BPF_F_ADJ_ROOM_DECAP_IPXIP_MASK))
>> +			return -EINVAL;
>> +
>> +		if (flags & BPF_F_ADJ_ROOM_DECAP_L4_UDP)
>> +			len_decap_min += sizeof(struct udphdr);
>> +
>> +		if (flags & BPF_F_ADJ_ROOM_DECAP_L4_GRE)
>> +			len_decap_min += sizeof(struct gre_base_hdr);
>> +
>> +		if (len_diff_abs < len_decap_min)
>> +			return -EINVAL;
>> +
>> +		if (flags & BPF_F_ADJ_ROOM_DECAP_L3_IPV4)
>>  			len_min = sizeof(struct iphdr);
>> -			break;
>> -		case BPF_F_ADJ_ROOM_DECAP_L3_IPV6:
>> +
>> +		if (flags & BPF_F_ADJ_ROOM_DECAP_L3_IPV6)
>>  			len_min = sizeof(struct ipv6hdr);
>> -			break;
>> -		default:
>> -			return -EINVAL;
>> -		}
>>  	}
>>    	len_cur = skb->len - skb_network_offset(skb);
> 


[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 3066 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 3/5] bpf: add helper masks for ADJ_ROOM flags and encap validation
  2026-03-24 18:12   ` Martin KaFai Lau
@ 2026-03-26 17:02     ` Hudson, Nick
  2026-03-26 17:49       ` Martin KaFai Lau
  0 siblings, 1 reply; 21+ messages in thread
From: Hudson, Nick @ 2026-03-26 17:02 UTC (permalink / raw)
  To: Martin KaFai Lau
  Cc: Willem de Bruijn, Tottenham, Max, Glasgall, Anna,
	Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	bpf@vger.kernel.org, netdev@vger.kernel.org,
	linux-kernel@vger.kernel.org

[-- Attachment #1: Type: text/plain, Size: 4768 bytes --]



> On Mar 24, 2026, at 6:12 PM, Martin KaFai Lau <martin.lau@linux.dev> wrote:
> 
> !-------------------------------------------------------------------|
> This Message Is From an External Sender
> This message came from outside your organization.
> |-------------------------------------------------------------------!
> 
> On 3/18/26 6:42 AM, Nick Hudson wrote:
>> Introduce helper masks for bpf_skb_adjust_room() flags to simplify
>> validation logic:
>> - BPF_F_ADJ_ROOM_DECAP_L4_MASK
>> - BPF_F_ADJ_ROOM_DECAP_IPXIP_MASK
>> - BPF_F_ADJ_ROOM_ENCAP_MASK
>> - BPF_F_ADJ_ROOM_DECAP_MASK
>> Add flag validation to bpf_skb_net_grow() to reject invalid encap
>> flags early. Refactor existing validation checks in bpf_skb_net_shrink()
>> and bpf_skb_adjust_room() to use the new masks (no behavior change).
>> Co-developed-by: Max Tottenham <mtottenh@akamai.com>
>> Signed-off-by: Max Tottenham <mtottenh@akamai.com>
>> Co-developed-by: Anna Glasgall <aglasgal@akamai.com>
>> Signed-off-by: Anna Glasgall <aglasgal@akamai.com>
>> Signed-off-by: Nick Hudson <nhudson@akamai.com>
>> ---
>>  net/core/filter.c | 31 +++++++++++++++++++++++--------
>>  1 file changed, 23 insertions(+), 8 deletions(-)
>> diff --git a/net/core/filter.c b/net/core/filter.c
>> index 0d5d5a17acb2..7c2871b40fe4 100644
>> --- a/net/core/filter.c
>> +++ b/net/core/filter.c
>> @@ -3483,14 +3483,25 @@ static u32 bpf_skb_net_base_len(const struct sk_buff *skb)
>>  #define BPF_F_ADJ_ROOM_DECAP_L3_MASK (BPF_F_ADJ_ROOM_DECAP_L3_IPV4 | \
>>    BPF_F_ADJ_ROOM_DECAP_L3_IPV6)
>>  -#define BPF_F_ADJ_ROOM_MASK (BPF_F_ADJ_ROOM_FIXED_GSO | \
>> -  BPF_F_ADJ_ROOM_ENCAP_L3_MASK | \
>> +#define BPF_F_ADJ_ROOM_DECAP_L4_MASK (BPF_F_ADJ_ROOM_DECAP_L4_UDP | \
>> +  BPF_F_ADJ_ROOM_DECAP_L4_GRE)
>> +
>> +#define BPF_F_ADJ_ROOM_DECAP_IPXIP_MASK (BPF_F_ADJ_ROOM_DECAP_IPXIP4 | \
>> +  BPF_F_ADJ_ROOM_DECAP_IPXIP6)
>> +
>> +#define BPF_F_ADJ_ROOM_ENCAP_MASK (BPF_F_ADJ_ROOM_ENCAP_L3_MASK | \
>>    BPF_F_ADJ_ROOM_ENCAP_L4_GRE | \
>>    BPF_F_ADJ_ROOM_ENCAP_L4_UDP | \
>>    BPF_F_ADJ_ROOM_ENCAP_L2_ETH | \
>>    BPF_F_ADJ_ROOM_ENCAP_L2( \
>> -   BPF_ADJ_ROOM_ENCAP_L2_MASK) | \
>> -  BPF_F_ADJ_ROOM_DECAP_L3_MASK)
>> +   BPF_ADJ_ROOM_ENCAP_L2_MASK))
>> +
>> +#define BPF_F_ADJ_ROOM_DECAP_MASK (BPF_F_ADJ_ROOM_DECAP_L3_MASK)
>> +
>> +#define BPF_F_ADJ_ROOM_MASK (BPF_F_ADJ_ROOM_FIXED_GSO | \
>> +  BPF_F_ADJ_ROOM_ENCAP_MASK | \
>> +  BPF_F_ADJ_ROOM_DECAP_MASK | \
>> +  BPF_F_ADJ_ROOM_NO_CSUM_RESET)
> 
> The patch does two things: refactoring of existing macros (BPF_F_ADJ_ROOM_ENCAP_MASK, BPF_F_ADJ_ROOM_DECAP_MASK) and new additions (BPF_F_ADJ_ROOM_DECAP_L4_MASK, BPF_F_ADJ_ROOM_DECAP_IPXIP_MASK) that depend on the new flags from the UAPI changes in patch 2.
> 
> The refactoring does not depend on the new UAPI flags and could be a separate patch placed earlier in the series. That way a reviewer can verify it is a no-op without the new flag additions getting in
> the way. The (BPF_F_ADJ_ROOM_DECAP_L4_MASK, BPF_F_ADJ_ROOM_DECAP_IPXIP_MASK) can be introduced together in patch 4 when it is first used.

OK, will split further.

> 
>>    static int bpf_skb_net_grow(struct sk_buff *skb, u32 off, u32 len_diff,
>>       u64 flags)
>> @@ -3502,6 +3513,11 @@ static int bpf_skb_net_grow(struct sk_buff *skb, u32 off, u32 len_diff,
>>   unsigned int gso_type = SKB_GSO_DODGY;
>>   int ret;
>>  + if (unlikely(flags & ~(BPF_F_ADJ_ROOM_ENCAP_MASK |
>> +        BPF_F_ADJ_ROOM_NO_CSUM_RESET |
>> +        BPF_F_ADJ_ROOM_FIXED_GSO)))
> 
> Under which case this new check will be hit?

If a user supplies +ve len_diff and attempts to pass a DECAP flag.

The commit message had

    Add flag validation to bpf_skb_net_grow() to reject invalid encap
    flags early.

> 
>> + return -EINVAL;
>> +
>>   if (skb_is_gso(skb) && !skb_is_gso_tcp(skb)) {
>>   /* udp gso_size delineates datagrams, only allow if fixed */
>>   if (!(skb_shinfo(skb)->gso_type & SKB_GSO_UDP_L4) ||
>> @@ -3611,8 +3627,8 @@ static int bpf_skb_net_shrink(struct sk_buff *skb, u32 off, u32 len_diff,
>>  {
>>   int ret;
>>  - if (unlikely(flags & ~(BPF_F_ADJ_ROOM_FIXED_GSO |
>> -        BPF_F_ADJ_ROOM_DECAP_L3_MASK |
>> + if (unlikely(flags & ~(BPF_F_ADJ_ROOM_DECAP_MASK |
>> +        BPF_F_ADJ_ROOM_FIXED_GSO |
>>          BPF_F_ADJ_ROOM_NO_CSUM_RESET)))
>>   return -EINVAL;
>>  @@ -3708,8 +3724,7 @@ BPF_CALL_4(bpf_skb_adjust_room, struct sk_buff *, skb, s32, len_diff,
>>   u32 off;
>>   int ret;
>>  - if (unlikely(flags & ~(BPF_F_ADJ_ROOM_MASK |
>> -        BPF_F_ADJ_ROOM_NO_CSUM_RESET)))
>> + if (unlikely(flags & ~BPF_F_ADJ_ROOM_MASK))
>>   return -EINVAL;
>>   if (unlikely(len_diff_abs > 0xfffU))
>>   return -EFAULT;



[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 3066 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 3/5] bpf: add helper masks for ADJ_ROOM flags and encap validation
  2026-03-26 17:02     ` Hudson, Nick
@ 2026-03-26 17:49       ` Martin KaFai Lau
  0 siblings, 0 replies; 21+ messages in thread
From: Martin KaFai Lau @ 2026-03-26 17:49 UTC (permalink / raw)
  To: Hudson, Nick
  Cc: Willem de Bruijn, Tottenham, Max, Glasgall, Anna,
	Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	bpf@vger.kernel.org, netdev@vger.kernel.org,
	linux-kernel@vger.kernel.org



On 3/26/26 10:02 AM, Hudson, Nick wrote:
>>>     static int bpf_skb_net_grow(struct sk_buff *skb, u32 off, u32 len_diff,
>>>        u64 flags)
>>> @@ -3502,6 +3513,11 @@ static int bpf_skb_net_grow(struct sk_buff *skb, u32 off, u32 len_diff,
>>>    unsigned int gso_type = SKB_GSO_DODGY;
>>>    int ret;
>>>   + if (unlikely(flags & ~(BPF_F_ADJ_ROOM_ENCAP_MASK |
>>> +        BPF_F_ADJ_ROOM_NO_CSUM_RESET |
>>> +        BPF_F_ADJ_ROOM_FIXED_GSO)))
>> Under which case this new check will be hit?
> If a user supplies +ve len_diff and attempts to pass a DECAP flag.
> 
> The commit message had
> 
>      Add flag validation to bpf_skb_net_grow() to reject invalid encap
>      flags early.

There is DECAP_MASK check in bpf_skb_adjust_room() and then !shrink is 
rejected. What am I missing?

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2026-03-26 17:49 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-18 13:42 [PATCH v2 0/5] bpf: skb_adjust_room helper refactor and tunnel decap flags Nick Hudson
2026-03-18 13:42 ` [PATCH v2 1/5] bpf: name the enum for BPF_FUNC_skb_adjust_room flags Nick Hudson
2026-03-21  0:39   ` Willem de Bruijn
2026-03-24 17:34   ` Martin KaFai Lau
2026-03-18 13:42 ` [PATCH v2 2/5] bpf: add BPF_F_ADJ_ROOM_DECAP_* flags for tunnel decapsulation Nick Hudson
2026-03-21  0:39   ` Willem de Bruijn
2026-03-18 13:42 ` [PATCH v2 3/5] bpf: add helper masks for ADJ_ROOM flags and encap validation Nick Hudson
2026-03-21  0:39   ` Willem de Bruijn
2026-03-24 18:12   ` Martin KaFai Lau
2026-03-26 17:02     ` Hudson, Nick
2026-03-26 17:49       ` Martin KaFai Lau
2026-03-18 13:42 ` [PATCH v2 4/5] bpf: allow new DECAP flags and add guard rails Nick Hudson
2026-03-18 20:02   ` Willem de Bruijn
2026-03-19  8:17     ` Hudson, Nick
2026-03-19 13:24       ` Willem de Bruijn
2026-03-21  0:40   ` Willem de Bruijn
2026-03-24 18:30   ` Martin KaFai Lau
2026-03-26 17:02     ` Hudson, Nick
2026-03-18 13:42 ` [PATCH v2 5/5] bpf: clear decap tunnel GSO state in skb_adjust_room Nick Hudson
2026-03-18 20:09   ` Willem de Bruijn
2026-03-18 20:01 ` [PATCH v2 0/5] bpf: skb_adjust_room helper refactor and tunnel decap flags Willem de Bruijn

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox