public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 1/5] bpf: name the enum for BPF_FUNC_skb_adjust_room flags
       [not found] <20260318134242.2725749-1-nhudson@akamai.com>
@ 2026-03-18 13:42 ` Nick Hudson
  2026-03-21  0:39   ` Willem de Bruijn
  2026-03-24 17:34   ` Martin KaFai Lau
  2026-03-18 13:42 ` [PATCH v2 2/5] bpf: add BPF_F_ADJ_ROOM_DECAP_* flags for tunnel decapsulation Nick Hudson
                   ` (3 subsequent siblings)
  4 siblings, 2 replies; 20+ messages in thread
From: Nick Hudson @ 2026-03-18 13:42 UTC (permalink / raw)
  To: bpf, netdev
  Cc: Willem de Bruijn, Nick Hudson, Max Tottenham, Anna Glasgall,
	Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	linux-kernel

The existing anonymous enum for BPF_FUNC_skb_adjust_room flags is
named to enum bpf_adj_room_flags to enable CO-RE (Compile Once -
Run Everywhere) lookups in BPF programs.

Co-developed-by: Max Tottenham <mtottenh@akamai.com>
Signed-off-by: Max Tottenham <mtottenh@akamai.com>
Co-developed-by: Anna Glasgall <aglasgal@akamai.com>
Signed-off-by: Anna Glasgall <aglasgal@akamai.com>
Signed-off-by: Nick Hudson <nhudson@akamai.com>
---
 include/uapi/linux/bpf.h       | 2 +-
 tools/include/uapi/linux/bpf.h | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index c8d400b7680a..bc4b25eb72ce 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -6209,7 +6209,7 @@ enum {
 };
 
 /* BPF_FUNC_skb_adjust_room flags. */
-enum {
+enum bpf_adj_room_flags {
 	BPF_F_ADJ_ROOM_FIXED_GSO	= (1ULL << 0),
 	BPF_F_ADJ_ROOM_ENCAP_L3_IPV4	= (1ULL << 1),
 	BPF_F_ADJ_ROOM_ENCAP_L3_IPV6	= (1ULL << 2),
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index 5e38b4887de6..db2c520d0e92 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -6209,7 +6209,7 @@ enum {
 };
 
 /* BPF_FUNC_skb_adjust_room flags. */
-enum {
+enum bpf_adj_room_flags {
 	BPF_F_ADJ_ROOM_FIXED_GSO	= (1ULL << 0),
 	BPF_F_ADJ_ROOM_ENCAP_L3_IPV4	= (1ULL << 1),
 	BPF_F_ADJ_ROOM_ENCAP_L3_IPV6	= (1ULL << 2),
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v2 2/5] bpf: add BPF_F_ADJ_ROOM_DECAP_* flags for tunnel decapsulation
       [not found] <20260318134242.2725749-1-nhudson@akamai.com>
  2026-03-18 13:42 ` [PATCH v2 1/5] bpf: name the enum for BPF_FUNC_skb_adjust_room flags Nick Hudson
@ 2026-03-18 13:42 ` Nick Hudson
  2026-03-21  0:39   ` Willem de Bruijn
  2026-03-18 13:42 ` [PATCH v2 3/5] bpf: add helper masks for ADJ_ROOM flags and encap validation Nick Hudson
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 20+ messages in thread
From: Nick Hudson @ 2026-03-18 13:42 UTC (permalink / raw)
  To: bpf, netdev
  Cc: Willem de Bruijn, Nick Hudson, Max Tottenham, Anna Glasgall,
	Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	linux-kernel

Add new bpf_skb_adjust_room() decapsulation flags:

- BPF_F_ADJ_ROOM_DECAP_L4_GRE
- BPF_F_ADJ_ROOM_DECAP_L4_UDP
- BPF_F_ADJ_ROOM_DECAP_IPXIP4
- BPF_F_ADJ_ROOM_DECAP_IPXIP6

These flags let BPF programs describe which tunnel layer is being
removed, so later changes can update tunnel-related GSO state
accordingly during decapsulation.

This patch only introduces the UAPI flag definitions and helper
documentation.

Co-developed-by: Max Tottenham <mtottenh@akamai.com>
Signed-off-by: Max Tottenham <mtottenh@akamai.com>
Co-developed-by: Anna Glasgall <aglasgal@akamai.com>
Signed-off-by: Anna Glasgall <aglasgal@akamai.com>
Signed-off-by: Nick Hudson <nhudson@akamai.com>
---
 include/uapi/linux/bpf.h       | 34 ++++++++++++++++++++++++++++++++--
 tools/include/uapi/linux/bpf.h | 34 ++++++++++++++++++++++++++++++++--
 2 files changed, 64 insertions(+), 4 deletions(-)

diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index bc4b25eb72ce..2ef886dc9685 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -3010,8 +3010,34 @@ union bpf_attr {
  *
  *		* **BPF_F_ADJ_ROOM_DECAP_L3_IPV4**,
  *		  **BPF_F_ADJ_ROOM_DECAP_L3_IPV6**:
- *		  Indicate the new IP header version after decapsulating the outer
- *		  IP header. Used when the inner and outer IP versions are different.
+ *		  Indicate the new IP header version after decapsulating the
+ *		  outer IP header. Used when the inner and outer IP versions
+ *		  are different. These flags only trigger a protocol change
+ *		  without clearing any tunnel-specific GSO flags.
+ *
+ *		* **BPF_F_ADJ_ROOM_DECAP_L4_GRE**:
+ *		  Clear GRE tunnel GSO flags (SKB_GSO_GRE and SKB_GSO_GRE_CSUM)
+ *		  when decapsulating a GRE tunnel.
+ *
+ *		* **BPF_F_ADJ_ROOM_DECAP_L4_UDP**:
+ *		  Clear UDP tunnel GSO flags (SKB_GSO_UDP_TUNNEL and
+ *		  SKB_GSO_UDP_TUNNEL_CSUM) when decapsulating a UDP tunnel.
+ *
+ *		* **BPF_F_ADJ_ROOM_DECAP_IPXIP4**:
+ *		  Clear IPIP/SIT tunnel GSO flag (SKB_GSO_IPXIP4) when decapsulating
+ *		  a tunnel with an outer IPv4 header (IPv4-in-IPv4 or IPv6-in-IPv4).
+ *
+ *		* **BPF_F_ADJ_ROOM_DECAP_IPXIP6**:
+ *		  Clear IPv6 encapsulation tunnel GSO flag (SKB_GSO_IPXIP6) when
+ *		  decapsulating a tunnel with an outer IPv6 header (IPv6-in-IPv6
+ *		  or IPv4-in-IPv6).
+ *
+ *		When using the decapsulation flags above, the skb->encapsulation
+ *		flag is automatically cleared if all tunnel-specific GSO flags
+ *		(SKB_GSO_UDP_TUNNEL, SKB_GSO_UDP_TUNNEL_CSUM, SKB_GSO_GRE,
+ *		SKB_GSO_GRE_CSUM, SKB_GSO_IPXIP4, SKB_GSO_IPXIP6) have been
+ *		removed from the packet. This handles cases where all tunnel
+ *		layers have been decapsulated.
  *
  * 		A call to this helper is susceptible to change the underlying
  * 		packet buffer. Therefore, at load time, all checks on pointers
@@ -6219,6 +6245,10 @@ enum bpf_adj_room_flags {
 	BPF_F_ADJ_ROOM_ENCAP_L2_ETH	= (1ULL << 6),
 	BPF_F_ADJ_ROOM_DECAP_L3_IPV4	= (1ULL << 7),
 	BPF_F_ADJ_ROOM_DECAP_L3_IPV6	= (1ULL << 8),
+	BPF_F_ADJ_ROOM_DECAP_L4_GRE	= (1ULL << 9),
+	BPF_F_ADJ_ROOM_DECAP_L4_UDP	= (1ULL << 10),
+	BPF_F_ADJ_ROOM_DECAP_IPXIP4	= (1ULL << 11),
+	BPF_F_ADJ_ROOM_DECAP_IPXIP6	= (1ULL << 12),
 };
 
 enum {
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index db2c520d0e92..e9a5c67ff5e2 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -3010,8 +3010,34 @@ union bpf_attr {
  *
  *		* **BPF_F_ADJ_ROOM_DECAP_L3_IPV4**,
  *		  **BPF_F_ADJ_ROOM_DECAP_L3_IPV6**:
- *		  Indicate the new IP header version after decapsulating the outer
- *		  IP header. Used when the inner and outer IP versions are different.
+ *		  Indicate the new IP header version after decapsulating the
+ *		  outer IP header. Used when the inner and outer IP versions
+ *		  are different. These flags only trigger a protocol change
+ *		  without clearing any tunnel-specific GSO flags.
+ *
+ *		* **BPF_F_ADJ_ROOM_DECAP_L4_GRE**:
+ *		  Clear GRE tunnel GSO flags (SKB_GSO_GRE and SKB_GSO_GRE_CSUM)
+ *		  when decapsulating a GRE tunnel.
+ *
+ *		* **BPF_F_ADJ_ROOM_DECAP_L4_UDP**:
+ *		  Clear UDP tunnel GSO flags (SKB_GSO_UDP_TUNNEL and
+ *		  SKB_GSO_UDP_TUNNEL_CSUM) when decapsulating a UDP tunnel.
+ *
+ *		* **BPF_F_ADJ_ROOM_DECAP_IPXIP4**:
+ *		  Clear IPIP/SIT tunnel GSO flag (SKB_GSO_IPXIP4) when decapsulating
+ *		  a tunnel with an outer IPv4 header (IPv4-in-IPv4 or IPv6-in-IPv4).
+ *
+ *		* **BPF_F_ADJ_ROOM_DECAP_IPXIP6**:
+ *		  Clear IPv6 encapsulation tunnel GSO flag (SKB_GSO_IPXIP6) when
+ *		  decapsulating a tunnel with an outer IPv6 header (IPv6-in-IPv6
+ *		  or IPv4-in-IPv6).
+ *
+ *		When using the decapsulation flags above, the skb->encapsulation
+ *		flag is automatically cleared if all tunnel-specific GSO flags
+ *		(SKB_GSO_UDP_TUNNEL, SKB_GSO_UDP_TUNNEL_CSUM, SKB_GSO_GRE,
+ *		SKB_GSO_GRE_CSUM, SKB_GSO_IPXIP4, SKB_GSO_IPXIP6) have been
+ *		removed from the packet. This handles cases where all tunnel
+ *		layers have been decapsulated.
  *
  * 		A call to this helper is susceptible to change the underlying
  * 		packet buffer. Therefore, at load time, all checks on pointers
@@ -6219,6 +6245,10 @@ enum bpf_adj_room_flags {
 	BPF_F_ADJ_ROOM_ENCAP_L2_ETH	= (1ULL << 6),
 	BPF_F_ADJ_ROOM_DECAP_L3_IPV4	= (1ULL << 7),
 	BPF_F_ADJ_ROOM_DECAP_L3_IPV6	= (1ULL << 8),
+	BPF_F_ADJ_ROOM_DECAP_L4_GRE	= (1ULL << 9),
+	BPF_F_ADJ_ROOM_DECAP_L4_UDP	= (1ULL << 10),
+	BPF_F_ADJ_ROOM_DECAP_IPXIP4	= (1ULL << 11),
+	BPF_F_ADJ_ROOM_DECAP_IPXIP6	= (1ULL << 12),
 };
 
 enum {
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v2 3/5] bpf: add helper masks for ADJ_ROOM flags and encap validation
       [not found] <20260318134242.2725749-1-nhudson@akamai.com>
  2026-03-18 13:42 ` [PATCH v2 1/5] bpf: name the enum for BPF_FUNC_skb_adjust_room flags Nick Hudson
  2026-03-18 13:42 ` [PATCH v2 2/5] bpf: add BPF_F_ADJ_ROOM_DECAP_* flags for tunnel decapsulation Nick Hudson
@ 2026-03-18 13:42 ` Nick Hudson
  2026-03-21  0:39   ` Willem de Bruijn
  2026-03-24 18:12   ` Martin KaFai Lau
  2026-03-18 13:42 ` [PATCH v2 4/5] bpf: allow new DECAP flags and add guard rails Nick Hudson
  2026-03-18 13:42 ` [PATCH v2 5/5] bpf: clear decap tunnel GSO state in skb_adjust_room Nick Hudson
  4 siblings, 2 replies; 20+ messages in thread
From: Nick Hudson @ 2026-03-18 13:42 UTC (permalink / raw)
  To: bpf, netdev
  Cc: Willem de Bruijn, Nick Hudson, Max Tottenham, Anna Glasgall,
	Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, linux-kernel

Introduce helper masks for bpf_skb_adjust_room() flags to simplify
validation logic:

- BPF_F_ADJ_ROOM_DECAP_L4_MASK
- BPF_F_ADJ_ROOM_DECAP_IPXIP_MASK
- BPF_F_ADJ_ROOM_ENCAP_MASK
- BPF_F_ADJ_ROOM_DECAP_MASK

Add flag validation to bpf_skb_net_grow() to reject invalid encap
flags early. Refactor existing validation checks in bpf_skb_net_shrink()
and bpf_skb_adjust_room() to use the new masks (no behavior change).

Co-developed-by: Max Tottenham <mtottenh@akamai.com>
Signed-off-by: Max Tottenham <mtottenh@akamai.com>
Co-developed-by: Anna Glasgall <aglasgal@akamai.com>
Signed-off-by: Anna Glasgall <aglasgal@akamai.com>
Signed-off-by: Nick Hudson <nhudson@akamai.com>
---
 net/core/filter.c | 31 +++++++++++++++++++++++--------
 1 file changed, 23 insertions(+), 8 deletions(-)

diff --git a/net/core/filter.c b/net/core/filter.c
index 0d5d5a17acb2..7c2871b40fe4 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -3483,14 +3483,25 @@ static u32 bpf_skb_net_base_len(const struct sk_buff *skb)
 #define BPF_F_ADJ_ROOM_DECAP_L3_MASK	(BPF_F_ADJ_ROOM_DECAP_L3_IPV4 | \
 					 BPF_F_ADJ_ROOM_DECAP_L3_IPV6)
 
-#define BPF_F_ADJ_ROOM_MASK		(BPF_F_ADJ_ROOM_FIXED_GSO | \
-					 BPF_F_ADJ_ROOM_ENCAP_L3_MASK | \
+#define BPF_F_ADJ_ROOM_DECAP_L4_MASK	(BPF_F_ADJ_ROOM_DECAP_L4_UDP | \
+					 BPF_F_ADJ_ROOM_DECAP_L4_GRE)
+
+#define BPF_F_ADJ_ROOM_DECAP_IPXIP_MASK	(BPF_F_ADJ_ROOM_DECAP_IPXIP4 | \
+					 BPF_F_ADJ_ROOM_DECAP_IPXIP6)
+
+#define BPF_F_ADJ_ROOM_ENCAP_MASK	(BPF_F_ADJ_ROOM_ENCAP_L3_MASK | \
 					 BPF_F_ADJ_ROOM_ENCAP_L4_GRE | \
 					 BPF_F_ADJ_ROOM_ENCAP_L4_UDP | \
 					 BPF_F_ADJ_ROOM_ENCAP_L2_ETH | \
 					 BPF_F_ADJ_ROOM_ENCAP_L2( \
-					  BPF_ADJ_ROOM_ENCAP_L2_MASK) | \
-					 BPF_F_ADJ_ROOM_DECAP_L3_MASK)
+					  BPF_ADJ_ROOM_ENCAP_L2_MASK))
+
+#define BPF_F_ADJ_ROOM_DECAP_MASK	(BPF_F_ADJ_ROOM_DECAP_L3_MASK)
+
+#define BPF_F_ADJ_ROOM_MASK		(BPF_F_ADJ_ROOM_FIXED_GSO | \
+					 BPF_F_ADJ_ROOM_ENCAP_MASK | \
+					 BPF_F_ADJ_ROOM_DECAP_MASK | \
+					 BPF_F_ADJ_ROOM_NO_CSUM_RESET)
 
 static int bpf_skb_net_grow(struct sk_buff *skb, u32 off, u32 len_diff,
 			    u64 flags)
@@ -3502,6 +3513,11 @@ static int bpf_skb_net_grow(struct sk_buff *skb, u32 off, u32 len_diff,
 	unsigned int gso_type = SKB_GSO_DODGY;
 	int ret;
 
+	if (unlikely(flags & ~(BPF_F_ADJ_ROOM_ENCAP_MASK |
+			       BPF_F_ADJ_ROOM_NO_CSUM_RESET |
+			       BPF_F_ADJ_ROOM_FIXED_GSO)))
+		return -EINVAL;
+
 	if (skb_is_gso(skb) && !skb_is_gso_tcp(skb)) {
 		/* udp gso_size delineates datagrams, only allow if fixed */
 		if (!(skb_shinfo(skb)->gso_type & SKB_GSO_UDP_L4) ||
@@ -3611,8 +3627,8 @@ static int bpf_skb_net_shrink(struct sk_buff *skb, u32 off, u32 len_diff,
 {
 	int ret;
 
-	if (unlikely(flags & ~(BPF_F_ADJ_ROOM_FIXED_GSO |
-			       BPF_F_ADJ_ROOM_DECAP_L3_MASK |
+	if (unlikely(flags & ~(BPF_F_ADJ_ROOM_DECAP_MASK |
+			       BPF_F_ADJ_ROOM_FIXED_GSO |
 			       BPF_F_ADJ_ROOM_NO_CSUM_RESET)))
 		return -EINVAL;
 
@@ -3708,8 +3724,7 @@ BPF_CALL_4(bpf_skb_adjust_room, struct sk_buff *, skb, s32, len_diff,
 	u32 off;
 	int ret;
 
-	if (unlikely(flags & ~(BPF_F_ADJ_ROOM_MASK |
-			       BPF_F_ADJ_ROOM_NO_CSUM_RESET)))
+	if (unlikely(flags & ~BPF_F_ADJ_ROOM_MASK))
 		return -EINVAL;
 	if (unlikely(len_diff_abs > 0xfffU))
 		return -EFAULT;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v2 4/5] bpf: allow new DECAP flags and add guard rails
       [not found] <20260318134242.2725749-1-nhudson@akamai.com>
                   ` (2 preceding siblings ...)
  2026-03-18 13:42 ` [PATCH v2 3/5] bpf: add helper masks for ADJ_ROOM flags and encap validation Nick Hudson
@ 2026-03-18 13:42 ` Nick Hudson
  2026-03-18 20:02   ` Willem de Bruijn
                     ` (2 more replies)
  2026-03-18 13:42 ` [PATCH v2 5/5] bpf: clear decap tunnel GSO state in skb_adjust_room Nick Hudson
  4 siblings, 3 replies; 20+ messages in thread
From: Nick Hudson @ 2026-03-18 13:42 UTC (permalink / raw)
  To: bpf, netdev
  Cc: Willem de Bruijn, Nick Hudson, Max Tottenham, Anna Glasgall,
	Martin KaFai Lau, Daniel Borkmann, Alexei Starovoitov,
	Andrii Nakryiko, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, linux-kernel

Add checks to require shrink-only decap, reject conflicting decap flag
combinations, and verify removed length is sufficient for claimed header
decapsulation.

Co-developed-by: Max Tottenham <mtottenh@akamai.com>
Signed-off-by: Max Tottenham <mtottenh@akamai.com>
Co-developed-by: Anna Glasgall <aglasgal@akamai.com>
Signed-off-by: Anna Glasgall <aglasgal@akamai.com>
Signed-off-by: Nick Hudson <nhudson@akamai.com>
---
 net/core/filter.c | 47 +++++++++++++++++++++++++++++++++++++----------
 1 file changed, 37 insertions(+), 10 deletions(-)

diff --git a/net/core/filter.c b/net/core/filter.c
index 7c2871b40fe4..47aec44a9cd3 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -56,6 +56,7 @@
 #include <net/sock_reuseport.h>
 #include <net/busy_poll.h>
 #include <net/tcp.h>
+#include <net/gre.h>
 #include <net/xfrm.h>
 #include <net/udp.h>
 #include <linux/bpf_trace.h>
@@ -3496,7 +3497,9 @@ static u32 bpf_skb_net_base_len(const struct sk_buff *skb)
 					 BPF_F_ADJ_ROOM_ENCAP_L2( \
 					  BPF_ADJ_ROOM_ENCAP_L2_MASK))
 
-#define BPF_F_ADJ_ROOM_DECAP_MASK	(BPF_F_ADJ_ROOM_DECAP_L3_MASK)
+#define BPF_F_ADJ_ROOM_DECAP_MASK	(BPF_F_ADJ_ROOM_DECAP_L3_MASK | \
+					 BPF_F_ADJ_ROOM_DECAP_L4_MASK | \
+					 BPF_F_ADJ_ROOM_DECAP_IPXIP_MASK)
 
 #define BPF_F_ADJ_ROOM_MASK		(BPF_F_ADJ_ROOM_FIXED_GSO | \
 					 BPF_F_ADJ_ROOM_ENCAP_MASK | \
@@ -3743,20 +3746,44 @@ BPF_CALL_4(bpf_skb_adjust_room, struct sk_buff *, skb, s32, len_diff,
 		return -ENOTSUPP;
 	}
 
-	if (flags & BPF_F_ADJ_ROOM_DECAP_L3_MASK) {
+	if (flags & BPF_F_ADJ_ROOM_DECAP_MASK) {
+		u32 len_decap_min = 0;
+
 		if (!shrink)
 			return -EINVAL;
 
-		switch (flags & BPF_F_ADJ_ROOM_DECAP_L3_MASK) {
-		case BPF_F_ADJ_ROOM_DECAP_L3_IPV4:
+		/* Reject mutually exclusive decap flag pairs. */
+		if ((flags & BPF_F_ADJ_ROOM_DECAP_L3_MASK) ==
+		    BPF_F_ADJ_ROOM_DECAP_L3_MASK)
+			return -EINVAL;
+
+		if ((flags & BPF_F_ADJ_ROOM_DECAP_L4_MASK) ==
+		    BPF_F_ADJ_ROOM_DECAP_L4_MASK)
+			return -EINVAL;
+
+		if ((flags & BPF_F_ADJ_ROOM_DECAP_IPXIP_MASK) ==
+		    BPF_F_ADJ_ROOM_DECAP_IPXIP_MASK)
+			return -EINVAL;
+
+		/* Reject mutually exclusive decap tunnel type flags. */
+		if ((flags & BPF_F_ADJ_ROOM_DECAP_L4_MASK) &&
+		    (flags & BPF_F_ADJ_ROOM_DECAP_IPXIP_MASK))
+			return -EINVAL;
+
+		if (flags & BPF_F_ADJ_ROOM_DECAP_L4_UDP)
+			len_decap_min += sizeof(struct udphdr);
+
+		if (flags & BPF_F_ADJ_ROOM_DECAP_L4_GRE)
+			len_decap_min += sizeof(struct gre_base_hdr);
+
+		if (len_diff_abs < len_decap_min)
+			return -EINVAL;
+
+		if (flags & BPF_F_ADJ_ROOM_DECAP_L3_IPV4)
 			len_min = sizeof(struct iphdr);
-			break;
-		case BPF_F_ADJ_ROOM_DECAP_L3_IPV6:
+
+		if (flags & BPF_F_ADJ_ROOM_DECAP_L3_IPV6)
 			len_min = sizeof(struct ipv6hdr);
-			break;
-		default:
-			return -EINVAL;
-		}
 	}
 
 	len_cur = skb->len - skb_network_offset(skb);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v2 5/5] bpf: clear decap tunnel GSO state in skb_adjust_room
       [not found] <20260318134242.2725749-1-nhudson@akamai.com>
                   ` (3 preceding siblings ...)
  2026-03-18 13:42 ` [PATCH v2 4/5] bpf: allow new DECAP flags and add guard rails Nick Hudson
@ 2026-03-18 13:42 ` Nick Hudson
  2026-03-18 20:09   ` Willem de Bruijn
  4 siblings, 1 reply; 20+ messages in thread
From: Nick Hudson @ 2026-03-18 13:42 UTC (permalink / raw)
  To: bpf, netdev
  Cc: Willem de Bruijn, Nick Hudson, Max Tottenham, Anna Glasgall,
	Martin KaFai Lau, Daniel Borkmann, Alexei Starovoitov,
	Andrii Nakryiko, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, linux-kernel

On shrink in bpf_skb_adjust_room(), clear tunnel-specific GSO flags
according to the decapsulation flags:

- BPF_F_ADJ_ROOM_DECAP_L4_UDP clears SKB_GSO_UDP_TUNNEL{,_CSUM}
- BPF_F_ADJ_ROOM_DECAP_L4_GRE clears SKB_GSO_GRE{,_CSUM}
- BPF_F_ADJ_ROOM_DECAP_IPXIP4 clears SKB_GSO_IPXIP4
- BPF_F_ADJ_ROOM_DECAP_IPXIP6 clears SKB_GSO_IPXIP6

When all tunnel-related GSO bits are cleared, also clear
skb->encapsulation.

Co-developed-by: Max Tottenham <mtottenh@akamai.com>
Signed-off-by: Max Tottenham <mtottenh@akamai.com>
Co-developed-by: Anna Glasgall <aglasgal@akamai.com>
Signed-off-by: Anna Glasgall <aglasgal@akamai.com>
Signed-off-by: Nick Hudson <nhudson@akamai.com>
---
 net/core/filter.c | 31 +++++++++++++++++++++++++++++++
 1 file changed, 31 insertions(+)

diff --git a/net/core/filter.c b/net/core/filter.c
index 47aec44a9cd3..35af1199ab97 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -3665,6 +3665,37 @@ static int bpf_skb_net_shrink(struct sk_buff *skb, u32 off, u32 len_diff,
 		if (!(flags & BPF_F_ADJ_ROOM_FIXED_GSO))
 			skb_increase_gso_size(shinfo, len_diff);
 
+		/* Selective GSO flag clearing based on decap type.
+		 * Only clear the flags for the tunnel layer being removed.
+		 */
+		if ((flags & BPF_F_ADJ_ROOM_DECAP_L4_UDP) &&
+		    (shinfo->gso_type & (SKB_GSO_UDP_TUNNEL |
+					 SKB_GSO_UDP_TUNNEL_CSUM)))
+			shinfo->gso_type &= ~(SKB_GSO_UDP_TUNNEL |
+					      SKB_GSO_UDP_TUNNEL_CSUM);
+		if ((flags & BPF_F_ADJ_ROOM_DECAP_L4_GRE) &&
+		    (shinfo->gso_type & (SKB_GSO_GRE | SKB_GSO_GRE_CSUM)))
+			shinfo->gso_type &= ~(SKB_GSO_GRE |
+					      SKB_GSO_GRE_CSUM);
+		if ((flags & BPF_F_ADJ_ROOM_DECAP_IPXIP4) &&
+		    (shinfo->gso_type & SKB_GSO_IPXIP4))
+			shinfo->gso_type &= ~SKB_GSO_IPXIP4;
+		if ((flags & BPF_F_ADJ_ROOM_DECAP_IPXIP6) &&
+		    (shinfo->gso_type & SKB_GSO_IPXIP6))
+			shinfo->gso_type &= ~SKB_GSO_IPXIP6;
+
+		/* Clear encapsulation flag only when no tunnel GSO flags remain */
+		if (flags & BPF_F_ADJ_ROOM_DECAP_MASK) {
+			if (!(shinfo->gso_type & (SKB_GSO_UDP_TUNNEL |
+						  SKB_GSO_UDP_TUNNEL_CSUM |
+						  SKB_GSO_GRE |
+						  SKB_GSO_GRE_CSUM |
+						  SKB_GSO_IPXIP4 |
+						  SKB_GSO_IPXIP6)))
+				if (skb->encapsulation)
+					skb->encapsulation = 0;
+		}
+
 		/* Header must be checked, and gso_segs recomputed. */
 		shinfo->gso_type |= SKB_GSO_DODGY;
 		shinfo->gso_segs = 0;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 4/5] bpf: allow new DECAP flags and add guard rails
  2026-03-18 13:42 ` [PATCH v2 4/5] bpf: allow new DECAP flags and add guard rails Nick Hudson
@ 2026-03-18 20:02   ` Willem de Bruijn
  2026-03-19  8:17     ` Hudson, Nick
  2026-03-21  0:40   ` Willem de Bruijn
  2026-03-24 18:30   ` Martin KaFai Lau
  2 siblings, 1 reply; 20+ messages in thread
From: Willem de Bruijn @ 2026-03-18 20:02 UTC (permalink / raw)
  To: Nick Hudson, bpf, netdev
  Cc: Willem de Bruijn, Nick Hudson, Max Tottenham, Anna Glasgall,
	Martin KaFai Lau, Daniel Borkmann, Alexei Starovoitov,
	Andrii Nakryiko, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, linux-kernel

Nick Hudson wrote:
> Add checks to require shrink-only decap, reject conflicting decap flag
> combinations, and verify removed length is sufficient for claimed header
> decapsulation.
> 
> Co-developed-by: Max Tottenham <mtottenh@akamai.com>
> Signed-off-by: Max Tottenham <mtottenh@akamai.com>
> Co-developed-by: Anna Glasgall <aglasgal@akamai.com>
> Signed-off-by: Anna Glasgall <aglasgal@akamai.com>
> Signed-off-by: Nick Hudson <nhudson@akamai.com>
> ---
>  net/core/filter.c | 47 +++++++++++++++++++++++++++++++++++++----------
>  1 file changed, 37 insertions(+), 10 deletions(-)
> 
> diff --git a/net/core/filter.c b/net/core/filter.c
> index 7c2871b40fe4..47aec44a9cd3 100644
> --- a/net/core/filter.c
> +++ b/net/core/filter.c
> @@ -56,6 +56,7 @@
>  #include <net/sock_reuseport.h>
>  #include <net/busy_poll.h>
>  #include <net/tcp.h>
> +#include <net/gre.h>
>  #include <net/xfrm.h>
>  #include <net/udp.h>
>  #include <linux/bpf_trace.h>
> @@ -3496,7 +3497,9 @@ static u32 bpf_skb_net_base_len(const struct sk_buff *skb)
>  					 BPF_F_ADJ_ROOM_ENCAP_L2( \
>  					  BPF_ADJ_ROOM_ENCAP_L2_MASK))
>  
> -#define BPF_F_ADJ_ROOM_DECAP_MASK	(BPF_F_ADJ_ROOM_DECAP_L3_MASK)
> +#define BPF_F_ADJ_ROOM_DECAP_MASK	(BPF_F_ADJ_ROOM_DECAP_L3_MASK | \
> +					 BPF_F_ADJ_ROOM_DECAP_L4_MASK | \
> +					 BPF_F_ADJ_ROOM_DECAP_IPXIP_MASK)
>  
>  #define BPF_F_ADJ_ROOM_MASK		(BPF_F_ADJ_ROOM_FIXED_GSO | \
>  					 BPF_F_ADJ_ROOM_ENCAP_MASK | \
> @@ -3743,20 +3746,44 @@ BPF_CALL_4(bpf_skb_adjust_room, struct sk_buff *, skb, s32, len_diff,
>  		return -ENOTSUPP;
>  	}
>  
> -	if (flags & BPF_F_ADJ_ROOM_DECAP_L3_MASK) {
> +	if (flags & BPF_F_ADJ_ROOM_DECAP_MASK) {
> +		u32 len_decap_min = 0;
> +
>  		if (!shrink)
>  			return -EINVAL;
>  
> -		switch (flags & BPF_F_ADJ_ROOM_DECAP_L3_MASK) {
> -		case BPF_F_ADJ_ROOM_DECAP_L3_IPV4:
> +		/* Reject mutually exclusive decap flag pairs. */
> +		if ((flags & BPF_F_ADJ_ROOM_DECAP_L3_MASK) ==
> +		    BPF_F_ADJ_ROOM_DECAP_L3_MASK)
> +			return -EINVAL;
> +
> +		if ((flags & BPF_F_ADJ_ROOM_DECAP_L4_MASK) ==
> +		    BPF_F_ADJ_ROOM_DECAP_L4_MASK)
> +			return -EINVAL;
> +
> +		if ((flags & BPF_F_ADJ_ROOM_DECAP_IPXIP_MASK) ==
> +		    BPF_F_ADJ_ROOM_DECAP_IPXIP_MASK)
> +			return -EINVAL;
> +
> +		/* Reject mutually exclusive decap tunnel type flags. */
> +		if ((flags & BPF_F_ADJ_ROOM_DECAP_L4_MASK) &&
> +		    (flags & BPF_F_ADJ_ROOM_DECAP_IPXIP_MASK))
> +			return -EINVAL;
> +
> +		if (flags & BPF_F_ADJ_ROOM_DECAP_L4_UDP)
> +			len_decap_min += sizeof(struct udphdr);
> +
> +		if (flags & BPF_F_ADJ_ROOM_DECAP_L4_GRE)
> +			len_decap_min += sizeof(struct gre_base_hdr);
> +
> +		if (len_diff_abs < len_decap_min)
> +			return -EINVAL;

Should this test come after the below IP flags?

> +
> +		if (flags & BPF_F_ADJ_ROOM_DECAP_L3_IPV4)
>  			len_min = sizeof(struct iphdr);
> -			break;
> -		case BPF_F_ADJ_ROOM_DECAP_L3_IPV6:
> +
> +		if (flags & BPF_F_ADJ_ROOM_DECAP_L3_IPV6)
>  			len_min = sizeof(struct ipv6hdr);
> -			break;
> -		default:
> -			return -EINVAL;
> -		}
>  	}
>  
>  	len_cur = skb->len - skb_network_offset(skb);
> -- 
> 2.34.1
> 



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 5/5] bpf: clear decap tunnel GSO state in skb_adjust_room
  2026-03-18 13:42 ` [PATCH v2 5/5] bpf: clear decap tunnel GSO state in skb_adjust_room Nick Hudson
@ 2026-03-18 20:09   ` Willem de Bruijn
  0 siblings, 0 replies; 20+ messages in thread
From: Willem de Bruijn @ 2026-03-18 20:09 UTC (permalink / raw)
  To: Nick Hudson, bpf, netdev
  Cc: Willem de Bruijn, Nick Hudson, Max Tottenham, Anna Glasgall,
	Martin KaFai Lau, Daniel Borkmann, Alexei Starovoitov,
	Andrii Nakryiko, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, linux-kernel

Nick Hudson wrote:
> On shrink in bpf_skb_adjust_room(), clear tunnel-specific GSO flags
> according to the decapsulation flags:
> 
> - BPF_F_ADJ_ROOM_DECAP_L4_UDP clears SKB_GSO_UDP_TUNNEL{,_CSUM}
> - BPF_F_ADJ_ROOM_DECAP_L4_GRE clears SKB_GSO_GRE{,_CSUM}
> - BPF_F_ADJ_ROOM_DECAP_IPXIP4 clears SKB_GSO_IPXIP4
> - BPF_F_ADJ_ROOM_DECAP_IPXIP6 clears SKB_GSO_IPXIP6
> 
> When all tunnel-related GSO bits are cleared, also clear
> skb->encapsulation.
> 
> Co-developed-by: Max Tottenham <mtottenh@akamai.com>
> Signed-off-by: Max Tottenham <mtottenh@akamai.com>
> Co-developed-by: Anna Glasgall <aglasgal@akamai.com>
> Signed-off-by: Anna Glasgall <aglasgal@akamai.com>
> Signed-off-by: Nick Hudson <nhudson@akamai.com>
> ---
>  net/core/filter.c | 31 +++++++++++++++++++++++++++++++
>  1 file changed, 31 insertions(+)
> 
> diff --git a/net/core/filter.c b/net/core/filter.c
> index 47aec44a9cd3..35af1199ab97 100644
> --- a/net/core/filter.c
> +++ b/net/core/filter.c
> @@ -3665,6 +3665,37 @@ static int bpf_skb_net_shrink(struct sk_buff *skb, u32 off, u32 len_diff,
>  		if (!(flags & BPF_F_ADJ_ROOM_FIXED_GSO))
>  			skb_increase_gso_size(shinfo, len_diff);
>  
> +		/* Selective GSO flag clearing based on decap type.
> +		 * Only clear the flags for the tunnel layer being removed.
> +		 */
> +		if ((flags & BPF_F_ADJ_ROOM_DECAP_L4_UDP) &&
> +		    (shinfo->gso_type & (SKB_GSO_UDP_TUNNEL |
> +					 SKB_GSO_UDP_TUNNEL_CSUM)))
> +			shinfo->gso_type &= ~(SKB_GSO_UDP_TUNNEL |
> +					      SKB_GSO_UDP_TUNNEL_CSUM);
> +		if ((flags & BPF_F_ADJ_ROOM_DECAP_L4_GRE) &&
> +		    (shinfo->gso_type & (SKB_GSO_GRE | SKB_GSO_GRE_CSUM)))
> +			shinfo->gso_type &= ~(SKB_GSO_GRE |
> +					      SKB_GSO_GRE_CSUM);
> +		if ((flags & BPF_F_ADJ_ROOM_DECAP_IPXIP4) &&
> +		    (shinfo->gso_type & SKB_GSO_IPXIP4))
> +			shinfo->gso_type &= ~SKB_GSO_IPXIP4;
> +		if ((flags & BPF_F_ADJ_ROOM_DECAP_IPXIP6) &&
> +		    (shinfo->gso_type & SKB_GSO_IPXIP6))
> +			shinfo->gso_type &= ~SKB_GSO_IPXIP6;
> +
> +		/* Clear encapsulation flag only when no tunnel GSO flags remain */
> +		if (flags & BPF_F_ADJ_ROOM_DECAP_MASK) {
> +			if (!(shinfo->gso_type & (SKB_GSO_UDP_TUNNEL |
> +						  SKB_GSO_UDP_TUNNEL_CSUM |
> +						  SKB_GSO_GRE |
> +						  SKB_GSO_GRE_CSUM |
> +						  SKB_GSO_IPXIP4 |
> +						  SKB_GSO_IPXIP6)))
> +				if (skb->encapsulation)
> +					skb->encapsulation = 0;

Is there any chance that this might clear it while some other tunnel
is still active? From a quick grep on skb->encapsulation the only
possible hit I see is SKB_GSO_ESP.

> +		}
> +
>  		/* Header must be checked, and gso_segs recomputed. */
>  		shinfo->gso_type |= SKB_GSO_DODGY;
>  		shinfo->gso_segs = 0;
> -- 
> 2.34.1
> 



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 4/5] bpf: allow new DECAP flags and add guard rails
  2026-03-18 20:02   ` Willem de Bruijn
@ 2026-03-19  8:17     ` Hudson, Nick
  2026-03-19 13:24       ` Willem de Bruijn
  0 siblings, 1 reply; 20+ messages in thread
From: Hudson, Nick @ 2026-03-19  8:17 UTC (permalink / raw)
  To: Willem de Bruijn
  Cc: bpf@vger.kernel.org, netdev@vger.kernel.org, Tottenham, Max,
	Glasgall, Anna, Martin KaFai Lau, Daniel Borkmann,
	Alexei Starovoitov, Andrii Nakryiko, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	linux-kernel@vger.kernel.org

[-- Attachment #1: Type: text/plain, Size: 3678 bytes --]



> On 18 Mar 2026, at 20:02, Willem de Bruijn <willemdebruijn.kernel@gmail.com> wrote:
> 
> !-------------------------------------------------------------------|
>  This Message Is From an External Sender
>  This message came from outside your organization.
> |-------------------------------------------------------------------!
> 
> Nick Hudson wrote:
>> Add checks to require shrink-only decap, reject conflicting decap flag
>> combinations, and verify removed length is sufficient for claimed header
>> decapsulation.
>> 
>> Co-developed-by: Max Tottenham <mtottenh@akamai.com>
>> Signed-off-by: Max Tottenham <mtottenh@akamai.com>
>> Co-developed-by: Anna Glasgall <aglasgal@akamai.com>
>> Signed-off-by: Anna Glasgall <aglasgal@akamai.com>
>> Signed-off-by: Nick Hudson <nhudson@akamai.com>
>> ---
>> net/core/filter.c | 47 +++++++++++++++++++++++++++++++++++++----------
>> 1 file changed, 37 insertions(+), 10 deletions(-)
>> 
>> diff --git a/net/core/filter.c b/net/core/filter.c
>> index 7c2871b40fe4..47aec44a9cd3 100644
>> --- a/net/core/filter.c
>> +++ b/net/core/filter.c
>> @@ -56,6 +56,7 @@
>> #include <net/sock_reuseport.h>
>> #include <net/busy_poll.h>
>> #include <net/tcp.h>
>> +#include <net/gre.h>
>> #include <net/xfrm.h>
>> #include <net/udp.h>
>> #include <linux/bpf_trace.h>
>> @@ -3496,7 +3497,9 @@ static u32 bpf_skb_net_base_len(const struct sk_buff *skb)
>> BPF_F_ADJ_ROOM_ENCAP_L2( \
>>  BPF_ADJ_ROOM_ENCAP_L2_MASK))
>> 
>> -#define BPF_F_ADJ_ROOM_DECAP_MASK (BPF_F_ADJ_ROOM_DECAP_L3_MASK)
>> +#define BPF_F_ADJ_ROOM_DECAP_MASK (BPF_F_ADJ_ROOM_DECAP_L3_MASK | \
>> + BPF_F_ADJ_ROOM_DECAP_L4_MASK | \
>> + BPF_F_ADJ_ROOM_DECAP_IPXIP_MASK)
>> 
>> #define BPF_F_ADJ_ROOM_MASK (BPF_F_ADJ_ROOM_FIXED_GSO | \
>> BPF_F_ADJ_ROOM_ENCAP_MASK | \
>> @@ -3743,20 +3746,44 @@ BPF_CALL_4(bpf_skb_adjust_room, struct sk_buff *, skb, s32, len_diff,
>> return -ENOTSUPP;
>> }
>> 
>> - if (flags & BPF_F_ADJ_ROOM_DECAP_L3_MASK) {
>> + if (flags & BPF_F_ADJ_ROOM_DECAP_MASK) {
>> + u32 len_decap_min = 0;
>> +
>> if (!shrink)
>> return -EINVAL;
>> 
>> - switch (flags & BPF_F_ADJ_ROOM_DECAP_L3_MASK) {
>> - case BPF_F_ADJ_ROOM_DECAP_L3_IPV4:
>> + /* Reject mutually exclusive decap flag pairs. */
>> + if ((flags & BPF_F_ADJ_ROOM_DECAP_L3_MASK) ==
>> +    BPF_F_ADJ_ROOM_DECAP_L3_MASK)
>> + return -EINVAL;
>> +
>> + if ((flags & BPF_F_ADJ_ROOM_DECAP_L4_MASK) ==
>> +    BPF_F_ADJ_ROOM_DECAP_L4_MASK)
>> + return -EINVAL;
>> +
>> + if ((flags & BPF_F_ADJ_ROOM_DECAP_IPXIP_MASK) ==
>> +    BPF_F_ADJ_ROOM_DECAP_IPXIP_MASK)
>> + return -EINVAL;
>> +
>> + /* Reject mutually exclusive decap tunnel type flags. */
>> + if ((flags & BPF_F_ADJ_ROOM_DECAP_L4_MASK) &&
>> +    (flags & BPF_F_ADJ_ROOM_DECAP_IPXIP_MASK))
>> + return -EINVAL;
>> +
>> + if (flags & BPF_F_ADJ_ROOM_DECAP_L4_UDP)
>> + len_decap_min += sizeof(struct udphdr);
>> +
>> + if (flags & BPF_F_ADJ_ROOM_DECAP_L4_GRE)
>> + len_decap_min += sizeof(struct gre_base_hdr);
>> +
>> + if (len_diff_abs < len_decap_min)
>> + return -EINVAL;
> 
> Should this test come after the below IP flags?

Should it?

Seems to me it can bail early without having to check the IP flags. len_decap_min vs len_min.

What am I missing?

> 
>> +
>> + if (flags & BPF_F_ADJ_ROOM_DECAP_L3_IPV4)
>> len_min = sizeof(struct iphdr);
>> - break;
>> - case BPF_F_ADJ_ROOM_DECAP_L3_IPV6:
>> +
>> + if (flags & BPF_F_ADJ_ROOM_DECAP_L3_IPV6)
>> len_min = sizeof(struct ipv6hdr);
>> - break;
>> - default:
>> - return -EINVAL;
>> - }
>> }
>> 
>> len_cur = skb->len - skb_network_offset(skb);
>> -- 
>> 2.34.1
>> 
> 
> 


[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 3067 bytes --]

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 4/5] bpf: allow new DECAP flags and add guard rails
  2026-03-19  8:17     ` Hudson, Nick
@ 2026-03-19 13:24       ` Willem de Bruijn
  0 siblings, 0 replies; 20+ messages in thread
From: Willem de Bruijn @ 2026-03-19 13:24 UTC (permalink / raw)
  To: Hudson, Nick, Willem de Bruijn
  Cc: bpf@vger.kernel.org, netdev@vger.kernel.org, Tottenham, Max,
	Glasgall, Anna, Martin KaFai Lau, Daniel Borkmann,
	Alexei Starovoitov, Andrii Nakryiko, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	linux-kernel@vger.kernel.org

Hudson, Nick wrote:
> 
> 
> > On 18 Mar 2026, at 20:02, Willem de Bruijn <willemdebruijn.kernel@gmail.com> wrote:
> > 
> > !-------------------------------------------------------------------|
> >  This Message Is From an External Sender
> >  This message came from outside your organization.
> > |-------------------------------------------------------------------!
> > 
> > Nick Hudson wrote:
> >> Add checks to require shrink-only decap, reject conflicting decap flag
> >> combinations, and verify removed length is sufficient for claimed header
> >> decapsulation.
> >> 
> >> Co-developed-by: Max Tottenham <mtottenh@akamai.com>
> >> Signed-off-by: Max Tottenham <mtottenh@akamai.com>
> >> Co-developed-by: Anna Glasgall <aglasgal@akamai.com>
> >> Signed-off-by: Anna Glasgall <aglasgal@akamai.com>
> >> Signed-off-by: Nick Hudson <nhudson@akamai.com>
> >> ---
> >> net/core/filter.c | 47 +++++++++++++++++++++++++++++++++++++----------
> >> 1 file changed, 37 insertions(+), 10 deletions(-)
> >> 
> >> diff --git a/net/core/filter.c b/net/core/filter.c
> >> index 7c2871b40fe4..47aec44a9cd3 100644
> >> --- a/net/core/filter.c
> >> +++ b/net/core/filter.c
> >> @@ -56,6 +56,7 @@
> >> #include <net/sock_reuseport.h>
> >> #include <net/busy_poll.h>
> >> #include <net/tcp.h>
> >> +#include <net/gre.h>
> >> #include <net/xfrm.h>
> >> #include <net/udp.h>
> >> #include <linux/bpf_trace.h>
> >> @@ -3496,7 +3497,9 @@ static u32 bpf_skb_net_base_len(const struct sk_buff *skb)
> >> BPF_F_ADJ_ROOM_ENCAP_L2( \
> >>  BPF_ADJ_ROOM_ENCAP_L2_MASK))
> >> 
> >> -#define BPF_F_ADJ_ROOM_DECAP_MASK (BPF_F_ADJ_ROOM_DECAP_L3_MASK)
> >> +#define BPF_F_ADJ_ROOM_DECAP_MASK (BPF_F_ADJ_ROOM_DECAP_L3_MASK | \
> >> + BPF_F_ADJ_ROOM_DECAP_L4_MASK | \
> >> + BPF_F_ADJ_ROOM_DECAP_IPXIP_MASK)
> >> 
> >> #define BPF_F_ADJ_ROOM_MASK (BPF_F_ADJ_ROOM_FIXED_GSO | \
> >> BPF_F_ADJ_ROOM_ENCAP_MASK | \
> >> @@ -3743,20 +3746,44 @@ BPF_CALL_4(bpf_skb_adjust_room, struct sk_buff *, skb, s32, len_diff,
> >> return -ENOTSUPP;
> >> }
> >> 
> >> - if (flags & BPF_F_ADJ_ROOM_DECAP_L3_MASK) {
> >> + if (flags & BPF_F_ADJ_ROOM_DECAP_MASK) {
> >> + u32 len_decap_min = 0;
> >> +
> >> if (!shrink)
> >> return -EINVAL;
> >> 
> >> - switch (flags & BPF_F_ADJ_ROOM_DECAP_L3_MASK) {
> >> - case BPF_F_ADJ_ROOM_DECAP_L3_IPV4:
> >> + /* Reject mutually exclusive decap flag pairs. */
> >> + if ((flags & BPF_F_ADJ_ROOM_DECAP_L3_MASK) ==
> >> +    BPF_F_ADJ_ROOM_DECAP_L3_MASK)
> >> + return -EINVAL;
> >> +
> >> + if ((flags & BPF_F_ADJ_ROOM_DECAP_L4_MASK) ==
> >> +    BPF_F_ADJ_ROOM_DECAP_L4_MASK)
> >> + return -EINVAL;
> >> +
> >> + if ((flags & BPF_F_ADJ_ROOM_DECAP_IPXIP_MASK) ==
> >> +    BPF_F_ADJ_ROOM_DECAP_IPXIP_MASK)
> >> + return -EINVAL;
> >> +
> >> + /* Reject mutually exclusive decap tunnel type flags. */
> >> + if ((flags & BPF_F_ADJ_ROOM_DECAP_L4_MASK) &&
> >> +    (flags & BPF_F_ADJ_ROOM_DECAP_IPXIP_MASK))
> >> + return -EINVAL;
> >> +
> >> + if (flags & BPF_F_ADJ_ROOM_DECAP_L4_UDP)
> >> + len_decap_min += sizeof(struct udphdr);
> >> +
> >> + if (flags & BPF_F_ADJ_ROOM_DECAP_L4_GRE)
> >> + len_decap_min += sizeof(struct gre_base_hdr);
> >> +
> >> + if (len_diff_abs < len_decap_min)
> >> + return -EINVAL;
> > 
> > Should this test come after the below IP flags?
> 
> Should it?
> 
> Seems to me it can bail early without having to check the IP flags. len_decap_min vs len_min.
> 
> What am I missing?

I would think it common that UDP decap also includes an L3 decap, in
which case the len_decap_min should include both header lengths.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 1/5] bpf: name the enum for BPF_FUNC_skb_adjust_room flags
  2026-03-18 13:42 ` [PATCH v2 1/5] bpf: name the enum for BPF_FUNC_skb_adjust_room flags Nick Hudson
@ 2026-03-21  0:39   ` Willem de Bruijn
  2026-03-24 17:34   ` Martin KaFai Lau
  1 sibling, 0 replies; 20+ messages in thread
From: Willem de Bruijn @ 2026-03-21  0:39 UTC (permalink / raw)
  To: Nick Hudson, bpf, netdev
  Cc: Willem de Bruijn, Nick Hudson, Max Tottenham, Anna Glasgall,
	Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	linux-kernel

Nick Hudson wrote:
> The existing anonymous enum for BPF_FUNC_skb_adjust_room flags is
> named to enum bpf_adj_room_flags to enable CO-RE (Compile Once -
> Run Everywhere) lookups in BPF programs.
> 
> Co-developed-by: Max Tottenham <mtottenh@akamai.com>
> Signed-off-by: Max Tottenham <mtottenh@akamai.com>
> Co-developed-by: Anna Glasgall <aglasgal@akamai.com>
> Signed-off-by: Anna Glasgall <aglasgal@akamai.com>
> Signed-off-by: Nick Hudson <nhudson@akamai.com>

Reviewed-by: Willem de Bruijn <willemb@google.com>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 2/5] bpf: add BPF_F_ADJ_ROOM_DECAP_* flags for tunnel decapsulation
  2026-03-18 13:42 ` [PATCH v2 2/5] bpf: add BPF_F_ADJ_ROOM_DECAP_* flags for tunnel decapsulation Nick Hudson
@ 2026-03-21  0:39   ` Willem de Bruijn
  0 siblings, 0 replies; 20+ messages in thread
From: Willem de Bruijn @ 2026-03-21  0:39 UTC (permalink / raw)
  To: Nick Hudson, bpf, netdev
  Cc: Willem de Bruijn, Nick Hudson, Max Tottenham, Anna Glasgall,
	Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	linux-kernel

Nick Hudson wrote:
> Add new bpf_skb_adjust_room() decapsulation flags:
> 
> - BPF_F_ADJ_ROOM_DECAP_L4_GRE
> - BPF_F_ADJ_ROOM_DECAP_L4_UDP
> - BPF_F_ADJ_ROOM_DECAP_IPXIP4
> - BPF_F_ADJ_ROOM_DECAP_IPXIP6
> 
> These flags let BPF programs describe which tunnel layer is being
> removed, so later changes can update tunnel-related GSO state
> accordingly during decapsulation.
> 
> This patch only introduces the UAPI flag definitions and helper
> documentation.
> 
> Co-developed-by: Max Tottenham <mtottenh@akamai.com>
> Signed-off-by: Max Tottenham <mtottenh@akamai.com>
> Co-developed-by: Anna Glasgall <aglasgal@akamai.com>
> Signed-off-by: Anna Glasgall <aglasgal@akamai.com>
> Signed-off-by: Nick Hudson <nhudson@akamai.com>

Reviewed-by: Willem de Bruijn <willemb@google.com>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 3/5] bpf: add helper masks for ADJ_ROOM flags and encap validation
  2026-03-18 13:42 ` [PATCH v2 3/5] bpf: add helper masks for ADJ_ROOM flags and encap validation Nick Hudson
@ 2026-03-21  0:39   ` Willem de Bruijn
  2026-03-24 18:12   ` Martin KaFai Lau
  1 sibling, 0 replies; 20+ messages in thread
From: Willem de Bruijn @ 2026-03-21  0:39 UTC (permalink / raw)
  To: Nick Hudson, bpf, netdev
  Cc: Willem de Bruijn, Nick Hudson, Max Tottenham, Anna Glasgall,
	Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, linux-kernel

Nick Hudson wrote:
> Introduce helper masks for bpf_skb_adjust_room() flags to simplify
> validation logic:
> 
> - BPF_F_ADJ_ROOM_DECAP_L4_MASK
> - BPF_F_ADJ_ROOM_DECAP_IPXIP_MASK
> - BPF_F_ADJ_ROOM_ENCAP_MASK
> - BPF_F_ADJ_ROOM_DECAP_MASK
> 
> Add flag validation to bpf_skb_net_grow() to reject invalid encap
> flags early. Refactor existing validation checks in bpf_skb_net_shrink()
> and bpf_skb_adjust_room() to use the new masks (no behavior change).
> 
> Co-developed-by: Max Tottenham <mtottenh@akamai.com>
> Signed-off-by: Max Tottenham <mtottenh@akamai.com>
> Co-developed-by: Anna Glasgall <aglasgal@akamai.com>
> Signed-off-by: Anna Glasgall <aglasgal@akamai.com>
> Signed-off-by: Nick Hudson <nhudson@akamai.com>

Reviewed-by: Willem de Bruijn <willemb@google.com>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 4/5] bpf: allow new DECAP flags and add guard rails
  2026-03-18 13:42 ` [PATCH v2 4/5] bpf: allow new DECAP flags and add guard rails Nick Hudson
  2026-03-18 20:02   ` Willem de Bruijn
@ 2026-03-21  0:40   ` Willem de Bruijn
  2026-03-24 18:30   ` Martin KaFai Lau
  2 siblings, 0 replies; 20+ messages in thread
From: Willem de Bruijn @ 2026-03-21  0:40 UTC (permalink / raw)
  To: Nick Hudson, bpf, netdev
  Cc: Willem de Bruijn, Nick Hudson, Max Tottenham, Anna Glasgall,
	Martin KaFai Lau, Daniel Borkmann, Alexei Starovoitov,
	Andrii Nakryiko, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, linux-kernel

Nick Hudson wrote:
> Add checks to require shrink-only decap, reject conflicting decap flag
> combinations, and verify removed length is sufficient for claimed header
> decapsulation.
> 
> Co-developed-by: Max Tottenham <mtottenh@akamai.com>
> Signed-off-by: Max Tottenham <mtottenh@akamai.com>
> Co-developed-by: Anna Glasgall <aglasgal@akamai.com>
> Signed-off-by: Anna Glasgall <aglasgal@akamai.com>
> Signed-off-by: Nick Hudson <nhudson@akamai.com>

Reviewed-by: Willem de Bruijn <willemb@google.com>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 1/5] bpf: name the enum for BPF_FUNC_skb_adjust_room flags
  2026-03-18 13:42 ` [PATCH v2 1/5] bpf: name the enum for BPF_FUNC_skb_adjust_room flags Nick Hudson
  2026-03-21  0:39   ` Willem de Bruijn
@ 2026-03-24 17:34   ` Martin KaFai Lau
  1 sibling, 0 replies; 20+ messages in thread
From: Martin KaFai Lau @ 2026-03-24 17:34 UTC (permalink / raw)
  To: Nick Hudson
  Cc: Willem de Bruijn, Max Tottenham, Anna Glasgall,
	Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, bpf, netdev,
	linux-kernel


On 3/18/26 6:42 AM, Nick Hudson wrote:
> The existing anonymous enum for BPF_FUNC_skb_adjust_room flags is
> named to enum bpf_adj_room_flags to enable CO-RE (Compile Once -
> Run Everywhere) lookups in BPF programs.

It would be useful to demonstrate the intended CO-RE usage in a 
selftest. I suspect it is bpf_core_enum_value_exists().

There are existing tests in test_tc_tunnel.c for the earlier 
BPF_F_ADJ_ROOM_* flag additions. Please add similar tests for the new 
flags introduced in this series.


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 3/5] bpf: add helper masks for ADJ_ROOM flags and encap validation
  2026-03-18 13:42 ` [PATCH v2 3/5] bpf: add helper masks for ADJ_ROOM flags and encap validation Nick Hudson
  2026-03-21  0:39   ` Willem de Bruijn
@ 2026-03-24 18:12   ` Martin KaFai Lau
  2026-03-26 17:02     ` Hudson, Nick
  1 sibling, 1 reply; 20+ messages in thread
From: Martin KaFai Lau @ 2026-03-24 18:12 UTC (permalink / raw)
  To: Nick Hudson
  Cc: Willem de Bruijn, Max Tottenham, Anna Glasgall,
	Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni, bpf,
	netdev, linux-kernel

On 3/18/26 6:42 AM, Nick Hudson wrote:
> Introduce helper masks for bpf_skb_adjust_room() flags to simplify
> validation logic:
> 
> - BPF_F_ADJ_ROOM_DECAP_L4_MASK
> - BPF_F_ADJ_ROOM_DECAP_IPXIP_MASK
> - BPF_F_ADJ_ROOM_ENCAP_MASK
> - BPF_F_ADJ_ROOM_DECAP_MASK
> 
> Add flag validation to bpf_skb_net_grow() to reject invalid encap
> flags early. Refactor existing validation checks in bpf_skb_net_shrink()
> and bpf_skb_adjust_room() to use the new masks (no behavior change).
> 
> Co-developed-by: Max Tottenham <mtottenh@akamai.com>
> Signed-off-by: Max Tottenham <mtottenh@akamai.com>
> Co-developed-by: Anna Glasgall <aglasgal@akamai.com>
> Signed-off-by: Anna Glasgall <aglasgal@akamai.com>
> Signed-off-by: Nick Hudson <nhudson@akamai.com>
> ---
>   net/core/filter.c | 31 +++++++++++++++++++++++--------
>   1 file changed, 23 insertions(+), 8 deletions(-)
> 
> diff --git a/net/core/filter.c b/net/core/filter.c
> index 0d5d5a17acb2..7c2871b40fe4 100644
> --- a/net/core/filter.c
> +++ b/net/core/filter.c
> @@ -3483,14 +3483,25 @@ static u32 bpf_skb_net_base_len(const struct sk_buff *skb)
>   #define BPF_F_ADJ_ROOM_DECAP_L3_MASK	(BPF_F_ADJ_ROOM_DECAP_L3_IPV4 | \
>   					 BPF_F_ADJ_ROOM_DECAP_L3_IPV6)
>   
> -#define BPF_F_ADJ_ROOM_MASK		(BPF_F_ADJ_ROOM_FIXED_GSO | \
> -					 BPF_F_ADJ_ROOM_ENCAP_L3_MASK | \
> +#define BPF_F_ADJ_ROOM_DECAP_L4_MASK	(BPF_F_ADJ_ROOM_DECAP_L4_UDP | \
> +					 BPF_F_ADJ_ROOM_DECAP_L4_GRE)
> +
> +#define BPF_F_ADJ_ROOM_DECAP_IPXIP_MASK	(BPF_F_ADJ_ROOM_DECAP_IPXIP4 | \
> +					 BPF_F_ADJ_ROOM_DECAP_IPXIP6)
> +
> +#define BPF_F_ADJ_ROOM_ENCAP_MASK	(BPF_F_ADJ_ROOM_ENCAP_L3_MASK | \
>   					 BPF_F_ADJ_ROOM_ENCAP_L4_GRE | \
>   					 BPF_F_ADJ_ROOM_ENCAP_L4_UDP | \
>   					 BPF_F_ADJ_ROOM_ENCAP_L2_ETH | \
>   					 BPF_F_ADJ_ROOM_ENCAP_L2( \
> -					  BPF_ADJ_ROOM_ENCAP_L2_MASK) | \
> -					 BPF_F_ADJ_ROOM_DECAP_L3_MASK)
> +					  BPF_ADJ_ROOM_ENCAP_L2_MASK))
> +
> +#define BPF_F_ADJ_ROOM_DECAP_MASK	(BPF_F_ADJ_ROOM_DECAP_L3_MASK)
> +
> +#define BPF_F_ADJ_ROOM_MASK		(BPF_F_ADJ_ROOM_FIXED_GSO | \
> +					 BPF_F_ADJ_ROOM_ENCAP_MASK | \
> +					 BPF_F_ADJ_ROOM_DECAP_MASK | \
> +					 BPF_F_ADJ_ROOM_NO_CSUM_RESET)

The patch does two things: refactoring of existing macros 
(BPF_F_ADJ_ROOM_ENCAP_MASK, BPF_F_ADJ_ROOM_DECAP_MASK) and new additions 
(BPF_F_ADJ_ROOM_DECAP_L4_MASK, BPF_F_ADJ_ROOM_DECAP_IPXIP_MASK) that 
depend on the new flags from the UAPI changes in patch 2.

The refactoring does not depend on the new UAPI flags and could be a 
separate patch placed earlier in the series. That way a reviewer can 
verify it is a no-op without the new flag additions getting in
the way. The (BPF_F_ADJ_ROOM_DECAP_L4_MASK, 
BPF_F_ADJ_ROOM_DECAP_IPXIP_MASK) can be introduced together in patch 4 
when it is first used.

>   
>   static int bpf_skb_net_grow(struct sk_buff *skb, u32 off, u32 len_diff,
>   			    u64 flags)
> @@ -3502,6 +3513,11 @@ static int bpf_skb_net_grow(struct sk_buff *skb, u32 off, u32 len_diff,
>   	unsigned int gso_type = SKB_GSO_DODGY;
>   	int ret;
>   
> +	if (unlikely(flags & ~(BPF_F_ADJ_ROOM_ENCAP_MASK |
> +			       BPF_F_ADJ_ROOM_NO_CSUM_RESET |
> +			       BPF_F_ADJ_ROOM_FIXED_GSO)))

Under which case this new check will be hit?

> +		return -EINVAL;
> +
>   	if (skb_is_gso(skb) && !skb_is_gso_tcp(skb)) {
>   		/* udp gso_size delineates datagrams, only allow if fixed */
>   		if (!(skb_shinfo(skb)->gso_type & SKB_GSO_UDP_L4) ||
> @@ -3611,8 +3627,8 @@ static int bpf_skb_net_shrink(struct sk_buff *skb, u32 off, u32 len_diff,
>   {
>   	int ret;
>   
> -	if (unlikely(flags & ~(BPF_F_ADJ_ROOM_FIXED_GSO |
> -			       BPF_F_ADJ_ROOM_DECAP_L3_MASK |
> +	if (unlikely(flags & ~(BPF_F_ADJ_ROOM_DECAP_MASK |
> +			       BPF_F_ADJ_ROOM_FIXED_GSO |
>   			       BPF_F_ADJ_ROOM_NO_CSUM_RESET)))
>   		return -EINVAL;
>   
> @@ -3708,8 +3724,7 @@ BPF_CALL_4(bpf_skb_adjust_room, struct sk_buff *, skb, s32, len_diff,
>   	u32 off;
>   	int ret;
>   
> -	if (unlikely(flags & ~(BPF_F_ADJ_ROOM_MASK |
> -			       BPF_F_ADJ_ROOM_NO_CSUM_RESET)))
> +	if (unlikely(flags & ~BPF_F_ADJ_ROOM_MASK))
>   		return -EINVAL;
>   	if (unlikely(len_diff_abs > 0xfffU))
>   		return -EFAULT;


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 4/5] bpf: allow new DECAP flags and add guard rails
  2026-03-18 13:42 ` [PATCH v2 4/5] bpf: allow new DECAP flags and add guard rails Nick Hudson
  2026-03-18 20:02   ` Willem de Bruijn
  2026-03-21  0:40   ` Willem de Bruijn
@ 2026-03-24 18:30   ` Martin KaFai Lau
  2026-03-26 17:02     ` Hudson, Nick
  2 siblings, 1 reply; 20+ messages in thread
From: Martin KaFai Lau @ 2026-03-24 18:30 UTC (permalink / raw)
  To: Nick Hudson
  Cc: Willem de Bruijn, Max Tottenham, Anna Glasgall, Daniel Borkmann,
	Alexei Starovoitov, Andrii Nakryiko, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, bpf, netdev,
	linux-kernel



On 3/18/26 6:42 AM, Nick Hudson wrote:
> Add checks to require shrink-only decap, reject conflicting decap flag
> combinations, and verify removed length is sufficient for claimed header
> decapsulation.
> 
> Co-developed-by: Max Tottenham <mtottenh@akamai.com>
> Signed-off-by: Max Tottenham <mtottenh@akamai.com>
> Co-developed-by: Anna Glasgall <aglasgal@akamai.com>
> Signed-off-by: Anna Glasgall <aglasgal@akamai.com>
> Signed-off-by: Nick Hudson <nhudson@akamai.com>
> ---
>   net/core/filter.c | 47 +++++++++++++++++++++++++++++++++++++----------
>   1 file changed, 37 insertions(+), 10 deletions(-)
> 
> diff --git a/net/core/filter.c b/net/core/filter.c
> index 7c2871b40fe4..47aec44a9cd3 100644
> --- a/net/core/filter.c
> +++ b/net/core/filter.c
> @@ -56,6 +56,7 @@
>   #include <net/sock_reuseport.h>
>   #include <net/busy_poll.h>
>   #include <net/tcp.h>
> +#include <net/gre.h>
>   #include <net/xfrm.h>
>   #include <net/udp.h>
>   #include <linux/bpf_trace.h>
> @@ -3496,7 +3497,9 @@ static u32 bpf_skb_net_base_len(const struct sk_buff *skb)
>   					 BPF_F_ADJ_ROOM_ENCAP_L2( \
>   					  BPF_ADJ_ROOM_ENCAP_L2_MASK))
>   
> -#define BPF_F_ADJ_ROOM_DECAP_MASK	(BPF_F_ADJ_ROOM_DECAP_L3_MASK)
> +#define BPF_F_ADJ_ROOM_DECAP_MASK	(BPF_F_ADJ_ROOM_DECAP_L3_MASK | \
> +					 BPF_F_ADJ_ROOM_DECAP_L4_MASK | \
> +					 BPF_F_ADJ_ROOM_DECAP_IPXIP_MASK)
>   
>   #define BPF_F_ADJ_ROOM_MASK		(BPF_F_ADJ_ROOM_FIXED_GSO | \
>   					 BPF_F_ADJ_ROOM_ENCAP_MASK | \
> @@ -3743,20 +3746,44 @@ BPF_CALL_4(bpf_skb_adjust_room, struct sk_buff *, skb, s32, len_diff,
>   		return -ENOTSUPP;
>   	}
>   
> -	if (flags & BPF_F_ADJ_ROOM_DECAP_L3_MASK) {
> +	if (flags & BPF_F_ADJ_ROOM_DECAP_MASK) {

This change should be done together with the macro refactoring patch 
mentioned in patch 3.

> +		u32 len_decap_min = 0;
> +
>   		if (!shrink)
>   			return -EINVAL;
>   
> -		switch (flags & BPF_F_ADJ_ROOM_DECAP_L3_MASK) {
> -		case BPF_F_ADJ_ROOM_DECAP_L3_IPV4:
> +		/* Reject mutually exclusive decap flag pairs. */
> +		if ((flags & BPF_F_ADJ_ROOM_DECAP_L3_MASK) ==
> +		    BPF_F_ADJ_ROOM_DECAP_L3_MASK)

iiuc, this 'if' and the len_min assignment changes below replace the 
existing switch case. Please separate this no-op change from the new 
flag validation logic. It is small enough to be done together in the 
macro refactoring patch also.

> +			return -EINVAL;
> +
> +		if ((flags & BPF_F_ADJ_ROOM_DECAP_L4_MASK) ==
> +		    BPF_F_ADJ_ROOM_DECAP_L4_MASK)
> +			return -EINVAL;
> +
> +		if ((flags & BPF_F_ADJ_ROOM_DECAP_IPXIP_MASK) ==
> +		    BPF_F_ADJ_ROOM_DECAP_IPXIP_MASK)
> +			return -EINVAL;
> +
> +		/* Reject mutually exclusive decap tunnel type flags. */
> +		if ((flags & BPF_F_ADJ_ROOM_DECAP_L4_MASK) &&
> +		    (flags & BPF_F_ADJ_ROOM_DECAP_IPXIP_MASK))
> +			return -EINVAL;
> +
> +		if (flags & BPF_F_ADJ_ROOM_DECAP_L4_UDP)
> +			len_decap_min += sizeof(struct udphdr);
> +
> +		if (flags & BPF_F_ADJ_ROOM_DECAP_L4_GRE)
> +			len_decap_min += sizeof(struct gre_base_hdr);
> +
> +		if (len_diff_abs < len_decap_min)
> +			return -EINVAL;
> +
> +		if (flags & BPF_F_ADJ_ROOM_DECAP_L3_IPV4)
>   			len_min = sizeof(struct iphdr);
> -			break;
> -		case BPF_F_ADJ_ROOM_DECAP_L3_IPV6:
> +
> +		if (flags & BPF_F_ADJ_ROOM_DECAP_L3_IPV6)
>   			len_min = sizeof(struct ipv6hdr);
> -			break;
> -		default:
> -			return -EINVAL;
> -		}
>   	}
>   
>   	len_cur = skb->len - skb_network_offset(skb);


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 4/5] bpf: allow new DECAP flags and add guard rails
  2026-03-24 18:30   ` Martin KaFai Lau
@ 2026-03-26 17:02     ` Hudson, Nick
  0 siblings, 0 replies; 20+ messages in thread
From: Hudson, Nick @ 2026-03-26 17:02 UTC (permalink / raw)
  To: Martin KaFai Lau
  Cc: Willem de Bruijn, Tottenham, Max, Glasgall, Anna, Daniel Borkmann,
	Alexei Starovoitov, Andrii Nakryiko, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, bpf@vger.kernel.org,
	netdev@vger.kernel.org, linux-kernel@vger.kernel.org

[-- Attachment #1: Type: text/plain, Size: 3964 bytes --]



> On Mar 24, 2026, at 6:30 PM, Martin KaFai Lau <martin.lau@linux.dev> wrote:
> 
> !-------------------------------------------------------------------|
> This Message Is From an External Sender
> This message came from outside your organization.
> |-------------------------------------------------------------------!
> 
> 
> 
> On 3/18/26 6:42 AM, Nick Hudson wrote:
>> Add checks to require shrink-only decap, reject conflicting decap flag
>> combinations, and verify removed length is sufficient for claimed header
>> decapsulation.
>> Co-developed-by: Max Tottenham <mtottenh@akamai.com>
>> Signed-off-by: Max Tottenham <mtottenh@akamai.com>
>> Co-developed-by: Anna Glasgall <aglasgal@akamai.com>
>> Signed-off-by: Anna Glasgall <aglasgal@akamai.com>
>> Signed-off-by: Nick Hudson <nhudson@akamai.com>
>> ---
>>  net/core/filter.c | 47 +++++++++++++++++++++++++++++++++++++----------
>>  1 file changed, 37 insertions(+), 10 deletions(-)
>> diff --git a/net/core/filter.c b/net/core/filter.c
>> index 7c2871b40fe4..47aec44a9cd3 100644
>> --- a/net/core/filter.c
>> +++ b/net/core/filter.c
>> @@ -56,6 +56,7 @@
>>  #include <net/sock_reuseport.h>
>>  #include <net/busy_poll.h>
>>  #include <net/tcp.h>
>> +#include <net/gre.h>
>>  #include <net/xfrm.h>
>>  #include <net/udp.h>
>>  #include <linux/bpf_trace.h>
>> @@ -3496,7 +3497,9 @@ static u32 bpf_skb_net_base_len(const struct sk_buff *skb)
>>  					 BPF_F_ADJ_ROOM_ENCAP_L2( \
>>  					  BPF_ADJ_ROOM_ENCAP_L2_MASK))
>>  -#define BPF_F_ADJ_ROOM_DECAP_MASK	(BPF_F_ADJ_ROOM_DECAP_L3_MASK)
>> +#define BPF_F_ADJ_ROOM_DECAP_MASK	(BPF_F_ADJ_ROOM_DECAP_L3_MASK | \
>> +					 BPF_F_ADJ_ROOM_DECAP_L4_MASK | \
>> +					 BPF_F_ADJ_ROOM_DECAP_IPXIP_MASK)
>>    #define BPF_F_ADJ_ROOM_MASK		(BPF_F_ADJ_ROOM_FIXED_GSO | \
>>  					 BPF_F_ADJ_ROOM_ENCAP_MASK | \
>> @@ -3743,20 +3746,44 @@ BPF_CALL_4(bpf_skb_adjust_room, struct sk_buff *, skb, s32, len_diff,
>>  		return -ENOTSUPP;
>>  	}
>>  -	if (flags & BPF_F_ADJ_ROOM_DECAP_L3_MASK) {
>> +	if (flags & BPF_F_ADJ_ROOM_DECAP_MASK) {
> 
> This change should be done together with the macro refactoring patch mentioned in patch 3.

OK, will send a new version with it done this way.

> 
>> +		u32 len_decap_min = 0;
>> +
>>  		if (!shrink)
>>  			return -EINVAL;
>>  -		switch (flags & BPF_F_ADJ_ROOM_DECAP_L3_MASK) {
>> -		case BPF_F_ADJ_ROOM_DECAP_L3_IPV4:
>> +		/* Reject mutually exclusive decap flag pairs. */
>> +		if ((flags & BPF_F_ADJ_ROOM_DECAP_L3_MASK) ==
>> +		    BPF_F_ADJ_ROOM_DECAP_L3_MASK)
> 
> iiuc, this 'if' and the len_min assignment changes below replace the existing switch case. Please separate this no-op change from the new flag validation logic. It is small enough to be done together in the macro refactoring patch also.
> 
>> +			return -EINVAL;
>> +
>> +		if ((flags & BPF_F_ADJ_ROOM_DECAP_L4_MASK) ==
>> +		    BPF_F_ADJ_ROOM_DECAP_L4_MASK)
>> +			return -EINVAL;
>> +
>> +		if ((flags & BPF_F_ADJ_ROOM_DECAP_IPXIP_MASK) ==
>> +		    BPF_F_ADJ_ROOM_DECAP_IPXIP_MASK)
>> +			return -EINVAL;
>> +
>> +		/* Reject mutually exclusive decap tunnel type flags. */
>> +		if ((flags & BPF_F_ADJ_ROOM_DECAP_L4_MASK) &&
>> +		    (flags & BPF_F_ADJ_ROOM_DECAP_IPXIP_MASK))
>> +			return -EINVAL;
>> +
>> +		if (flags & BPF_F_ADJ_ROOM_DECAP_L4_UDP)
>> +			len_decap_min += sizeof(struct udphdr);
>> +
>> +		if (flags & BPF_F_ADJ_ROOM_DECAP_L4_GRE)
>> +			len_decap_min += sizeof(struct gre_base_hdr);
>> +
>> +		if (len_diff_abs < len_decap_min)
>> +			return -EINVAL;
>> +
>> +		if (flags & BPF_F_ADJ_ROOM_DECAP_L3_IPV4)
>>  			len_min = sizeof(struct iphdr);
>> -			break;
>> -		case BPF_F_ADJ_ROOM_DECAP_L3_IPV6:
>> +
>> +		if (flags & BPF_F_ADJ_ROOM_DECAP_L3_IPV6)
>>  			len_min = sizeof(struct ipv6hdr);
>> -			break;
>> -		default:
>> -			return -EINVAL;
>> -		}
>>  	}
>>    	len_cur = skb->len - skb_network_offset(skb);
> 


[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 3066 bytes --]

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 3/5] bpf: add helper masks for ADJ_ROOM flags and encap validation
  2026-03-24 18:12   ` Martin KaFai Lau
@ 2026-03-26 17:02     ` Hudson, Nick
  2026-03-26 17:49       ` Martin KaFai Lau
  0 siblings, 1 reply; 20+ messages in thread
From: Hudson, Nick @ 2026-03-26 17:02 UTC (permalink / raw)
  To: Martin KaFai Lau
  Cc: Willem de Bruijn, Tottenham, Max, Glasgall, Anna,
	Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	bpf@vger.kernel.org, netdev@vger.kernel.org,
	linux-kernel@vger.kernel.org

[-- Attachment #1: Type: text/plain, Size: 4768 bytes --]



> On Mar 24, 2026, at 6:12 PM, Martin KaFai Lau <martin.lau@linux.dev> wrote:
> 
> !-------------------------------------------------------------------|
> This Message Is From an External Sender
> This message came from outside your organization.
> |-------------------------------------------------------------------!
> 
> On 3/18/26 6:42 AM, Nick Hudson wrote:
>> Introduce helper masks for bpf_skb_adjust_room() flags to simplify
>> validation logic:
>> - BPF_F_ADJ_ROOM_DECAP_L4_MASK
>> - BPF_F_ADJ_ROOM_DECAP_IPXIP_MASK
>> - BPF_F_ADJ_ROOM_ENCAP_MASK
>> - BPF_F_ADJ_ROOM_DECAP_MASK
>> Add flag validation to bpf_skb_net_grow() to reject invalid encap
>> flags early. Refactor existing validation checks in bpf_skb_net_shrink()
>> and bpf_skb_adjust_room() to use the new masks (no behavior change).
>> Co-developed-by: Max Tottenham <mtottenh@akamai.com>
>> Signed-off-by: Max Tottenham <mtottenh@akamai.com>
>> Co-developed-by: Anna Glasgall <aglasgal@akamai.com>
>> Signed-off-by: Anna Glasgall <aglasgal@akamai.com>
>> Signed-off-by: Nick Hudson <nhudson@akamai.com>
>> ---
>>  net/core/filter.c | 31 +++++++++++++++++++++++--------
>>  1 file changed, 23 insertions(+), 8 deletions(-)
>> diff --git a/net/core/filter.c b/net/core/filter.c
>> index 0d5d5a17acb2..7c2871b40fe4 100644
>> --- a/net/core/filter.c
>> +++ b/net/core/filter.c
>> @@ -3483,14 +3483,25 @@ static u32 bpf_skb_net_base_len(const struct sk_buff *skb)
>>  #define BPF_F_ADJ_ROOM_DECAP_L3_MASK (BPF_F_ADJ_ROOM_DECAP_L3_IPV4 | \
>>    BPF_F_ADJ_ROOM_DECAP_L3_IPV6)
>>  -#define BPF_F_ADJ_ROOM_MASK (BPF_F_ADJ_ROOM_FIXED_GSO | \
>> -  BPF_F_ADJ_ROOM_ENCAP_L3_MASK | \
>> +#define BPF_F_ADJ_ROOM_DECAP_L4_MASK (BPF_F_ADJ_ROOM_DECAP_L4_UDP | \
>> +  BPF_F_ADJ_ROOM_DECAP_L4_GRE)
>> +
>> +#define BPF_F_ADJ_ROOM_DECAP_IPXIP_MASK (BPF_F_ADJ_ROOM_DECAP_IPXIP4 | \
>> +  BPF_F_ADJ_ROOM_DECAP_IPXIP6)
>> +
>> +#define BPF_F_ADJ_ROOM_ENCAP_MASK (BPF_F_ADJ_ROOM_ENCAP_L3_MASK | \
>>    BPF_F_ADJ_ROOM_ENCAP_L4_GRE | \
>>    BPF_F_ADJ_ROOM_ENCAP_L4_UDP | \
>>    BPF_F_ADJ_ROOM_ENCAP_L2_ETH | \
>>    BPF_F_ADJ_ROOM_ENCAP_L2( \
>> -   BPF_ADJ_ROOM_ENCAP_L2_MASK) | \
>> -  BPF_F_ADJ_ROOM_DECAP_L3_MASK)
>> +   BPF_ADJ_ROOM_ENCAP_L2_MASK))
>> +
>> +#define BPF_F_ADJ_ROOM_DECAP_MASK (BPF_F_ADJ_ROOM_DECAP_L3_MASK)
>> +
>> +#define BPF_F_ADJ_ROOM_MASK (BPF_F_ADJ_ROOM_FIXED_GSO | \
>> +  BPF_F_ADJ_ROOM_ENCAP_MASK | \
>> +  BPF_F_ADJ_ROOM_DECAP_MASK | \
>> +  BPF_F_ADJ_ROOM_NO_CSUM_RESET)
> 
> The patch does two things: refactoring of existing macros (BPF_F_ADJ_ROOM_ENCAP_MASK, BPF_F_ADJ_ROOM_DECAP_MASK) and new additions (BPF_F_ADJ_ROOM_DECAP_L4_MASK, BPF_F_ADJ_ROOM_DECAP_IPXIP_MASK) that depend on the new flags from the UAPI changes in patch 2.
> 
> The refactoring does not depend on the new UAPI flags and could be a separate patch placed earlier in the series. That way a reviewer can verify it is a no-op without the new flag additions getting in
> the way. The (BPF_F_ADJ_ROOM_DECAP_L4_MASK, BPF_F_ADJ_ROOM_DECAP_IPXIP_MASK) can be introduced together in patch 4 when it is first used.

OK, will split further.

> 
>>    static int bpf_skb_net_grow(struct sk_buff *skb, u32 off, u32 len_diff,
>>       u64 flags)
>> @@ -3502,6 +3513,11 @@ static int bpf_skb_net_grow(struct sk_buff *skb, u32 off, u32 len_diff,
>>   unsigned int gso_type = SKB_GSO_DODGY;
>>   int ret;
>>  + if (unlikely(flags & ~(BPF_F_ADJ_ROOM_ENCAP_MASK |
>> +        BPF_F_ADJ_ROOM_NO_CSUM_RESET |
>> +        BPF_F_ADJ_ROOM_FIXED_GSO)))
> 
> Under which case this new check will be hit?

If a user supplies +ve len_diff and attempts to pass a DECAP flag.

The commit message had

    Add flag validation to bpf_skb_net_grow() to reject invalid encap
    flags early.

> 
>> + return -EINVAL;
>> +
>>   if (skb_is_gso(skb) && !skb_is_gso_tcp(skb)) {
>>   /* udp gso_size delineates datagrams, only allow if fixed */
>>   if (!(skb_shinfo(skb)->gso_type & SKB_GSO_UDP_L4) ||
>> @@ -3611,8 +3627,8 @@ static int bpf_skb_net_shrink(struct sk_buff *skb, u32 off, u32 len_diff,
>>  {
>>   int ret;
>>  - if (unlikely(flags & ~(BPF_F_ADJ_ROOM_FIXED_GSO |
>> -        BPF_F_ADJ_ROOM_DECAP_L3_MASK |
>> + if (unlikely(flags & ~(BPF_F_ADJ_ROOM_DECAP_MASK |
>> +        BPF_F_ADJ_ROOM_FIXED_GSO |
>>          BPF_F_ADJ_ROOM_NO_CSUM_RESET)))
>>   return -EINVAL;
>>  @@ -3708,8 +3724,7 @@ BPF_CALL_4(bpf_skb_adjust_room, struct sk_buff *, skb, s32, len_diff,
>>   u32 off;
>>   int ret;
>>  - if (unlikely(flags & ~(BPF_F_ADJ_ROOM_MASK |
>> -        BPF_F_ADJ_ROOM_NO_CSUM_RESET)))
>> + if (unlikely(flags & ~BPF_F_ADJ_ROOM_MASK))
>>   return -EINVAL;
>>   if (unlikely(len_diff_abs > 0xfffU))
>>   return -EFAULT;



[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 3066 bytes --]

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 3/5] bpf: add helper masks for ADJ_ROOM flags and encap validation
  2026-03-26 17:02     ` Hudson, Nick
@ 2026-03-26 17:49       ` Martin KaFai Lau
  2026-03-27 10:55         ` Hudson, Nick
  0 siblings, 1 reply; 20+ messages in thread
From: Martin KaFai Lau @ 2026-03-26 17:49 UTC (permalink / raw)
  To: Hudson, Nick
  Cc: Willem de Bruijn, Tottenham, Max, Glasgall, Anna,
	Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	bpf@vger.kernel.org, netdev@vger.kernel.org,
	linux-kernel@vger.kernel.org



On 3/26/26 10:02 AM, Hudson, Nick wrote:
>>>     static int bpf_skb_net_grow(struct sk_buff *skb, u32 off, u32 len_diff,
>>>        u64 flags)
>>> @@ -3502,6 +3513,11 @@ static int bpf_skb_net_grow(struct sk_buff *skb, u32 off, u32 len_diff,
>>>    unsigned int gso_type = SKB_GSO_DODGY;
>>>    int ret;
>>>   + if (unlikely(flags & ~(BPF_F_ADJ_ROOM_ENCAP_MASK |
>>> +        BPF_F_ADJ_ROOM_NO_CSUM_RESET |
>>> +        BPF_F_ADJ_ROOM_FIXED_GSO)))
>> Under which case this new check will be hit?
> If a user supplies +ve len_diff and attempts to pass a DECAP flag.
> 
> The commit message had
> 
>      Add flag validation to bpf_skb_net_grow() to reject invalid encap
>      flags early.

There is DECAP_MASK check in bpf_skb_adjust_room() and then !shrink is 
rejected. What am I missing?

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 3/5] bpf: add helper masks for ADJ_ROOM flags and encap validation
  2026-03-26 17:49       ` Martin KaFai Lau
@ 2026-03-27 10:55         ` Hudson, Nick
  0 siblings, 0 replies; 20+ messages in thread
From: Hudson, Nick @ 2026-03-27 10:55 UTC (permalink / raw)
  To: Martin KaFai Lau
  Cc: Willem de Bruijn, Tottenham, Max, Glasgall, Anna,
	Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	bpf@vger.kernel.org, netdev@vger.kernel.org,
	linux-kernel@vger.kernel.org

[-- Attachment #1: Type: text/plain, Size: 1272 bytes --]



> On Mar 26, 2026, at 5:49 PM, Martin KaFai Lau <martin.lau@linux.dev> wrote:
> 
> !-------------------------------------------------------------------|
> This Message Is From an External Sender
> This message came from outside your organization.
> |-------------------------------------------------------------------!
> 
> 
> 
> On 3/26/26 10:02 AM, Hudson, Nick wrote:
>>>>    static int bpf_skb_net_grow(struct sk_buff *skb, u32 off, u32 len_diff,
>>>>       u64 flags)
>>>> @@ -3502,6 +3513,11 @@ static int bpf_skb_net_grow(struct sk_buff *skb, u32 off, u32 len_diff,
>>>>   unsigned int gso_type = SKB_GSO_DODGY;
>>>>   int ret;
>>>>  + if (unlikely(flags & ~(BPF_F_ADJ_ROOM_ENCAP_MASK |
>>>> +        BPF_F_ADJ_ROOM_NO_CSUM_RESET |
>>>> +        BPF_F_ADJ_ROOM_FIXED_GSO)))
>>> Under which case this new check will be hit?
>> If a user supplies +ve len_diff and attempts to pass a DECAP flag.
>> The commit message had
>>     Add flag validation to bpf_skb_net_grow() to reject invalid encap
>>     flags early.
> 
> There is DECAP_MASK check in bpf_skb_adjust_room() and then !shrink is rejected. What am I missing?

Duh, right.

Do you prefer the do all the flag checking in bpf_skb_adjust_room or keep the encap/decap split?


[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 3066 bytes --]

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2026-03-27 10:56 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20260318134242.2725749-1-nhudson@akamai.com>
2026-03-18 13:42 ` [PATCH v2 1/5] bpf: name the enum for BPF_FUNC_skb_adjust_room flags Nick Hudson
2026-03-21  0:39   ` Willem de Bruijn
2026-03-24 17:34   ` Martin KaFai Lau
2026-03-18 13:42 ` [PATCH v2 2/5] bpf: add BPF_F_ADJ_ROOM_DECAP_* flags for tunnel decapsulation Nick Hudson
2026-03-21  0:39   ` Willem de Bruijn
2026-03-18 13:42 ` [PATCH v2 3/5] bpf: add helper masks for ADJ_ROOM flags and encap validation Nick Hudson
2026-03-21  0:39   ` Willem de Bruijn
2026-03-24 18:12   ` Martin KaFai Lau
2026-03-26 17:02     ` Hudson, Nick
2026-03-26 17:49       ` Martin KaFai Lau
2026-03-27 10:55         ` Hudson, Nick
2026-03-18 13:42 ` [PATCH v2 4/5] bpf: allow new DECAP flags and add guard rails Nick Hudson
2026-03-18 20:02   ` Willem de Bruijn
2026-03-19  8:17     ` Hudson, Nick
2026-03-19 13:24       ` Willem de Bruijn
2026-03-21  0:40   ` Willem de Bruijn
2026-03-24 18:30   ` Martin KaFai Lau
2026-03-26 17:02     ` Hudson, Nick
2026-03-18 13:42 ` [PATCH v2 5/5] bpf: clear decap tunnel GSO state in skb_adjust_room Nick Hudson
2026-03-18 20:09   ` Willem de Bruijn

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox