* [PATCH v2 1/5] bpf: name the enum for BPF_FUNC_skb_adjust_room flags
[not found] <20260318134242.2725749-1-nhudson@akamai.com>
@ 2026-03-18 13:42 ` Nick Hudson
2026-03-21 0:39 ` Willem de Bruijn
2026-03-24 17:34 ` Martin KaFai Lau
2026-03-18 13:42 ` [PATCH v2 2/5] bpf: add BPF_F_ADJ_ROOM_DECAP_* flags for tunnel decapsulation Nick Hudson
` (3 subsequent siblings)
4 siblings, 2 replies; 20+ messages in thread
From: Nick Hudson @ 2026-03-18 13:42 UTC (permalink / raw)
To: bpf, netdev
Cc: Willem de Bruijn, Nick Hudson, Max Tottenham, Anna Glasgall,
Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
linux-kernel
The existing anonymous enum for BPF_FUNC_skb_adjust_room flags is
named to enum bpf_adj_room_flags to enable CO-RE (Compile Once -
Run Everywhere) lookups in BPF programs.
Co-developed-by: Max Tottenham <mtottenh@akamai.com>
Signed-off-by: Max Tottenham <mtottenh@akamai.com>
Co-developed-by: Anna Glasgall <aglasgal@akamai.com>
Signed-off-by: Anna Glasgall <aglasgal@akamai.com>
Signed-off-by: Nick Hudson <nhudson@akamai.com>
---
include/uapi/linux/bpf.h | 2 +-
tools/include/uapi/linux/bpf.h | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index c8d400b7680a..bc4b25eb72ce 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -6209,7 +6209,7 @@ enum {
};
/* BPF_FUNC_skb_adjust_room flags. */
-enum {
+enum bpf_adj_room_flags {
BPF_F_ADJ_ROOM_FIXED_GSO = (1ULL << 0),
BPF_F_ADJ_ROOM_ENCAP_L3_IPV4 = (1ULL << 1),
BPF_F_ADJ_ROOM_ENCAP_L3_IPV6 = (1ULL << 2),
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index 5e38b4887de6..db2c520d0e92 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -6209,7 +6209,7 @@ enum {
};
/* BPF_FUNC_skb_adjust_room flags. */
-enum {
+enum bpf_adj_room_flags {
BPF_F_ADJ_ROOM_FIXED_GSO = (1ULL << 0),
BPF_F_ADJ_ROOM_ENCAP_L3_IPV4 = (1ULL << 1),
BPF_F_ADJ_ROOM_ENCAP_L3_IPV6 = (1ULL << 2),
--
2.34.1
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH v2 2/5] bpf: add BPF_F_ADJ_ROOM_DECAP_* flags for tunnel decapsulation
[not found] <20260318134242.2725749-1-nhudson@akamai.com>
2026-03-18 13:42 ` [PATCH v2 1/5] bpf: name the enum for BPF_FUNC_skb_adjust_room flags Nick Hudson
@ 2026-03-18 13:42 ` Nick Hudson
2026-03-21 0:39 ` Willem de Bruijn
2026-03-18 13:42 ` [PATCH v2 3/5] bpf: add helper masks for ADJ_ROOM flags and encap validation Nick Hudson
` (2 subsequent siblings)
4 siblings, 1 reply; 20+ messages in thread
From: Nick Hudson @ 2026-03-18 13:42 UTC (permalink / raw)
To: bpf, netdev
Cc: Willem de Bruijn, Nick Hudson, Max Tottenham, Anna Glasgall,
Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
linux-kernel
Add new bpf_skb_adjust_room() decapsulation flags:
- BPF_F_ADJ_ROOM_DECAP_L4_GRE
- BPF_F_ADJ_ROOM_DECAP_L4_UDP
- BPF_F_ADJ_ROOM_DECAP_IPXIP4
- BPF_F_ADJ_ROOM_DECAP_IPXIP6
These flags let BPF programs describe which tunnel layer is being
removed, so later changes can update tunnel-related GSO state
accordingly during decapsulation.
This patch only introduces the UAPI flag definitions and helper
documentation.
Co-developed-by: Max Tottenham <mtottenh@akamai.com>
Signed-off-by: Max Tottenham <mtottenh@akamai.com>
Co-developed-by: Anna Glasgall <aglasgal@akamai.com>
Signed-off-by: Anna Glasgall <aglasgal@akamai.com>
Signed-off-by: Nick Hudson <nhudson@akamai.com>
---
include/uapi/linux/bpf.h | 34 ++++++++++++++++++++++++++++++++--
tools/include/uapi/linux/bpf.h | 34 ++++++++++++++++++++++++++++++++--
2 files changed, 64 insertions(+), 4 deletions(-)
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index bc4b25eb72ce..2ef886dc9685 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -3010,8 +3010,34 @@ union bpf_attr {
*
* * **BPF_F_ADJ_ROOM_DECAP_L3_IPV4**,
* **BPF_F_ADJ_ROOM_DECAP_L3_IPV6**:
- * Indicate the new IP header version after decapsulating the outer
- * IP header. Used when the inner and outer IP versions are different.
+ * Indicate the new IP header version after decapsulating the
+ * outer IP header. Used when the inner and outer IP versions
+ * are different. These flags only trigger a protocol change
+ * without clearing any tunnel-specific GSO flags.
+ *
+ * * **BPF_F_ADJ_ROOM_DECAP_L4_GRE**:
+ * Clear GRE tunnel GSO flags (SKB_GSO_GRE and SKB_GSO_GRE_CSUM)
+ * when decapsulating a GRE tunnel.
+ *
+ * * **BPF_F_ADJ_ROOM_DECAP_L4_UDP**:
+ * Clear UDP tunnel GSO flags (SKB_GSO_UDP_TUNNEL and
+ * SKB_GSO_UDP_TUNNEL_CSUM) when decapsulating a UDP tunnel.
+ *
+ * * **BPF_F_ADJ_ROOM_DECAP_IPXIP4**:
+ * Clear IPIP/SIT tunnel GSO flag (SKB_GSO_IPXIP4) when decapsulating
+ * a tunnel with an outer IPv4 header (IPv4-in-IPv4 or IPv6-in-IPv4).
+ *
+ * * **BPF_F_ADJ_ROOM_DECAP_IPXIP6**:
+ * Clear IPv6 encapsulation tunnel GSO flag (SKB_GSO_IPXIP6) when
+ * decapsulating a tunnel with an outer IPv6 header (IPv6-in-IPv6
+ * or IPv4-in-IPv6).
+ *
+ * When using the decapsulation flags above, the skb->encapsulation
+ * flag is automatically cleared if all tunnel-specific GSO flags
+ * (SKB_GSO_UDP_TUNNEL, SKB_GSO_UDP_TUNNEL_CSUM, SKB_GSO_GRE,
+ * SKB_GSO_GRE_CSUM, SKB_GSO_IPXIP4, SKB_GSO_IPXIP6) have been
+ * removed from the packet. This handles cases where all tunnel
+ * layers have been decapsulated.
*
* A call to this helper is susceptible to change the underlying
* packet buffer. Therefore, at load time, all checks on pointers
@@ -6219,6 +6245,10 @@ enum bpf_adj_room_flags {
BPF_F_ADJ_ROOM_ENCAP_L2_ETH = (1ULL << 6),
BPF_F_ADJ_ROOM_DECAP_L3_IPV4 = (1ULL << 7),
BPF_F_ADJ_ROOM_DECAP_L3_IPV6 = (1ULL << 8),
+ BPF_F_ADJ_ROOM_DECAP_L4_GRE = (1ULL << 9),
+ BPF_F_ADJ_ROOM_DECAP_L4_UDP = (1ULL << 10),
+ BPF_F_ADJ_ROOM_DECAP_IPXIP4 = (1ULL << 11),
+ BPF_F_ADJ_ROOM_DECAP_IPXIP6 = (1ULL << 12),
};
enum {
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index db2c520d0e92..e9a5c67ff5e2 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -3010,8 +3010,34 @@ union bpf_attr {
*
* * **BPF_F_ADJ_ROOM_DECAP_L3_IPV4**,
* **BPF_F_ADJ_ROOM_DECAP_L3_IPV6**:
- * Indicate the new IP header version after decapsulating the outer
- * IP header. Used when the inner and outer IP versions are different.
+ * Indicate the new IP header version after decapsulating the
+ * outer IP header. Used when the inner and outer IP versions
+ * are different. These flags only trigger a protocol change
+ * without clearing any tunnel-specific GSO flags.
+ *
+ * * **BPF_F_ADJ_ROOM_DECAP_L4_GRE**:
+ * Clear GRE tunnel GSO flags (SKB_GSO_GRE and SKB_GSO_GRE_CSUM)
+ * when decapsulating a GRE tunnel.
+ *
+ * * **BPF_F_ADJ_ROOM_DECAP_L4_UDP**:
+ * Clear UDP tunnel GSO flags (SKB_GSO_UDP_TUNNEL and
+ * SKB_GSO_UDP_TUNNEL_CSUM) when decapsulating a UDP tunnel.
+ *
+ * * **BPF_F_ADJ_ROOM_DECAP_IPXIP4**:
+ * Clear IPIP/SIT tunnel GSO flag (SKB_GSO_IPXIP4) when decapsulating
+ * a tunnel with an outer IPv4 header (IPv4-in-IPv4 or IPv6-in-IPv4).
+ *
+ * * **BPF_F_ADJ_ROOM_DECAP_IPXIP6**:
+ * Clear IPv6 encapsulation tunnel GSO flag (SKB_GSO_IPXIP6) when
+ * decapsulating a tunnel with an outer IPv6 header (IPv6-in-IPv6
+ * or IPv4-in-IPv6).
+ *
+ * When using the decapsulation flags above, the skb->encapsulation
+ * flag is automatically cleared if all tunnel-specific GSO flags
+ * (SKB_GSO_UDP_TUNNEL, SKB_GSO_UDP_TUNNEL_CSUM, SKB_GSO_GRE,
+ * SKB_GSO_GRE_CSUM, SKB_GSO_IPXIP4, SKB_GSO_IPXIP6) have been
+ * removed from the packet. This handles cases where all tunnel
+ * layers have been decapsulated.
*
* A call to this helper is susceptible to change the underlying
* packet buffer. Therefore, at load time, all checks on pointers
@@ -6219,6 +6245,10 @@ enum bpf_adj_room_flags {
BPF_F_ADJ_ROOM_ENCAP_L2_ETH = (1ULL << 6),
BPF_F_ADJ_ROOM_DECAP_L3_IPV4 = (1ULL << 7),
BPF_F_ADJ_ROOM_DECAP_L3_IPV6 = (1ULL << 8),
+ BPF_F_ADJ_ROOM_DECAP_L4_GRE = (1ULL << 9),
+ BPF_F_ADJ_ROOM_DECAP_L4_UDP = (1ULL << 10),
+ BPF_F_ADJ_ROOM_DECAP_IPXIP4 = (1ULL << 11),
+ BPF_F_ADJ_ROOM_DECAP_IPXIP6 = (1ULL << 12),
};
enum {
--
2.34.1
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH v2 3/5] bpf: add helper masks for ADJ_ROOM flags and encap validation
[not found] <20260318134242.2725749-1-nhudson@akamai.com>
2026-03-18 13:42 ` [PATCH v2 1/5] bpf: name the enum for BPF_FUNC_skb_adjust_room flags Nick Hudson
2026-03-18 13:42 ` [PATCH v2 2/5] bpf: add BPF_F_ADJ_ROOM_DECAP_* flags for tunnel decapsulation Nick Hudson
@ 2026-03-18 13:42 ` Nick Hudson
2026-03-21 0:39 ` Willem de Bruijn
2026-03-24 18:12 ` Martin KaFai Lau
2026-03-18 13:42 ` [PATCH v2 4/5] bpf: allow new DECAP flags and add guard rails Nick Hudson
2026-03-18 13:42 ` [PATCH v2 5/5] bpf: clear decap tunnel GSO state in skb_adjust_room Nick Hudson
4 siblings, 2 replies; 20+ messages in thread
From: Nick Hudson @ 2026-03-18 13:42 UTC (permalink / raw)
To: bpf, netdev
Cc: Willem de Bruijn, Nick Hudson, Max Tottenham, Anna Glasgall,
Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
Martin KaFai Lau, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni, linux-kernel
Introduce helper masks for bpf_skb_adjust_room() flags to simplify
validation logic:
- BPF_F_ADJ_ROOM_DECAP_L4_MASK
- BPF_F_ADJ_ROOM_DECAP_IPXIP_MASK
- BPF_F_ADJ_ROOM_ENCAP_MASK
- BPF_F_ADJ_ROOM_DECAP_MASK
Add flag validation to bpf_skb_net_grow() to reject invalid encap
flags early. Refactor existing validation checks in bpf_skb_net_shrink()
and bpf_skb_adjust_room() to use the new masks (no behavior change).
Co-developed-by: Max Tottenham <mtottenh@akamai.com>
Signed-off-by: Max Tottenham <mtottenh@akamai.com>
Co-developed-by: Anna Glasgall <aglasgal@akamai.com>
Signed-off-by: Anna Glasgall <aglasgal@akamai.com>
Signed-off-by: Nick Hudson <nhudson@akamai.com>
---
net/core/filter.c | 31 +++++++++++++++++++++++--------
1 file changed, 23 insertions(+), 8 deletions(-)
diff --git a/net/core/filter.c b/net/core/filter.c
index 0d5d5a17acb2..7c2871b40fe4 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -3483,14 +3483,25 @@ static u32 bpf_skb_net_base_len(const struct sk_buff *skb)
#define BPF_F_ADJ_ROOM_DECAP_L3_MASK (BPF_F_ADJ_ROOM_DECAP_L3_IPV4 | \
BPF_F_ADJ_ROOM_DECAP_L3_IPV6)
-#define BPF_F_ADJ_ROOM_MASK (BPF_F_ADJ_ROOM_FIXED_GSO | \
- BPF_F_ADJ_ROOM_ENCAP_L3_MASK | \
+#define BPF_F_ADJ_ROOM_DECAP_L4_MASK (BPF_F_ADJ_ROOM_DECAP_L4_UDP | \
+ BPF_F_ADJ_ROOM_DECAP_L4_GRE)
+
+#define BPF_F_ADJ_ROOM_DECAP_IPXIP_MASK (BPF_F_ADJ_ROOM_DECAP_IPXIP4 | \
+ BPF_F_ADJ_ROOM_DECAP_IPXIP6)
+
+#define BPF_F_ADJ_ROOM_ENCAP_MASK (BPF_F_ADJ_ROOM_ENCAP_L3_MASK | \
BPF_F_ADJ_ROOM_ENCAP_L4_GRE | \
BPF_F_ADJ_ROOM_ENCAP_L4_UDP | \
BPF_F_ADJ_ROOM_ENCAP_L2_ETH | \
BPF_F_ADJ_ROOM_ENCAP_L2( \
- BPF_ADJ_ROOM_ENCAP_L2_MASK) | \
- BPF_F_ADJ_ROOM_DECAP_L3_MASK)
+ BPF_ADJ_ROOM_ENCAP_L2_MASK))
+
+#define BPF_F_ADJ_ROOM_DECAP_MASK (BPF_F_ADJ_ROOM_DECAP_L3_MASK)
+
+#define BPF_F_ADJ_ROOM_MASK (BPF_F_ADJ_ROOM_FIXED_GSO | \
+ BPF_F_ADJ_ROOM_ENCAP_MASK | \
+ BPF_F_ADJ_ROOM_DECAP_MASK | \
+ BPF_F_ADJ_ROOM_NO_CSUM_RESET)
static int bpf_skb_net_grow(struct sk_buff *skb, u32 off, u32 len_diff,
u64 flags)
@@ -3502,6 +3513,11 @@ static int bpf_skb_net_grow(struct sk_buff *skb, u32 off, u32 len_diff,
unsigned int gso_type = SKB_GSO_DODGY;
int ret;
+ if (unlikely(flags & ~(BPF_F_ADJ_ROOM_ENCAP_MASK |
+ BPF_F_ADJ_ROOM_NO_CSUM_RESET |
+ BPF_F_ADJ_ROOM_FIXED_GSO)))
+ return -EINVAL;
+
if (skb_is_gso(skb) && !skb_is_gso_tcp(skb)) {
/* udp gso_size delineates datagrams, only allow if fixed */
if (!(skb_shinfo(skb)->gso_type & SKB_GSO_UDP_L4) ||
@@ -3611,8 +3627,8 @@ static int bpf_skb_net_shrink(struct sk_buff *skb, u32 off, u32 len_diff,
{
int ret;
- if (unlikely(flags & ~(BPF_F_ADJ_ROOM_FIXED_GSO |
- BPF_F_ADJ_ROOM_DECAP_L3_MASK |
+ if (unlikely(flags & ~(BPF_F_ADJ_ROOM_DECAP_MASK |
+ BPF_F_ADJ_ROOM_FIXED_GSO |
BPF_F_ADJ_ROOM_NO_CSUM_RESET)))
return -EINVAL;
@@ -3708,8 +3724,7 @@ BPF_CALL_4(bpf_skb_adjust_room, struct sk_buff *, skb, s32, len_diff,
u32 off;
int ret;
- if (unlikely(flags & ~(BPF_F_ADJ_ROOM_MASK |
- BPF_F_ADJ_ROOM_NO_CSUM_RESET)))
+ if (unlikely(flags & ~BPF_F_ADJ_ROOM_MASK))
return -EINVAL;
if (unlikely(len_diff_abs > 0xfffU))
return -EFAULT;
--
2.34.1
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH v2 4/5] bpf: allow new DECAP flags and add guard rails
[not found] <20260318134242.2725749-1-nhudson@akamai.com>
` (2 preceding siblings ...)
2026-03-18 13:42 ` [PATCH v2 3/5] bpf: add helper masks for ADJ_ROOM flags and encap validation Nick Hudson
@ 2026-03-18 13:42 ` Nick Hudson
2026-03-18 20:02 ` Willem de Bruijn
` (2 more replies)
2026-03-18 13:42 ` [PATCH v2 5/5] bpf: clear decap tunnel GSO state in skb_adjust_room Nick Hudson
4 siblings, 3 replies; 20+ messages in thread
From: Nick Hudson @ 2026-03-18 13:42 UTC (permalink / raw)
To: bpf, netdev
Cc: Willem de Bruijn, Nick Hudson, Max Tottenham, Anna Glasgall,
Martin KaFai Lau, Daniel Borkmann, Alexei Starovoitov,
Andrii Nakryiko, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni, linux-kernel
Add checks to require shrink-only decap, reject conflicting decap flag
combinations, and verify removed length is sufficient for claimed header
decapsulation.
Co-developed-by: Max Tottenham <mtottenh@akamai.com>
Signed-off-by: Max Tottenham <mtottenh@akamai.com>
Co-developed-by: Anna Glasgall <aglasgal@akamai.com>
Signed-off-by: Anna Glasgall <aglasgal@akamai.com>
Signed-off-by: Nick Hudson <nhudson@akamai.com>
---
net/core/filter.c | 47 +++++++++++++++++++++++++++++++++++++----------
1 file changed, 37 insertions(+), 10 deletions(-)
diff --git a/net/core/filter.c b/net/core/filter.c
index 7c2871b40fe4..47aec44a9cd3 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -56,6 +56,7 @@
#include <net/sock_reuseport.h>
#include <net/busy_poll.h>
#include <net/tcp.h>
+#include <net/gre.h>
#include <net/xfrm.h>
#include <net/udp.h>
#include <linux/bpf_trace.h>
@@ -3496,7 +3497,9 @@ static u32 bpf_skb_net_base_len(const struct sk_buff *skb)
BPF_F_ADJ_ROOM_ENCAP_L2( \
BPF_ADJ_ROOM_ENCAP_L2_MASK))
-#define BPF_F_ADJ_ROOM_DECAP_MASK (BPF_F_ADJ_ROOM_DECAP_L3_MASK)
+#define BPF_F_ADJ_ROOM_DECAP_MASK (BPF_F_ADJ_ROOM_DECAP_L3_MASK | \
+ BPF_F_ADJ_ROOM_DECAP_L4_MASK | \
+ BPF_F_ADJ_ROOM_DECAP_IPXIP_MASK)
#define BPF_F_ADJ_ROOM_MASK (BPF_F_ADJ_ROOM_FIXED_GSO | \
BPF_F_ADJ_ROOM_ENCAP_MASK | \
@@ -3743,20 +3746,44 @@ BPF_CALL_4(bpf_skb_adjust_room, struct sk_buff *, skb, s32, len_diff,
return -ENOTSUPP;
}
- if (flags & BPF_F_ADJ_ROOM_DECAP_L3_MASK) {
+ if (flags & BPF_F_ADJ_ROOM_DECAP_MASK) {
+ u32 len_decap_min = 0;
+
if (!shrink)
return -EINVAL;
- switch (flags & BPF_F_ADJ_ROOM_DECAP_L3_MASK) {
- case BPF_F_ADJ_ROOM_DECAP_L3_IPV4:
+ /* Reject mutually exclusive decap flag pairs. */
+ if ((flags & BPF_F_ADJ_ROOM_DECAP_L3_MASK) ==
+ BPF_F_ADJ_ROOM_DECAP_L3_MASK)
+ return -EINVAL;
+
+ if ((flags & BPF_F_ADJ_ROOM_DECAP_L4_MASK) ==
+ BPF_F_ADJ_ROOM_DECAP_L4_MASK)
+ return -EINVAL;
+
+ if ((flags & BPF_F_ADJ_ROOM_DECAP_IPXIP_MASK) ==
+ BPF_F_ADJ_ROOM_DECAP_IPXIP_MASK)
+ return -EINVAL;
+
+ /* Reject mutually exclusive decap tunnel type flags. */
+ if ((flags & BPF_F_ADJ_ROOM_DECAP_L4_MASK) &&
+ (flags & BPF_F_ADJ_ROOM_DECAP_IPXIP_MASK))
+ return -EINVAL;
+
+ if (flags & BPF_F_ADJ_ROOM_DECAP_L4_UDP)
+ len_decap_min += sizeof(struct udphdr);
+
+ if (flags & BPF_F_ADJ_ROOM_DECAP_L4_GRE)
+ len_decap_min += sizeof(struct gre_base_hdr);
+
+ if (len_diff_abs < len_decap_min)
+ return -EINVAL;
+
+ if (flags & BPF_F_ADJ_ROOM_DECAP_L3_IPV4)
len_min = sizeof(struct iphdr);
- break;
- case BPF_F_ADJ_ROOM_DECAP_L3_IPV6:
+
+ if (flags & BPF_F_ADJ_ROOM_DECAP_L3_IPV6)
len_min = sizeof(struct ipv6hdr);
- break;
- default:
- return -EINVAL;
- }
}
len_cur = skb->len - skb_network_offset(skb);
--
2.34.1
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH v2 5/5] bpf: clear decap tunnel GSO state in skb_adjust_room
[not found] <20260318134242.2725749-1-nhudson@akamai.com>
` (3 preceding siblings ...)
2026-03-18 13:42 ` [PATCH v2 4/5] bpf: allow new DECAP flags and add guard rails Nick Hudson
@ 2026-03-18 13:42 ` Nick Hudson
2026-03-18 20:09 ` Willem de Bruijn
4 siblings, 1 reply; 20+ messages in thread
From: Nick Hudson @ 2026-03-18 13:42 UTC (permalink / raw)
To: bpf, netdev
Cc: Willem de Bruijn, Nick Hudson, Max Tottenham, Anna Glasgall,
Martin KaFai Lau, Daniel Borkmann, Alexei Starovoitov,
Andrii Nakryiko, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni, linux-kernel
On shrink in bpf_skb_adjust_room(), clear tunnel-specific GSO flags
according to the decapsulation flags:
- BPF_F_ADJ_ROOM_DECAP_L4_UDP clears SKB_GSO_UDP_TUNNEL{,_CSUM}
- BPF_F_ADJ_ROOM_DECAP_L4_GRE clears SKB_GSO_GRE{,_CSUM}
- BPF_F_ADJ_ROOM_DECAP_IPXIP4 clears SKB_GSO_IPXIP4
- BPF_F_ADJ_ROOM_DECAP_IPXIP6 clears SKB_GSO_IPXIP6
When all tunnel-related GSO bits are cleared, also clear
skb->encapsulation.
Co-developed-by: Max Tottenham <mtottenh@akamai.com>
Signed-off-by: Max Tottenham <mtottenh@akamai.com>
Co-developed-by: Anna Glasgall <aglasgal@akamai.com>
Signed-off-by: Anna Glasgall <aglasgal@akamai.com>
Signed-off-by: Nick Hudson <nhudson@akamai.com>
---
net/core/filter.c | 31 +++++++++++++++++++++++++++++++
1 file changed, 31 insertions(+)
diff --git a/net/core/filter.c b/net/core/filter.c
index 47aec44a9cd3..35af1199ab97 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -3665,6 +3665,37 @@ static int bpf_skb_net_shrink(struct sk_buff *skb, u32 off, u32 len_diff,
if (!(flags & BPF_F_ADJ_ROOM_FIXED_GSO))
skb_increase_gso_size(shinfo, len_diff);
+ /* Selective GSO flag clearing based on decap type.
+ * Only clear the flags for the tunnel layer being removed.
+ */
+ if ((flags & BPF_F_ADJ_ROOM_DECAP_L4_UDP) &&
+ (shinfo->gso_type & (SKB_GSO_UDP_TUNNEL |
+ SKB_GSO_UDP_TUNNEL_CSUM)))
+ shinfo->gso_type &= ~(SKB_GSO_UDP_TUNNEL |
+ SKB_GSO_UDP_TUNNEL_CSUM);
+ if ((flags & BPF_F_ADJ_ROOM_DECAP_L4_GRE) &&
+ (shinfo->gso_type & (SKB_GSO_GRE | SKB_GSO_GRE_CSUM)))
+ shinfo->gso_type &= ~(SKB_GSO_GRE |
+ SKB_GSO_GRE_CSUM);
+ if ((flags & BPF_F_ADJ_ROOM_DECAP_IPXIP4) &&
+ (shinfo->gso_type & SKB_GSO_IPXIP4))
+ shinfo->gso_type &= ~SKB_GSO_IPXIP4;
+ if ((flags & BPF_F_ADJ_ROOM_DECAP_IPXIP6) &&
+ (shinfo->gso_type & SKB_GSO_IPXIP6))
+ shinfo->gso_type &= ~SKB_GSO_IPXIP6;
+
+ /* Clear encapsulation flag only when no tunnel GSO flags remain */
+ if (flags & BPF_F_ADJ_ROOM_DECAP_MASK) {
+ if (!(shinfo->gso_type & (SKB_GSO_UDP_TUNNEL |
+ SKB_GSO_UDP_TUNNEL_CSUM |
+ SKB_GSO_GRE |
+ SKB_GSO_GRE_CSUM |
+ SKB_GSO_IPXIP4 |
+ SKB_GSO_IPXIP6)))
+ if (skb->encapsulation)
+ skb->encapsulation = 0;
+ }
+
/* Header must be checked, and gso_segs recomputed. */
shinfo->gso_type |= SKB_GSO_DODGY;
shinfo->gso_segs = 0;
--
2.34.1
^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [PATCH v2 4/5] bpf: allow new DECAP flags and add guard rails
2026-03-18 13:42 ` [PATCH v2 4/5] bpf: allow new DECAP flags and add guard rails Nick Hudson
@ 2026-03-18 20:02 ` Willem de Bruijn
2026-03-19 8:17 ` Hudson, Nick
2026-03-21 0:40 ` Willem de Bruijn
2026-03-24 18:30 ` Martin KaFai Lau
2 siblings, 1 reply; 20+ messages in thread
From: Willem de Bruijn @ 2026-03-18 20:02 UTC (permalink / raw)
To: Nick Hudson, bpf, netdev
Cc: Willem de Bruijn, Nick Hudson, Max Tottenham, Anna Glasgall,
Martin KaFai Lau, Daniel Borkmann, Alexei Starovoitov,
Andrii Nakryiko, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni, linux-kernel
Nick Hudson wrote:
> Add checks to require shrink-only decap, reject conflicting decap flag
> combinations, and verify removed length is sufficient for claimed header
> decapsulation.
>
> Co-developed-by: Max Tottenham <mtottenh@akamai.com>
> Signed-off-by: Max Tottenham <mtottenh@akamai.com>
> Co-developed-by: Anna Glasgall <aglasgal@akamai.com>
> Signed-off-by: Anna Glasgall <aglasgal@akamai.com>
> Signed-off-by: Nick Hudson <nhudson@akamai.com>
> ---
> net/core/filter.c | 47 +++++++++++++++++++++++++++++++++++++----------
> 1 file changed, 37 insertions(+), 10 deletions(-)
>
> diff --git a/net/core/filter.c b/net/core/filter.c
> index 7c2871b40fe4..47aec44a9cd3 100644
> --- a/net/core/filter.c
> +++ b/net/core/filter.c
> @@ -56,6 +56,7 @@
> #include <net/sock_reuseport.h>
> #include <net/busy_poll.h>
> #include <net/tcp.h>
> +#include <net/gre.h>
> #include <net/xfrm.h>
> #include <net/udp.h>
> #include <linux/bpf_trace.h>
> @@ -3496,7 +3497,9 @@ static u32 bpf_skb_net_base_len(const struct sk_buff *skb)
> BPF_F_ADJ_ROOM_ENCAP_L2( \
> BPF_ADJ_ROOM_ENCAP_L2_MASK))
>
> -#define BPF_F_ADJ_ROOM_DECAP_MASK (BPF_F_ADJ_ROOM_DECAP_L3_MASK)
> +#define BPF_F_ADJ_ROOM_DECAP_MASK (BPF_F_ADJ_ROOM_DECAP_L3_MASK | \
> + BPF_F_ADJ_ROOM_DECAP_L4_MASK | \
> + BPF_F_ADJ_ROOM_DECAP_IPXIP_MASK)
>
> #define BPF_F_ADJ_ROOM_MASK (BPF_F_ADJ_ROOM_FIXED_GSO | \
> BPF_F_ADJ_ROOM_ENCAP_MASK | \
> @@ -3743,20 +3746,44 @@ BPF_CALL_4(bpf_skb_adjust_room, struct sk_buff *, skb, s32, len_diff,
> return -ENOTSUPP;
> }
>
> - if (flags & BPF_F_ADJ_ROOM_DECAP_L3_MASK) {
> + if (flags & BPF_F_ADJ_ROOM_DECAP_MASK) {
> + u32 len_decap_min = 0;
> +
> if (!shrink)
> return -EINVAL;
>
> - switch (flags & BPF_F_ADJ_ROOM_DECAP_L3_MASK) {
> - case BPF_F_ADJ_ROOM_DECAP_L3_IPV4:
> + /* Reject mutually exclusive decap flag pairs. */
> + if ((flags & BPF_F_ADJ_ROOM_DECAP_L3_MASK) ==
> + BPF_F_ADJ_ROOM_DECAP_L3_MASK)
> + return -EINVAL;
> +
> + if ((flags & BPF_F_ADJ_ROOM_DECAP_L4_MASK) ==
> + BPF_F_ADJ_ROOM_DECAP_L4_MASK)
> + return -EINVAL;
> +
> + if ((flags & BPF_F_ADJ_ROOM_DECAP_IPXIP_MASK) ==
> + BPF_F_ADJ_ROOM_DECAP_IPXIP_MASK)
> + return -EINVAL;
> +
> + /* Reject mutually exclusive decap tunnel type flags. */
> + if ((flags & BPF_F_ADJ_ROOM_DECAP_L4_MASK) &&
> + (flags & BPF_F_ADJ_ROOM_DECAP_IPXIP_MASK))
> + return -EINVAL;
> +
> + if (flags & BPF_F_ADJ_ROOM_DECAP_L4_UDP)
> + len_decap_min += sizeof(struct udphdr);
> +
> + if (flags & BPF_F_ADJ_ROOM_DECAP_L4_GRE)
> + len_decap_min += sizeof(struct gre_base_hdr);
> +
> + if (len_diff_abs < len_decap_min)
> + return -EINVAL;
Should this test come after the below IP flags?
> +
> + if (flags & BPF_F_ADJ_ROOM_DECAP_L3_IPV4)
> len_min = sizeof(struct iphdr);
> - break;
> - case BPF_F_ADJ_ROOM_DECAP_L3_IPV6:
> +
> + if (flags & BPF_F_ADJ_ROOM_DECAP_L3_IPV6)
> len_min = sizeof(struct ipv6hdr);
> - break;
> - default:
> - return -EINVAL;
> - }
> }
>
> len_cur = skb->len - skb_network_offset(skb);
> --
> 2.34.1
>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v2 5/5] bpf: clear decap tunnel GSO state in skb_adjust_room
2026-03-18 13:42 ` [PATCH v2 5/5] bpf: clear decap tunnel GSO state in skb_adjust_room Nick Hudson
@ 2026-03-18 20:09 ` Willem de Bruijn
0 siblings, 0 replies; 20+ messages in thread
From: Willem de Bruijn @ 2026-03-18 20:09 UTC (permalink / raw)
To: Nick Hudson, bpf, netdev
Cc: Willem de Bruijn, Nick Hudson, Max Tottenham, Anna Glasgall,
Martin KaFai Lau, Daniel Borkmann, Alexei Starovoitov,
Andrii Nakryiko, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni, linux-kernel
Nick Hudson wrote:
> On shrink in bpf_skb_adjust_room(), clear tunnel-specific GSO flags
> according to the decapsulation flags:
>
> - BPF_F_ADJ_ROOM_DECAP_L4_UDP clears SKB_GSO_UDP_TUNNEL{,_CSUM}
> - BPF_F_ADJ_ROOM_DECAP_L4_GRE clears SKB_GSO_GRE{,_CSUM}
> - BPF_F_ADJ_ROOM_DECAP_IPXIP4 clears SKB_GSO_IPXIP4
> - BPF_F_ADJ_ROOM_DECAP_IPXIP6 clears SKB_GSO_IPXIP6
>
> When all tunnel-related GSO bits are cleared, also clear
> skb->encapsulation.
>
> Co-developed-by: Max Tottenham <mtottenh@akamai.com>
> Signed-off-by: Max Tottenham <mtottenh@akamai.com>
> Co-developed-by: Anna Glasgall <aglasgal@akamai.com>
> Signed-off-by: Anna Glasgall <aglasgal@akamai.com>
> Signed-off-by: Nick Hudson <nhudson@akamai.com>
> ---
> net/core/filter.c | 31 +++++++++++++++++++++++++++++++
> 1 file changed, 31 insertions(+)
>
> diff --git a/net/core/filter.c b/net/core/filter.c
> index 47aec44a9cd3..35af1199ab97 100644
> --- a/net/core/filter.c
> +++ b/net/core/filter.c
> @@ -3665,6 +3665,37 @@ static int bpf_skb_net_shrink(struct sk_buff *skb, u32 off, u32 len_diff,
> if (!(flags & BPF_F_ADJ_ROOM_FIXED_GSO))
> skb_increase_gso_size(shinfo, len_diff);
>
> + /* Selective GSO flag clearing based on decap type.
> + * Only clear the flags for the tunnel layer being removed.
> + */
> + if ((flags & BPF_F_ADJ_ROOM_DECAP_L4_UDP) &&
> + (shinfo->gso_type & (SKB_GSO_UDP_TUNNEL |
> + SKB_GSO_UDP_TUNNEL_CSUM)))
> + shinfo->gso_type &= ~(SKB_GSO_UDP_TUNNEL |
> + SKB_GSO_UDP_TUNNEL_CSUM);
> + if ((flags & BPF_F_ADJ_ROOM_DECAP_L4_GRE) &&
> + (shinfo->gso_type & (SKB_GSO_GRE | SKB_GSO_GRE_CSUM)))
> + shinfo->gso_type &= ~(SKB_GSO_GRE |
> + SKB_GSO_GRE_CSUM);
> + if ((flags & BPF_F_ADJ_ROOM_DECAP_IPXIP4) &&
> + (shinfo->gso_type & SKB_GSO_IPXIP4))
> + shinfo->gso_type &= ~SKB_GSO_IPXIP4;
> + if ((flags & BPF_F_ADJ_ROOM_DECAP_IPXIP6) &&
> + (shinfo->gso_type & SKB_GSO_IPXIP6))
> + shinfo->gso_type &= ~SKB_GSO_IPXIP6;
> +
> + /* Clear encapsulation flag only when no tunnel GSO flags remain */
> + if (flags & BPF_F_ADJ_ROOM_DECAP_MASK) {
> + if (!(shinfo->gso_type & (SKB_GSO_UDP_TUNNEL |
> + SKB_GSO_UDP_TUNNEL_CSUM |
> + SKB_GSO_GRE |
> + SKB_GSO_GRE_CSUM |
> + SKB_GSO_IPXIP4 |
> + SKB_GSO_IPXIP6)))
> + if (skb->encapsulation)
> + skb->encapsulation = 0;
Is there any chance that this might clear it while some other tunnel
is still active? From a quick grep on skb->encapsulation the only
possible hit I see is SKB_GSO_ESP.
> + }
> +
> /* Header must be checked, and gso_segs recomputed. */
> shinfo->gso_type |= SKB_GSO_DODGY;
> shinfo->gso_segs = 0;
> --
> 2.34.1
>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v2 4/5] bpf: allow new DECAP flags and add guard rails
2026-03-18 20:02 ` Willem de Bruijn
@ 2026-03-19 8:17 ` Hudson, Nick
2026-03-19 13:24 ` Willem de Bruijn
0 siblings, 1 reply; 20+ messages in thread
From: Hudson, Nick @ 2026-03-19 8:17 UTC (permalink / raw)
To: Willem de Bruijn
Cc: bpf@vger.kernel.org, netdev@vger.kernel.org, Tottenham, Max,
Glasgall, Anna, Martin KaFai Lau, Daniel Borkmann,
Alexei Starovoitov, Andrii Nakryiko, David S. Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni,
linux-kernel@vger.kernel.org
[-- Attachment #1: Type: text/plain, Size: 3678 bytes --]
> On 18 Mar 2026, at 20:02, Willem de Bruijn <willemdebruijn.kernel@gmail.com> wrote:
>
> !-------------------------------------------------------------------|
> This Message Is From an External Sender
> This message came from outside your organization.
> |-------------------------------------------------------------------!
>
> Nick Hudson wrote:
>> Add checks to require shrink-only decap, reject conflicting decap flag
>> combinations, and verify removed length is sufficient for claimed header
>> decapsulation.
>>
>> Co-developed-by: Max Tottenham <mtottenh@akamai.com>
>> Signed-off-by: Max Tottenham <mtottenh@akamai.com>
>> Co-developed-by: Anna Glasgall <aglasgal@akamai.com>
>> Signed-off-by: Anna Glasgall <aglasgal@akamai.com>
>> Signed-off-by: Nick Hudson <nhudson@akamai.com>
>> ---
>> net/core/filter.c | 47 +++++++++++++++++++++++++++++++++++++----------
>> 1 file changed, 37 insertions(+), 10 deletions(-)
>>
>> diff --git a/net/core/filter.c b/net/core/filter.c
>> index 7c2871b40fe4..47aec44a9cd3 100644
>> --- a/net/core/filter.c
>> +++ b/net/core/filter.c
>> @@ -56,6 +56,7 @@
>> #include <net/sock_reuseport.h>
>> #include <net/busy_poll.h>
>> #include <net/tcp.h>
>> +#include <net/gre.h>
>> #include <net/xfrm.h>
>> #include <net/udp.h>
>> #include <linux/bpf_trace.h>
>> @@ -3496,7 +3497,9 @@ static u32 bpf_skb_net_base_len(const struct sk_buff *skb)
>> BPF_F_ADJ_ROOM_ENCAP_L2( \
>> BPF_ADJ_ROOM_ENCAP_L2_MASK))
>>
>> -#define BPF_F_ADJ_ROOM_DECAP_MASK (BPF_F_ADJ_ROOM_DECAP_L3_MASK)
>> +#define BPF_F_ADJ_ROOM_DECAP_MASK (BPF_F_ADJ_ROOM_DECAP_L3_MASK | \
>> + BPF_F_ADJ_ROOM_DECAP_L4_MASK | \
>> + BPF_F_ADJ_ROOM_DECAP_IPXIP_MASK)
>>
>> #define BPF_F_ADJ_ROOM_MASK (BPF_F_ADJ_ROOM_FIXED_GSO | \
>> BPF_F_ADJ_ROOM_ENCAP_MASK | \
>> @@ -3743,20 +3746,44 @@ BPF_CALL_4(bpf_skb_adjust_room, struct sk_buff *, skb, s32, len_diff,
>> return -ENOTSUPP;
>> }
>>
>> - if (flags & BPF_F_ADJ_ROOM_DECAP_L3_MASK) {
>> + if (flags & BPF_F_ADJ_ROOM_DECAP_MASK) {
>> + u32 len_decap_min = 0;
>> +
>> if (!shrink)
>> return -EINVAL;
>>
>> - switch (flags & BPF_F_ADJ_ROOM_DECAP_L3_MASK) {
>> - case BPF_F_ADJ_ROOM_DECAP_L3_IPV4:
>> + /* Reject mutually exclusive decap flag pairs. */
>> + if ((flags & BPF_F_ADJ_ROOM_DECAP_L3_MASK) ==
>> + BPF_F_ADJ_ROOM_DECAP_L3_MASK)
>> + return -EINVAL;
>> +
>> + if ((flags & BPF_F_ADJ_ROOM_DECAP_L4_MASK) ==
>> + BPF_F_ADJ_ROOM_DECAP_L4_MASK)
>> + return -EINVAL;
>> +
>> + if ((flags & BPF_F_ADJ_ROOM_DECAP_IPXIP_MASK) ==
>> + BPF_F_ADJ_ROOM_DECAP_IPXIP_MASK)
>> + return -EINVAL;
>> +
>> + /* Reject mutually exclusive decap tunnel type flags. */
>> + if ((flags & BPF_F_ADJ_ROOM_DECAP_L4_MASK) &&
>> + (flags & BPF_F_ADJ_ROOM_DECAP_IPXIP_MASK))
>> + return -EINVAL;
>> +
>> + if (flags & BPF_F_ADJ_ROOM_DECAP_L4_UDP)
>> + len_decap_min += sizeof(struct udphdr);
>> +
>> + if (flags & BPF_F_ADJ_ROOM_DECAP_L4_GRE)
>> + len_decap_min += sizeof(struct gre_base_hdr);
>> +
>> + if (len_diff_abs < len_decap_min)
>> + return -EINVAL;
>
> Should this test come after the below IP flags?
Should it?
Seems to me it can bail early without having to check the IP flags. len_decap_min vs len_min.
What am I missing?
>
>> +
>> + if (flags & BPF_F_ADJ_ROOM_DECAP_L3_IPV4)
>> len_min = sizeof(struct iphdr);
>> - break;
>> - case BPF_F_ADJ_ROOM_DECAP_L3_IPV6:
>> +
>> + if (flags & BPF_F_ADJ_ROOM_DECAP_L3_IPV6)
>> len_min = sizeof(struct ipv6hdr);
>> - break;
>> - default:
>> - return -EINVAL;
>> - }
>> }
>>
>> len_cur = skb->len - skb_network_offset(skb);
>> --
>> 2.34.1
>>
>
>
[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 3067 bytes --]
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v2 4/5] bpf: allow new DECAP flags and add guard rails
2026-03-19 8:17 ` Hudson, Nick
@ 2026-03-19 13:24 ` Willem de Bruijn
0 siblings, 0 replies; 20+ messages in thread
From: Willem de Bruijn @ 2026-03-19 13:24 UTC (permalink / raw)
To: Hudson, Nick, Willem de Bruijn
Cc: bpf@vger.kernel.org, netdev@vger.kernel.org, Tottenham, Max,
Glasgall, Anna, Martin KaFai Lau, Daniel Borkmann,
Alexei Starovoitov, Andrii Nakryiko, David S. Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni,
linux-kernel@vger.kernel.org
Hudson, Nick wrote:
>
>
> > On 18 Mar 2026, at 20:02, Willem de Bruijn <willemdebruijn.kernel@gmail.com> wrote:
> >
> > !-------------------------------------------------------------------|
> > This Message Is From an External Sender
> > This message came from outside your organization.
> > |-------------------------------------------------------------------!
> >
> > Nick Hudson wrote:
> >> Add checks to require shrink-only decap, reject conflicting decap flag
> >> combinations, and verify removed length is sufficient for claimed header
> >> decapsulation.
> >>
> >> Co-developed-by: Max Tottenham <mtottenh@akamai.com>
> >> Signed-off-by: Max Tottenham <mtottenh@akamai.com>
> >> Co-developed-by: Anna Glasgall <aglasgal@akamai.com>
> >> Signed-off-by: Anna Glasgall <aglasgal@akamai.com>
> >> Signed-off-by: Nick Hudson <nhudson@akamai.com>
> >> ---
> >> net/core/filter.c | 47 +++++++++++++++++++++++++++++++++++++----------
> >> 1 file changed, 37 insertions(+), 10 deletions(-)
> >>
> >> diff --git a/net/core/filter.c b/net/core/filter.c
> >> index 7c2871b40fe4..47aec44a9cd3 100644
> >> --- a/net/core/filter.c
> >> +++ b/net/core/filter.c
> >> @@ -56,6 +56,7 @@
> >> #include <net/sock_reuseport.h>
> >> #include <net/busy_poll.h>
> >> #include <net/tcp.h>
> >> +#include <net/gre.h>
> >> #include <net/xfrm.h>
> >> #include <net/udp.h>
> >> #include <linux/bpf_trace.h>
> >> @@ -3496,7 +3497,9 @@ static u32 bpf_skb_net_base_len(const struct sk_buff *skb)
> >> BPF_F_ADJ_ROOM_ENCAP_L2( \
> >> BPF_ADJ_ROOM_ENCAP_L2_MASK))
> >>
> >> -#define BPF_F_ADJ_ROOM_DECAP_MASK (BPF_F_ADJ_ROOM_DECAP_L3_MASK)
> >> +#define BPF_F_ADJ_ROOM_DECAP_MASK (BPF_F_ADJ_ROOM_DECAP_L3_MASK | \
> >> + BPF_F_ADJ_ROOM_DECAP_L4_MASK | \
> >> + BPF_F_ADJ_ROOM_DECAP_IPXIP_MASK)
> >>
> >> #define BPF_F_ADJ_ROOM_MASK (BPF_F_ADJ_ROOM_FIXED_GSO | \
> >> BPF_F_ADJ_ROOM_ENCAP_MASK | \
> >> @@ -3743,20 +3746,44 @@ BPF_CALL_4(bpf_skb_adjust_room, struct sk_buff *, skb, s32, len_diff,
> >> return -ENOTSUPP;
> >> }
> >>
> >> - if (flags & BPF_F_ADJ_ROOM_DECAP_L3_MASK) {
> >> + if (flags & BPF_F_ADJ_ROOM_DECAP_MASK) {
> >> + u32 len_decap_min = 0;
> >> +
> >> if (!shrink)
> >> return -EINVAL;
> >>
> >> - switch (flags & BPF_F_ADJ_ROOM_DECAP_L3_MASK) {
> >> - case BPF_F_ADJ_ROOM_DECAP_L3_IPV4:
> >> + /* Reject mutually exclusive decap flag pairs. */
> >> + if ((flags & BPF_F_ADJ_ROOM_DECAP_L3_MASK) ==
> >> + BPF_F_ADJ_ROOM_DECAP_L3_MASK)
> >> + return -EINVAL;
> >> +
> >> + if ((flags & BPF_F_ADJ_ROOM_DECAP_L4_MASK) ==
> >> + BPF_F_ADJ_ROOM_DECAP_L4_MASK)
> >> + return -EINVAL;
> >> +
> >> + if ((flags & BPF_F_ADJ_ROOM_DECAP_IPXIP_MASK) ==
> >> + BPF_F_ADJ_ROOM_DECAP_IPXIP_MASK)
> >> + return -EINVAL;
> >> +
> >> + /* Reject mutually exclusive decap tunnel type flags. */
> >> + if ((flags & BPF_F_ADJ_ROOM_DECAP_L4_MASK) &&
> >> + (flags & BPF_F_ADJ_ROOM_DECAP_IPXIP_MASK))
> >> + return -EINVAL;
> >> +
> >> + if (flags & BPF_F_ADJ_ROOM_DECAP_L4_UDP)
> >> + len_decap_min += sizeof(struct udphdr);
> >> +
> >> + if (flags & BPF_F_ADJ_ROOM_DECAP_L4_GRE)
> >> + len_decap_min += sizeof(struct gre_base_hdr);
> >> +
> >> + if (len_diff_abs < len_decap_min)
> >> + return -EINVAL;
> >
> > Should this test come after the below IP flags?
>
> Should it?
>
> Seems to me it can bail early without having to check the IP flags. len_decap_min vs len_min.
>
> What am I missing?
I would think it common that UDP decap also includes an L3 decap, in
which case the len_decap_min should include both header lengths.
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v2 1/5] bpf: name the enum for BPF_FUNC_skb_adjust_room flags
2026-03-18 13:42 ` [PATCH v2 1/5] bpf: name the enum for BPF_FUNC_skb_adjust_room flags Nick Hudson
@ 2026-03-21 0:39 ` Willem de Bruijn
2026-03-24 17:34 ` Martin KaFai Lau
1 sibling, 0 replies; 20+ messages in thread
From: Willem de Bruijn @ 2026-03-21 0:39 UTC (permalink / raw)
To: Nick Hudson, bpf, netdev
Cc: Willem de Bruijn, Nick Hudson, Max Tottenham, Anna Glasgall,
Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
linux-kernel
Nick Hudson wrote:
> The existing anonymous enum for BPF_FUNC_skb_adjust_room flags is
> named to enum bpf_adj_room_flags to enable CO-RE (Compile Once -
> Run Everywhere) lookups in BPF programs.
>
> Co-developed-by: Max Tottenham <mtottenh@akamai.com>
> Signed-off-by: Max Tottenham <mtottenh@akamai.com>
> Co-developed-by: Anna Glasgall <aglasgal@akamai.com>
> Signed-off-by: Anna Glasgall <aglasgal@akamai.com>
> Signed-off-by: Nick Hudson <nhudson@akamai.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v2 2/5] bpf: add BPF_F_ADJ_ROOM_DECAP_* flags for tunnel decapsulation
2026-03-18 13:42 ` [PATCH v2 2/5] bpf: add BPF_F_ADJ_ROOM_DECAP_* flags for tunnel decapsulation Nick Hudson
@ 2026-03-21 0:39 ` Willem de Bruijn
0 siblings, 0 replies; 20+ messages in thread
From: Willem de Bruijn @ 2026-03-21 0:39 UTC (permalink / raw)
To: Nick Hudson, bpf, netdev
Cc: Willem de Bruijn, Nick Hudson, Max Tottenham, Anna Glasgall,
Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
linux-kernel
Nick Hudson wrote:
> Add new bpf_skb_adjust_room() decapsulation flags:
>
> - BPF_F_ADJ_ROOM_DECAP_L4_GRE
> - BPF_F_ADJ_ROOM_DECAP_L4_UDP
> - BPF_F_ADJ_ROOM_DECAP_IPXIP4
> - BPF_F_ADJ_ROOM_DECAP_IPXIP6
>
> These flags let BPF programs describe which tunnel layer is being
> removed, so later changes can update tunnel-related GSO state
> accordingly during decapsulation.
>
> This patch only introduces the UAPI flag definitions and helper
> documentation.
>
> Co-developed-by: Max Tottenham <mtottenh@akamai.com>
> Signed-off-by: Max Tottenham <mtottenh@akamai.com>
> Co-developed-by: Anna Glasgall <aglasgal@akamai.com>
> Signed-off-by: Anna Glasgall <aglasgal@akamai.com>
> Signed-off-by: Nick Hudson <nhudson@akamai.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v2 3/5] bpf: add helper masks for ADJ_ROOM flags and encap validation
2026-03-18 13:42 ` [PATCH v2 3/5] bpf: add helper masks for ADJ_ROOM flags and encap validation Nick Hudson
@ 2026-03-21 0:39 ` Willem de Bruijn
2026-03-24 18:12 ` Martin KaFai Lau
1 sibling, 0 replies; 20+ messages in thread
From: Willem de Bruijn @ 2026-03-21 0:39 UTC (permalink / raw)
To: Nick Hudson, bpf, netdev
Cc: Willem de Bruijn, Nick Hudson, Max Tottenham, Anna Glasgall,
Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
Martin KaFai Lau, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni, linux-kernel
Nick Hudson wrote:
> Introduce helper masks for bpf_skb_adjust_room() flags to simplify
> validation logic:
>
> - BPF_F_ADJ_ROOM_DECAP_L4_MASK
> - BPF_F_ADJ_ROOM_DECAP_IPXIP_MASK
> - BPF_F_ADJ_ROOM_ENCAP_MASK
> - BPF_F_ADJ_ROOM_DECAP_MASK
>
> Add flag validation to bpf_skb_net_grow() to reject invalid encap
> flags early. Refactor existing validation checks in bpf_skb_net_shrink()
> and bpf_skb_adjust_room() to use the new masks (no behavior change).
>
> Co-developed-by: Max Tottenham <mtottenh@akamai.com>
> Signed-off-by: Max Tottenham <mtottenh@akamai.com>
> Co-developed-by: Anna Glasgall <aglasgal@akamai.com>
> Signed-off-by: Anna Glasgall <aglasgal@akamai.com>
> Signed-off-by: Nick Hudson <nhudson@akamai.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v2 4/5] bpf: allow new DECAP flags and add guard rails
2026-03-18 13:42 ` [PATCH v2 4/5] bpf: allow new DECAP flags and add guard rails Nick Hudson
2026-03-18 20:02 ` Willem de Bruijn
@ 2026-03-21 0:40 ` Willem de Bruijn
2026-03-24 18:30 ` Martin KaFai Lau
2 siblings, 0 replies; 20+ messages in thread
From: Willem de Bruijn @ 2026-03-21 0:40 UTC (permalink / raw)
To: Nick Hudson, bpf, netdev
Cc: Willem de Bruijn, Nick Hudson, Max Tottenham, Anna Glasgall,
Martin KaFai Lau, Daniel Borkmann, Alexei Starovoitov,
Andrii Nakryiko, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni, linux-kernel
Nick Hudson wrote:
> Add checks to require shrink-only decap, reject conflicting decap flag
> combinations, and verify removed length is sufficient for claimed header
> decapsulation.
>
> Co-developed-by: Max Tottenham <mtottenh@akamai.com>
> Signed-off-by: Max Tottenham <mtottenh@akamai.com>
> Co-developed-by: Anna Glasgall <aglasgal@akamai.com>
> Signed-off-by: Anna Glasgall <aglasgal@akamai.com>
> Signed-off-by: Nick Hudson <nhudson@akamai.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v2 1/5] bpf: name the enum for BPF_FUNC_skb_adjust_room flags
2026-03-18 13:42 ` [PATCH v2 1/5] bpf: name the enum for BPF_FUNC_skb_adjust_room flags Nick Hudson
2026-03-21 0:39 ` Willem de Bruijn
@ 2026-03-24 17:34 ` Martin KaFai Lau
1 sibling, 0 replies; 20+ messages in thread
From: Martin KaFai Lau @ 2026-03-24 17:34 UTC (permalink / raw)
To: Nick Hudson
Cc: Willem de Bruijn, Max Tottenham, Anna Glasgall,
Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, bpf, netdev,
linux-kernel
On 3/18/26 6:42 AM, Nick Hudson wrote:
> The existing anonymous enum for BPF_FUNC_skb_adjust_room flags is
> named to enum bpf_adj_room_flags to enable CO-RE (Compile Once -
> Run Everywhere) lookups in BPF programs.
It would be useful to demonstrate the intended CO-RE usage in a
selftest. I suspect it is bpf_core_enum_value_exists().
There are existing tests in test_tc_tunnel.c for the earlier
BPF_F_ADJ_ROOM_* flag additions. Please add similar tests for the new
flags introduced in this series.
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v2 3/5] bpf: add helper masks for ADJ_ROOM flags and encap validation
2026-03-18 13:42 ` [PATCH v2 3/5] bpf: add helper masks for ADJ_ROOM flags and encap validation Nick Hudson
2026-03-21 0:39 ` Willem de Bruijn
@ 2026-03-24 18:12 ` Martin KaFai Lau
2026-03-26 17:02 ` Hudson, Nick
1 sibling, 1 reply; 20+ messages in thread
From: Martin KaFai Lau @ 2026-03-24 18:12 UTC (permalink / raw)
To: Nick Hudson
Cc: Willem de Bruijn, Max Tottenham, Anna Glasgall,
Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni, bpf,
netdev, linux-kernel
On 3/18/26 6:42 AM, Nick Hudson wrote:
> Introduce helper masks for bpf_skb_adjust_room() flags to simplify
> validation logic:
>
> - BPF_F_ADJ_ROOM_DECAP_L4_MASK
> - BPF_F_ADJ_ROOM_DECAP_IPXIP_MASK
> - BPF_F_ADJ_ROOM_ENCAP_MASK
> - BPF_F_ADJ_ROOM_DECAP_MASK
>
> Add flag validation to bpf_skb_net_grow() to reject invalid encap
> flags early. Refactor existing validation checks in bpf_skb_net_shrink()
> and bpf_skb_adjust_room() to use the new masks (no behavior change).
>
> Co-developed-by: Max Tottenham <mtottenh@akamai.com>
> Signed-off-by: Max Tottenham <mtottenh@akamai.com>
> Co-developed-by: Anna Glasgall <aglasgal@akamai.com>
> Signed-off-by: Anna Glasgall <aglasgal@akamai.com>
> Signed-off-by: Nick Hudson <nhudson@akamai.com>
> ---
> net/core/filter.c | 31 +++++++++++++++++++++++--------
> 1 file changed, 23 insertions(+), 8 deletions(-)
>
> diff --git a/net/core/filter.c b/net/core/filter.c
> index 0d5d5a17acb2..7c2871b40fe4 100644
> --- a/net/core/filter.c
> +++ b/net/core/filter.c
> @@ -3483,14 +3483,25 @@ static u32 bpf_skb_net_base_len(const struct sk_buff *skb)
> #define BPF_F_ADJ_ROOM_DECAP_L3_MASK (BPF_F_ADJ_ROOM_DECAP_L3_IPV4 | \
> BPF_F_ADJ_ROOM_DECAP_L3_IPV6)
>
> -#define BPF_F_ADJ_ROOM_MASK (BPF_F_ADJ_ROOM_FIXED_GSO | \
> - BPF_F_ADJ_ROOM_ENCAP_L3_MASK | \
> +#define BPF_F_ADJ_ROOM_DECAP_L4_MASK (BPF_F_ADJ_ROOM_DECAP_L4_UDP | \
> + BPF_F_ADJ_ROOM_DECAP_L4_GRE)
> +
> +#define BPF_F_ADJ_ROOM_DECAP_IPXIP_MASK (BPF_F_ADJ_ROOM_DECAP_IPXIP4 | \
> + BPF_F_ADJ_ROOM_DECAP_IPXIP6)
> +
> +#define BPF_F_ADJ_ROOM_ENCAP_MASK (BPF_F_ADJ_ROOM_ENCAP_L3_MASK | \
> BPF_F_ADJ_ROOM_ENCAP_L4_GRE | \
> BPF_F_ADJ_ROOM_ENCAP_L4_UDP | \
> BPF_F_ADJ_ROOM_ENCAP_L2_ETH | \
> BPF_F_ADJ_ROOM_ENCAP_L2( \
> - BPF_ADJ_ROOM_ENCAP_L2_MASK) | \
> - BPF_F_ADJ_ROOM_DECAP_L3_MASK)
> + BPF_ADJ_ROOM_ENCAP_L2_MASK))
> +
> +#define BPF_F_ADJ_ROOM_DECAP_MASK (BPF_F_ADJ_ROOM_DECAP_L3_MASK)
> +
> +#define BPF_F_ADJ_ROOM_MASK (BPF_F_ADJ_ROOM_FIXED_GSO | \
> + BPF_F_ADJ_ROOM_ENCAP_MASK | \
> + BPF_F_ADJ_ROOM_DECAP_MASK | \
> + BPF_F_ADJ_ROOM_NO_CSUM_RESET)
The patch does two things: refactoring of existing macros
(BPF_F_ADJ_ROOM_ENCAP_MASK, BPF_F_ADJ_ROOM_DECAP_MASK) and new additions
(BPF_F_ADJ_ROOM_DECAP_L4_MASK, BPF_F_ADJ_ROOM_DECAP_IPXIP_MASK) that
depend on the new flags from the UAPI changes in patch 2.
The refactoring does not depend on the new UAPI flags and could be a
separate patch placed earlier in the series. That way a reviewer can
verify it is a no-op without the new flag additions getting in
the way. The (BPF_F_ADJ_ROOM_DECAP_L4_MASK,
BPF_F_ADJ_ROOM_DECAP_IPXIP_MASK) can be introduced together in patch 4
when it is first used.
>
> static int bpf_skb_net_grow(struct sk_buff *skb, u32 off, u32 len_diff,
> u64 flags)
> @@ -3502,6 +3513,11 @@ static int bpf_skb_net_grow(struct sk_buff *skb, u32 off, u32 len_diff,
> unsigned int gso_type = SKB_GSO_DODGY;
> int ret;
>
> + if (unlikely(flags & ~(BPF_F_ADJ_ROOM_ENCAP_MASK |
> + BPF_F_ADJ_ROOM_NO_CSUM_RESET |
> + BPF_F_ADJ_ROOM_FIXED_GSO)))
Under which case this new check will be hit?
> + return -EINVAL;
> +
> if (skb_is_gso(skb) && !skb_is_gso_tcp(skb)) {
> /* udp gso_size delineates datagrams, only allow if fixed */
> if (!(skb_shinfo(skb)->gso_type & SKB_GSO_UDP_L4) ||
> @@ -3611,8 +3627,8 @@ static int bpf_skb_net_shrink(struct sk_buff *skb, u32 off, u32 len_diff,
> {
> int ret;
>
> - if (unlikely(flags & ~(BPF_F_ADJ_ROOM_FIXED_GSO |
> - BPF_F_ADJ_ROOM_DECAP_L3_MASK |
> + if (unlikely(flags & ~(BPF_F_ADJ_ROOM_DECAP_MASK |
> + BPF_F_ADJ_ROOM_FIXED_GSO |
> BPF_F_ADJ_ROOM_NO_CSUM_RESET)))
> return -EINVAL;
>
> @@ -3708,8 +3724,7 @@ BPF_CALL_4(bpf_skb_adjust_room, struct sk_buff *, skb, s32, len_diff,
> u32 off;
> int ret;
>
> - if (unlikely(flags & ~(BPF_F_ADJ_ROOM_MASK |
> - BPF_F_ADJ_ROOM_NO_CSUM_RESET)))
> + if (unlikely(flags & ~BPF_F_ADJ_ROOM_MASK))
> return -EINVAL;
> if (unlikely(len_diff_abs > 0xfffU))
> return -EFAULT;
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v2 4/5] bpf: allow new DECAP flags and add guard rails
2026-03-18 13:42 ` [PATCH v2 4/5] bpf: allow new DECAP flags and add guard rails Nick Hudson
2026-03-18 20:02 ` Willem de Bruijn
2026-03-21 0:40 ` Willem de Bruijn
@ 2026-03-24 18:30 ` Martin KaFai Lau
2026-03-26 17:02 ` Hudson, Nick
2 siblings, 1 reply; 20+ messages in thread
From: Martin KaFai Lau @ 2026-03-24 18:30 UTC (permalink / raw)
To: Nick Hudson
Cc: Willem de Bruijn, Max Tottenham, Anna Glasgall, Daniel Borkmann,
Alexei Starovoitov, Andrii Nakryiko, David S. Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni, bpf, netdev,
linux-kernel
On 3/18/26 6:42 AM, Nick Hudson wrote:
> Add checks to require shrink-only decap, reject conflicting decap flag
> combinations, and verify removed length is sufficient for claimed header
> decapsulation.
>
> Co-developed-by: Max Tottenham <mtottenh@akamai.com>
> Signed-off-by: Max Tottenham <mtottenh@akamai.com>
> Co-developed-by: Anna Glasgall <aglasgal@akamai.com>
> Signed-off-by: Anna Glasgall <aglasgal@akamai.com>
> Signed-off-by: Nick Hudson <nhudson@akamai.com>
> ---
> net/core/filter.c | 47 +++++++++++++++++++++++++++++++++++++----------
> 1 file changed, 37 insertions(+), 10 deletions(-)
>
> diff --git a/net/core/filter.c b/net/core/filter.c
> index 7c2871b40fe4..47aec44a9cd3 100644
> --- a/net/core/filter.c
> +++ b/net/core/filter.c
> @@ -56,6 +56,7 @@
> #include <net/sock_reuseport.h>
> #include <net/busy_poll.h>
> #include <net/tcp.h>
> +#include <net/gre.h>
> #include <net/xfrm.h>
> #include <net/udp.h>
> #include <linux/bpf_trace.h>
> @@ -3496,7 +3497,9 @@ static u32 bpf_skb_net_base_len(const struct sk_buff *skb)
> BPF_F_ADJ_ROOM_ENCAP_L2( \
> BPF_ADJ_ROOM_ENCAP_L2_MASK))
>
> -#define BPF_F_ADJ_ROOM_DECAP_MASK (BPF_F_ADJ_ROOM_DECAP_L3_MASK)
> +#define BPF_F_ADJ_ROOM_DECAP_MASK (BPF_F_ADJ_ROOM_DECAP_L3_MASK | \
> + BPF_F_ADJ_ROOM_DECAP_L4_MASK | \
> + BPF_F_ADJ_ROOM_DECAP_IPXIP_MASK)
>
> #define BPF_F_ADJ_ROOM_MASK (BPF_F_ADJ_ROOM_FIXED_GSO | \
> BPF_F_ADJ_ROOM_ENCAP_MASK | \
> @@ -3743,20 +3746,44 @@ BPF_CALL_4(bpf_skb_adjust_room, struct sk_buff *, skb, s32, len_diff,
> return -ENOTSUPP;
> }
>
> - if (flags & BPF_F_ADJ_ROOM_DECAP_L3_MASK) {
> + if (flags & BPF_F_ADJ_ROOM_DECAP_MASK) {
This change should be done together with the macro refactoring patch
mentioned in patch 3.
> + u32 len_decap_min = 0;
> +
> if (!shrink)
> return -EINVAL;
>
> - switch (flags & BPF_F_ADJ_ROOM_DECAP_L3_MASK) {
> - case BPF_F_ADJ_ROOM_DECAP_L3_IPV4:
> + /* Reject mutually exclusive decap flag pairs. */
> + if ((flags & BPF_F_ADJ_ROOM_DECAP_L3_MASK) ==
> + BPF_F_ADJ_ROOM_DECAP_L3_MASK)
iiuc, this 'if' and the len_min assignment changes below replace the
existing switch case. Please separate this no-op change from the new
flag validation logic. It is small enough to be done together in the
macro refactoring patch also.
> + return -EINVAL;
> +
> + if ((flags & BPF_F_ADJ_ROOM_DECAP_L4_MASK) ==
> + BPF_F_ADJ_ROOM_DECAP_L4_MASK)
> + return -EINVAL;
> +
> + if ((flags & BPF_F_ADJ_ROOM_DECAP_IPXIP_MASK) ==
> + BPF_F_ADJ_ROOM_DECAP_IPXIP_MASK)
> + return -EINVAL;
> +
> + /* Reject mutually exclusive decap tunnel type flags. */
> + if ((flags & BPF_F_ADJ_ROOM_DECAP_L4_MASK) &&
> + (flags & BPF_F_ADJ_ROOM_DECAP_IPXIP_MASK))
> + return -EINVAL;
> +
> + if (flags & BPF_F_ADJ_ROOM_DECAP_L4_UDP)
> + len_decap_min += sizeof(struct udphdr);
> +
> + if (flags & BPF_F_ADJ_ROOM_DECAP_L4_GRE)
> + len_decap_min += sizeof(struct gre_base_hdr);
> +
> + if (len_diff_abs < len_decap_min)
> + return -EINVAL;
> +
> + if (flags & BPF_F_ADJ_ROOM_DECAP_L3_IPV4)
> len_min = sizeof(struct iphdr);
> - break;
> - case BPF_F_ADJ_ROOM_DECAP_L3_IPV6:
> +
> + if (flags & BPF_F_ADJ_ROOM_DECAP_L3_IPV6)
> len_min = sizeof(struct ipv6hdr);
> - break;
> - default:
> - return -EINVAL;
> - }
> }
>
> len_cur = skb->len - skb_network_offset(skb);
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v2 4/5] bpf: allow new DECAP flags and add guard rails
2026-03-24 18:30 ` Martin KaFai Lau
@ 2026-03-26 17:02 ` Hudson, Nick
0 siblings, 0 replies; 20+ messages in thread
From: Hudson, Nick @ 2026-03-26 17:02 UTC (permalink / raw)
To: Martin KaFai Lau
Cc: Willem de Bruijn, Tottenham, Max, Glasgall, Anna, Daniel Borkmann,
Alexei Starovoitov, Andrii Nakryiko, David S. Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni, bpf@vger.kernel.org,
netdev@vger.kernel.org, linux-kernel@vger.kernel.org
[-- Attachment #1: Type: text/plain, Size: 3964 bytes --]
> On Mar 24, 2026, at 6:30 PM, Martin KaFai Lau <martin.lau@linux.dev> wrote:
>
> !-------------------------------------------------------------------|
> This Message Is From an External Sender
> This message came from outside your organization.
> |-------------------------------------------------------------------!
>
>
>
> On 3/18/26 6:42 AM, Nick Hudson wrote:
>> Add checks to require shrink-only decap, reject conflicting decap flag
>> combinations, and verify removed length is sufficient for claimed header
>> decapsulation.
>> Co-developed-by: Max Tottenham <mtottenh@akamai.com>
>> Signed-off-by: Max Tottenham <mtottenh@akamai.com>
>> Co-developed-by: Anna Glasgall <aglasgal@akamai.com>
>> Signed-off-by: Anna Glasgall <aglasgal@akamai.com>
>> Signed-off-by: Nick Hudson <nhudson@akamai.com>
>> ---
>> net/core/filter.c | 47 +++++++++++++++++++++++++++++++++++++----------
>> 1 file changed, 37 insertions(+), 10 deletions(-)
>> diff --git a/net/core/filter.c b/net/core/filter.c
>> index 7c2871b40fe4..47aec44a9cd3 100644
>> --- a/net/core/filter.c
>> +++ b/net/core/filter.c
>> @@ -56,6 +56,7 @@
>> #include <net/sock_reuseport.h>
>> #include <net/busy_poll.h>
>> #include <net/tcp.h>
>> +#include <net/gre.h>
>> #include <net/xfrm.h>
>> #include <net/udp.h>
>> #include <linux/bpf_trace.h>
>> @@ -3496,7 +3497,9 @@ static u32 bpf_skb_net_base_len(const struct sk_buff *skb)
>> BPF_F_ADJ_ROOM_ENCAP_L2( \
>> BPF_ADJ_ROOM_ENCAP_L2_MASK))
>> -#define BPF_F_ADJ_ROOM_DECAP_MASK (BPF_F_ADJ_ROOM_DECAP_L3_MASK)
>> +#define BPF_F_ADJ_ROOM_DECAP_MASK (BPF_F_ADJ_ROOM_DECAP_L3_MASK | \
>> + BPF_F_ADJ_ROOM_DECAP_L4_MASK | \
>> + BPF_F_ADJ_ROOM_DECAP_IPXIP_MASK)
>> #define BPF_F_ADJ_ROOM_MASK (BPF_F_ADJ_ROOM_FIXED_GSO | \
>> BPF_F_ADJ_ROOM_ENCAP_MASK | \
>> @@ -3743,20 +3746,44 @@ BPF_CALL_4(bpf_skb_adjust_room, struct sk_buff *, skb, s32, len_diff,
>> return -ENOTSUPP;
>> }
>> - if (flags & BPF_F_ADJ_ROOM_DECAP_L3_MASK) {
>> + if (flags & BPF_F_ADJ_ROOM_DECAP_MASK) {
>
> This change should be done together with the macro refactoring patch mentioned in patch 3.
OK, will send a new version with it done this way.
>
>> + u32 len_decap_min = 0;
>> +
>> if (!shrink)
>> return -EINVAL;
>> - switch (flags & BPF_F_ADJ_ROOM_DECAP_L3_MASK) {
>> - case BPF_F_ADJ_ROOM_DECAP_L3_IPV4:
>> + /* Reject mutually exclusive decap flag pairs. */
>> + if ((flags & BPF_F_ADJ_ROOM_DECAP_L3_MASK) ==
>> + BPF_F_ADJ_ROOM_DECAP_L3_MASK)
>
> iiuc, this 'if' and the len_min assignment changes below replace the existing switch case. Please separate this no-op change from the new flag validation logic. It is small enough to be done together in the macro refactoring patch also.
>
>> + return -EINVAL;
>> +
>> + if ((flags & BPF_F_ADJ_ROOM_DECAP_L4_MASK) ==
>> + BPF_F_ADJ_ROOM_DECAP_L4_MASK)
>> + return -EINVAL;
>> +
>> + if ((flags & BPF_F_ADJ_ROOM_DECAP_IPXIP_MASK) ==
>> + BPF_F_ADJ_ROOM_DECAP_IPXIP_MASK)
>> + return -EINVAL;
>> +
>> + /* Reject mutually exclusive decap tunnel type flags. */
>> + if ((flags & BPF_F_ADJ_ROOM_DECAP_L4_MASK) &&
>> + (flags & BPF_F_ADJ_ROOM_DECAP_IPXIP_MASK))
>> + return -EINVAL;
>> +
>> + if (flags & BPF_F_ADJ_ROOM_DECAP_L4_UDP)
>> + len_decap_min += sizeof(struct udphdr);
>> +
>> + if (flags & BPF_F_ADJ_ROOM_DECAP_L4_GRE)
>> + len_decap_min += sizeof(struct gre_base_hdr);
>> +
>> + if (len_diff_abs < len_decap_min)
>> + return -EINVAL;
>> +
>> + if (flags & BPF_F_ADJ_ROOM_DECAP_L3_IPV4)
>> len_min = sizeof(struct iphdr);
>> - break;
>> - case BPF_F_ADJ_ROOM_DECAP_L3_IPV6:
>> +
>> + if (flags & BPF_F_ADJ_ROOM_DECAP_L3_IPV6)
>> len_min = sizeof(struct ipv6hdr);
>> - break;
>> - default:
>> - return -EINVAL;
>> - }
>> }
>> len_cur = skb->len - skb_network_offset(skb);
>
[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 3066 bytes --]
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v2 3/5] bpf: add helper masks for ADJ_ROOM flags and encap validation
2026-03-24 18:12 ` Martin KaFai Lau
@ 2026-03-26 17:02 ` Hudson, Nick
2026-03-26 17:49 ` Martin KaFai Lau
0 siblings, 1 reply; 20+ messages in thread
From: Hudson, Nick @ 2026-03-26 17:02 UTC (permalink / raw)
To: Martin KaFai Lau
Cc: Willem de Bruijn, Tottenham, Max, Glasgall, Anna,
Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
bpf@vger.kernel.org, netdev@vger.kernel.org,
linux-kernel@vger.kernel.org
[-- Attachment #1: Type: text/plain, Size: 4768 bytes --]
> On Mar 24, 2026, at 6:12 PM, Martin KaFai Lau <martin.lau@linux.dev> wrote:
>
> !-------------------------------------------------------------------|
> This Message Is From an External Sender
> This message came from outside your organization.
> |-------------------------------------------------------------------!
>
> On 3/18/26 6:42 AM, Nick Hudson wrote:
>> Introduce helper masks for bpf_skb_adjust_room() flags to simplify
>> validation logic:
>> - BPF_F_ADJ_ROOM_DECAP_L4_MASK
>> - BPF_F_ADJ_ROOM_DECAP_IPXIP_MASK
>> - BPF_F_ADJ_ROOM_ENCAP_MASK
>> - BPF_F_ADJ_ROOM_DECAP_MASK
>> Add flag validation to bpf_skb_net_grow() to reject invalid encap
>> flags early. Refactor existing validation checks in bpf_skb_net_shrink()
>> and bpf_skb_adjust_room() to use the new masks (no behavior change).
>> Co-developed-by: Max Tottenham <mtottenh@akamai.com>
>> Signed-off-by: Max Tottenham <mtottenh@akamai.com>
>> Co-developed-by: Anna Glasgall <aglasgal@akamai.com>
>> Signed-off-by: Anna Glasgall <aglasgal@akamai.com>
>> Signed-off-by: Nick Hudson <nhudson@akamai.com>
>> ---
>> net/core/filter.c | 31 +++++++++++++++++++++++--------
>> 1 file changed, 23 insertions(+), 8 deletions(-)
>> diff --git a/net/core/filter.c b/net/core/filter.c
>> index 0d5d5a17acb2..7c2871b40fe4 100644
>> --- a/net/core/filter.c
>> +++ b/net/core/filter.c
>> @@ -3483,14 +3483,25 @@ static u32 bpf_skb_net_base_len(const struct sk_buff *skb)
>> #define BPF_F_ADJ_ROOM_DECAP_L3_MASK (BPF_F_ADJ_ROOM_DECAP_L3_IPV4 | \
>> BPF_F_ADJ_ROOM_DECAP_L3_IPV6)
>> -#define BPF_F_ADJ_ROOM_MASK (BPF_F_ADJ_ROOM_FIXED_GSO | \
>> - BPF_F_ADJ_ROOM_ENCAP_L3_MASK | \
>> +#define BPF_F_ADJ_ROOM_DECAP_L4_MASK (BPF_F_ADJ_ROOM_DECAP_L4_UDP | \
>> + BPF_F_ADJ_ROOM_DECAP_L4_GRE)
>> +
>> +#define BPF_F_ADJ_ROOM_DECAP_IPXIP_MASK (BPF_F_ADJ_ROOM_DECAP_IPXIP4 | \
>> + BPF_F_ADJ_ROOM_DECAP_IPXIP6)
>> +
>> +#define BPF_F_ADJ_ROOM_ENCAP_MASK (BPF_F_ADJ_ROOM_ENCAP_L3_MASK | \
>> BPF_F_ADJ_ROOM_ENCAP_L4_GRE | \
>> BPF_F_ADJ_ROOM_ENCAP_L4_UDP | \
>> BPF_F_ADJ_ROOM_ENCAP_L2_ETH | \
>> BPF_F_ADJ_ROOM_ENCAP_L2( \
>> - BPF_ADJ_ROOM_ENCAP_L2_MASK) | \
>> - BPF_F_ADJ_ROOM_DECAP_L3_MASK)
>> + BPF_ADJ_ROOM_ENCAP_L2_MASK))
>> +
>> +#define BPF_F_ADJ_ROOM_DECAP_MASK (BPF_F_ADJ_ROOM_DECAP_L3_MASK)
>> +
>> +#define BPF_F_ADJ_ROOM_MASK (BPF_F_ADJ_ROOM_FIXED_GSO | \
>> + BPF_F_ADJ_ROOM_ENCAP_MASK | \
>> + BPF_F_ADJ_ROOM_DECAP_MASK | \
>> + BPF_F_ADJ_ROOM_NO_CSUM_RESET)
>
> The patch does two things: refactoring of existing macros (BPF_F_ADJ_ROOM_ENCAP_MASK, BPF_F_ADJ_ROOM_DECAP_MASK) and new additions (BPF_F_ADJ_ROOM_DECAP_L4_MASK, BPF_F_ADJ_ROOM_DECAP_IPXIP_MASK) that depend on the new flags from the UAPI changes in patch 2.
>
> The refactoring does not depend on the new UAPI flags and could be a separate patch placed earlier in the series. That way a reviewer can verify it is a no-op without the new flag additions getting in
> the way. The (BPF_F_ADJ_ROOM_DECAP_L4_MASK, BPF_F_ADJ_ROOM_DECAP_IPXIP_MASK) can be introduced together in patch 4 when it is first used.
OK, will split further.
>
>> static int bpf_skb_net_grow(struct sk_buff *skb, u32 off, u32 len_diff,
>> u64 flags)
>> @@ -3502,6 +3513,11 @@ static int bpf_skb_net_grow(struct sk_buff *skb, u32 off, u32 len_diff,
>> unsigned int gso_type = SKB_GSO_DODGY;
>> int ret;
>> + if (unlikely(flags & ~(BPF_F_ADJ_ROOM_ENCAP_MASK |
>> + BPF_F_ADJ_ROOM_NO_CSUM_RESET |
>> + BPF_F_ADJ_ROOM_FIXED_GSO)))
>
> Under which case this new check will be hit?
If a user supplies +ve len_diff and attempts to pass a DECAP flag.
The commit message had
Add flag validation to bpf_skb_net_grow() to reject invalid encap
flags early.
>
>> + return -EINVAL;
>> +
>> if (skb_is_gso(skb) && !skb_is_gso_tcp(skb)) {
>> /* udp gso_size delineates datagrams, only allow if fixed */
>> if (!(skb_shinfo(skb)->gso_type & SKB_GSO_UDP_L4) ||
>> @@ -3611,8 +3627,8 @@ static int bpf_skb_net_shrink(struct sk_buff *skb, u32 off, u32 len_diff,
>> {
>> int ret;
>> - if (unlikely(flags & ~(BPF_F_ADJ_ROOM_FIXED_GSO |
>> - BPF_F_ADJ_ROOM_DECAP_L3_MASK |
>> + if (unlikely(flags & ~(BPF_F_ADJ_ROOM_DECAP_MASK |
>> + BPF_F_ADJ_ROOM_FIXED_GSO |
>> BPF_F_ADJ_ROOM_NO_CSUM_RESET)))
>> return -EINVAL;
>> @@ -3708,8 +3724,7 @@ BPF_CALL_4(bpf_skb_adjust_room, struct sk_buff *, skb, s32, len_diff,
>> u32 off;
>> int ret;
>> - if (unlikely(flags & ~(BPF_F_ADJ_ROOM_MASK |
>> - BPF_F_ADJ_ROOM_NO_CSUM_RESET)))
>> + if (unlikely(flags & ~BPF_F_ADJ_ROOM_MASK))
>> return -EINVAL;
>> if (unlikely(len_diff_abs > 0xfffU))
>> return -EFAULT;
[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 3066 bytes --]
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v2 3/5] bpf: add helper masks for ADJ_ROOM flags and encap validation
2026-03-26 17:02 ` Hudson, Nick
@ 2026-03-26 17:49 ` Martin KaFai Lau
2026-03-27 10:55 ` Hudson, Nick
0 siblings, 1 reply; 20+ messages in thread
From: Martin KaFai Lau @ 2026-03-26 17:49 UTC (permalink / raw)
To: Hudson, Nick
Cc: Willem de Bruijn, Tottenham, Max, Glasgall, Anna,
Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
bpf@vger.kernel.org, netdev@vger.kernel.org,
linux-kernel@vger.kernel.org
On 3/26/26 10:02 AM, Hudson, Nick wrote:
>>> static int bpf_skb_net_grow(struct sk_buff *skb, u32 off, u32 len_diff,
>>> u64 flags)
>>> @@ -3502,6 +3513,11 @@ static int bpf_skb_net_grow(struct sk_buff *skb, u32 off, u32 len_diff,
>>> unsigned int gso_type = SKB_GSO_DODGY;
>>> int ret;
>>> + if (unlikely(flags & ~(BPF_F_ADJ_ROOM_ENCAP_MASK |
>>> + BPF_F_ADJ_ROOM_NO_CSUM_RESET |
>>> + BPF_F_ADJ_ROOM_FIXED_GSO)))
>> Under which case this new check will be hit?
> If a user supplies +ve len_diff and attempts to pass a DECAP flag.
>
> The commit message had
>
> Add flag validation to bpf_skb_net_grow() to reject invalid encap
> flags early.
There is DECAP_MASK check in bpf_skb_adjust_room() and then !shrink is
rejected. What am I missing?
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v2 3/5] bpf: add helper masks for ADJ_ROOM flags and encap validation
2026-03-26 17:49 ` Martin KaFai Lau
@ 2026-03-27 10:55 ` Hudson, Nick
0 siblings, 0 replies; 20+ messages in thread
From: Hudson, Nick @ 2026-03-27 10:55 UTC (permalink / raw)
To: Martin KaFai Lau
Cc: Willem de Bruijn, Tottenham, Max, Glasgall, Anna,
Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
bpf@vger.kernel.org, netdev@vger.kernel.org,
linux-kernel@vger.kernel.org
[-- Attachment #1: Type: text/plain, Size: 1272 bytes --]
> On Mar 26, 2026, at 5:49 PM, Martin KaFai Lau <martin.lau@linux.dev> wrote:
>
> !-------------------------------------------------------------------|
> This Message Is From an External Sender
> This message came from outside your organization.
> |-------------------------------------------------------------------!
>
>
>
> On 3/26/26 10:02 AM, Hudson, Nick wrote:
>>>> static int bpf_skb_net_grow(struct sk_buff *skb, u32 off, u32 len_diff,
>>>> u64 flags)
>>>> @@ -3502,6 +3513,11 @@ static int bpf_skb_net_grow(struct sk_buff *skb, u32 off, u32 len_diff,
>>>> unsigned int gso_type = SKB_GSO_DODGY;
>>>> int ret;
>>>> + if (unlikely(flags & ~(BPF_F_ADJ_ROOM_ENCAP_MASK |
>>>> + BPF_F_ADJ_ROOM_NO_CSUM_RESET |
>>>> + BPF_F_ADJ_ROOM_FIXED_GSO)))
>>> Under which case this new check will be hit?
>> If a user supplies +ve len_diff and attempts to pass a DECAP flag.
>> The commit message had
>> Add flag validation to bpf_skb_net_grow() to reject invalid encap
>> flags early.
>
> There is DECAP_MASK check in bpf_skb_adjust_room() and then !shrink is rejected. What am I missing?
Duh, right.
Do you prefer the do all the flag checking in bpf_skb_adjust_room or keep the encap/decap split?
[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 3066 bytes --]
^ permalink raw reply [flat|nested] 20+ messages in thread
end of thread, other threads:[~2026-03-27 10:56 UTC | newest]
Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20260318134242.2725749-1-nhudson@akamai.com>
2026-03-18 13:42 ` [PATCH v2 1/5] bpf: name the enum for BPF_FUNC_skb_adjust_room flags Nick Hudson
2026-03-21 0:39 ` Willem de Bruijn
2026-03-24 17:34 ` Martin KaFai Lau
2026-03-18 13:42 ` [PATCH v2 2/5] bpf: add BPF_F_ADJ_ROOM_DECAP_* flags for tunnel decapsulation Nick Hudson
2026-03-21 0:39 ` Willem de Bruijn
2026-03-18 13:42 ` [PATCH v2 3/5] bpf: add helper masks for ADJ_ROOM flags and encap validation Nick Hudson
2026-03-21 0:39 ` Willem de Bruijn
2026-03-24 18:12 ` Martin KaFai Lau
2026-03-26 17:02 ` Hudson, Nick
2026-03-26 17:49 ` Martin KaFai Lau
2026-03-27 10:55 ` Hudson, Nick
2026-03-18 13:42 ` [PATCH v2 4/5] bpf: allow new DECAP flags and add guard rails Nick Hudson
2026-03-18 20:02 ` Willem de Bruijn
2026-03-19 8:17 ` Hudson, Nick
2026-03-19 13:24 ` Willem de Bruijn
2026-03-21 0:40 ` Willem de Bruijn
2026-03-24 18:30 ` Martin KaFai Lau
2026-03-26 17:02 ` Hudson, Nick
2026-03-18 13:42 ` [PATCH v2 5/5] bpf: clear decap tunnel GSO state in skb_adjust_room Nick Hudson
2026-03-18 20:09 ` Willem de Bruijn
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox