* [PATCH net-next v2 00/12] BIG TCP for UDP tunnels
@ 2026-02-26 20:15 Alice Mikityanska
2026-02-26 20:15 ` [PATCH net-next v2 01/12] net/sched: act_csum: don't mangle UDP tunnel GSO packets Alice Mikityanska
` (12 more replies)
0 siblings, 13 replies; 25+ messages in thread
From: Alice Mikityanska @ 2026-02-26 20:15 UTC (permalink / raw)
To: Daniel Borkmann, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni, Xin Long, Willem de Bruijn, David Ahern,
Nikolay Aleksandrov
Cc: Shuah Khan, Stanislav Fomichev, Andrew Lunn, Simon Horman,
Florian Westphal, netdev, Alice Mikityanska
From: Alice Mikityanska <alice@isovalent.com>
This series is a follow-up to "BIG TCP without HBH in IPv6", and it adds
support for BIG TCP IPv4/IPv6 workloads in vxlan and geneve. Now that
IPv6 BIG TCP doesn't require stripping the HBH in all various
combinations in tunneled traffic, adding BIG TCP becomes feasible.
Patches 01-03 are small fixups to some related code that I'm changing in
the series.
Patch 04 adds accessors for the length field in the UDP header, as
suggested by Paolo in review. The usage of udp_set_len is then added in
the following patches that start using length=0 in BIG TCP UDP packets.
Patches 05-07 close the gaps that prevent BIG TCP packets from going
through UDP tunnel code.
Patch 08 re-adds proper validation of malformed packets that arrive with
length=0 from the wire.
Patch 09 is for proper formatting in tcpdump (set UDP len to 0 rather
than a trimmed value on overflow).
Patches 10-11 bump up tso_max_size for VXLAN and GENEVE.
Patch 12 adds selftests.
Thanks all!
v2 changes: Addressed the review comments: added UDP len helpers,
consolidated UDP len sanity checks in patch 08 into one, added
selftests. Added fixups to related code (patch 01-03).
v1: https://lore.kernel.org/netdev/20250923134742.1399800-1-maxtram95@gmail.com/
Alice Mikityanska (11):
net/sched: act_csum: don't mangle UDP tunnel GSO packets
udp: gso: Simplify handling length in GSO_PARTIAL
geneve: Fix off-by-one comparing with GRO_LEGACY_MAX_SIZE
net: Use helpers to get/set UDP len tree-wide
net: Enable BIG TCP with partial GSO
udp: Support gro_ipv4_max_size > 65536
udp: Support BIG TCP GSO packets where they can occur
udp: Validate UDP length in udp_gro_receive
udp: Set length in UDP header to 0 for big GSO packets
vxlan: Enable BIG TCP packets
selftests: net: Add a test for BIG TCP in UDP tunnels
Daniel Borkmann (1):
geneve: Enable BIG TCP packets
drivers/infiniband/core/lag.c | 2 +-
drivers/infiniband/sw/rxe/rxe_net.c | 4 +-
drivers/net/amt.c | 6 +-
drivers/net/ethernet/intel/i40e/i40e_txrx.c | 2 +-
drivers/net/ethernet/intel/iavf/iavf_txrx.c | 2 +-
drivers/net/ethernet/intel/ice/ice_txrx.c | 2 +-
drivers/net/ethernet/intel/idpf/idpf_txrx.c | 2 +-
.../marvell/octeontx2/nic/otx2_txrx.c | 2 +-
.../net/ethernet/mellanox/mlx5/core/en_rx.c | 4 +-
.../ethernet/mellanox/mlx5/core/en_selftest.c | 2 +-
drivers/net/ethernet/sfc/falcon/selftest.c | 4 +-
drivers/net/ethernet/sfc/selftest.c | 4 +-
drivers/net/ethernet/sfc/siena/selftest.c | 4 +-
drivers/net/ethernet/sfc/tc_encap_actions.c | 2 +-
.../stmicro/stmmac/stmmac_selftests.c | 4 +-
drivers/net/geneve.c | 6 +-
drivers/net/netdevsim/dev.c | 2 +-
drivers/net/netdevsim/psample.c | 2 +-
drivers/net/netdevsim/psp.c | 8 +-
drivers/net/vxlan/vxlan_core.c | 2 +
drivers/net/wireguard/receive.c | 2 +-
include/linux/udp.h | 16 ++
include/net/udplite.h | 4 +-
include/trace/events/icmp.h | 2 +-
lib/tests/blackhole_dev_kunit.c | 2 +-
net/6lowpan/nhc_udp.c | 10 +-
net/core/netpoll.c | 2 +-
net/core/pktgen.c | 4 +-
net/core/selftests.c | 4 +-
net/core/skbuff.c | 10 +-
net/core/tso.c | 3 +-
net/ipv4/esp4.c | 2 +-
net/ipv4/fou_core.c | 2 +-
net/ipv4/ipconfig.c | 6 +-
net/ipv4/netfilter/nf_nat_snmp_basic_main.c | 4 +-
net/ipv4/route.c | 2 +-
net/ipv4/udp.c | 8 +-
net/ipv4/udp_offload.c | 58 +++----
net/ipv4/udp_tunnel_core.c | 2 +-
net/ipv6/esp6.c | 5 +-
net/ipv6/fou6.c | 2 +-
net/ipv6/ip6_udp_tunnel.c | 2 +-
net/ipv6/udp.c | 3 +-
net/ipv6/udp_offload.c | 2 +-
net/l2tp/l2tp_core.c | 2 +-
net/netfilter/ipvs/ip_vs_xmit.c | 2 +-
net/netfilter/nf_conntrack_proto_udp.c | 19 ++-
net/netfilter/nf_log_syslog.c | 2 +-
net/netfilter/nf_nat_helper.c | 2 +-
net/psp/psp_main.c | 2 +-
net/sched/act_csum.c | 12 +-
net/xfrm/xfrm_nat_keepalive.c | 2 +-
tools/testing/selftests/net/Makefile | 1 +
.../testing/selftests/net/big_tcp_tunnels.sh | 145 ++++++++++++++++++
54 files changed, 298 insertions(+), 114 deletions(-)
create mode 100755 tools/testing/selftests/net/big_tcp_tunnels.sh
--
2.52.0
^ permalink raw reply [flat|nested] 25+ messages in thread
* [PATCH net-next v2 01/12] net/sched: act_csum: don't mangle UDP tunnel GSO packets
2026-02-26 20:15 [PATCH net-next v2 00/12] BIG TCP for UDP tunnels Alice Mikityanska
@ 2026-02-26 20:15 ` Alice Mikityanska
2026-02-26 20:15 ` [PATCH net-next v2 02/12] udp: gso: Simplify handling length in GSO_PARTIAL Alice Mikityanska
` (11 subsequent siblings)
12 siblings, 0 replies; 25+ messages in thread
From: Alice Mikityanska @ 2026-02-26 20:15 UTC (permalink / raw)
To: Daniel Borkmann, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni, Xin Long, Willem de Bruijn, David Ahern,
Nikolay Aleksandrov
Cc: Shuah Khan, Stanislav Fomichev, Andrew Lunn, Simon Horman,
Florian Westphal, netdev, Alice Mikityanska
From: Alice Mikityanska <alice@isovalent.com>
Similar to commit add641e7dee3 ("sched: act_csum: don't mangle TCP and
UDP GSO packets"), UDP tunnel GSO packets going through act_csum
shouldn't have their checksum calculated at this point, because it will
be done after segmentation. Setting the checksum in act_csum modifies
skb->ip_summed and prevents inner IP csum offload from kicking in,
resulting in a packet with a bad checksum.
Add UDP tunnel GSO packets to the exceptions, and also add UDP GSO
(SKB_GSO_UDP_L4), as the same logic as in the commit mentioned above
applies to UDP GSO too.
Signed-off-by: Alice Mikityanska <alice@isovalent.com>
---
net/sched/act_csum.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/net/sched/act_csum.c b/net/sched/act_csum.c
index 213e1ce9d2da..a5cc76613f32 100644
--- a/net/sched/act_csum.c
+++ b/net/sched/act_csum.c
@@ -259,7 +259,9 @@ static int tcf_csum_ipv4_udp(struct sk_buff *skb, unsigned int ihl,
const struct iphdr *iph;
u16 ul;
- if (skb_is_gso(skb) && skb_shinfo(skb)->gso_type & SKB_GSO_UDP)
+ if (skb_is_gso(skb) && skb_shinfo(skb)->gso_type &
+ (SKB_GSO_UDP | SKB_GSO_UDP_L4 |
+ SKB_GSO_UDP_TUNNEL | SKB_GSO_UDP_TUNNEL_CSUM))
return 1;
/*
@@ -315,7 +317,9 @@ static int tcf_csum_ipv6_udp(struct sk_buff *skb, unsigned int ihl,
const struct ipv6hdr *ip6h;
u16 ul;
- if (skb_is_gso(skb) && skb_shinfo(skb)->gso_type & SKB_GSO_UDP)
+ if (skb_is_gso(skb) && skb_shinfo(skb)->gso_type &
+ (SKB_GSO_UDP | SKB_GSO_UDP_L4 |
+ SKB_GSO_UDP_TUNNEL | SKB_GSO_UDP_TUNNEL_CSUM))
return 1;
/*
--
2.52.0
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH net-next v2 02/12] udp: gso: Simplify handling length in GSO_PARTIAL
2026-02-26 20:15 [PATCH net-next v2 00/12] BIG TCP for UDP tunnels Alice Mikityanska
2026-02-26 20:15 ` [PATCH net-next v2 01/12] net/sched: act_csum: don't mangle UDP tunnel GSO packets Alice Mikityanska
@ 2026-02-26 20:15 ` Alice Mikityanska
2026-03-06 20:55 ` Willem de Bruijn
2026-02-26 20:15 ` [PATCH net-next v2 03/12] geneve: Fix off-by-one comparing with GRO_LEGACY_MAX_SIZE Alice Mikityanska
` (10 subsequent siblings)
12 siblings, 1 reply; 25+ messages in thread
From: Alice Mikityanska @ 2026-02-26 20:15 UTC (permalink / raw)
To: Daniel Borkmann, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni, Xin Long, Willem de Bruijn, David Ahern,
Nikolay Aleksandrov
Cc: Shuah Khan, Stanislav Fomichev, Andrew Lunn, Simon Horman,
Florian Westphal, netdev, Alice Mikityanska, Gal Pressman
From: Alice Mikityanska <alice@isovalent.com>
Taking further the idea of commit b10b446ce7ad ("udp: gso: Use single
MSS length in UDP header for GSO_PARTIAL"), simplify the implementation
and fix the checksum (apparently ignored by hardware anyway).
The mentioned commit started using msslen for uh->len, but still uses
newlen to adjust uh->check. If the formula for check is fixed, newlen is
assigned but never used before the loop, and newlen is overwritten after
the loop. This makes msslen not really necessary, as we can reuse
newlen, if we don't adjust mss before. The adjustment of mss can be
simply dropped, because mss is not used anywhere else below.
This brings us back to one variable, drops an unneeded arithmetic for
mss, and fixes the UDP checksum.
Signed-off-by: Alice Mikityanska <alice@isovalent.com>
Cc: Gal Pressman <gal@nvidia.com>
---
net/ipv4/udp_offload.c | 13 ++-----------
1 file changed, 2 insertions(+), 11 deletions(-)
diff --git a/net/ipv4/udp_offload.c b/net/ipv4/udp_offload.c
index 6b1654c1ad4a..e831234326c4 100644
--- a/net/ipv4/udp_offload.c
+++ b/net/ipv4/udp_offload.c
@@ -483,11 +483,11 @@ struct sk_buff *__udp_gso_segment(struct sk_buff *gso_skb,
struct sock *sk = gso_skb->sk;
unsigned int sum_truesize = 0;
struct sk_buff *segs, *seg;
- __be16 newlen, msslen;
struct udphdr *uh;
unsigned int mss;
bool copy_dtor;
__sum16 check;
+ __be16 newlen;
int ret = 0;
mss = skb_shinfo(gso_skb)->gso_size;
@@ -556,15 +556,6 @@ struct sk_buff *__udp_gso_segment(struct sk_buff *gso_skb,
return segs;
}
- msslen = htons(sizeof(*uh) + mss);
-
- /* GSO partial and frag_list segmentation only requires splitting
- * the frame into an MSS multiple and possibly a remainder, both
- * cases return a GSO skb. So update the mss now.
- */
- if (skb_is_gso(segs))
- mss *= skb_shinfo(segs)->gso_segs;
-
seg = segs;
uh = udp_hdr(seg);
@@ -587,7 +578,7 @@ struct sk_buff *__udp_gso_segment(struct sk_buff *gso_skb,
if (!seg->next)
break;
- uh->len = msslen;
+ uh->len = newlen;
uh->check = check;
if (seg->ip_summed == CHECKSUM_PARTIAL)
--
2.52.0
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH net-next v2 03/12] geneve: Fix off-by-one comparing with GRO_LEGACY_MAX_SIZE
2026-02-26 20:15 [PATCH net-next v2 00/12] BIG TCP for UDP tunnels Alice Mikityanska
2026-02-26 20:15 ` [PATCH net-next v2 01/12] net/sched: act_csum: don't mangle UDP tunnel GSO packets Alice Mikityanska
2026-02-26 20:15 ` [PATCH net-next v2 02/12] udp: gso: Simplify handling length in GSO_PARTIAL Alice Mikityanska
@ 2026-02-26 20:15 ` Alice Mikityanska
2026-02-26 20:20 ` Alice Mikityanska
2026-02-26 20:15 ` [PATCH net-next v2 04/12] net: Use helpers to get/set UDP len tree-wide Alice Mikityanska
` (9 subsequent siblings)
12 siblings, 1 reply; 25+ messages in thread
From: Alice Mikityanska @ 2026-02-26 20:15 UTC (permalink / raw)
To: Daniel Borkmann, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni, Xin Long, Willem de Bruijn, David Ahern,
Nikolay Aleksandrov
Cc: Shuah Khan, Stanislav Fomichev, Andrew Lunn, Simon Horman,
Florian Westphal, netdev, Alice Mikityanska
From: Alice Mikityanska <alice@isovalent.com>
GRO_LEGACY_MAX_SIZE = 65536; total_len being 65536 is too big to fit
into a u16. As can be seen in skb_gro_receive, packets bigger or equal
to gro_max_size (or GRO_LEGACY_MAX_SIZE) are dropped with -E2BIG. Apply
the same boundary to geneve_post_decap_hint to avoid writing 65536 to a
16-bit iph->tot_len field with an overflow.
Fixes: fd0dd796576e ("geneve: use GRO hint option in the RX path")
Signed-off-by: Alice Mikityanska <alice@isovalent.com>
---
drivers/net/geneve.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/geneve.c b/drivers/net/geneve.c
index 01cdd06102e0..7a26e2439d48 100644
--- a/drivers/net/geneve.c
+++ b/drivers/net/geneve.c
@@ -604,7 +604,7 @@ static int geneve_post_decap_hint(const struct sock *sk, struct sk_buff *skb,
ipv6h = (void *)skb->data + gro_hint->nested_nh_offset;
iph = (struct iphdr *)ipv6h;
total_len = skb->len - gro_hint->nested_nh_offset;
- if (total_len > GRO_LEGACY_MAX_SIZE)
+ if (total_len >= GRO_LEGACY_MAX_SIZE)
return -E2BIG;
/*
--
2.52.0
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH net-next v2 04/12] net: Use helpers to get/set UDP len tree-wide
2026-02-26 20:15 [PATCH net-next v2 00/12] BIG TCP for UDP tunnels Alice Mikityanska
` (2 preceding siblings ...)
2026-02-26 20:15 ` [PATCH net-next v2 03/12] geneve: Fix off-by-one comparing with GRO_LEGACY_MAX_SIZE Alice Mikityanska
@ 2026-02-26 20:15 ` Alice Mikityanska
2026-02-26 20:19 ` Alice Mikityanska
2026-02-26 20:15 ` [PATCH net-next v2 05/12] net: Enable BIG TCP with partial GSO Alice Mikityanska
` (8 subsequent siblings)
12 siblings, 1 reply; 25+ messages in thread
From: Alice Mikityanska @ 2026-02-26 20:15 UTC (permalink / raw)
To: Daniel Borkmann, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni, Xin Long, Willem de Bruijn, David Ahern,
Nikolay Aleksandrov
Cc: Shuah Khan, Stanislav Fomichev, Andrew Lunn, Simon Horman,
Florian Westphal, netdev, Alice Mikityanska
From: Alice Mikityanska <alice@isovalent.com>
Since BIG TCP for UDP tunnels will start using len=0 in the UDP header
as an indicator of a GSO packet bigger than 65535 bytes, this commit
introduces the following getter and setters to use tree-wide, in order
to explicitly mark places where len=0 may be expected, and handle them
properly:
1. udp_get_len_short() returns len in host byte order: to be used on the
RX side to deal with non-aggregated packets, or to access the raw value
of the len field.
2. udp_set_len() sets uh->len to its real value if it's not bigger than
65535, and to 0 otherwise: to be used in GSO context with aggregated
packets.
3. udp_set_len_short() is to be used when the length is known to fit 16
bits. It WARNs when the caller tries to assign a bigger value if
CONFIG_DEBUG_NET=y.
At the moment udp_set_len() is not used, a following commit will start
using it after enabling len>65535 for GSO.
Raw uh->len (in network byte order) is still accessed in a few places
for checksum calculation purposes, and in __udp6_lib_rcv and nsim_do_psp
to decode len=0 (__udp4_lib_rcv will be modified to parse len=0 in the
corresponding commit).
Signed-off-by: Alice Mikityanska <alice@isovalent.com>
---
drivers/infiniband/core/lag.c | 2 +-
drivers/infiniband/sw/rxe/rxe_net.c | 4 +-
drivers/net/amt.c | 6 +--
drivers/net/ethernet/intel/i40e/i40e_txrx.c | 2 +-
drivers/net/ethernet/intel/iavf/iavf_txrx.c | 2 +-
drivers/net/ethernet/intel/ice/ice_txrx.c | 2 +-
drivers/net/ethernet/intel/idpf/idpf_txrx.c | 2 +-
.../marvell/octeontx2/nic/otx2_txrx.c | 2 +-
.../net/ethernet/mellanox/mlx5/core/en_rx.c | 4 +-
.../ethernet/mellanox/mlx5/core/en_selftest.c | 2 +-
drivers/net/ethernet/sfc/falcon/selftest.c | 4 +-
drivers/net/ethernet/sfc/selftest.c | 4 +-
drivers/net/ethernet/sfc/siena/selftest.c | 4 +-
drivers/net/ethernet/sfc/tc_encap_actions.c | 2 +-
.../stmicro/stmmac/stmmac_selftests.c | 4 +-
drivers/net/geneve.c | 2 +-
drivers/net/netdevsim/dev.c | 2 +-
drivers/net/netdevsim/psample.c | 2 +-
drivers/net/netdevsim/psp.c | 8 ++--
drivers/net/wireguard/receive.c | 2 +-
include/linux/udp.h | 16 ++++++++
include/net/udplite.h | 4 +-
include/trace/events/icmp.h | 2 +-
lib/tests/blackhole_dev_kunit.c | 2 +-
net/6lowpan/nhc_udp.c | 10 ++---
net/core/netpoll.c | 2 +-
net/core/pktgen.c | 4 +-
net/core/selftests.c | 4 +-
net/core/tso.c | 3 +-
net/ipv4/esp4.c | 2 +-
net/ipv4/fou_core.c | 2 +-
net/ipv4/ipconfig.c | 6 +--
net/ipv4/netfilter/nf_nat_snmp_basic_main.c | 4 +-
net/ipv4/route.c | 2 +-
net/ipv4/udp.c | 3 +-
net/ipv4/udp_offload.c | 37 +++++++++----------
net/ipv4/udp_tunnel_core.c | 2 +-
net/ipv6/esp6.c | 5 ++-
net/ipv6/fou6.c | 2 +-
net/ipv6/ip6_udp_tunnel.c | 2 +-
net/ipv6/udp.c | 3 +-
net/ipv6/udp_offload.c | 2 +-
net/l2tp/l2tp_core.c | 2 +-
net/netfilter/ipvs/ip_vs_xmit.c | 2 +-
net/netfilter/nf_conntrack_proto_udp.c | 17 +++++++--
net/netfilter/nf_log_syslog.c | 2 +-
net/netfilter/nf_nat_helper.c | 2 +-
net/psp/psp_main.c | 2 +-
net/sched/act_csum.c | 4 +-
net/xfrm/xfrm_nat_keepalive.c | 2 +-
50 files changed, 123 insertions(+), 91 deletions(-)
diff --git a/drivers/infiniband/core/lag.c b/drivers/infiniband/core/lag.c
index 8fd80adfe833..00fe241737ff 100644
--- a/drivers/infiniband/core/lag.c
+++ b/drivers/infiniband/core/lag.c
@@ -36,7 +36,7 @@ static struct sk_buff *rdma_build_skb(struct net_device *netdev,
uh->source =
htons(rdma_flow_label_to_udp_sport(ah_attr->grh.flow_label));
uh->dest = htons(ROCE_V2_UDP_DPORT);
- uh->len = htons(sizeof(struct udphdr));
+ udp_set_len_short(uh, sizeof(struct udphdr));
if (is_ipv4) {
skb_push(skb, sizeof(struct iphdr));
diff --git a/drivers/infiniband/sw/rxe/rxe_net.c b/drivers/infiniband/sw/rxe/rxe_net.c
index 0bd0902b11f7..7508d2c3a306 100644
--- a/drivers/infiniband/sw/rxe/rxe_net.c
+++ b/drivers/infiniband/sw/rxe/rxe_net.c
@@ -237,7 +237,7 @@ static int rxe_udp_encap_recv(struct sock *sk, struct sk_buff *skb)
pkt->port_num = 1;
pkt->hdr = (u8 *)(udph + 1);
pkt->mask = RXE_GRH_MASK;
- pkt->paylen = be16_to_cpu(udph->len) - sizeof(*udph);
+ pkt->paylen = udp_get_len_short(udph) - sizeof(*udph);
/* remove udp header */
skb_pull(skb, sizeof(struct udphdr));
@@ -300,7 +300,7 @@ static void prepare_udp_hdr(struct sk_buff *skb, __be16 src_port,
udph->dest = dst_port;
udph->source = src_port;
- udph->len = htons(skb->len);
+ udp_set_len_short(udph, skb->len);
udph->check = 0;
}
diff --git a/drivers/net/amt.c b/drivers/net/amt.c
index f2f3139e38a5..01511eca7d84 100644
--- a/drivers/net/amt.c
+++ b/drivers/net/amt.c
@@ -667,7 +667,7 @@ static void amt_send_discovery(struct amt_dev *amt)
udph = udp_hdr(skb);
udph->source = amt->gw_port;
udph->dest = amt->relay_port;
- udph->len = htons(sizeof(*udph) + sizeof(*amtd));
+ udp_set_len_short(udph, sizeof(*udph) + sizeof(*amtd));
udph->check = 0;
offset = skb_transport_offset(skb);
skb->csum = skb_checksum(skb, offset, skb->len - offset, 0);
@@ -758,7 +758,7 @@ static void amt_send_request(struct amt_dev *amt, bool v6)
udph = udp_hdr(skb);
udph->source = amt->gw_port;
udph->dest = amt->relay_port;
- udph->len = htons(sizeof(*amtrh) + sizeof(*udph));
+ udp_set_len_short(udph, sizeof(*amtrh) + sizeof(*udph));
udph->check = 0;
offset = skb_transport_offset(skb);
skb->csum = skb_checksum(skb, offset, skb->len - offset, 0);
@@ -2608,7 +2608,7 @@ static void amt_send_advertisement(struct amt_dev *amt, __be32 nonce,
udph = udp_hdr(skb);
udph->source = amt->relay_port;
udph->dest = dport;
- udph->len = htons(sizeof(*amta) + sizeof(*udph));
+ udp_set_len_short(udph, sizeof(*amta) + sizeof(*udph));
udph->check = 0;
offset = skb_transport_offset(skb);
skb->csum = skb_checksum(skb, offset, skb->len - offset, 0);
diff --git a/drivers/net/ethernet/intel/i40e/i40e_txrx.c b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
index 34db7d8866b0..63433279e3c3 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_txrx.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
@@ -3128,7 +3128,7 @@ static int i40e_tso(struct i40e_tx_buffer *first, u8 *hdr_len,
SKB_GSO_UDP_TUNNEL_CSUM)) {
if (!(skb_shinfo(skb)->gso_type & SKB_GSO_PARTIAL) &&
(skb_shinfo(skb)->gso_type & SKB_GSO_UDP_TUNNEL_CSUM)) {
- l4.udp->len = 0;
+ udp_set_len_short(l4.udp, 0);
/* determine offset of outer transport header */
l4_offset = l4.hdr - skb->data;
diff --git a/drivers/net/ethernet/intel/iavf/iavf_txrx.c b/drivers/net/ethernet/intel/iavf/iavf_txrx.c
index 363c42bf3dcf..c30abf17cf5d 100644
--- a/drivers/net/ethernet/intel/iavf/iavf_txrx.c
+++ b/drivers/net/ethernet/intel/iavf/iavf_txrx.c
@@ -1774,7 +1774,7 @@ static int iavf_tso(struct iavf_tx_buffer *first, u8 *hdr_len,
SKB_GSO_UDP_TUNNEL_CSUM)) {
if (!(skb_shinfo(skb)->gso_type & SKB_GSO_PARTIAL) &&
(skb_shinfo(skb)->gso_type & SKB_GSO_UDP_TUNNEL_CSUM)) {
- l4.udp->len = 0;
+ udp_set_len_short(l4.udp, 0);
/* determine offset of outer transport header */
l4_offset = l4.hdr - skb->data;
diff --git a/drivers/net/ethernet/intel/ice/ice_txrx.c b/drivers/net/ethernet/intel/ice/ice_txrx.c
index a5bbce68f76c..b45db305dd64 100644
--- a/drivers/net/ethernet/intel/ice/ice_txrx.c
+++ b/drivers/net/ethernet/intel/ice/ice_txrx.c
@@ -1882,7 +1882,7 @@ int ice_tso(struct ice_tx_buf *first, struct ice_tx_offload_params *off)
SKB_GSO_UDP_TUNNEL_CSUM)) {
if (!(skb_shinfo(skb)->gso_type & SKB_GSO_PARTIAL) &&
(skb_shinfo(skb)->gso_type & SKB_GSO_UDP_TUNNEL_CSUM)) {
- l4.udp->len = 0;
+ udp_set_len_short(l4.udp, 0);
/* determine offset of outer transport header */
l4_start = (u8)(l4.hdr - skb->data);
diff --git a/drivers/net/ethernet/intel/idpf/idpf_txrx.c b/drivers/net/ethernet/intel/idpf/idpf_txrx.c
index 05a162094d10..3a160801e3b8 100644
--- a/drivers/net/ethernet/intel/idpf/idpf_txrx.c
+++ b/drivers/net/ethernet/intel/idpf/idpf_txrx.c
@@ -2865,7 +2865,7 @@ int idpf_tso(struct sk_buff *skb, struct idpf_tx_offload_params *off)
(__force __wsum)htonl(paylen));
/* compute length of segmentation header */
off->tso_hdr_len = sizeof(struct udphdr) + l4_start;
- l4.udp->len = htons(shinfo->gso_size + sizeof(struct udphdr));
+ udp_set_len_short(l4.udp, shinfo->gso_size + sizeof(struct udphdr));
break;
default:
return -EINVAL;
diff --git a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_txrx.c b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_txrx.c
index 625bb5a05344..8d2d607bc92f 100644
--- a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_txrx.c
+++ b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_txrx.c
@@ -750,7 +750,7 @@ static void otx2_sqe_add_ext(struct otx2_nic *pfvf, struct otx2_snd_queue *sq,
ext->lso_format = pfvf->hw.lso_udpv6_idx;
}
- udph->len = htons(sizeof(struct udphdr));
+ udp_set_len_short(udph, sizeof(struct udphdr));
}
} else if (skb_shinfo(skb)->tx_flags & SKBTX_HW_TSTAMP) {
ext->tstmp = 1;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
index 8fb57a4f36dd..6bb1971083c2 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
@@ -1036,7 +1036,7 @@ static void mlx5e_shampo_update_ipv4_udp_hdr(struct mlx5e_rq *rq, struct iphdr *
struct udphdr *uh;
uh = (struct udphdr *)(skb->data + udp_off);
- uh->len = htons(skb->len - udp_off);
+ udp_set_len_short(uh, skb->len - udp_off);
if (uh->check)
uh->check = ~udp_v4_check(skb->len - udp_off, ipv4->saddr,
@@ -1055,7 +1055,7 @@ static void mlx5e_shampo_update_ipv6_udp_hdr(struct mlx5e_rq *rq, struct ipv6hdr
struct udphdr *uh;
uh = (struct udphdr *)(skb->data + udp_off);
- uh->len = htons(skb->len - udp_off);
+ udp_set_len_short(uh, skb->len - udp_off);
if (uh->check)
uh->check = ~udp_v6_check(skb->len - udp_off, &ipv6->saddr,
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_selftest.c b/drivers/net/ethernet/mellanox/mlx5/core/en_selftest.c
index accc26d1a872..1dcdb86690bb 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_selftest.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_selftest.c
@@ -113,7 +113,7 @@ static struct sk_buff *mlx5e_test_get_udp_skb(struct mlx5e_priv *priv)
/* Fill UDP header */
udph->source = htons(9);
udph->dest = htons(9); /* Discard Protocol */
- udph->len = htons(sizeof(struct mlx5ehdr) + sizeof(struct udphdr));
+ udp_set_len_short(udph, sizeof(struct mlx5ehdr) + sizeof(struct udphdr));
udph->check = 0;
/* Fill IP header */
diff --git a/drivers/net/ethernet/sfc/falcon/selftest.c b/drivers/net/ethernet/sfc/falcon/selftest.c
index db4dd7fb77f5..4d29e0baf2eb 100644
--- a/drivers/net/ethernet/sfc/falcon/selftest.c
+++ b/drivers/net/ethernet/sfc/falcon/selftest.c
@@ -401,8 +401,8 @@ static void ef4_iterate_state(struct ef4_nic *efx)
/* Initialise udp header */
payload->udp.source = 0;
- payload->udp.len = htons(sizeof(*payload) -
- offsetof(struct ef4_loopback_payload, udp));
+ udp_set_len_short(&payload->udp, sizeof(*payload) -
+ offsetof(struct ef4_loopback_payload, udp));
payload->udp.check = 0; /* checksum ignored */
/* Fill out payload */
diff --git a/drivers/net/ethernet/sfc/selftest.c b/drivers/net/ethernet/sfc/selftest.c
index 8ec76329237a..dc716feb79cb 100644
--- a/drivers/net/ethernet/sfc/selftest.c
+++ b/drivers/net/ethernet/sfc/selftest.c
@@ -398,8 +398,8 @@ static void efx_iterate_state(struct efx_nic *efx)
/* Initialise udp header */
payload->udp.source = 0;
- payload->udp.len = htons(sizeof(*payload) -
- offsetof(struct efx_loopback_payload, udp));
+ udp_set_len_short(&payload->udp, sizeof(*payload) -
+ offsetof(struct efx_loopback_payload, udp));
payload->udp.check = 0; /* checksum ignored */
/* Fill out payload */
diff --git a/drivers/net/ethernet/sfc/siena/selftest.c b/drivers/net/ethernet/sfc/siena/selftest.c
index 930643612df5..c74cf5131364 100644
--- a/drivers/net/ethernet/sfc/siena/selftest.c
+++ b/drivers/net/ethernet/sfc/siena/selftest.c
@@ -399,8 +399,8 @@ static void efx_iterate_state(struct efx_nic *efx)
/* Initialise udp header */
payload->udp.source = 0;
- payload->udp.len = htons(sizeof(*payload) -
- offsetof(struct efx_loopback_payload, udp));
+ udp_set_len_short(&payload->udp, sizeof(*payload) -
+ offsetof(struct efx_loopback_payload, udp));
payload->udp.check = 0; /* checksum ignored */
/* Fill out payload */
diff --git a/drivers/net/ethernet/sfc/tc_encap_actions.c b/drivers/net/ethernet/sfc/tc_encap_actions.c
index da35705cc5e1..b2428e098817 100644
--- a/drivers/net/ethernet/sfc/tc_encap_actions.c
+++ b/drivers/net/ethernet/sfc/tc_encap_actions.c
@@ -312,7 +312,7 @@ static void efx_gen_tun_header_udp(struct efx_tc_encap_action *encap, u8 len)
encap->encap_hdr_len += sizeof(*udp);
udp->dest = key->tp_dst;
- udp->len = cpu_to_be16(sizeof(*udp) + len);
+ udp_set_len_short(udp, sizeof(*udp) + len);
}
static void efx_gen_tun_header_vxlan(struct efx_tc_encap_action *encap)
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_selftests.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_selftests.c
index a0c75886587c..29e824bd90ca 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_selftests.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_selftests.c
@@ -154,9 +154,9 @@ static struct sk_buff *stmmac_test_get_udp_skb(struct stmmac_priv *priv,
} else {
uhdr->source = htons(attr->sport);
uhdr->dest = htons(attr->dport);
- uhdr->len = htons(sizeof(*shdr) + sizeof(*uhdr) + attr->size);
+ udp_set_len_short(uhdr, sizeof(*shdr) + sizeof(*uhdr) + attr->size);
if (attr->max_size)
- uhdr->len = htons(attr->max_size -
+ udp_set_len_short(uhdr, attr->max_size -
(sizeof(*ihdr) + sizeof(*ehdr)));
uhdr->check = 0;
}
diff --git a/drivers/net/geneve.c b/drivers/net/geneve.c
index 7a26e2439d48..5aa5c0e81b12 100644
--- a/drivers/net/geneve.c
+++ b/drivers/net/geneve.c
@@ -631,7 +631,7 @@ static int geneve_post_decap_hint(const struct sock *sk, struct sk_buff *skb,
/* Adjust the nested UDP header len and checksum. */
uh = udp_hdr(skb);
- uh->len = htons(skb->len - gro_hint->nested_tp_offset);
+ udp_set_len_short(uh, skb->len - gro_hint->nested_tp_offset);
if (uh->check) {
len = skb->len - gro_hint->nested_nh_offset;
skb_shinfo(skb)->gso_type |= SKB_GSO_UDP_TUNNEL_CSUM;
diff --git a/drivers/net/netdevsim/dev.c b/drivers/net/netdevsim/dev.c
index e82de0fd3157..9f10ff039252 100644
--- a/drivers/net/netdevsim/dev.c
+++ b/drivers/net/netdevsim/dev.c
@@ -845,7 +845,7 @@ static struct sk_buff *nsim_dev_trap_skb_build(void)
udph = skb_put_zero(skb, sizeof(struct udphdr) + data_len);
get_random_bytes(&udph->source, sizeof(u16));
get_random_bytes(&udph->dest, sizeof(u16));
- udph->len = htons(sizeof(struct udphdr) + data_len);
+ udp_set_len_short(udph, sizeof(struct udphdr) + data_len);
return skb;
}
diff --git a/drivers/net/netdevsim/psample.c b/drivers/net/netdevsim/psample.c
index 47d24bc64ee4..ef3116686707 100644
--- a/drivers/net/netdevsim/psample.c
+++ b/drivers/net/netdevsim/psample.c
@@ -73,7 +73,7 @@ static struct sk_buff *nsim_dev_psample_skb_build(void)
udph = skb_put_zero(skb, sizeof(struct udphdr) + data_len);
get_random_bytes(&udph->source, sizeof(u16));
get_random_bytes(&udph->dest, sizeof(u16));
- udph->len = htons(sizeof(struct udphdr) + data_len);
+ udp_set_len_short(udph, sizeof(struct udphdr) + data_len);
return skb;
}
diff --git a/drivers/net/netdevsim/psp.c b/drivers/net/netdevsim/psp.c
index 0b4d717253b0..e81b69d6a577 100644
--- a/drivers/net/netdevsim/psp.c
+++ b/drivers/net/netdevsim/psp.c
@@ -84,6 +84,7 @@ nsim_do_psp(struct sk_buff *skb, struct netdevsim *ns,
struct iphdr *iph;
struct udphdr *uh;
__wsum csum;
+ u16 udplen;
/* Do not decapsulate. Receive the skb with the udp and psp
* headers still there as if this is a normal udp packet.
@@ -91,19 +92,20 @@ nsim_do_psp(struct sk_buff *skb, struct netdevsim *ns,
* provide a valid checksum here, so the skb isn't dropped.
*/
uh = udp_hdr(skb);
+ udplen = ntohs(uh->len) ?: skb->len;
csum = skb_checksum(skb, skb_transport_offset(skb),
- ntohs(uh->len), 0);
+ udplen, 0);
switch (skb->protocol) {
case htons(ETH_P_IP):
iph = ip_hdr(skb);
- uh->check = udp_v4_check(ntohs(uh->len), iph->saddr,
+ uh->check = udp_v4_check(udplen, iph->saddr,
iph->daddr, csum);
break;
#if IS_ENABLED(CONFIG_IPV6)
case htons(ETH_P_IPV6):
ip6h = ipv6_hdr(skb);
- uh->check = udp_v6_check(ntohs(uh->len), &ip6h->saddr,
+ uh->check = udp_v6_check(udplen, &ip6h->saddr,
&ip6h->daddr, csum);
break;
#endif
diff --git a/drivers/net/wireguard/receive.c b/drivers/net/wireguard/receive.c
index eb8851113654..275fe1bc994c 100644
--- a/drivers/net/wireguard/receive.c
+++ b/drivers/net/wireguard/receive.c
@@ -62,7 +62,7 @@ static int prepare_skb_header(struct sk_buff *skb, struct wg_device *wg)
* to have UDP fields.
*/
return -EINVAL;
- data_len = ntohs(udp->len);
+ data_len = udp_get_len_short(udp); /* GRO not expected here. */
if (unlikely(data_len < sizeof(struct udphdr) ||
data_len > skb->len - data_offset))
/* UDP packet is reporting too small of a size or lying about
diff --git a/include/linux/udp.h b/include/linux/udp.h
index 1cbf6b4d3aab..3a79fa23918f 100644
--- a/include/linux/udp.h
+++ b/include/linux/udp.h
@@ -23,6 +23,22 @@ static inline struct udphdr *udp_hdr(const struct sk_buff *skb)
return (struct udphdr *)skb_transport_header(skb);
}
+static inline unsigned int udp_get_len_short(const struct udphdr *uh)
+{
+ return ntohs(uh->len);
+}
+
+static inline void udp_set_len(struct udphdr *uh, unsigned int len)
+{
+ uh->len = len < GRO_LEGACY_MAX_SIZE ? htons(len) : 0;
+}
+
+static inline void udp_set_len_short(struct udphdr *uh, unsigned int len)
+{
+ DEBUG_NET_WARN_ON_ONCE(len >= GRO_LEGACY_MAX_SIZE);
+ uh->len = htons(len);
+}
+
#define UDP_HTABLE_SIZE_MIN_PERNET 128
#define UDP_HTABLE_SIZE_MIN (IS_ENABLED(CONFIG_BASE_SMALL) ? 128 : 256)
#define UDP_HTABLE_SIZE_MAX 65536
diff --git a/include/net/udplite.h b/include/net/udplite.h
index 786919d29f8d..b7148bc5b7c3 100644
--- a/include/net/udplite.h
+++ b/include/net/udplite.h
@@ -40,7 +40,7 @@ static inline int udplite_checksum_init(struct sk_buff *skb, struct udphdr *uh)
return 1;
}
- cscov = ntohs(uh->len);
+ cscov = udp_get_len_short(uh);
if (cscov == 0) /* Indicates that full coverage is required. */
;
@@ -76,7 +76,7 @@ static inline __wsum udplite_csum(struct sk_buff *skb)
if (pcslen < len) {
if (pcslen > 0)
len = pcslen;
- udp_hdr(skb)->len = htons(pcslen);
+ udp_set_len_short(udp_hdr(skb), pcslen);
}
}
skb->ip_summed = CHECKSUM_NONE; /* no HW support for checksumming */
diff --git a/include/trace/events/icmp.h b/include/trace/events/icmp.h
index 31559796949a..09ae115099df 100644
--- a/include/trace/events/icmp.h
+++ b/include/trace/events/icmp.h
@@ -44,7 +44,7 @@ TRACE_EVENT(icmp_send,
} else {
__entry->sport = ntohs(uh->source);
__entry->dport = ntohs(uh->dest);
- __entry->ulen = ntohs(uh->len);
+ __entry->ulen = udp_get_len_short(uh);
}
p32 = (__be32 *) __entry->saddr;
diff --git a/lib/tests/blackhole_dev_kunit.c b/lib/tests/blackhole_dev_kunit.c
index 06834ab35f43..fa3e0533038d 100644
--- a/lib/tests/blackhole_dev_kunit.c
+++ b/lib/tests/blackhole_dev_kunit.c
@@ -46,7 +46,7 @@ static void test_blackholedev(struct kunit *test)
uh = (struct udphdr *)skb_push(skb, sizeof(struct udphdr));
skb_set_transport_header(skb, 0);
uh->source = uh->dest = htons(UDP_PORT);
- uh->len = htons(data_len);
+ udp_set_len_short(uh, data_len);
uh->check = 0;
/* (Network) IPv6 */
ip6h = (struct ipv6hdr *)skb_push(skb, sizeof(struct ipv6hdr));
diff --git a/net/6lowpan/nhc_udp.c b/net/6lowpan/nhc_udp.c
index 0a506c77283d..ed4227e6db74 100644
--- a/net/6lowpan/nhc_udp.c
+++ b/net/6lowpan/nhc_udp.c
@@ -88,16 +88,16 @@ static int udp_uncompress(struct sk_buff *skb, size_t needed)
switch (lowpan_dev(skb->dev)->lltype) {
case LOWPAN_LLTYPE_IEEE802154:
if (lowpan_802154_cb(skb)->d_size)
- uh.len = htons(lowpan_802154_cb(skb)->d_size -
- sizeof(struct ipv6hdr));
+ udp_set_len_short(&uh, lowpan_802154_cb(skb)->d_size -
+ sizeof(struct ipv6hdr));
else
- uh.len = htons(skb->len + sizeof(struct udphdr));
+ udp_set_len_short(&uh, skb->len + sizeof(struct udphdr));
break;
default:
- uh.len = htons(skb->len + sizeof(struct udphdr));
+ udp_set_len_short(&uh, skb->len + sizeof(struct udphdr));
break;
}
- pr_debug("uncompressed UDP length: src = %d", ntohs(uh.len));
+ pr_debug("uncompressed UDP length: src = %d", udp_get_len_short(&uh));
/* replace the compressed UDP head by the uncompressed UDP
* header
diff --git a/net/core/netpoll.c b/net/core/netpoll.c
index a8558a52884f..a3f737974d51 100644
--- a/net/core/netpoll.c
+++ b/net/core/netpoll.c
@@ -474,7 +474,7 @@ static void push_udp(struct netpoll *np, struct sk_buff *skb, int len)
udph = udp_hdr(skb);
udph->source = htons(np->local_port);
udph->dest = htons(np->remote_port);
- udph->len = htons(udp_len);
+ udp_set_len_short(udph, udp_len);
netpoll_udp_checksum(np, skb, len);
}
diff --git a/net/core/pktgen.c b/net/core/pktgen.c
index 8e185b318288..5b4dd04d6124 100644
--- a/net/core/pktgen.c
+++ b/net/core/pktgen.c
@@ -3005,7 +3005,7 @@ static struct sk_buff *fill_packet_ipv4(struct net_device *odev,
udph->source = htons(pkt_dev->cur_udp_src);
udph->dest = htons(pkt_dev->cur_udp_dst);
- udph->len = htons(datalen + 8); /* DATA + udphdr */
+ udp_set_len_short(udph, datalen + 8); /* DATA + udphdr */
udph->check = 0;
iph->ihl = 5;
@@ -3138,7 +3138,7 @@ static struct sk_buff *fill_packet_ipv6(struct net_device *odev,
udplen = datalen + sizeof(struct udphdr);
udph->source = htons(pkt_dev->cur_udp_src);
udph->dest = htons(pkt_dev->cur_udp_dst);
- udph->len = htons(udplen);
+ udp_set_len_short(udph, udplen);
udph->check = 0;
*(__be32 *) iph = htonl(0x60000000); /* Version + flow */
diff --git a/net/core/selftests.c b/net/core/selftests.c
index 0a203d3fb9dc..36b949ae520b 100644
--- a/net/core/selftests.c
+++ b/net/core/selftests.c
@@ -72,9 +72,9 @@ struct sk_buff *net_test_get_skb(struct net_device *ndev, u8 id,
} else {
uhdr->source = htons(attr->sport);
uhdr->dest = htons(attr->dport);
- uhdr->len = htons(sizeof(*shdr) + sizeof(*uhdr) + attr->size);
+ udp_set_len_short(uhdr, sizeof(*shdr) + sizeof(*uhdr) + attr->size);
if (attr->max_size)
- uhdr->len = htons(attr->max_size -
+ udp_set_len_short(uhdr, attr->max_size -
(sizeof(*ihdr) + sizeof(*ehdr)));
uhdr->check = 0;
}
diff --git a/net/core/tso.c b/net/core/tso.c
index 6df997b9076e..3cc5a03e7a12 100644
--- a/net/core/tso.c
+++ b/net/core/tso.c
@@ -38,7 +38,8 @@ void tso_build_hdr(const struct sk_buff *skb, char *hdr, struct tso_t *tso,
} else {
struct udphdr *uh = (struct udphdr *)hdr;
- uh->len = htons(sizeof(*uh) + size);
+ /* size is after segmentation. */
+ udp_set_len_short(uh, sizeof(*uh) + size);
}
}
EXPORT_SYMBOL(tso_build_hdr);
diff --git a/net/ipv4/esp4.c b/net/ipv4/esp4.c
index 2c922afadb8f..25ee1ea9f548 100644
--- a/net/ipv4/esp4.c
+++ b/net/ipv4/esp4.c
@@ -317,7 +317,7 @@ static struct ip_esp_hdr *esp_output_udp_encap(struct sk_buff *skb,
uh = (struct udphdr *)esp->esph;
uh->source = sport;
uh->dest = dport;
- uh->len = htons(len);
+ udp_set_len_short(uh, len);
uh->check = 0;
/* For IPv4 ESP with UDP encapsulation, if xo is not null, the skb is in the crypto offload
diff --git a/net/ipv4/fou_core.c b/net/ipv4/fou_core.c
index 3baaa4df7e42..7aeb6efbfc44 100644
--- a/net/ipv4/fou_core.c
+++ b/net/ipv4/fou_core.c
@@ -1043,7 +1043,7 @@ static void fou_build_udp(struct sk_buff *skb, struct ip_tunnel_encap *e,
uh->dest = e->dport;
uh->source = sport;
- uh->len = htons(skb->len);
+ udp_set_len_short(uh, skb->len);
udp_set_csum(!(e->flags & TUNNEL_ENCAP_FLAG_CSUM), skb,
fl4->saddr, fl4->daddr, skb->len);
diff --git a/net/ipv4/ipconfig.c b/net/ipv4/ipconfig.c
index a35ffedacc7c..155db067eaec 100644
--- a/net/ipv4/ipconfig.c
+++ b/net/ipv4/ipconfig.c
@@ -847,7 +847,7 @@ static void __init ic_bootp_send_if(struct ic_device *d, unsigned long jiffies_d
/* Construct UDP header */
b->udph.source = htons(68);
b->udph.dest = htons(67);
- b->udph.len = htons(sizeof(struct bootp_pkt) - sizeof(struct iphdr));
+ udp_set_len_short(&b->udph, sizeof(struct bootp_pkt) - sizeof(struct iphdr));
/* UDP checksum not calculated -- explicitly allowed in BOOTP RFC */
/* Construct DHCP/BOOTP header */
@@ -1025,10 +1025,10 @@ static int __init ic_bootp_recv(struct sk_buff *skb, struct net_device *dev, str
if (b->udph.source != htons(67) || b->udph.dest != htons(68))
goto drop;
- if (ntohs(h->tot_len) < ntohs(b->udph.len) + sizeof(struct iphdr))
+ if (ntohs(h->tot_len) < udp_get_len_short(&b->udph) + sizeof(struct iphdr))
goto drop;
- len = ntohs(b->udph.len) - sizeof(struct udphdr);
+ len = udp_get_len_short(&b->udph) - sizeof(struct udphdr);
ext_len = len - (sizeof(*b) -
sizeof(struct iphdr) -
sizeof(struct udphdr) -
diff --git a/net/ipv4/netfilter/nf_nat_snmp_basic_main.c b/net/ipv4/netfilter/nf_nat_snmp_basic_main.c
index 717b726504fe..afe0f4a328d0 100644
--- a/net/ipv4/netfilter/nf_nat_snmp_basic_main.c
+++ b/net/ipv4/netfilter/nf_nat_snmp_basic_main.c
@@ -127,7 +127,7 @@ static int snmp_translate(struct nf_conn *ct, int dir, struct sk_buff *skb)
{
struct iphdr *iph = ip_hdr(skb);
struct udphdr *udph = (struct udphdr *)((__be32 *)iph + iph->ihl);
- u16 datalen = ntohs(udph->len) - sizeof(struct udphdr);
+ u16 datalen = udp_get_len_short(udph) - sizeof(struct udphdr);
char *data = (unsigned char *)udph + sizeof(struct udphdr);
struct snmp_ctx ctx;
int ret;
@@ -181,7 +181,7 @@ static int help(struct sk_buff *skb, unsigned int protoff,
* enough room for a UDP header. Just verify the UDP length field so we
* can mess around with the payload.
*/
- if (ntohs(udph->len) != skb->len - (iph->ihl << 2)) {
+ if (udp_get_len_short(udph) != skb->len - (iph->ihl << 2)) {
nf_ct_helper_log(skb, ct, "dropping malformed packet\n");
return NF_DROP;
}
diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index 463236e0dc2d..2eed102231b8 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -3190,7 +3190,7 @@ static struct sk_buff *inet_rtm_getroute_build_skb(__be32 src, __be32 dst,
udph = skb_put_zero(skb, sizeof(struct udphdr));
udph->source = sport;
udph->dest = dport;
- udph->len = htons(sizeof(struct udphdr));
+ udp_set_len_short(udph, sizeof(struct udphdr));
udph->check = 0;
break;
}
diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index 6c6b68a66dcd..345ef93001fc 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -1133,7 +1133,8 @@ static int udp_send_skb(struct sk_buff *skb, struct flowi4 *fl4,
uh = udp_hdr(skb);
uh->source = inet->inet_sport;
uh->dest = fl4->fl4_dport;
- uh->len = htons(len);
+ /* Datagram length checked in udp_sendmsg. */
+ udp_set_len_short(uh, len);
uh->check = 0;
if (cork->gso_size) {
diff --git a/net/ipv4/udp_offload.c b/net/ipv4/udp_offload.c
index e831234326c4..2f35b485ff40 100644
--- a/net/ipv4/udp_offload.c
+++ b/net/ipv4/udp_offload.c
@@ -279,11 +279,11 @@ static struct sk_buff *__skb_udp_tunnel_segment(struct sk_buff *skb,
* segment instead of the entire frame.
*/
if (gso_partial && skb_is_gso(skb)) {
- uh->len = htons(skb_shinfo(skb)->gso_size +
- SKB_GSO_CB(skb)->data_offset +
- skb->head - (unsigned char *)uh);
+ udp_set_len_short(uh, skb_shinfo(skb)->gso_size +
+ SKB_GSO_CB(skb)->data_offset +
+ skb->head - (unsigned char *)uh);
} else {
- uh->len = htons(len);
+ udp_set_len_short(uh, len);
}
if (!need_csum)
@@ -469,7 +469,7 @@ static struct sk_buff *__udp_gso_segment_list(struct sk_buff *skb,
if (IS_ERR(skb))
return skb;
- udp_hdr(skb)->len = htons(sizeof(struct udphdr) + mss);
+ udp_set_len_short(udp_hdr(skb), sizeof(struct udphdr) + mss);
if (is_ipv6)
return __udpv6_gso_segment_list_csum(skb);
@@ -487,8 +487,8 @@ struct sk_buff *__udp_gso_segment(struct sk_buff *gso_skb,
unsigned int mss;
bool copy_dtor;
__sum16 check;
- __be16 newlen;
int ret = 0;
+ u16 newlen;
mss = skb_shinfo(gso_skb)->gso_size;
if (gso_skb->len <= sizeof(*uh) + mss)
@@ -565,8 +565,8 @@ struct sk_buff *__udp_gso_segment(struct sk_buff *gso_skb,
(skb_shinfo(gso_skb)->tx_flags & SKBTX_ANY_TSTAMP);
/* compute checksum adjustment based on old length versus new */
- newlen = htons(sizeof(*uh) + mss);
- check = csum16_add(csum16_sub(uh->check, uh->len), newlen);
+ newlen = sizeof(*uh) + mss;
+ check = csum16_add(csum16_sub(uh->check, uh->len), htons(newlen));
for (;;) {
if (copy_dtor) {
@@ -578,7 +578,7 @@ struct sk_buff *__udp_gso_segment(struct sk_buff *gso_skb,
if (!seg->next)
break;
- uh->len = newlen;
+ udp_set_len_short(uh, newlen);
uh->check = check;
if (seg->ip_summed == CHECKSUM_PARTIAL)
@@ -592,11 +592,10 @@ struct sk_buff *__udp_gso_segment(struct sk_buff *gso_skb,
}
/* last packet can be partial gso_size, account for that in checksum */
- newlen = htons(skb_tail_pointer(seg) - skb_transport_header(seg) +
- seg->data_len);
- check = csum16_add(csum16_sub(uh->check, uh->len), newlen);
+ newlen = skb_tail_pointer(seg) - skb_transport_header(seg) + seg->data_len;
+ check = csum16_add(csum16_sub(uh->check, uh->len), htons(newlen));
- uh->len = newlen;
+ udp_set_len_short(uh, newlen);
uh->check = check;
if (seg->ip_summed == CHECKSUM_PARTIAL)
@@ -708,7 +707,7 @@ static struct sk_buff *udp_gro_receive_segment(struct list_head *head,
}
/* Do not deal with padded or malicious packets, sorry ! */
- ulen = ntohs(uh->len);
+ ulen = udp_get_len_short(uh);
if (ulen <= sizeof(*uh) || ulen != skb_gro_len(skb)) {
NAPI_GRO_CB(skb)->flush = 1;
return NULL;
@@ -741,7 +740,7 @@ static struct sk_buff *udp_gro_receive_segment(struct list_head *head,
* On len mismatch merge the first packet shorter than gso_size,
* otherwise complete the GRO packet.
*/
- if (ulen > ntohs(uh2->len) || flush) {
+ if (ulen > udp_get_len_short(uh2) || flush) {
pp = p;
} else {
if (NAPI_GRO_CB(skb)->is_flist) {
@@ -764,7 +763,7 @@ static struct sk_buff *udp_gro_receive_segment(struct list_head *head,
}
}
- if (ret || ulen != ntohs(uh2->len) ||
+ if (ret || ulen != udp_get_len_short(uh2) ||
NAPI_GRO_CB(p)->count >= UDP_GRO_CNT_MAX)
pp = p;
@@ -916,12 +915,12 @@ static int udp_gro_complete_segment(struct sk_buff *skb)
int udp_gro_complete(struct sk_buff *skb, int nhoff,
udp_lookup_t lookup)
{
- __be16 newlen = htons(skb->len - nhoff);
+ unsigned int newlen = skb->len - nhoff;
struct udphdr *uh = (struct udphdr *)(skb->data + nhoff);
struct sock *sk;
int err;
- uh->len = newlen;
+ udp_set_len_short(uh, newlen);
sk = INDIRECT_CALL_INET(lookup, udp6_lib_lookup_skb,
udp4_lib_lookup_skb, skb, uh->source, uh->dest);
@@ -959,7 +958,7 @@ INDIRECT_CALLABLE_SCOPE int udp4_gro_complete(struct sk_buff *skb, int nhoff)
/* do fraglist only if there is no outer UDP encap (or we already processed it) */
if (NAPI_GRO_CB(skb)->is_flist && !NAPI_GRO_CB(skb)->encap_mark) {
- uh->len = htons(skb->len - nhoff);
+ udp_set_len_short(uh, skb->len - nhoff);
skb_shinfo(skb)->gso_type |= (SKB_GSO_FRAGLIST|SKB_GSO_UDP_L4);
skb_shinfo(skb)->gso_segs = NAPI_GRO_CB(skb)->count;
diff --git a/net/ipv4/udp_tunnel_core.c b/net/ipv4/udp_tunnel_core.c
index b1f667c52cb2..18f789d9383e 100644
--- a/net/ipv4/udp_tunnel_core.c
+++ b/net/ipv4/udp_tunnel_core.c
@@ -184,7 +184,7 @@ void udp_tunnel_xmit_skb(struct rtable *rt, struct sock *sk, struct sk_buff *skb
uh->dest = dst_port;
uh->source = src_port;
- uh->len = htons(skb->len);
+ udp_set_len_short(uh, skb->len);
memset(&(IPCB(skb)->opt), 0, sizeof(IPCB(skb)->opt));
diff --git a/net/ipv6/esp6.c b/net/ipv6/esp6.c
index e75da98f5283..194566129477 100644
--- a/net/ipv6/esp6.c
+++ b/net/ipv6/esp6.c
@@ -227,7 +227,8 @@ static void esp_output_encap_csum(struct sk_buff *skb)
if (*skb_mac_header(skb) == IPPROTO_UDP) {
struct udphdr *uh = udp_hdr(skb);
struct ipv6hdr *ip6h = ipv6_hdr(skb);
- int len = ntohs(uh->len);
+ /* esp6_output_udp_encap limits len to U16_MAX. */
+ int len = udp_get_len_short(uh);
unsigned int offset = skb_transport_offset(skb);
__wsum csum = skb_checksum(skb, offset, skb->len - offset, 0);
@@ -352,7 +353,7 @@ static struct ip_esp_hdr *esp6_output_udp_encap(struct sk_buff *skb,
uh = (struct udphdr *)esp->esph;
uh->source = sport;
uh->dest = dport;
- uh->len = htons(len);
+ udp_set_len_short(uh, len);
uh->check = 0;
*skb_mac_header(skb) = IPPROTO_UDP;
diff --git a/net/ipv6/fou6.c b/net/ipv6/fou6.c
index 430518ae26fa..abcf23500299 100644
--- a/net/ipv6/fou6.c
+++ b/net/ipv6/fou6.c
@@ -30,7 +30,7 @@ static void fou6_build_udp(struct sk_buff *skb, struct ip_tunnel_encap *e,
uh->dest = e->dport;
uh->source = sport;
- uh->len = htons(skb->len);
+ udp_set_len_short(uh, skb->len);
udp6_set_csum(!(e->flags & TUNNEL_ENCAP_FLAG_CSUM6), skb,
&fl6->saddr, &fl6->daddr, skb->len);
diff --git a/net/ipv6/ip6_udp_tunnel.c b/net/ipv6/ip6_udp_tunnel.c
index cef3e0210744..26b140fea7b7 100644
--- a/net/ipv6/ip6_udp_tunnel.c
+++ b/net/ipv6/ip6_udp_tunnel.c
@@ -93,7 +93,7 @@ void udp_tunnel6_xmit_skb(struct dst_entry *dst, struct sock *sk,
uh->dest = dst_port;
uh->source = src_port;
- uh->len = htons(skb->len);
+ udp_set_len_short(uh, skb->len);
skb_dst_set(skb, dst);
diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
index 48f73401adf4..dbc41008d286 100644
--- a/net/ipv6/udp.c
+++ b/net/ipv6/udp.c
@@ -1431,7 +1431,8 @@ static int udp_v6_send_skb(struct sk_buff *skb, struct flowi6 *fl6,
uh = udp_hdr(skb);
uh->source = fl6->fl6_sport;
uh->dest = fl6->fl6_dport;
- uh->len = htons(len);
+ /* Datagram length checked in udpv6_sendmsg. */
+ udp_set_len_short(uh, len);
uh->check = 0;
if (cork->gso_size) {
diff --git a/net/ipv6/udp_offload.c b/net/ipv6/udp_offload.c
index e003b8494dc0..bfe0d7104e8a 100644
--- a/net/ipv6/udp_offload.c
+++ b/net/ipv6/udp_offload.c
@@ -172,7 +172,7 @@ int udp6_gro_complete(struct sk_buff *skb, int nhoff)
/* do fraglist only if there is no outer UDP encap (or we already processed it) */
if (NAPI_GRO_CB(skb)->is_flist && !NAPI_GRO_CB(skb)->encap_mark) {
- uh->len = htons(skb->len - nhoff);
+ udp_set_len_short(uh, skb->len - nhoff);
skb_shinfo(skb)->gso_type |= (SKB_GSO_FRAGLIST|SKB_GSO_UDP_L4);
skb_shinfo(skb)->gso_segs = NAPI_GRO_CB(skb)->count;
diff --git a/net/l2tp/l2tp_core.c b/net/l2tp/l2tp_core.c
index c89ae52764b8..432bac206990 100644
--- a/net/l2tp/l2tp_core.c
+++ b/net/l2tp/l2tp_core.c
@@ -1290,7 +1290,7 @@ static int l2tp_xmit_core(struct l2tp_session *session, struct sk_buff *skb, uns
uh->source = inet->inet_sport;
uh->dest = inet->inet_dport;
udp_len = uhlen + session->hdr_len + data_len;
- uh->len = htons(udp_len);
+ udp_set_len_short(uh, udp_len);
/* Calculate UDP checksum if configured to do so */
#if IS_ENABLED(CONFIG_IPV6)
diff --git a/net/netfilter/ipvs/ip_vs_xmit.c b/net/netfilter/ipvs/ip_vs_xmit.c
index 0fb5162992e5..b460998e348e 100644
--- a/net/netfilter/ipvs/ip_vs_xmit.c
+++ b/net/netfilter/ipvs/ip_vs_xmit.c
@@ -1089,7 +1089,7 @@ ipvs_gue_encap(struct net *net, struct sk_buff *skb,
dport = cp->dest->tun_port;
udph->dest = dport;
udph->source = sport;
- udph->len = htons(skb->len);
+ udp_set_len_short(udph, skb->len);
udph->check = 0;
*next_protocol = IPPROTO_UDP;
diff --git a/net/netfilter/nf_conntrack_proto_udp.c b/net/netfilter/nf_conntrack_proto_udp.c
index 0030fbe8885c..e9bd1632304f 100644
--- a/net/netfilter/nf_conntrack_proto_udp.c
+++ b/net/netfilter/nf_conntrack_proto_udp.c
@@ -41,11 +41,22 @@ static void udp_error_log(const struct sk_buff *skb,
nf_l4proto_log_invalid(skb, state, IPPROTO_UDP, "%s", msg);
}
+static bool udp_validate_len(struct sk_buff *skb,
+ const struct udphdr *hdr,
+ unsigned int dataoff)
+{
+ unsigned int udplen = udp_get_len_short(hdr);
+ unsigned int skblen = skb->len - dataoff;
+
+ if (udplen > skblen || udplen < sizeof(*hdr))
+ return false;
+ return true;
+}
+
static bool udp_error(struct sk_buff *skb,
unsigned int dataoff,
const struct nf_hook_state *state)
{
- unsigned int udplen = skb->len - dataoff;
const struct udphdr *hdr;
struct udphdr _hdr;
@@ -57,7 +68,7 @@ static bool udp_error(struct sk_buff *skb,
}
/* Truncated/malformed packets */
- if (ntohs(hdr->len) > udplen || ntohs(hdr->len) < sizeof(*hdr)) {
+ if (!udp_validate_len(skb, hdr, dataoff)) {
udp_error_log(skb, state, "truncated/malformed packet");
return true;
}
@@ -153,7 +164,7 @@ static bool udplite_error(struct sk_buff *skb,
return true;
}
- cscov = ntohs(hdr->len);
+ cscov = udp_get_len_short(hdr);
if (cscov == 0) {
cscov = udplen;
} else if (cscov < sizeof(*hdr) || cscov > udplen) {
diff --git a/net/netfilter/nf_log_syslog.c b/net/netfilter/nf_log_syslog.c
index 41503847d9d7..0254db8b97ce 100644
--- a/net/netfilter/nf_log_syslog.c
+++ b/net/netfilter/nf_log_syslog.c
@@ -290,7 +290,7 @@ nf_log_dump_udp_header(struct nf_log_buf *m,
/* Max length: 20 "SPT=65535 DPT=65535 " */
nf_log_buf_add(m, "SPT=%u DPT=%u LEN=%u ",
- ntohs(uh->source), ntohs(uh->dest), ntohs(uh->len));
+ ntohs(uh->source), ntohs(uh->dest), udp_get_len_short(uh));
out:
return 0;
diff --git a/net/netfilter/nf_nat_helper.c b/net/netfilter/nf_nat_helper.c
index bf591e6af005..3853f41db499 100644
--- a/net/netfilter/nf_nat_helper.c
+++ b/net/netfilter/nf_nat_helper.c
@@ -161,7 +161,7 @@ nf_nat_mangle_udp_packet(struct sk_buff *skb,
/* update the length of the UDP packet */
datalen = skb->len - protoff;
- udph->len = htons(datalen);
+ udp_set_len_short(udph, datalen);
/* fix udp checksum if udp checksum was previously calculated */
if (!udph->check && skb->ip_summed != CHECKSUM_PARTIAL)
diff --git a/net/psp/psp_main.c b/net/psp/psp_main.c
index d4c04c923c5a..2415b75a2a12 100644
--- a/net/psp/psp_main.c
+++ b/net/psp/psp_main.c
@@ -207,7 +207,7 @@ static void psp_write_headers(struct net *net, struct sk_buff *skb, __be32 spi,
uh->source = udp_flow_src_port(net, skb, 0, 0, false);
}
uh->check = 0;
- uh->len = htons(udp_len);
+ udp_set_len_short(uh, udp_len);
psph->nexthdr = IPPROTO_TCP;
psph->hdrlen = PSP_HDRLEN_NOOPT;
diff --git a/net/sched/act_csum.c b/net/sched/act_csum.c
index a5cc76613f32..5315f851b7a4 100644
--- a/net/sched/act_csum.c
+++ b/net/sched/act_csum.c
@@ -276,7 +276,7 @@ static int tcf_csum_ipv4_udp(struct sk_buff *skb, unsigned int ihl,
return 0;
iph = ip_hdr(skb);
- ul = ntohs(udph->len);
+ ul = udp_get_len_short(udph);
if (udplite || udph->check) {
@@ -334,7 +334,7 @@ static int tcf_csum_ipv6_udp(struct sk_buff *skb, unsigned int ihl,
return 0;
ip6h = ipv6_hdr(skb);
- ul = ntohs(udph->len);
+ ul = udp_get_len_short(udph);
udph->check = 0;
diff --git a/net/xfrm/xfrm_nat_keepalive.c b/net/xfrm/xfrm_nat_keepalive.c
index ebf95d48e86c..678626ae3229 100644
--- a/net/xfrm/xfrm_nat_keepalive.c
+++ b/net/xfrm/xfrm_nat_keepalive.c
@@ -133,7 +133,7 @@ static void nat_keepalive_send(struct nat_keepalive *ka)
uh = skb_push(skb, sizeof(*uh));
uh->source = ka->encap_sport;
uh->dest = ka->encap_dport;
- uh->len = htons(skb->len);
+ udp_set_len_short(uh, skb->len);
uh->check = 0;
skb->mark = ka->smark;
--
2.52.0
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH net-next v2 05/12] net: Enable BIG TCP with partial GSO
2026-02-26 20:15 [PATCH net-next v2 00/12] BIG TCP for UDP tunnels Alice Mikityanska
` (3 preceding siblings ...)
2026-02-26 20:15 ` [PATCH net-next v2 04/12] net: Use helpers to get/set UDP len tree-wide Alice Mikityanska
@ 2026-02-26 20:15 ` Alice Mikityanska
2026-02-26 20:15 ` [PATCH net-next v2 06/12] udp: Support gro_ipv4_max_size > 65536 Alice Mikityanska
` (7 subsequent siblings)
12 siblings, 0 replies; 25+ messages in thread
From: Alice Mikityanska @ 2026-02-26 20:15 UTC (permalink / raw)
To: Daniel Borkmann, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni, Xin Long, Willem de Bruijn, David Ahern,
Nikolay Aleksandrov
Cc: Shuah Khan, Stanislav Fomichev, Andrew Lunn, Simon Horman,
Florian Westphal, netdev, Alice Mikityanska
From: Alice Mikityanska <alice@isovalent.com>
skb_segment is called for partial GSO, when netif_needs_gso returns true
in validate_xmit_skb. Partial GSO is needed, for example, when
segmentation of tunneled traffic is offloaded to a NIC that only
supports inner checksum offload.
Currently, skb_segment clamps the segment length to 65534 bytes, because
gso_size == 65535 is a special value GSO_BY_FRAGS, and we don't want
to accidentally assign mss = 65535, as it would fall into the
GSO_BY_FRAGS check further in the function.
This implementation, however, artificially blocks len > 65534, which is
possible since the introduction of BIG TCP. To allow bigger lengths and
avoid resegmentation of BIG TCP packets, store the gso_by_frags flag in
the beginning and don't use a special value of mss for this purpose
after mss was modified.
Signed-off-by: Alice Mikityanska <alice@isovalent.com>
---
net/core/skbuff.c | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 0e217041958a..490fc55759c3 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -4773,6 +4773,7 @@ struct sk_buff *skb_segment(struct sk_buff *head_skb,
struct sk_buff *tail = NULL;
struct sk_buff *list_skb = skb_shinfo(head_skb)->frag_list;
unsigned int mss = skb_shinfo(head_skb)->gso_size;
+ bool gso_by_frags = mss == GSO_BY_FRAGS;
unsigned int doffset = head_skb->data - skb_mac_header(head_skb);
unsigned int offset = doffset;
unsigned int tnl_hlen = skb_tnl_header_len(head_skb);
@@ -4788,7 +4789,7 @@ struct sk_buff *skb_segment(struct sk_buff *head_skb,
int nfrags, pos;
if ((skb_shinfo(head_skb)->gso_type & SKB_GSO_DODGY) &&
- mss != GSO_BY_FRAGS && mss != skb_headlen(head_skb)) {
+ !gso_by_frags && mss != skb_headlen(head_skb)) {
struct sk_buff *check_skb;
for (check_skb = list_skb; check_skb; check_skb = check_skb->next) {
@@ -4816,7 +4817,7 @@ struct sk_buff *skb_segment(struct sk_buff *head_skb,
sg = !!(features & NETIF_F_SG);
csum = !!can_checksum_protocol(features, proto);
- if (sg && csum && (mss != GSO_BY_FRAGS)) {
+ if (sg && csum && !gso_by_frags) {
if (!(features & NETIF_F_GSO_PARTIAL)) {
struct sk_buff *iter;
unsigned int frag_len;
@@ -4850,9 +4851,8 @@ struct sk_buff *skb_segment(struct sk_buff *head_skb,
/* GSO partial only requires that we trim off any excess that
* doesn't fit into an MSS sized block, so take care of that
* now.
- * Cap len to not accidentally hit GSO_BY_FRAGS.
*/
- partial_segs = min(len, GSO_BY_FRAGS - 1) / mss;
+ partial_segs = len / mss;
if (partial_segs > 1)
mss *= partial_segs;
else
@@ -4876,7 +4876,7 @@ struct sk_buff *skb_segment(struct sk_buff *head_skb,
int hsize;
int size;
- if (unlikely(mss == GSO_BY_FRAGS)) {
+ if (unlikely(gso_by_frags)) {
len = list_skb->len;
} else {
len = head_skb->len - offset;
--
2.52.0
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH net-next v2 06/12] udp: Support gro_ipv4_max_size > 65536
2026-02-26 20:15 [PATCH net-next v2 00/12] BIG TCP for UDP tunnels Alice Mikityanska
` (4 preceding siblings ...)
2026-02-26 20:15 ` [PATCH net-next v2 05/12] net: Enable BIG TCP with partial GSO Alice Mikityanska
@ 2026-02-26 20:15 ` Alice Mikityanska
2026-03-06 21:24 ` Willem de Bruijn
2026-02-26 20:15 ` [PATCH net-next v2 07/12] udp: Support BIG TCP GSO packets where they can occur Alice Mikityanska
` (6 subsequent siblings)
12 siblings, 1 reply; 25+ messages in thread
From: Alice Mikityanska @ 2026-02-26 20:15 UTC (permalink / raw)
To: Daniel Borkmann, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni, Xin Long, Willem de Bruijn, David Ahern,
Nikolay Aleksandrov
Cc: Shuah Khan, Stanislav Fomichev, Andrew Lunn, Simon Horman,
Florian Westphal, netdev, Alice Mikityanska
From: Alice Mikityanska <alice@isovalent.com>
Currently, gro_max_size and gro_ipv4_max_size can be set to values
bigger than 65536, and GRO will happily aggregate UDP to the configured
size (for example, with TCP traffic in VXLAN tunnels). However,
udp_gro_complete uses the 16-bit length field in the UDP header to store
the length of the aggregated packet. It leads to the packet truncation
later in __udp4_lib_rcv.
Fix this by storing 0 to the UDP length field and by restoring the real
length from skb->len in __udp4_lib_rcv.
Signed-off-by: Alice Mikityanska <alice@isovalent.com>
---
net/ipv4/udp.c | 5 ++++-
net/ipv4/udp_offload.c | 4 ++--
net/ipv6/udp_offload.c | 2 +-
3 files changed, 7 insertions(+), 4 deletions(-)
diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index 345ef93001fc..870b35107ede 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -2690,7 +2690,7 @@ int __udp4_lib_rcv(struct sk_buff *skb, struct udp_table *udptable,
{
struct sock *sk = NULL;
struct udphdr *uh;
- unsigned short ulen;
+ unsigned int ulen;
struct rtable *rt = skb_rtable(skb);
__be32 saddr, daddr;
struct net *net = dev_net(skb->dev);
@@ -2714,6 +2714,9 @@ int __udp4_lib_rcv(struct sk_buff *skb, struct udp_table *udptable,
goto short_packet;
if (proto == IPPROTO_UDP) {
+ if (!ulen)
+ ulen = skb->len;
+
/* UDP validates ulen. */
if (ulen < sizeof(*uh) || pskb_trim_rcsum(skb, ulen))
goto short_packet;
diff --git a/net/ipv4/udp_offload.c b/net/ipv4/udp_offload.c
index 2f35b485ff40..780df257a8d9 100644
--- a/net/ipv4/udp_offload.c
+++ b/net/ipv4/udp_offload.c
@@ -920,7 +920,7 @@ int udp_gro_complete(struct sk_buff *skb, int nhoff,
struct sock *sk;
int err;
- udp_set_len_short(uh, newlen);
+ udp_set_len(uh, newlen);
sk = INDIRECT_CALL_INET(lookup, udp6_lib_lookup_skb,
udp4_lib_lookup_skb, skb, uh->source, uh->dest);
@@ -958,7 +958,7 @@ INDIRECT_CALLABLE_SCOPE int udp4_gro_complete(struct sk_buff *skb, int nhoff)
/* do fraglist only if there is no outer UDP encap (or we already processed it) */
if (NAPI_GRO_CB(skb)->is_flist && !NAPI_GRO_CB(skb)->encap_mark) {
- udp_set_len_short(uh, skb->len - nhoff);
+ udp_set_len(uh, skb->len - nhoff);
skb_shinfo(skb)->gso_type |= (SKB_GSO_FRAGLIST|SKB_GSO_UDP_L4);
skb_shinfo(skb)->gso_segs = NAPI_GRO_CB(skb)->count;
diff --git a/net/ipv6/udp_offload.c b/net/ipv6/udp_offload.c
index bfe0d7104e8a..37b90ad9f9b2 100644
--- a/net/ipv6/udp_offload.c
+++ b/net/ipv6/udp_offload.c
@@ -172,7 +172,7 @@ int udp6_gro_complete(struct sk_buff *skb, int nhoff)
/* do fraglist only if there is no outer UDP encap (or we already processed it) */
if (NAPI_GRO_CB(skb)->is_flist && !NAPI_GRO_CB(skb)->encap_mark) {
- udp_set_len_short(uh, skb->len - nhoff);
+ udp_set_len(uh, skb->len - nhoff);
skb_shinfo(skb)->gso_type |= (SKB_GSO_FRAGLIST|SKB_GSO_UDP_L4);
skb_shinfo(skb)->gso_segs = NAPI_GRO_CB(skb)->count;
--
2.52.0
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH net-next v2 07/12] udp: Support BIG TCP GSO packets where they can occur
2026-02-26 20:15 [PATCH net-next v2 00/12] BIG TCP for UDP tunnels Alice Mikityanska
` (5 preceding siblings ...)
2026-02-26 20:15 ` [PATCH net-next v2 06/12] udp: Support gro_ipv4_max_size > 65536 Alice Mikityanska
@ 2026-02-26 20:15 ` Alice Mikityanska
2026-02-26 20:15 ` [PATCH net-next v2 08/12] udp: Validate UDP length in udp_gro_receive Alice Mikityanska
` (5 subsequent siblings)
12 siblings, 0 replies; 25+ messages in thread
From: Alice Mikityanska @ 2026-02-26 20:15 UTC (permalink / raw)
To: Daniel Borkmann, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni, Xin Long, Willem de Bruijn, David Ahern,
Nikolay Aleksandrov
Cc: Shuah Khan, Stanislav Fomichev, Andrew Lunn, Simon Horman,
Florian Westphal, netdev, Alice Mikityanska
From: Alice Mikityanska <alice@isovalent.com>
Wherever a GSO packet can occur, and its length is used to fill the UDP
header, use udp_set_len that assigns 0 if the length doesn't fit 16
bits, so that the packet can be properly parsed and segmented later,
instead of having truncated length.
Signed-off-by: Alice Mikityanska <alice@isovalent.com>
---
net/ipv4/fou_core.c | 2 +-
net/ipv6/fou6.c | 2 +-
net/netfilter/ipvs/ip_vs_xmit.c | 2 +-
net/netfilter/nf_conntrack_proto_udp.c | 4 +++-
net/netfilter/nf_nat_helper.c | 2 +-
net/psp/psp_main.c | 2 +-
6 files changed, 8 insertions(+), 6 deletions(-)
diff --git a/net/ipv4/fou_core.c b/net/ipv4/fou_core.c
index 7aeb6efbfc44..d934860e1f34 100644
--- a/net/ipv4/fou_core.c
+++ b/net/ipv4/fou_core.c
@@ -1043,7 +1043,7 @@ static void fou_build_udp(struct sk_buff *skb, struct ip_tunnel_encap *e,
uh->dest = e->dport;
uh->source = sport;
- udp_set_len_short(uh, skb->len);
+ udp_set_len(uh, skb->len);
udp_set_csum(!(e->flags & TUNNEL_ENCAP_FLAG_CSUM), skb,
fl4->saddr, fl4->daddr, skb->len);
diff --git a/net/ipv6/fou6.c b/net/ipv6/fou6.c
index abcf23500299..3c79fe8aba1b 100644
--- a/net/ipv6/fou6.c
+++ b/net/ipv6/fou6.c
@@ -30,7 +30,7 @@ static void fou6_build_udp(struct sk_buff *skb, struct ip_tunnel_encap *e,
uh->dest = e->dport;
uh->source = sport;
- udp_set_len_short(uh, skb->len);
+ udp_set_len(uh, skb->len);
udp6_set_csum(!(e->flags & TUNNEL_ENCAP_FLAG_CSUM6), skb,
&fl6->saddr, &fl6->daddr, skb->len);
diff --git a/net/netfilter/ipvs/ip_vs_xmit.c b/net/netfilter/ipvs/ip_vs_xmit.c
index b460998e348e..08b0b5bfe4ec 100644
--- a/net/netfilter/ipvs/ip_vs_xmit.c
+++ b/net/netfilter/ipvs/ip_vs_xmit.c
@@ -1089,7 +1089,7 @@ ipvs_gue_encap(struct net *net, struct sk_buff *skb,
dport = cp->dest->tun_port;
udph->dest = dport;
udph->source = sport;
- udp_set_len_short(udph, skb->len);
+ udp_set_len(udph, skb->len);
udph->check = 0;
*next_protocol = IPPROTO_UDP;
diff --git a/net/netfilter/nf_conntrack_proto_udp.c b/net/netfilter/nf_conntrack_proto_udp.c
index e9bd1632304f..ca7d259ded8b 100644
--- a/net/netfilter/nf_conntrack_proto_udp.c
+++ b/net/netfilter/nf_conntrack_proto_udp.c
@@ -45,9 +45,11 @@ static bool udp_validate_len(struct sk_buff *skb,
const struct udphdr *hdr,
unsigned int dataoff)
{
- unsigned int udplen = udp_get_len_short(hdr);
+ unsigned int udplen = ntohs(hdr->len);
unsigned int skblen = skb->len - dataoff;
+ if (!udplen && skblen >= GRO_LEGACY_MAX_SIZE)
+ return true;
if (udplen > skblen || udplen < sizeof(*hdr))
return false;
return true;
diff --git a/net/netfilter/nf_nat_helper.c b/net/netfilter/nf_nat_helper.c
index 3853f41db499..ec34a2f4baa8 100644
--- a/net/netfilter/nf_nat_helper.c
+++ b/net/netfilter/nf_nat_helper.c
@@ -161,7 +161,7 @@ nf_nat_mangle_udp_packet(struct sk_buff *skb,
/* update the length of the UDP packet */
datalen = skb->len - protoff;
- udp_set_len_short(udph, datalen);
+ udp_set_len(udph, datalen);
/* fix udp checksum if udp checksum was previously calculated */
if (!udph->check && skb->ip_summed != CHECKSUM_PARTIAL)
diff --git a/net/psp/psp_main.c b/net/psp/psp_main.c
index 2415b75a2a12..c9c60cb69daa 100644
--- a/net/psp/psp_main.c
+++ b/net/psp/psp_main.c
@@ -207,7 +207,7 @@ static void psp_write_headers(struct net *net, struct sk_buff *skb, __be32 spi,
uh->source = udp_flow_src_port(net, skb, 0, 0, false);
}
uh->check = 0;
- udp_set_len_short(uh, udp_len);
+ udp_set_len(uh, udp_len);
psph->nexthdr = IPPROTO_TCP;
psph->hdrlen = PSP_HDRLEN_NOOPT;
--
2.52.0
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH net-next v2 08/12] udp: Validate UDP length in udp_gro_receive
2026-02-26 20:15 [PATCH net-next v2 00/12] BIG TCP for UDP tunnels Alice Mikityanska
` (6 preceding siblings ...)
2026-02-26 20:15 ` [PATCH net-next v2 07/12] udp: Support BIG TCP GSO packets where they can occur Alice Mikityanska
@ 2026-02-26 20:15 ` Alice Mikityanska
2026-02-26 20:15 ` [PATCH net-next v2 09/12] udp: Set length in UDP header to 0 for big GSO packets Alice Mikityanska
` (4 subsequent siblings)
12 siblings, 0 replies; 25+ messages in thread
From: Alice Mikityanska @ 2026-02-26 20:15 UTC (permalink / raw)
To: Daniel Borkmann, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni, Xin Long, Willem de Bruijn, David Ahern,
Nikolay Aleksandrov
Cc: Shuah Khan, Stanislav Fomichev, Andrew Lunn, Simon Horman,
Florian Westphal, netdev, Alice Mikityanska
From: Alice Mikityanska <alice@isovalent.com>
In the previous commit we started using uh->len = 0 as a marker of a GRO
packet bigger than 65536 bytes. To prevent abuse by maliciously crafted
packets, check the length in the UDP header in udp_gro_receive.
Note that a similar check was present in udp_gro_receive_segment, but
not in the UDP socket gro_receive flow. By adding an early check to
udp_gro_receive, the check in udp_gro_receive_segment can be dropped.
Signed-off-by: Alice Mikityanska <alice@isovalent.com>
---
net/ipv4/udp_offload.c | 12 +++++++-----
1 file changed, 7 insertions(+), 5 deletions(-)
diff --git a/net/ipv4/udp_offload.c b/net/ipv4/udp_offload.c
index 780df257a8d9..5d9de8998867 100644
--- a/net/ipv4/udp_offload.c
+++ b/net/ipv4/udp_offload.c
@@ -706,12 +706,8 @@ static struct sk_buff *udp_gro_receive_segment(struct list_head *head,
return NULL;
}
- /* Do not deal with padded or malicious packets, sorry ! */
ulen = udp_get_len_short(uh);
- if (ulen <= sizeof(*uh) || ulen != skb_gro_len(skb)) {
- NAPI_GRO_CB(skb)->flush = 1;
- return NULL;
- }
+
/* pull encapsulating udp header */
skb_gro_pull(skb, sizeof(struct udphdr));
@@ -781,8 +777,14 @@ struct sk_buff *udp_gro_receive(struct list_head *head, struct sk_buff *skb,
struct sk_buff *p;
struct udphdr *uh2;
unsigned int off = skb_gro_offset(skb);
+ unsigned int ulen;
int flush = 1;
+ /* Do not deal with padded or malicious packets, sorry! */
+ ulen = udp_get_len_short(uh);
+ if (ulen <= sizeof(*uh) || ulen != skb_gro_len(skb))
+ goto out;
+
/* We can do L4 aggregation only if the packet can't land in a tunnel
* otherwise we could corrupt the inner stream. Detecting such packets
* cannot be foolproof and the aggregation might still happen in some
--
2.52.0
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH net-next v2 09/12] udp: Set length in UDP header to 0 for big GSO packets
2026-02-26 20:15 [PATCH net-next v2 00/12] BIG TCP for UDP tunnels Alice Mikityanska
` (7 preceding siblings ...)
2026-02-26 20:15 ` [PATCH net-next v2 08/12] udp: Validate UDP length in udp_gro_receive Alice Mikityanska
@ 2026-02-26 20:15 ` Alice Mikityanska
2026-02-26 20:15 ` [PATCH net-next v2 10/12] vxlan: Enable BIG TCP packets Alice Mikityanska
` (3 subsequent siblings)
12 siblings, 0 replies; 25+ messages in thread
From: Alice Mikityanska @ 2026-02-26 20:15 UTC (permalink / raw)
To: Daniel Borkmann, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni, Xin Long, Willem de Bruijn, David Ahern,
Nikolay Aleksandrov
Cc: Shuah Khan, Stanislav Fomichev, Andrew Lunn, Simon Horman,
Florian Westphal, netdev, Alice Mikityanska
From: Alice Mikityanska <alice@isovalent.com>
skb->len may be bigger than 65535 in UDP-based tunnels that have BIG TCP
enabled. If GSO aggregates packets that large, set the length in the UDP
header to 0, so that tcpdump can print such packets properly (treating
them as RFC 2675 jumbograms). Later in the pipeline, __udp_gso_segment
will set uh->len to the size of individual packets.
Signed-off-by: Alice Mikityanska <alice@isovalent.com>
---
net/ipv4/udp_tunnel_core.c | 2 +-
net/ipv6/ip6_udp_tunnel.c | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/net/ipv4/udp_tunnel_core.c b/net/ipv4/udp_tunnel_core.c
index 18f789d9383e..8c586dc08f3b 100644
--- a/net/ipv4/udp_tunnel_core.c
+++ b/net/ipv4/udp_tunnel_core.c
@@ -184,7 +184,7 @@ void udp_tunnel_xmit_skb(struct rtable *rt, struct sock *sk, struct sk_buff *skb
uh->dest = dst_port;
uh->source = src_port;
- udp_set_len_short(uh, skb->len);
+ udp_set_len(uh, skb->len);
memset(&(IPCB(skb)->opt), 0, sizeof(IPCB(skb)->opt));
diff --git a/net/ipv6/ip6_udp_tunnel.c b/net/ipv6/ip6_udp_tunnel.c
index 26b140fea7b7..b7810736d1c6 100644
--- a/net/ipv6/ip6_udp_tunnel.c
+++ b/net/ipv6/ip6_udp_tunnel.c
@@ -93,7 +93,7 @@ void udp_tunnel6_xmit_skb(struct dst_entry *dst, struct sock *sk,
uh->dest = dst_port;
uh->source = src_port;
- udp_set_len_short(uh, skb->len);
+ udp_set_len(uh, skb->len);
skb_dst_set(skb, dst);
--
2.52.0
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH net-next v2 10/12] vxlan: Enable BIG TCP packets
2026-02-26 20:15 [PATCH net-next v2 00/12] BIG TCP for UDP tunnels Alice Mikityanska
` (8 preceding siblings ...)
2026-02-26 20:15 ` [PATCH net-next v2 09/12] udp: Set length in UDP header to 0 for big GSO packets Alice Mikityanska
@ 2026-02-26 20:15 ` Alice Mikityanska
2026-02-26 20:15 ` [PATCH net-next v2 11/12] geneve: " Alice Mikityanska
` (2 subsequent siblings)
12 siblings, 0 replies; 25+ messages in thread
From: Alice Mikityanska @ 2026-02-26 20:15 UTC (permalink / raw)
To: Daniel Borkmann, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni, Xin Long, Willem de Bruijn, David Ahern,
Nikolay Aleksandrov
Cc: Shuah Khan, Stanislav Fomichev, Andrew Lunn, Simon Horman,
Florian Westphal, netdev, Alice Mikityanska
From: Alice Mikityanska <alice@isovalent.com>
In Cilium we do support BIG TCP, but so far the latter has only been
enabled for direct routing use-cases. A lot of users rely on Cilium
with vxlan/geneve tunneling though. The underlying kernel infra for
tunneling has not been supporting BIG TCP up to this point.
Given we do now, bump tso_max_size for vxlan netdevs up to GSO_MAX_SIZE
to allow the admin to use BIG TCP with vxlan tunnels.
BIG TCP on vxlan disabled:
Standard MTU:
# netperf -H 10.1.0.2 -t TCP_STREAM -l60
MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 10.1.0.2 () port 0 AF_INET : demo
Recv Send Send
Socket Socket Message Elapsed
Size Size Size Time Throughput
bytes bytes bytes secs. 10^6bits/sec
131072 16384 16384 30.00 34440.00
8k MTU:
# netperf -H 10.1.0.2 -t TCP_STREAM -l60
MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 10.1.0.2 () port 0 AF_INET : demo
Recv Send Send
Socket Socket Message Elapsed
Size Size Size Time Throughput
bytes bytes bytes secs. 10^6bits/sec
262144 32768 32768 30.00 55684.26
BIG TCP on vxlan enabled:
Standard MTU:
# netperf -H 10.1.0.2 -t TCP_STREAM -l60
MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 10.1.0.2 () port 0 AF_INET : demo
Recv Send Send
Socket Socket Message Elapsed
Size Size Size Time Throughput
bytes bytes bytes secs. 10^6bits/sec
131072 16384 16384 30.00 39564.78
8k MTU:
# netperf -H 10.1.0.2 -t TCP_STREAM -l60
MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 10.1.0.2 () port 0 AF_INET : demo
Recv Send Send
Socket Socket Message Elapsed
Size Size Size Time Throughput
bytes bytes bytes secs. 10^6bits/sec
262144 32768 32768 30.00 61466.47
When tunnel offloads are not enabled/exposed and we fully need to rely on
SW-based segmentation on transmit (e.g. in case of Azure) then the more
aggressive batching also has a visible effect. Below example was on the
same setup as with above benchmarks but with HW support disabled:
# ethtool -k enp10s0f0np0 | grep udp
tx-udp_tnl-segmentation: off
tx-udp_tnl-csum-segmentation: off
tx-udp-segmentation: off
rx-udp_tunnel-port-offload: off
rx-udp-gro-forwarding: off
Before:
# netperf -H 10.1.0.2 -t TCP_STREAM -l60
MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 10.1.0.2 () port 0 AF_INET : demo
Recv Send Send
Socket Socket Message Elapsed
Size Size Size Time Throughput
bytes bytes bytes secs. 10^6bits/sec
131072 16384 16384 60.00 21820.82
After:
# netperf -H 10.1.0.2 -t TCP_STREAM -l60
MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 10.1.0.2 () port 0 AF_INET : demo
Recv Send Send
Socket Socket Message Elapsed
Size Size Size Time Throughput
bytes bytes bytes secs. 10^6bits/sec
131072 16384 16384 60.00 29390.78
Example receive side:
swapper 0 [002] 4712.645070: net:netif_receive_skb: dev=enp10s0f0np0 skbaddr=0xffff8f3b086e0200 len=129542
ffffffff8cfe3aaa __netif_receive_skb_core.constprop.0+0x6ca ([kernel.kallsyms])
ffffffff8cfe3aaa __netif_receive_skb_core.constprop.0+0x6ca ([kernel.kallsyms])
ffffffff8cfe47dd __netif_receive_skb_list_core+0xed ([kernel.kallsyms])
ffffffff8cfe4e52 netif_receive_skb_list_internal+0x1d2 ([kernel.kallsyms])
ffffffff8d0210d8 gro_complete.constprop.0+0x108 ([kernel.kallsyms])
ffffffff8d021724 dev_gro_receive+0x4e4 ([kernel.kallsyms])
ffffffff8d021a99 gro_receive_skb+0x89 ([kernel.kallsyms])
ffffffffc06edb71 mlx5e_handle_rx_cqe_mpwrq+0x131 ([kernel.kallsyms])
ffffffffc06ee38a mlx5e_poll_rx_cq+0x9a ([kernel.kallsyms])
ffffffffc06ef2c7 mlx5e_napi_poll+0x107 ([kernel.kallsyms])
ffffffff8cfe586d __napi_poll+0x2d ([kernel.kallsyms])
ffffffff8cfe5f8d net_rx_action+0x20d ([kernel.kallsyms])
ffffffff8c35d252 handle_softirqs+0xe2 ([kernel.kallsyms])
ffffffff8c35d556 __irq_exit_rcu+0xd6 ([kernel.kallsyms])
ffffffff8c35d81e irq_exit_rcu+0xe ([kernel.kallsyms])
ffffffff8d2602b8 common_interrupt+0x98 ([kernel.kallsyms])
ffffffff8c000da7 asm_common_interrupt+0x27 ([kernel.kallsyms])
ffffffff8d2645c5 cpuidle_enter_state+0xd5 ([kernel.kallsyms])
ffffffff8cf6358e cpuidle_enter+0x2e ([kernel.kallsyms])
ffffffff8c3ba932 call_cpuidle+0x22 ([kernel.kallsyms])
ffffffff8c3bfb5e do_idle+0x1ce ([kernel.kallsyms])
ffffffff8c3bfd79 cpu_startup_entry+0x29 ([kernel.kallsyms])
ffffffff8c30a6c2 start_secondary+0x112 ([kernel.kallsyms])
ffffffff8c2c142d common_startup_64+0x13e ([kernel.kallsyms])
Example transmit side:
swapper 0 [005] 4768.021375: net:net_dev_xmit: dev=enp10s0f0np0 skbaddr=0xffff8af32ebe1200 len=129556 rc=0
ffffffffa75e19c3 dev_hard_start_xmit+0x173 ([kernel.kallsyms])
ffffffffa75e19c3 dev_hard_start_xmit+0x173 ([kernel.kallsyms])
ffffffffa7653823 sch_direct_xmit+0x143 ([kernel.kallsyms])
ffffffffa75e2780 __dev_queue_xmit+0xc70 ([kernel.kallsyms])
ffffffffa76a1205 ip_finish_output2+0x265 ([kernel.kallsyms])
ffffffffa76a1577 __ip_finish_output+0x87 ([kernel.kallsyms])
ffffffffa76a165b ip_finish_output+0x2b ([kernel.kallsyms])
ffffffffa76a179e ip_output+0x5e ([kernel.kallsyms])
ffffffffa76a19d5 ip_local_out+0x35 ([kernel.kallsyms])
ffffffffa770d0e5 iptunnel_xmit+0x185 ([kernel.kallsyms])
ffffffffc179634e nf_nat_used_tuple_new.cold+0x1129 ([kernel.kallsyms])
ffffffffc17a7301 vxlan_xmit_one+0xc21 ([kernel.kallsyms])
ffffffffc17a80a2 vxlan_xmit+0x4a2 ([kernel.kallsyms])
ffffffffa75e18af dev_hard_start_xmit+0x5f ([kernel.kallsyms])
ffffffffa75e1d3f __dev_queue_xmit+0x22f ([kernel.kallsyms])
ffffffffa76a1205 ip_finish_output2+0x265 ([kernel.kallsyms])
ffffffffa76a1577 __ip_finish_output+0x87 ([kernel.kallsyms])
ffffffffa76a165b ip_finish_output+0x2b ([kernel.kallsyms])
ffffffffa76a179e ip_output+0x5e ([kernel.kallsyms])
ffffffffa76a1de2 __ip_queue_xmit+0x1b2 ([kernel.kallsyms])
ffffffffa76a2135 ip_queue_xmit+0x15 ([kernel.kallsyms])
ffffffffa76c70a2 __tcp_transmit_skb+0x522 ([kernel.kallsyms])
ffffffffa76c931a tcp_write_xmit+0x65a ([kernel.kallsyms])
ffffffffa76cb42e tcp_tsq_write+0x5e ([kernel.kallsyms])
ffffffffa76cb7ef tcp_tasklet_func+0x10f ([kernel.kallsyms])
ffffffffa695d9f7 tasklet_action_common+0x107 ([kernel.kallsyms])
ffffffffa695db99 tasklet_action+0x29 ([kernel.kallsyms])
ffffffffa695d252 handle_softirqs+0xe2 ([kernel.kallsyms])
ffffffffa695d556 __irq_exit_rcu+0xd6 ([kernel.kallsyms])
ffffffffa695d81e irq_exit_rcu+0xe ([kernel.kallsyms])
ffffffffa78602b8 common_interrupt+0x98 ([kernel.kallsyms])
ffffffffa6600da7 asm_common_interrupt+0x27 ([kernel.kallsyms])
ffffffffa78645c5 cpuidle_enter_state+0xd5 ([kernel.kallsyms])
ffffffffa756358e cpuidle_enter+0x2e ([kernel.kallsyms])
ffffffffa69ba932 call_cpuidle+0x22 ([kernel.kallsyms])
ffffffffa69bfb5e do_idle+0x1ce ([kernel.kallsyms])
ffffffffa69bfd79 cpu_startup_entry+0x29 ([kernel.kallsyms])
ffffffffa690a6c2 start_secondary+0x112 ([kernel.kallsyms])
ffffffffa68c142d common_startup_64+0x13e ([kernel.kallsyms])
Signed-off-by: Alice Mikityanska <alice@isovalent.com>
Co-developed-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Nikolay Aleksandrov <razor@blackwall.org>
---
drivers/net/vxlan/vxlan_core.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/net/vxlan/vxlan_core.c b/drivers/net/vxlan/vxlan_core.c
index 05558b6afecd..88ecbd151433 100644
--- a/drivers/net/vxlan/vxlan_core.c
+++ b/drivers/net/vxlan/vxlan_core.c
@@ -3363,6 +3363,8 @@ static void vxlan_setup(struct net_device *dev)
dev->mangleid_features = NETIF_F_GSO_PARTIAL;
netif_keep_dst(dev);
+ netif_set_tso_max_size(dev, GSO_MAX_SIZE);
+
dev->priv_flags |= IFF_NO_QUEUE;
dev->change_proto_down = true;
dev->lltx = true;
--
2.52.0
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH net-next v2 11/12] geneve: Enable BIG TCP packets
2026-02-26 20:15 [PATCH net-next v2 00/12] BIG TCP for UDP tunnels Alice Mikityanska
` (9 preceding siblings ...)
2026-02-26 20:15 ` [PATCH net-next v2 10/12] vxlan: Enable BIG TCP packets Alice Mikityanska
@ 2026-02-26 20:15 ` Alice Mikityanska
2026-02-26 20:16 ` [PATCH net-next v2 12/12] selftests: net: Add a test for BIG TCP in UDP tunnels Alice Mikityanska
2026-02-27 18:17 ` [syzbot ci] Re: BIG TCP for " syzbot ci
12 siblings, 0 replies; 25+ messages in thread
From: Alice Mikityanska @ 2026-02-26 20:15 UTC (permalink / raw)
To: Daniel Borkmann, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni, Xin Long, Willem de Bruijn, David Ahern,
Nikolay Aleksandrov
Cc: Shuah Khan, Stanislav Fomichev, Andrew Lunn, Simon Horman,
Florian Westphal, netdev, Alice Mikityanska
From: Alice Mikityanska <alice@isovalent.com>
From: Daniel Borkmann <daniel@iogearbox.net>
In Cilium we do support BIG TCP, but so far the latter has only been
enabled for direct routing use-cases. A lot of users rely on Cilium
with vxlan/geneve tunneling though. The underlying kernel infra for
tunneling has not been supporting BIG TCP up to this point.
Given we do now, bump tso_max_size for geneve netdevs up to GSO_MAX_SIZE
to allow the admin to use BIG TCP with geneve tunnels.
BIG TCP on geneve disabled:
Standard MTU:
# netperf -H 10.1.0.2 -t TCP_STREAM -l60
MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 10.1.0.2 () port 0 AF_INET : demo
Recv Send Send
Socket Socket Message Elapsed
Size Size Size Time Throughput
bytes bytes bytes secs. 10^6bits/sec
131072 16384 16384 30.00 37391.34
8k MTU:
# netperf -H 10.1.0.2 -t TCP_STREAM -l60
MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 10.1.0.2 () port 0 AF_INET : demo
Recv Send Send
Socket Socket Message Elapsed
Size Size Size Time Throughput
bytes bytes bytes secs. 10^6bits/sec
262144 32768 32768 60.00 58030.19
BIG TCP on geneve enabled:
Standard MTU:
# netperf -H 10.1.0.2 -t TCP_STREAM -l60
MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 10.1.0.2 () port 0 AF_INET : demo
Recv Send Send
Socket Socket Message Elapsed
Size Size Size Time Throughput
bytes bytes bytes secs. 10^6bits/sec
131072 16384 16384 30.00 40891.57
8k MTU:
# netperf -H 10.1.0.2 -t TCP_STREAM -l60
MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 10.1.0.2 () port 0 AF_INET : demo
Recv Send Send
Socket Socket Message Elapsed
Size Size Size Time Throughput
bytes bytes bytes secs. 10^6bits/sec
262144 32768 32768 60.00 61458.39
Example receive side:
swapper 0 [008] 3682.509996: net:netif_receive_skb: dev=geneve0 skbaddr=0xffff8f3b0a781800 len=129492
ffffffff8cfe3aaa __netif_receive_skb_core.constprop.0+0x6ca ([kernel.kallsyms])
ffffffff8cfe3aaa __netif_receive_skb_core.constprop.0+0x6ca ([kernel.kallsyms])
ffffffff8cfe47dd __netif_receive_skb_list_core+0xed ([kernel.kallsyms])
ffffffff8cfe4e52 netif_receive_skb_list_internal+0x1d2 ([kernel.kallsyms])
ffffffff8cfe573c napi_complete_done+0x7c ([kernel.kallsyms])
ffffffff8d046c23 gro_cell_poll+0x83 ([kernel.kallsyms])
ffffffff8cfe586d __napi_poll+0x2d ([kernel.kallsyms])
ffffffff8cfe5f8d net_rx_action+0x20d ([kernel.kallsyms])
ffffffff8c35d252 handle_softirqs+0xe2 ([kernel.kallsyms])
ffffffff8c35d556 __irq_exit_rcu+0xd6 ([kernel.kallsyms])
ffffffff8c35d81e irq_exit_rcu+0xe ([kernel.kallsyms])
ffffffff8d2602b8 common_interrupt+0x98 ([kernel.kallsyms])
ffffffff8c000da7 asm_common_interrupt+0x27 ([kernel.kallsyms])
ffffffff8d2645c5 cpuidle_enter_state+0xd5 ([kernel.kallsyms])
ffffffff8cf6358e cpuidle_enter+0x2e ([kernel.kallsyms])
ffffffff8c3ba932 call_cpuidle+0x22 ([kernel.kallsyms])
ffffffff8c3bfb5e do_idle+0x1ce ([kernel.kallsyms])
ffffffff8c3bfd79 cpu_startup_entry+0x29 ([kernel.kallsyms])
ffffffff8c30a6c2 start_secondary+0x112 ([kernel.kallsyms])
ffffffff8c2c142d common_startup_64+0x13e ([kernel.kallsyms])
Example transmit side:
swapper 0 [002] 3403.688687: net:net_dev_xmit: dev=enp10s0f0np0 skbaddr=0xffff8af31d104ae8 len=129556 rc=0
ffffffffa75e19c3 dev_hard_start_xmit+0x173 ([kernel.kallsyms])
ffffffffa75e19c3 dev_hard_start_xmit+0x173 ([kernel.kallsyms])
ffffffffa7653823 sch_direct_xmit+0x143 ([kernel.kallsyms])
ffffffffa75e2780 __dev_queue_xmit+0xc70 ([kernel.kallsyms])
ffffffffa76a1205 ip_finish_output2+0x265 ([kernel.kallsyms])
ffffffffa76a1577 __ip_finish_output+0x87 ([kernel.kallsyms])
ffffffffa76a165b ip_finish_output+0x2b ([kernel.kallsyms])
ffffffffa76a179e ip_output+0x5e ([kernel.kallsyms])
ffffffffa76a19d5 ip_local_out+0x35 ([kernel.kallsyms])
ffffffffa770d0e5 iptunnel_xmit+0x185 ([kernel.kallsyms])
ffffffffc179634e nf_nat_used_tuple_new.cold+0x1129 ([kernel.kallsyms])
ffffffffc179d3e0 geneve_xmit+0x920 ([kernel.kallsyms])
ffffffffa75e18af dev_hard_start_xmit+0x5f ([kernel.kallsyms])
ffffffffa75e1d3f __dev_queue_xmit+0x22f ([kernel.kallsyms])
ffffffffa76a1205 ip_finish_output2+0x265 ([kernel.kallsyms])
ffffffffa76a1577 __ip_finish_output+0x87 ([kernel.kallsyms])
ffffffffa76a165b ip_finish_output+0x2b ([kernel.kallsyms])
ffffffffa76a179e ip_output+0x5e ([kernel.kallsyms])
ffffffffa76a1de2 __ip_queue_xmit+0x1b2 ([kernel.kallsyms])
ffffffffa76a2135 ip_queue_xmit+0x15 ([kernel.kallsyms])
ffffffffa76c70a2 __tcp_transmit_skb+0x522 ([kernel.kallsyms])
ffffffffa76c931a tcp_write_xmit+0x65a ([kernel.kallsyms])
ffffffffa76ca3b9 __tcp_push_pending_frames+0x39 ([kernel.kallsyms])
ffffffffa76c1fb6 tcp_rcv_established+0x276 ([kernel.kallsyms])
ffffffffa76d3957 tcp_v4_do_rcv+0x157 ([kernel.kallsyms])
ffffffffa76d6053 tcp_v4_rcv+0x1243 ([kernel.kallsyms])
ffffffffa769b8ea ip_protocol_deliver_rcu+0x2a ([kernel.kallsyms])
ffffffffa769bab7 ip_local_deliver_finish+0x77 ([kernel.kallsyms])
ffffffffa769bb4d ip_local_deliver+0x6d ([kernel.kallsyms])
ffffffffa769abe7 ip_sublist_rcv_finish+0x37 ([kernel.kallsyms])
ffffffffa769b713 ip_sublist_rcv+0x173 ([kernel.kallsyms])
ffffffffa769bde2 ip_list_rcv+0x102 ([kernel.kallsyms])
ffffffffa75e4868 __netif_receive_skb_list_core+0x178 ([kernel.kallsyms])
ffffffffa75e4e52 netif_receive_skb_list_internal+0x1d2 ([kernel.kallsyms])
ffffffffa75e573c napi_complete_done+0x7c ([kernel.kallsyms])
ffffffffa7646c23 gro_cell_poll+0x83 ([kernel.kallsyms])
ffffffffa75e586d __napi_poll+0x2d ([kernel.kallsyms])
ffffffffa75e5f8d net_rx_action+0x20d ([kernel.kallsyms])
ffffffffa695d252 handle_softirqs+0xe2 ([kernel.kallsyms])
ffffffffa695d556 __irq_exit_rcu+0xd6 ([kernel.kallsyms])
ffffffffa695d81e irq_exit_rcu+0xe ([kernel.kallsyms])
ffffffffa78602b8 common_interrupt+0x98 ([kernel.kallsyms])
ffffffffa6600da7 asm_common_interrupt+0x27 ([kernel.kallsyms])
ffffffffa78645c5 cpuidle_enter_state+0xd5 ([kernel.kallsyms])
ffffffffa756358e cpuidle_enter+0x2e ([kernel.kallsyms])
ffffffffa69ba932 call_cpuidle+0x22 ([kernel.kallsyms])
ffffffffa69bfb5e do_idle+0x1ce ([kernel.kallsyms])
ffffffffa69bfd79 cpu_startup_entry+0x29 ([kernel.kallsyms])
ffffffffa690a6c2 start_secondary+0x112 ([kernel.kallsyms])
ffffffffa68c142d common_startup_64+0x13e ([kernel.kallsyms])
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Co-developed-by: Alice Mikityanska <alice@isovalent.com>
Signed-off-by: Alice Mikityanska <alice@isovalent.com>
Cc: Nikolay Aleksandrov <razor@blackwall.org>
---
drivers/net/geneve.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/net/geneve.c b/drivers/net/geneve.c
index 5aa5c0e81b12..c92af9b6171f 100644
--- a/drivers/net/geneve.c
+++ b/drivers/net/geneve.c
@@ -1700,6 +1700,8 @@ static void geneve_setup(struct net_device *dev)
dev->max_mtu = IP_MAX_MTU - GENEVE_BASE_HLEN - dev->hard_header_len;
netif_keep_dst(dev);
+ netif_set_tso_max_size(dev, GSO_MAX_SIZE);
+
dev->priv_flags &= ~IFF_TX_SKB_SHARING;
dev->priv_flags |= IFF_LIVE_ADDR_CHANGE | IFF_NO_QUEUE;
dev->lltx = true;
--
2.52.0
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH net-next v2 12/12] selftests: net: Add a test for BIG TCP in UDP tunnels
2026-02-26 20:15 [PATCH net-next v2 00/12] BIG TCP for UDP tunnels Alice Mikityanska
` (10 preceding siblings ...)
2026-02-26 20:15 ` [PATCH net-next v2 11/12] geneve: " Alice Mikityanska
@ 2026-02-26 20:16 ` Alice Mikityanska
2026-02-27 1:30 ` Jakub Kicinski
2026-02-27 18:17 ` [syzbot ci] Re: BIG TCP for " syzbot ci
12 siblings, 1 reply; 25+ messages in thread
From: Alice Mikityanska @ 2026-02-26 20:16 UTC (permalink / raw)
To: Daniel Borkmann, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni, Xin Long, Willem de Bruijn, David Ahern,
Nikolay Aleksandrov
Cc: Shuah Khan, Stanislav Fomichev, Andrew Lunn, Simon Horman,
Florian Westphal, netdev, Alice Mikityanska
From: Alice Mikityanska <alice@isovalent.com>
The test sets up VXLAN and GENEVE tunnels over IPv4 and IPv6 and runs
IPv4 and IPv6 traffic through them with BIG TCP enabled. It checks that
a non-negligible amount of big aggregated packets are seen in tcpdump.
Signed-off-by: Alice Mikityanska <alice@isovalent.com>
---
tools/testing/selftests/net/Makefile | 1 +
.../testing/selftests/net/big_tcp_tunnels.sh | 145 ++++++++++++++++++
2 files changed, 146 insertions(+)
create mode 100755 tools/testing/selftests/net/big_tcp_tunnels.sh
diff --git a/tools/testing/selftests/net/Makefile b/tools/testing/selftests/net/Makefile
index e97c90886f34..0134fa84fc2c 100644
--- a/tools/testing/selftests/net/Makefile
+++ b/tools/testing/selftests/net/Makefile
@@ -13,6 +13,7 @@ TEST_PROGS := \
arp_ndisc_untracked_subnets.sh \
bareudp.sh \
big_tcp.sh \
+ big_tcp_tunnels.sh \
bind_bhash.sh \
bpf_offload.py \
broadcast_ether_dst.sh \
diff --git a/tools/testing/selftests/net/big_tcp_tunnels.sh b/tools/testing/selftests/net/big_tcp_tunnels.sh
new file mode 100755
index 000000000000..6896bf36ee00
--- /dev/null
+++ b/tools/testing/selftests/net/big_tcp_tunnels.sh
@@ -0,0 +1,145 @@
+#!/usr/bin/env bash
+# SPDX-License-Identifier: GPL-2.0
+#
+# Testing for IPv4 and IPv6 BIG TCP over VXLAN and GENEVE tunnels.
+
+SERVER_NS=$(mktemp -u server-XXXXXXXX)
+SERVER_IP4="192.168.1.1"
+SERVER_IP6="2001:db8::1:1"
+SERVER_IP4_TUN="192.168.2.1"
+SERVER_IP6_TUN="2001:db8::2:1"
+
+CLIENT_NS=$(mktemp -u client-XXXXXXXX)
+CLIENT_IP4="192.168.1.2"
+CLIENT_IP6="2001:db8::1:2"
+CLIENT_IP4_TUN="192.168.2.2"
+CLIENT_IP6_TUN="2001:db8::2:2"
+
+PACKETS_THRESHOLD=10000
+
+# Kselftest framework requirement - SKIP code is 4.
+ksft_skip=4
+
+setup() {
+ ip netns add "$SERVER_NS"
+ ip netns add "$CLIENT_NS"
+ ip -netns "$SERVER_NS" link add link1 type veth peer name link0 netns "$CLIENT_NS"
+
+ ip -netns "$CLIENT_NS" link set link0 up
+ ip -netns "$CLIENT_NS" addr replace "$CLIENT_IP4/24" dev link0
+ ip -netns "$CLIENT_NS" addr replace "$CLIENT_IP6/112" dev link0 nodad
+ ip -netns "$CLIENT_NS" link set link0 \
+ gso_max_size 196608 gso_ipv4_max_size 196608 \
+ gro_max_size 196608 gro_ipv4_max_size 196608
+ ip -netns "$SERVER_NS" link set link1 up
+ ip -netns "$SERVER_NS" addr replace "$SERVER_IP4/24" dev link1
+ ip -netns "$SERVER_NS" addr replace "$SERVER_IP6/112" dev link1 nodad
+ ip -netns "$SERVER_NS" link set link1 \
+ gso_max_size 196608 gso_ipv4_max_size 196608 \
+ gro_max_size 196608 gro_ipv4_max_size 196608
+
+ ip netns exec "$SERVER_NS" netserver 2>&1 >/dev/null
+}
+
+setup_tunnel() {
+ if [ "$2" = 4 ]; then
+ SERVER_IP="$SERVER_IP4"
+ CLIENT_IP="$CLIENT_IP4"
+ echo "Setting up ${1^^} over IPv4"
+ else
+ SERVER_IP="$SERVER_IP6"
+ CLIENT_IP="$CLIENT_IP6"
+ echo "Setting up ${1^^} over IPv6"
+ fi
+
+ if [ "$1" = vxlan ]; then
+ ip -netns "$CLIENT_NS" link add tun0 type vxlan \
+ id 5001 remote "$SERVER_IP" local "$CLIENT_IP" dev link0 dstport 4789
+ else
+ ip -netns "$CLIENT_NS" link add tun0 type geneve \
+ id 5001 remote "$SERVER_IP"
+ fi
+ ip -netns "$CLIENT_NS" link set tun0 up
+ ip -netns "$CLIENT_NS" addr replace "$CLIENT_IP4_TUN/24" dev tun0
+ ip -netns "$CLIENT_NS" addr replace "$CLIENT_IP6_TUN/112" dev tun0 nodad
+ ip -netns "$CLIENT_NS" link set tun0 \
+ gso_max_size 196608 gso_ipv4_max_size 196608 \
+ gro_max_size 196608 gro_ipv4_max_size 196608
+ if [ "$1" = vxlan ]; then
+ ip -netns "$SERVER_NS" link add tun1 type vxlan \
+ id 5001 remote "$CLIENT_IP" local "$SERVER_IP" dev link1 dstport 4789
+ else
+ ip -netns "$SERVER_NS" link add tun1 type geneve \
+ id 5001 remote "$CLIENT_IP"
+ fi
+ ip -netns "$SERVER_NS" link set tun1 up
+ ip -netns "$SERVER_NS" addr replace "$SERVER_IP4_TUN/24" dev tun1
+ ip -netns "$SERVER_NS" addr replace "$SERVER_IP6_TUN/112" dev tun1 nodad
+ ip -netns "$SERVER_NS" link set tun1 \
+ gso_max_size 196608 gso_ipv4_max_size 196608 \
+ gro_max_size 196608 gro_ipv4_max_size 196608
+}
+
+cleanup_tunnel() {
+ ip -netns "$CLIENT_NS" link del tun0
+ ip -netns "$SERVER_NS" link del tun1
+}
+
+cleanup() {
+ ip netns exec "$SERVER_NS" killall netserver
+ ip netns del "$SERVER_NS"
+ ip netns del "$CLIENT_NS"
+}
+
+do_test() {
+ exec 3< <(ip netns exec "$SERVER_NS" tcpdump -nn -i link1 greater 65536 2> /dev/null)
+ TCPDUMP_SERVER_PID="$!"
+ exec 4< <(wc -l <&3)
+ exec 5< <(ip netns exec "$CLIENT_NS" tcpdump -nn -i link0 greater 65536 2> /dev/null)
+ TCPDUMP_CLIENT_PID="$!"
+ exec 6< <(wc -l <&5)
+
+ if [ "$1" = 4 ]; then
+ SERVER_IP="$SERVER_IP4_TUN"
+ echo "Running IPv4 traffic in the tunnel"
+ else
+ SERVER_IP="$SERVER_IP6_TUN"
+ echo "Running IPv6 traffic in the tunnel"
+ fi
+
+ ip netns exec "$CLIENT_NS" netperf -t TCP_STREAM -l 5 -H "$SERVER_IP" -- \
+ -r 80000:80000 > /dev/null
+ kill "$TCPDUMP_SERVER_PID" "$TCPDUMP_CLIENT_PID"
+ wait "$TCPDUMP_SERVER_PID" "$TCPDUMP_CLIENT_PID"
+ PACKETS_SERVER=$(cat <&4)
+ PACKETS_CLIENT=$(cat <&6)
+ exec 3>&- 4>&- 5>&- 6>&-
+
+ # One line is empty, each packet is two lines (inner and outer).
+ echo "Captured BIG TCP GRO packets: $(((PACKETS_SERVER - 1) / 2))"
+ echo "Captured BIG TCP GSO packets: $(((PACKETS_CLIENT - 1) / 2))"
+ [ "$PACKETS_SERVER" -gt "$(( PACKETS_THRESHOLD * 2 + 1))" ] || return 1
+ [ "$PACKETS_CLIENT" -gt "$(( PACKETS_THRESHOLD * 2 + 1))" ] || return 1
+}
+
+if ! netperf -V &> /dev/null; then
+ echo "SKIP: Could not run test without netperf tool"
+ exit "$ksft_skip"
+fi
+
+if ! ip link help 2>&1 | grep gso_ipv4_max_size &> /dev/null; then
+ echo "SKIP: Could not run test without gso/gro_ipv4_max_size supported in ip-link"
+ exit "$ksft_skip"
+fi
+
+trap cleanup EXIT
+setup
+for tunnel in vxlan geneve; do
+ for tun_family in 4 6; do
+ for traffic_family in 4 6; do
+ setup_tunnel "$tunnel" "$tun_family" || exit "$?"
+ do_test "$traffic_family" || exit "$?"
+ cleanup_tunnel
+ done
+ done
+done
--
2.52.0
^ permalink raw reply related [flat|nested] 25+ messages in thread
* Re: [PATCH net-next v2 04/12] net: Use helpers to get/set UDP len tree-wide
2026-02-26 20:15 ` [PATCH net-next v2 04/12] net: Use helpers to get/set UDP len tree-wide Alice Mikityanska
@ 2026-02-26 20:19 ` Alice Mikityanska
0 siblings, 0 replies; 25+ messages in thread
From: Alice Mikityanska @ 2026-02-26 20:19 UTC (permalink / raw)
To: Alice Mikityanska
Cc: Daniel Borkmann, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni, Xin Long, Willem de Bruijn, David Ahern,
Nikolay Aleksandrov, Shuah Khan, Stanislav Fomichev, Andrew Lunn,
Simon Horman, Florian Westphal, netdev
On Thu, 26 Feb 2026 at 22:16, Alice Mikityanska
<alice.kernel@fastmail.im> wrote:
>
> From: Alice Mikityanska <alice@isovalent.com>
>
> Since BIG TCP for UDP tunnels will start using len=0 in the UDP header
> as an indicator of a GSO packet bigger than 65535 bytes, this commit
> introduces the following getter and setters to use tree-wide, in order
> to explicitly mark places where len=0 may be expected, and handle them
> properly:
>
> 1. udp_get_len_short() returns len in host byte order: to be used on the
> RX side to deal with non-aggregated packets, or to access the raw value
> of the len field.
>
> 2. udp_set_len() sets uh->len to its real value if it's not bigger than
> 65535, and to 0 otherwise: to be used in GSO context with aggregated
> packets.
>
> 3. udp_set_len_short() is to be used when the length is known to fit 16
> bits. It WARNs when the caller tries to assign a bigger value if
> CONFIG_DEBUG_NET=y.
>
> At the moment udp_set_len() is not used, a following commit will start
> using it after enabling len>65535 for GSO.
>
> Raw uh->len (in network byte order) is still accessed in a few places
> for checksum calculation purposes, and in __udp6_lib_rcv and nsim_do_psp
> to decode len=0 (__udp4_lib_rcv will be modified to parse len=0 in the
> corresponding commit).
>
> Signed-off-by: Alice Mikityanska <alice@isovalent.com>
I went over all usages of uh->len.
I think, last time I missed the fraglist case in udp4_gro_complete,
where we should use udp_set_len too. I chose places where udp_set_len is
necessary, according to my best discretion. In addition to UDP tunnels,
I also added the usage to FOU (BIG TCP is blocked by tso_max_size, but
lifting up this limitation seems to make it work), IPVS and netfilter
mangle (they handle GSO packets), and psp_write_headers (can be called
from mlx5, but I don't know if it's needed in practice).
I haven't added it to l2tp_xmit_core, as I think, it doesn't operate on
GSO packets, but if I'm wrong, please correct me.
Pktgen assigns udph->len (and then iph->payload_len) seemingly without
range checking, so I didn't use a helper there to avoid emitting a WARN
if the user sets a value too big.
> ---
> drivers/infiniband/core/lag.c | 2 +-
> drivers/infiniband/sw/rxe/rxe_net.c | 4 +-
> drivers/net/amt.c | 6 +--
> drivers/net/ethernet/intel/i40e/i40e_txrx.c | 2 +-
> drivers/net/ethernet/intel/iavf/iavf_txrx.c | 2 +-
> drivers/net/ethernet/intel/ice/ice_txrx.c | 2 +-
> drivers/net/ethernet/intel/idpf/idpf_txrx.c | 2 +-
> .../marvell/octeontx2/nic/otx2_txrx.c | 2 +-
> .../net/ethernet/mellanox/mlx5/core/en_rx.c | 4 +-
> .../ethernet/mellanox/mlx5/core/en_selftest.c | 2 +-
> drivers/net/ethernet/sfc/falcon/selftest.c | 4 +-
> drivers/net/ethernet/sfc/selftest.c | 4 +-
> drivers/net/ethernet/sfc/siena/selftest.c | 4 +-
> drivers/net/ethernet/sfc/tc_encap_actions.c | 2 +-
> .../stmicro/stmmac/stmmac_selftests.c | 4 +-
> drivers/net/geneve.c | 2 +-
> drivers/net/netdevsim/dev.c | 2 +-
> drivers/net/netdevsim/psample.c | 2 +-
> drivers/net/netdevsim/psp.c | 8 ++--
> drivers/net/wireguard/receive.c | 2 +-
> include/linux/udp.h | 16 ++++++++
> include/net/udplite.h | 4 +-
> include/trace/events/icmp.h | 2 +-
> lib/tests/blackhole_dev_kunit.c | 2 +-
> net/6lowpan/nhc_udp.c | 10 ++---
> net/core/netpoll.c | 2 +-
> net/core/pktgen.c | 4 +-
> net/core/selftests.c | 4 +-
> net/core/tso.c | 3 +-
> net/ipv4/esp4.c | 2 +-
> net/ipv4/fou_core.c | 2 +-
> net/ipv4/ipconfig.c | 6 +--
> net/ipv4/netfilter/nf_nat_snmp_basic_main.c | 4 +-
> net/ipv4/route.c | 2 +-
> net/ipv4/udp.c | 3 +-
> net/ipv4/udp_offload.c | 37 +++++++++----------
> net/ipv4/udp_tunnel_core.c | 2 +-
> net/ipv6/esp6.c | 5 ++-
> net/ipv6/fou6.c | 2 +-
> net/ipv6/ip6_udp_tunnel.c | 2 +-
> net/ipv6/udp.c | 3 +-
> net/ipv6/udp_offload.c | 2 +-
> net/l2tp/l2tp_core.c | 2 +-
> net/netfilter/ipvs/ip_vs_xmit.c | 2 +-
> net/netfilter/nf_conntrack_proto_udp.c | 17 +++++++--
> net/netfilter/nf_log_syslog.c | 2 +-
> net/netfilter/nf_nat_helper.c | 2 +-
> net/psp/psp_main.c | 2 +-
> net/sched/act_csum.c | 4 +-
> net/xfrm/xfrm_nat_keepalive.c | 2 +-
> 50 files changed, 123 insertions(+), 91 deletions(-)
>
> diff --git a/drivers/infiniband/core/lag.c b/drivers/infiniband/core/lag.c
> index 8fd80adfe833..00fe241737ff 100644
> --- a/drivers/infiniband/core/lag.c
> +++ b/drivers/infiniband/core/lag.c
> @@ -36,7 +36,7 @@ static struct sk_buff *rdma_build_skb(struct net_device *netdev,
> uh->source =
> htons(rdma_flow_label_to_udp_sport(ah_attr->grh.flow_label));
> uh->dest = htons(ROCE_V2_UDP_DPORT);
> - uh->len = htons(sizeof(struct udphdr));
> + udp_set_len_short(uh, sizeof(struct udphdr));
>
> if (is_ipv4) {
> skb_push(skb, sizeof(struct iphdr));
> diff --git a/drivers/infiniband/sw/rxe/rxe_net.c b/drivers/infiniband/sw/rxe/rxe_net.c
> index 0bd0902b11f7..7508d2c3a306 100644
> --- a/drivers/infiniband/sw/rxe/rxe_net.c
> +++ b/drivers/infiniband/sw/rxe/rxe_net.c
> @@ -237,7 +237,7 @@ static int rxe_udp_encap_recv(struct sock *sk, struct sk_buff *skb)
> pkt->port_num = 1;
> pkt->hdr = (u8 *)(udph + 1);
> pkt->mask = RXE_GRH_MASK;
> - pkt->paylen = be16_to_cpu(udph->len) - sizeof(*udph);
> + pkt->paylen = udp_get_len_short(udph) - sizeof(*udph);
>
> /* remove udp header */
> skb_pull(skb, sizeof(struct udphdr));
> @@ -300,7 +300,7 @@ static void prepare_udp_hdr(struct sk_buff *skb, __be16 src_port,
>
> udph->dest = dst_port;
> udph->source = src_port;
> - udph->len = htons(skb->len);
> + udp_set_len_short(udph, skb->len);
> udph->check = 0;
> }
>
> diff --git a/drivers/net/amt.c b/drivers/net/amt.c
> index f2f3139e38a5..01511eca7d84 100644
> --- a/drivers/net/amt.c
> +++ b/drivers/net/amt.c
> @@ -667,7 +667,7 @@ static void amt_send_discovery(struct amt_dev *amt)
> udph = udp_hdr(skb);
> udph->source = amt->gw_port;
> udph->dest = amt->relay_port;
> - udph->len = htons(sizeof(*udph) + sizeof(*amtd));
> + udp_set_len_short(udph, sizeof(*udph) + sizeof(*amtd));
> udph->check = 0;
> offset = skb_transport_offset(skb);
> skb->csum = skb_checksum(skb, offset, skb->len - offset, 0);
> @@ -758,7 +758,7 @@ static void amt_send_request(struct amt_dev *amt, bool v6)
> udph = udp_hdr(skb);
> udph->source = amt->gw_port;
> udph->dest = amt->relay_port;
> - udph->len = htons(sizeof(*amtrh) + sizeof(*udph));
> + udp_set_len_short(udph, sizeof(*amtrh) + sizeof(*udph));
> udph->check = 0;
> offset = skb_transport_offset(skb);
> skb->csum = skb_checksum(skb, offset, skb->len - offset, 0);
> @@ -2608,7 +2608,7 @@ static void amt_send_advertisement(struct amt_dev *amt, __be32 nonce,
> udph = udp_hdr(skb);
> udph->source = amt->relay_port;
> udph->dest = dport;
> - udph->len = htons(sizeof(*amta) + sizeof(*udph));
> + udp_set_len_short(udph, sizeof(*amta) + sizeof(*udph));
> udph->check = 0;
> offset = skb_transport_offset(skb);
> skb->csum = skb_checksum(skb, offset, skb->len - offset, 0);
> diff --git a/drivers/net/ethernet/intel/i40e/i40e_txrx.c b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
> index 34db7d8866b0..63433279e3c3 100644
> --- a/drivers/net/ethernet/intel/i40e/i40e_txrx.c
> +++ b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
> @@ -3128,7 +3128,7 @@ static int i40e_tso(struct i40e_tx_buffer *first, u8 *hdr_len,
> SKB_GSO_UDP_TUNNEL_CSUM)) {
> if (!(skb_shinfo(skb)->gso_type & SKB_GSO_PARTIAL) &&
> (skb_shinfo(skb)->gso_type & SKB_GSO_UDP_TUNNEL_CSUM)) {
> - l4.udp->len = 0;
> + udp_set_len_short(l4.udp, 0);
>
> /* determine offset of outer transport header */
> l4_offset = l4.hdr - skb->data;
> diff --git a/drivers/net/ethernet/intel/iavf/iavf_txrx.c b/drivers/net/ethernet/intel/iavf/iavf_txrx.c
> index 363c42bf3dcf..c30abf17cf5d 100644
> --- a/drivers/net/ethernet/intel/iavf/iavf_txrx.c
> +++ b/drivers/net/ethernet/intel/iavf/iavf_txrx.c
> @@ -1774,7 +1774,7 @@ static int iavf_tso(struct iavf_tx_buffer *first, u8 *hdr_len,
> SKB_GSO_UDP_TUNNEL_CSUM)) {
> if (!(skb_shinfo(skb)->gso_type & SKB_GSO_PARTIAL) &&
> (skb_shinfo(skb)->gso_type & SKB_GSO_UDP_TUNNEL_CSUM)) {
> - l4.udp->len = 0;
> + udp_set_len_short(l4.udp, 0);
>
> /* determine offset of outer transport header */
> l4_offset = l4.hdr - skb->data;
> diff --git a/drivers/net/ethernet/intel/ice/ice_txrx.c b/drivers/net/ethernet/intel/ice/ice_txrx.c
> index a5bbce68f76c..b45db305dd64 100644
> --- a/drivers/net/ethernet/intel/ice/ice_txrx.c
> +++ b/drivers/net/ethernet/intel/ice/ice_txrx.c
> @@ -1882,7 +1882,7 @@ int ice_tso(struct ice_tx_buf *first, struct ice_tx_offload_params *off)
> SKB_GSO_UDP_TUNNEL_CSUM)) {
> if (!(skb_shinfo(skb)->gso_type & SKB_GSO_PARTIAL) &&
> (skb_shinfo(skb)->gso_type & SKB_GSO_UDP_TUNNEL_CSUM)) {
> - l4.udp->len = 0;
> + udp_set_len_short(l4.udp, 0);
>
> /* determine offset of outer transport header */
> l4_start = (u8)(l4.hdr - skb->data);
> diff --git a/drivers/net/ethernet/intel/idpf/idpf_txrx.c b/drivers/net/ethernet/intel/idpf/idpf_txrx.c
> index 05a162094d10..3a160801e3b8 100644
> --- a/drivers/net/ethernet/intel/idpf/idpf_txrx.c
> +++ b/drivers/net/ethernet/intel/idpf/idpf_txrx.c
> @@ -2865,7 +2865,7 @@ int idpf_tso(struct sk_buff *skb, struct idpf_tx_offload_params *off)
> (__force __wsum)htonl(paylen));
> /* compute length of segmentation header */
> off->tso_hdr_len = sizeof(struct udphdr) + l4_start;
> - l4.udp->len = htons(shinfo->gso_size + sizeof(struct udphdr));
> + udp_set_len_short(l4.udp, shinfo->gso_size + sizeof(struct udphdr));
> break;
> default:
> return -EINVAL;
> diff --git a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_txrx.c b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_txrx.c
> index 625bb5a05344..8d2d607bc92f 100644
> --- a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_txrx.c
> +++ b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_txrx.c
> @@ -750,7 +750,7 @@ static void otx2_sqe_add_ext(struct otx2_nic *pfvf, struct otx2_snd_queue *sq,
> ext->lso_format = pfvf->hw.lso_udpv6_idx;
> }
>
> - udph->len = htons(sizeof(struct udphdr));
> + udp_set_len_short(udph, sizeof(struct udphdr));
> }
> } else if (skb_shinfo(skb)->tx_flags & SKBTX_HW_TSTAMP) {
> ext->tstmp = 1;
> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
> index 8fb57a4f36dd..6bb1971083c2 100644
> --- a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
> +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
> @@ -1036,7 +1036,7 @@ static void mlx5e_shampo_update_ipv4_udp_hdr(struct mlx5e_rq *rq, struct iphdr *
> struct udphdr *uh;
>
> uh = (struct udphdr *)(skb->data + udp_off);
> - uh->len = htons(skb->len - udp_off);
> + udp_set_len_short(uh, skb->len - udp_off);
>
> if (uh->check)
> uh->check = ~udp_v4_check(skb->len - udp_off, ipv4->saddr,
> @@ -1055,7 +1055,7 @@ static void mlx5e_shampo_update_ipv6_udp_hdr(struct mlx5e_rq *rq, struct ipv6hdr
> struct udphdr *uh;
>
> uh = (struct udphdr *)(skb->data + udp_off);
> - uh->len = htons(skb->len - udp_off);
> + udp_set_len_short(uh, skb->len - udp_off);
>
> if (uh->check)
> uh->check = ~udp_v6_check(skb->len - udp_off, &ipv6->saddr,
> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_selftest.c b/drivers/net/ethernet/mellanox/mlx5/core/en_selftest.c
> index accc26d1a872..1dcdb86690bb 100644
> --- a/drivers/net/ethernet/mellanox/mlx5/core/en_selftest.c
> +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_selftest.c
> @@ -113,7 +113,7 @@ static struct sk_buff *mlx5e_test_get_udp_skb(struct mlx5e_priv *priv)
> /* Fill UDP header */
> udph->source = htons(9);
> udph->dest = htons(9); /* Discard Protocol */
> - udph->len = htons(sizeof(struct mlx5ehdr) + sizeof(struct udphdr));
> + udp_set_len_short(udph, sizeof(struct mlx5ehdr) + sizeof(struct udphdr));
> udph->check = 0;
>
> /* Fill IP header */
> diff --git a/drivers/net/ethernet/sfc/falcon/selftest.c b/drivers/net/ethernet/sfc/falcon/selftest.c
> index db4dd7fb77f5..4d29e0baf2eb 100644
> --- a/drivers/net/ethernet/sfc/falcon/selftest.c
> +++ b/drivers/net/ethernet/sfc/falcon/selftest.c
> @@ -401,8 +401,8 @@ static void ef4_iterate_state(struct ef4_nic *efx)
>
> /* Initialise udp header */
> payload->udp.source = 0;
> - payload->udp.len = htons(sizeof(*payload) -
> - offsetof(struct ef4_loopback_payload, udp));
> + udp_set_len_short(&payload->udp, sizeof(*payload) -
> + offsetof(struct ef4_loopback_payload, udp));
> payload->udp.check = 0; /* checksum ignored */
>
> /* Fill out payload */
> diff --git a/drivers/net/ethernet/sfc/selftest.c b/drivers/net/ethernet/sfc/selftest.c
> index 8ec76329237a..dc716feb79cb 100644
> --- a/drivers/net/ethernet/sfc/selftest.c
> +++ b/drivers/net/ethernet/sfc/selftest.c
> @@ -398,8 +398,8 @@ static void efx_iterate_state(struct efx_nic *efx)
>
> /* Initialise udp header */
> payload->udp.source = 0;
> - payload->udp.len = htons(sizeof(*payload) -
> - offsetof(struct efx_loopback_payload, udp));
> + udp_set_len_short(&payload->udp, sizeof(*payload) -
> + offsetof(struct efx_loopback_payload, udp));
> payload->udp.check = 0; /* checksum ignored */
>
> /* Fill out payload */
> diff --git a/drivers/net/ethernet/sfc/siena/selftest.c b/drivers/net/ethernet/sfc/siena/selftest.c
> index 930643612df5..c74cf5131364 100644
> --- a/drivers/net/ethernet/sfc/siena/selftest.c
> +++ b/drivers/net/ethernet/sfc/siena/selftest.c
> @@ -399,8 +399,8 @@ static void efx_iterate_state(struct efx_nic *efx)
>
> /* Initialise udp header */
> payload->udp.source = 0;
> - payload->udp.len = htons(sizeof(*payload) -
> - offsetof(struct efx_loopback_payload, udp));
> + udp_set_len_short(&payload->udp, sizeof(*payload) -
> + offsetof(struct efx_loopback_payload, udp));
> payload->udp.check = 0; /* checksum ignored */
>
> /* Fill out payload */
> diff --git a/drivers/net/ethernet/sfc/tc_encap_actions.c b/drivers/net/ethernet/sfc/tc_encap_actions.c
> index da35705cc5e1..b2428e098817 100644
> --- a/drivers/net/ethernet/sfc/tc_encap_actions.c
> +++ b/drivers/net/ethernet/sfc/tc_encap_actions.c
> @@ -312,7 +312,7 @@ static void efx_gen_tun_header_udp(struct efx_tc_encap_action *encap, u8 len)
> encap->encap_hdr_len += sizeof(*udp);
>
> udp->dest = key->tp_dst;
> - udp->len = cpu_to_be16(sizeof(*udp) + len);
> + udp_set_len_short(udp, sizeof(*udp) + len);
> }
>
> static void efx_gen_tun_header_vxlan(struct efx_tc_encap_action *encap)
> diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_selftests.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_selftests.c
> index a0c75886587c..29e824bd90ca 100644
> --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_selftests.c
> +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_selftests.c
> @@ -154,9 +154,9 @@ static struct sk_buff *stmmac_test_get_udp_skb(struct stmmac_priv *priv,
> } else {
> uhdr->source = htons(attr->sport);
> uhdr->dest = htons(attr->dport);
> - uhdr->len = htons(sizeof(*shdr) + sizeof(*uhdr) + attr->size);
> + udp_set_len_short(uhdr, sizeof(*shdr) + sizeof(*uhdr) + attr->size);
> if (attr->max_size)
> - uhdr->len = htons(attr->max_size -
> + udp_set_len_short(uhdr, attr->max_size -
> (sizeof(*ihdr) + sizeof(*ehdr)));
> uhdr->check = 0;
> }
> diff --git a/drivers/net/geneve.c b/drivers/net/geneve.c
> index 7a26e2439d48..5aa5c0e81b12 100644
> --- a/drivers/net/geneve.c
> +++ b/drivers/net/geneve.c
> @@ -631,7 +631,7 @@ static int geneve_post_decap_hint(const struct sock *sk, struct sk_buff *skb,
>
> /* Adjust the nested UDP header len and checksum. */
> uh = udp_hdr(skb);
> - uh->len = htons(skb->len - gro_hint->nested_tp_offset);
> + udp_set_len_short(uh, skb->len - gro_hint->nested_tp_offset);
> if (uh->check) {
> len = skb->len - gro_hint->nested_nh_offset;
> skb_shinfo(skb)->gso_type |= SKB_GSO_UDP_TUNNEL_CSUM;
> diff --git a/drivers/net/netdevsim/dev.c b/drivers/net/netdevsim/dev.c
> index e82de0fd3157..9f10ff039252 100644
> --- a/drivers/net/netdevsim/dev.c
> +++ b/drivers/net/netdevsim/dev.c
> @@ -845,7 +845,7 @@ static struct sk_buff *nsim_dev_trap_skb_build(void)
> udph = skb_put_zero(skb, sizeof(struct udphdr) + data_len);
> get_random_bytes(&udph->source, sizeof(u16));
> get_random_bytes(&udph->dest, sizeof(u16));
> - udph->len = htons(sizeof(struct udphdr) + data_len);
> + udp_set_len_short(udph, sizeof(struct udphdr) + data_len);
>
> return skb;
> }
> diff --git a/drivers/net/netdevsim/psample.c b/drivers/net/netdevsim/psample.c
> index 47d24bc64ee4..ef3116686707 100644
> --- a/drivers/net/netdevsim/psample.c
> +++ b/drivers/net/netdevsim/psample.c
> @@ -73,7 +73,7 @@ static struct sk_buff *nsim_dev_psample_skb_build(void)
> udph = skb_put_zero(skb, sizeof(struct udphdr) + data_len);
> get_random_bytes(&udph->source, sizeof(u16));
> get_random_bytes(&udph->dest, sizeof(u16));
> - udph->len = htons(sizeof(struct udphdr) + data_len);
> + udp_set_len_short(udph, sizeof(struct udphdr) + data_len);
>
> return skb;
> }
> diff --git a/drivers/net/netdevsim/psp.c b/drivers/net/netdevsim/psp.c
> index 0b4d717253b0..e81b69d6a577 100644
> --- a/drivers/net/netdevsim/psp.c
> +++ b/drivers/net/netdevsim/psp.c
> @@ -84,6 +84,7 @@ nsim_do_psp(struct sk_buff *skb, struct netdevsim *ns,
> struct iphdr *iph;
> struct udphdr *uh;
> __wsum csum;
> + u16 udplen;
>
> /* Do not decapsulate. Receive the skb with the udp and psp
> * headers still there as if this is a normal udp packet.
> @@ -91,19 +92,20 @@ nsim_do_psp(struct sk_buff *skb, struct netdevsim *ns,
> * provide a valid checksum here, so the skb isn't dropped.
> */
> uh = udp_hdr(skb);
> + udplen = ntohs(uh->len) ?: skb->len;
> csum = skb_checksum(skb, skb_transport_offset(skb),
> - ntohs(uh->len), 0);
> + udplen, 0);
>
> switch (skb->protocol) {
> case htons(ETH_P_IP):
> iph = ip_hdr(skb);
> - uh->check = udp_v4_check(ntohs(uh->len), iph->saddr,
> + uh->check = udp_v4_check(udplen, iph->saddr,
> iph->daddr, csum);
> break;
> #if IS_ENABLED(CONFIG_IPV6)
> case htons(ETH_P_IPV6):
> ip6h = ipv6_hdr(skb);
> - uh->check = udp_v6_check(ntohs(uh->len), &ip6h->saddr,
> + uh->check = udp_v6_check(udplen, &ip6h->saddr,
> &ip6h->daddr, csum);
> break;
> #endif
> diff --git a/drivers/net/wireguard/receive.c b/drivers/net/wireguard/receive.c
> index eb8851113654..275fe1bc994c 100644
> --- a/drivers/net/wireguard/receive.c
> +++ b/drivers/net/wireguard/receive.c
> @@ -62,7 +62,7 @@ static int prepare_skb_header(struct sk_buff *skb, struct wg_device *wg)
> * to have UDP fields.
> */
> return -EINVAL;
> - data_len = ntohs(udp->len);
> + data_len = udp_get_len_short(udp); /* GRO not expected here. */
> if (unlikely(data_len < sizeof(struct udphdr) ||
> data_len > skb->len - data_offset))
> /* UDP packet is reporting too small of a size or lying about
> diff --git a/include/linux/udp.h b/include/linux/udp.h
> index 1cbf6b4d3aab..3a79fa23918f 100644
> --- a/include/linux/udp.h
> +++ b/include/linux/udp.h
> @@ -23,6 +23,22 @@ static inline struct udphdr *udp_hdr(const struct sk_buff *skb)
> return (struct udphdr *)skb_transport_header(skb);
> }
>
> +static inline unsigned int udp_get_len_short(const struct udphdr *uh)
> +{
> + return ntohs(uh->len);
> +}
> +
> +static inline void udp_set_len(struct udphdr *uh, unsigned int len)
> +{
> + uh->len = len < GRO_LEGACY_MAX_SIZE ? htons(len) : 0;
> +}
> +
> +static inline void udp_set_len_short(struct udphdr *uh, unsigned int len)
> +{
> + DEBUG_NET_WARN_ON_ONCE(len >= GRO_LEGACY_MAX_SIZE);
> + uh->len = htons(len);
> +}
> +
> #define UDP_HTABLE_SIZE_MIN_PERNET 128
> #define UDP_HTABLE_SIZE_MIN (IS_ENABLED(CONFIG_BASE_SMALL) ? 128 : 256)
> #define UDP_HTABLE_SIZE_MAX 65536
> diff --git a/include/net/udplite.h b/include/net/udplite.h
> index 786919d29f8d..b7148bc5b7c3 100644
> --- a/include/net/udplite.h
> +++ b/include/net/udplite.h
> @@ -40,7 +40,7 @@ static inline int udplite_checksum_init(struct sk_buff *skb, struct udphdr *uh)
> return 1;
> }
>
> - cscov = ntohs(uh->len);
> + cscov = udp_get_len_short(uh);
>
> if (cscov == 0) /* Indicates that full coverage is required. */
> ;
> @@ -76,7 +76,7 @@ static inline __wsum udplite_csum(struct sk_buff *skb)
> if (pcslen < len) {
> if (pcslen > 0)
> len = pcslen;
> - udp_hdr(skb)->len = htons(pcslen);
> + udp_set_len_short(udp_hdr(skb), pcslen);
> }
> }
> skb->ip_summed = CHECKSUM_NONE; /* no HW support for checksumming */
> diff --git a/include/trace/events/icmp.h b/include/trace/events/icmp.h
> index 31559796949a..09ae115099df 100644
> --- a/include/trace/events/icmp.h
> +++ b/include/trace/events/icmp.h
> @@ -44,7 +44,7 @@ TRACE_EVENT(icmp_send,
> } else {
> __entry->sport = ntohs(uh->source);
> __entry->dport = ntohs(uh->dest);
> - __entry->ulen = ntohs(uh->len);
> + __entry->ulen = udp_get_len_short(uh);
> }
>
> p32 = (__be32 *) __entry->saddr;
> diff --git a/lib/tests/blackhole_dev_kunit.c b/lib/tests/blackhole_dev_kunit.c
> index 06834ab35f43..fa3e0533038d 100644
> --- a/lib/tests/blackhole_dev_kunit.c
> +++ b/lib/tests/blackhole_dev_kunit.c
> @@ -46,7 +46,7 @@ static void test_blackholedev(struct kunit *test)
> uh = (struct udphdr *)skb_push(skb, sizeof(struct udphdr));
> skb_set_transport_header(skb, 0);
> uh->source = uh->dest = htons(UDP_PORT);
> - uh->len = htons(data_len);
> + udp_set_len_short(uh, data_len);
> uh->check = 0;
> /* (Network) IPv6 */
> ip6h = (struct ipv6hdr *)skb_push(skb, sizeof(struct ipv6hdr));
> diff --git a/net/6lowpan/nhc_udp.c b/net/6lowpan/nhc_udp.c
> index 0a506c77283d..ed4227e6db74 100644
> --- a/net/6lowpan/nhc_udp.c
> +++ b/net/6lowpan/nhc_udp.c
> @@ -88,16 +88,16 @@ static int udp_uncompress(struct sk_buff *skb, size_t needed)
> switch (lowpan_dev(skb->dev)->lltype) {
> case LOWPAN_LLTYPE_IEEE802154:
> if (lowpan_802154_cb(skb)->d_size)
> - uh.len = htons(lowpan_802154_cb(skb)->d_size -
> - sizeof(struct ipv6hdr));
> + udp_set_len_short(&uh, lowpan_802154_cb(skb)->d_size -
> + sizeof(struct ipv6hdr));
> else
> - uh.len = htons(skb->len + sizeof(struct udphdr));
> + udp_set_len_short(&uh, skb->len + sizeof(struct udphdr));
> break;
> default:
> - uh.len = htons(skb->len + sizeof(struct udphdr));
> + udp_set_len_short(&uh, skb->len + sizeof(struct udphdr));
> break;
> }
> - pr_debug("uncompressed UDP length: src = %d", ntohs(uh.len));
> + pr_debug("uncompressed UDP length: src = %d", udp_get_len_short(&uh));
>
> /* replace the compressed UDP head by the uncompressed UDP
> * header
> diff --git a/net/core/netpoll.c b/net/core/netpoll.c
> index a8558a52884f..a3f737974d51 100644
> --- a/net/core/netpoll.c
> +++ b/net/core/netpoll.c
> @@ -474,7 +474,7 @@ static void push_udp(struct netpoll *np, struct sk_buff *skb, int len)
> udph = udp_hdr(skb);
> udph->source = htons(np->local_port);
> udph->dest = htons(np->remote_port);
> - udph->len = htons(udp_len);
> + udp_set_len_short(udph, udp_len);
>
> netpoll_udp_checksum(np, skb, len);
> }
> diff --git a/net/core/pktgen.c b/net/core/pktgen.c
> index 8e185b318288..5b4dd04d6124 100644
> --- a/net/core/pktgen.c
> +++ b/net/core/pktgen.c
> @@ -3005,7 +3005,7 @@ static struct sk_buff *fill_packet_ipv4(struct net_device *odev,
>
> udph->source = htons(pkt_dev->cur_udp_src);
> udph->dest = htons(pkt_dev->cur_udp_dst);
> - udph->len = htons(datalen + 8); /* DATA + udphdr */
> + udp_set_len_short(udph, datalen + 8); /* DATA + udphdr */
> udph->check = 0;
>
> iph->ihl = 5;
> @@ -3138,7 +3138,7 @@ static struct sk_buff *fill_packet_ipv6(struct net_device *odev,
> udplen = datalen + sizeof(struct udphdr);
> udph->source = htons(pkt_dev->cur_udp_src);
> udph->dest = htons(pkt_dev->cur_udp_dst);
> - udph->len = htons(udplen);
> + udp_set_len_short(udph, udplen);
> udph->check = 0;
>
> *(__be32 *) iph = htonl(0x60000000); /* Version + flow */
> diff --git a/net/core/selftests.c b/net/core/selftests.c
> index 0a203d3fb9dc..36b949ae520b 100644
> --- a/net/core/selftests.c
> +++ b/net/core/selftests.c
> @@ -72,9 +72,9 @@ struct sk_buff *net_test_get_skb(struct net_device *ndev, u8 id,
> } else {
> uhdr->source = htons(attr->sport);
> uhdr->dest = htons(attr->dport);
> - uhdr->len = htons(sizeof(*shdr) + sizeof(*uhdr) + attr->size);
> + udp_set_len_short(uhdr, sizeof(*shdr) + sizeof(*uhdr) + attr->size);
> if (attr->max_size)
> - uhdr->len = htons(attr->max_size -
> + udp_set_len_short(uhdr, attr->max_size -
> (sizeof(*ihdr) + sizeof(*ehdr)));
> uhdr->check = 0;
> }
> diff --git a/net/core/tso.c b/net/core/tso.c
> index 6df997b9076e..3cc5a03e7a12 100644
> --- a/net/core/tso.c
> +++ b/net/core/tso.c
> @@ -38,7 +38,8 @@ void tso_build_hdr(const struct sk_buff *skb, char *hdr, struct tso_t *tso,
> } else {
> struct udphdr *uh = (struct udphdr *)hdr;
>
> - uh->len = htons(sizeof(*uh) + size);
> + /* size is after segmentation. */
> + udp_set_len_short(uh, sizeof(*uh) + size);
> }
> }
> EXPORT_SYMBOL(tso_build_hdr);
> diff --git a/net/ipv4/esp4.c b/net/ipv4/esp4.c
> index 2c922afadb8f..25ee1ea9f548 100644
> --- a/net/ipv4/esp4.c
> +++ b/net/ipv4/esp4.c
> @@ -317,7 +317,7 @@ static struct ip_esp_hdr *esp_output_udp_encap(struct sk_buff *skb,
> uh = (struct udphdr *)esp->esph;
> uh->source = sport;
> uh->dest = dport;
> - uh->len = htons(len);
> + udp_set_len_short(uh, len);
> uh->check = 0;
>
> /* For IPv4 ESP with UDP encapsulation, if xo is not null, the skb is in the crypto offload
> diff --git a/net/ipv4/fou_core.c b/net/ipv4/fou_core.c
> index 3baaa4df7e42..7aeb6efbfc44 100644
> --- a/net/ipv4/fou_core.c
> +++ b/net/ipv4/fou_core.c
> @@ -1043,7 +1043,7 @@ static void fou_build_udp(struct sk_buff *skb, struct ip_tunnel_encap *e,
>
> uh->dest = e->dport;
> uh->source = sport;
> - uh->len = htons(skb->len);
> + udp_set_len_short(uh, skb->len);
> udp_set_csum(!(e->flags & TUNNEL_ENCAP_FLAG_CSUM), skb,
> fl4->saddr, fl4->daddr, skb->len);
>
> diff --git a/net/ipv4/ipconfig.c b/net/ipv4/ipconfig.c
> index a35ffedacc7c..155db067eaec 100644
> --- a/net/ipv4/ipconfig.c
> +++ b/net/ipv4/ipconfig.c
> @@ -847,7 +847,7 @@ static void __init ic_bootp_send_if(struct ic_device *d, unsigned long jiffies_d
> /* Construct UDP header */
> b->udph.source = htons(68);
> b->udph.dest = htons(67);
> - b->udph.len = htons(sizeof(struct bootp_pkt) - sizeof(struct iphdr));
> + udp_set_len_short(&b->udph, sizeof(struct bootp_pkt) - sizeof(struct iphdr));
> /* UDP checksum not calculated -- explicitly allowed in BOOTP RFC */
>
> /* Construct DHCP/BOOTP header */
> @@ -1025,10 +1025,10 @@ static int __init ic_bootp_recv(struct sk_buff *skb, struct net_device *dev, str
> if (b->udph.source != htons(67) || b->udph.dest != htons(68))
> goto drop;
>
> - if (ntohs(h->tot_len) < ntohs(b->udph.len) + sizeof(struct iphdr))
> + if (ntohs(h->tot_len) < udp_get_len_short(&b->udph) + sizeof(struct iphdr))
> goto drop;
>
> - len = ntohs(b->udph.len) - sizeof(struct udphdr);
> + len = udp_get_len_short(&b->udph) - sizeof(struct udphdr);
> ext_len = len - (sizeof(*b) -
> sizeof(struct iphdr) -
> sizeof(struct udphdr) -
> diff --git a/net/ipv4/netfilter/nf_nat_snmp_basic_main.c b/net/ipv4/netfilter/nf_nat_snmp_basic_main.c
> index 717b726504fe..afe0f4a328d0 100644
> --- a/net/ipv4/netfilter/nf_nat_snmp_basic_main.c
> +++ b/net/ipv4/netfilter/nf_nat_snmp_basic_main.c
> @@ -127,7 +127,7 @@ static int snmp_translate(struct nf_conn *ct, int dir, struct sk_buff *skb)
> {
> struct iphdr *iph = ip_hdr(skb);
> struct udphdr *udph = (struct udphdr *)((__be32 *)iph + iph->ihl);
> - u16 datalen = ntohs(udph->len) - sizeof(struct udphdr);
> + u16 datalen = udp_get_len_short(udph) - sizeof(struct udphdr);
> char *data = (unsigned char *)udph + sizeof(struct udphdr);
> struct snmp_ctx ctx;
> int ret;
> @@ -181,7 +181,7 @@ static int help(struct sk_buff *skb, unsigned int protoff,
> * enough room for a UDP header. Just verify the UDP length field so we
> * can mess around with the payload.
> */
> - if (ntohs(udph->len) != skb->len - (iph->ihl << 2)) {
> + if (udp_get_len_short(udph) != skb->len - (iph->ihl << 2)) {
> nf_ct_helper_log(skb, ct, "dropping malformed packet\n");
> return NF_DROP;
> }
> diff --git a/net/ipv4/route.c b/net/ipv4/route.c
> index 463236e0dc2d..2eed102231b8 100644
> --- a/net/ipv4/route.c
> +++ b/net/ipv4/route.c
> @@ -3190,7 +3190,7 @@ static struct sk_buff *inet_rtm_getroute_build_skb(__be32 src, __be32 dst,
> udph = skb_put_zero(skb, sizeof(struct udphdr));
> udph->source = sport;
> udph->dest = dport;
> - udph->len = htons(sizeof(struct udphdr));
> + udp_set_len_short(udph, sizeof(struct udphdr));
> udph->check = 0;
> break;
> }
> diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
> index 6c6b68a66dcd..345ef93001fc 100644
> --- a/net/ipv4/udp.c
> +++ b/net/ipv4/udp.c
> @@ -1133,7 +1133,8 @@ static int udp_send_skb(struct sk_buff *skb, struct flowi4 *fl4,
> uh = udp_hdr(skb);
> uh->source = inet->inet_sport;
> uh->dest = fl4->fl4_dport;
> - uh->len = htons(len);
> + /* Datagram length checked in udp_sendmsg. */
> + udp_set_len_short(uh, len);
> uh->check = 0;
>
> if (cork->gso_size) {
> diff --git a/net/ipv4/udp_offload.c b/net/ipv4/udp_offload.c
> index e831234326c4..2f35b485ff40 100644
> --- a/net/ipv4/udp_offload.c
> +++ b/net/ipv4/udp_offload.c
> @@ -279,11 +279,11 @@ static struct sk_buff *__skb_udp_tunnel_segment(struct sk_buff *skb,
> * segment instead of the entire frame.
> */
> if (gso_partial && skb_is_gso(skb)) {
> - uh->len = htons(skb_shinfo(skb)->gso_size +
> - SKB_GSO_CB(skb)->data_offset +
> - skb->head - (unsigned char *)uh);
> + udp_set_len_short(uh, skb_shinfo(skb)->gso_size +
> + SKB_GSO_CB(skb)->data_offset +
> + skb->head - (unsigned char *)uh);
> } else {
> - uh->len = htons(len);
> + udp_set_len_short(uh, len);
> }
>
> if (!need_csum)
> @@ -469,7 +469,7 @@ static struct sk_buff *__udp_gso_segment_list(struct sk_buff *skb,
> if (IS_ERR(skb))
> return skb;
>
> - udp_hdr(skb)->len = htons(sizeof(struct udphdr) + mss);
> + udp_set_len_short(udp_hdr(skb), sizeof(struct udphdr) + mss);
>
> if (is_ipv6)
> return __udpv6_gso_segment_list_csum(skb);
> @@ -487,8 +487,8 @@ struct sk_buff *__udp_gso_segment(struct sk_buff *gso_skb,
> unsigned int mss;
> bool copy_dtor;
> __sum16 check;
> - __be16 newlen;
> int ret = 0;
> + u16 newlen;
>
> mss = skb_shinfo(gso_skb)->gso_size;
> if (gso_skb->len <= sizeof(*uh) + mss)
> @@ -565,8 +565,8 @@ struct sk_buff *__udp_gso_segment(struct sk_buff *gso_skb,
> (skb_shinfo(gso_skb)->tx_flags & SKBTX_ANY_TSTAMP);
>
> /* compute checksum adjustment based on old length versus new */
> - newlen = htons(sizeof(*uh) + mss);
> - check = csum16_add(csum16_sub(uh->check, uh->len), newlen);
> + newlen = sizeof(*uh) + mss;
> + check = csum16_add(csum16_sub(uh->check, uh->len), htons(newlen));
>
> for (;;) {
> if (copy_dtor) {
> @@ -578,7 +578,7 @@ struct sk_buff *__udp_gso_segment(struct sk_buff *gso_skb,
> if (!seg->next)
> break;
>
> - uh->len = newlen;
> + udp_set_len_short(uh, newlen);
> uh->check = check;
>
> if (seg->ip_summed == CHECKSUM_PARTIAL)
> @@ -592,11 +592,10 @@ struct sk_buff *__udp_gso_segment(struct sk_buff *gso_skb,
> }
>
> /* last packet can be partial gso_size, account for that in checksum */
> - newlen = htons(skb_tail_pointer(seg) - skb_transport_header(seg) +
> - seg->data_len);
> - check = csum16_add(csum16_sub(uh->check, uh->len), newlen);
> + newlen = skb_tail_pointer(seg) - skb_transport_header(seg) + seg->data_len;
> + check = csum16_add(csum16_sub(uh->check, uh->len), htons(newlen));
>
> - uh->len = newlen;
> + udp_set_len_short(uh, newlen);
> uh->check = check;
>
> if (seg->ip_summed == CHECKSUM_PARTIAL)
> @@ -708,7 +707,7 @@ static struct sk_buff *udp_gro_receive_segment(struct list_head *head,
> }
>
> /* Do not deal with padded or malicious packets, sorry ! */
> - ulen = ntohs(uh->len);
> + ulen = udp_get_len_short(uh);
> if (ulen <= sizeof(*uh) || ulen != skb_gro_len(skb)) {
> NAPI_GRO_CB(skb)->flush = 1;
> return NULL;
> @@ -741,7 +740,7 @@ static struct sk_buff *udp_gro_receive_segment(struct list_head *head,
> * On len mismatch merge the first packet shorter than gso_size,
> * otherwise complete the GRO packet.
> */
> - if (ulen > ntohs(uh2->len) || flush) {
> + if (ulen > udp_get_len_short(uh2) || flush) {
> pp = p;
> } else {
> if (NAPI_GRO_CB(skb)->is_flist) {
> @@ -764,7 +763,7 @@ static struct sk_buff *udp_gro_receive_segment(struct list_head *head,
> }
> }
>
> - if (ret || ulen != ntohs(uh2->len) ||
> + if (ret || ulen != udp_get_len_short(uh2) ||
> NAPI_GRO_CB(p)->count >= UDP_GRO_CNT_MAX)
> pp = p;
>
> @@ -916,12 +915,12 @@ static int udp_gro_complete_segment(struct sk_buff *skb)
> int udp_gro_complete(struct sk_buff *skb, int nhoff,
> udp_lookup_t lookup)
> {
> - __be16 newlen = htons(skb->len - nhoff);
> + unsigned int newlen = skb->len - nhoff;
> struct udphdr *uh = (struct udphdr *)(skb->data + nhoff);
> struct sock *sk;
> int err;
>
> - uh->len = newlen;
> + udp_set_len_short(uh, newlen);
>
> sk = INDIRECT_CALL_INET(lookup, udp6_lib_lookup_skb,
> udp4_lib_lookup_skb, skb, uh->source, uh->dest);
> @@ -959,7 +958,7 @@ INDIRECT_CALLABLE_SCOPE int udp4_gro_complete(struct sk_buff *skb, int nhoff)
>
> /* do fraglist only if there is no outer UDP encap (or we already processed it) */
> if (NAPI_GRO_CB(skb)->is_flist && !NAPI_GRO_CB(skb)->encap_mark) {
> - uh->len = htons(skb->len - nhoff);
> + udp_set_len_short(uh, skb->len - nhoff);
>
> skb_shinfo(skb)->gso_type |= (SKB_GSO_FRAGLIST|SKB_GSO_UDP_L4);
> skb_shinfo(skb)->gso_segs = NAPI_GRO_CB(skb)->count;
> diff --git a/net/ipv4/udp_tunnel_core.c b/net/ipv4/udp_tunnel_core.c
> index b1f667c52cb2..18f789d9383e 100644
> --- a/net/ipv4/udp_tunnel_core.c
> +++ b/net/ipv4/udp_tunnel_core.c
> @@ -184,7 +184,7 @@ void udp_tunnel_xmit_skb(struct rtable *rt, struct sock *sk, struct sk_buff *skb
>
> uh->dest = dst_port;
> uh->source = src_port;
> - uh->len = htons(skb->len);
> + udp_set_len_short(uh, skb->len);
>
> memset(&(IPCB(skb)->opt), 0, sizeof(IPCB(skb)->opt));
>
> diff --git a/net/ipv6/esp6.c b/net/ipv6/esp6.c
> index e75da98f5283..194566129477 100644
> --- a/net/ipv6/esp6.c
> +++ b/net/ipv6/esp6.c
> @@ -227,7 +227,8 @@ static void esp_output_encap_csum(struct sk_buff *skb)
> if (*skb_mac_header(skb) == IPPROTO_UDP) {
> struct udphdr *uh = udp_hdr(skb);
> struct ipv6hdr *ip6h = ipv6_hdr(skb);
> - int len = ntohs(uh->len);
> + /* esp6_output_udp_encap limits len to U16_MAX. */
> + int len = udp_get_len_short(uh);
> unsigned int offset = skb_transport_offset(skb);
> __wsum csum = skb_checksum(skb, offset, skb->len - offset, 0);
>
> @@ -352,7 +353,7 @@ static struct ip_esp_hdr *esp6_output_udp_encap(struct sk_buff *skb,
> uh = (struct udphdr *)esp->esph;
> uh->source = sport;
> uh->dest = dport;
> - uh->len = htons(len);
> + udp_set_len_short(uh, len);
> uh->check = 0;
>
> *skb_mac_header(skb) = IPPROTO_UDP;
> diff --git a/net/ipv6/fou6.c b/net/ipv6/fou6.c
> index 430518ae26fa..abcf23500299 100644
> --- a/net/ipv6/fou6.c
> +++ b/net/ipv6/fou6.c
> @@ -30,7 +30,7 @@ static void fou6_build_udp(struct sk_buff *skb, struct ip_tunnel_encap *e,
>
> uh->dest = e->dport;
> uh->source = sport;
> - uh->len = htons(skb->len);
> + udp_set_len_short(uh, skb->len);
> udp6_set_csum(!(e->flags & TUNNEL_ENCAP_FLAG_CSUM6), skb,
> &fl6->saddr, &fl6->daddr, skb->len);
>
> diff --git a/net/ipv6/ip6_udp_tunnel.c b/net/ipv6/ip6_udp_tunnel.c
> index cef3e0210744..26b140fea7b7 100644
> --- a/net/ipv6/ip6_udp_tunnel.c
> +++ b/net/ipv6/ip6_udp_tunnel.c
> @@ -93,7 +93,7 @@ void udp_tunnel6_xmit_skb(struct dst_entry *dst, struct sock *sk,
> uh->dest = dst_port;
> uh->source = src_port;
>
> - uh->len = htons(skb->len);
> + udp_set_len_short(uh, skb->len);
>
> skb_dst_set(skb, dst);
>
> diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
> index 48f73401adf4..dbc41008d286 100644
> --- a/net/ipv6/udp.c
> +++ b/net/ipv6/udp.c
> @@ -1431,7 +1431,8 @@ static int udp_v6_send_skb(struct sk_buff *skb, struct flowi6 *fl6,
> uh = udp_hdr(skb);
> uh->source = fl6->fl6_sport;
> uh->dest = fl6->fl6_dport;
> - uh->len = htons(len);
> + /* Datagram length checked in udpv6_sendmsg. */
> + udp_set_len_short(uh, len);
> uh->check = 0;
>
> if (cork->gso_size) {
> diff --git a/net/ipv6/udp_offload.c b/net/ipv6/udp_offload.c
> index e003b8494dc0..bfe0d7104e8a 100644
> --- a/net/ipv6/udp_offload.c
> +++ b/net/ipv6/udp_offload.c
> @@ -172,7 +172,7 @@ int udp6_gro_complete(struct sk_buff *skb, int nhoff)
>
> /* do fraglist only if there is no outer UDP encap (or we already processed it) */
> if (NAPI_GRO_CB(skb)->is_flist && !NAPI_GRO_CB(skb)->encap_mark) {
> - uh->len = htons(skb->len - nhoff);
> + udp_set_len_short(uh, skb->len - nhoff);
>
> skb_shinfo(skb)->gso_type |= (SKB_GSO_FRAGLIST|SKB_GSO_UDP_L4);
> skb_shinfo(skb)->gso_segs = NAPI_GRO_CB(skb)->count;
> diff --git a/net/l2tp/l2tp_core.c b/net/l2tp/l2tp_core.c
> index c89ae52764b8..432bac206990 100644
> --- a/net/l2tp/l2tp_core.c
> +++ b/net/l2tp/l2tp_core.c
> @@ -1290,7 +1290,7 @@ static int l2tp_xmit_core(struct l2tp_session *session, struct sk_buff *skb, uns
> uh->source = inet->inet_sport;
> uh->dest = inet->inet_dport;
> udp_len = uhlen + session->hdr_len + data_len;
> - uh->len = htons(udp_len);
> + udp_set_len_short(uh, udp_len);
>
> /* Calculate UDP checksum if configured to do so */
> #if IS_ENABLED(CONFIG_IPV6)
> diff --git a/net/netfilter/ipvs/ip_vs_xmit.c b/net/netfilter/ipvs/ip_vs_xmit.c
> index 0fb5162992e5..b460998e348e 100644
> --- a/net/netfilter/ipvs/ip_vs_xmit.c
> +++ b/net/netfilter/ipvs/ip_vs_xmit.c
> @@ -1089,7 +1089,7 @@ ipvs_gue_encap(struct net *net, struct sk_buff *skb,
> dport = cp->dest->tun_port;
> udph->dest = dport;
> udph->source = sport;
> - udph->len = htons(skb->len);
> + udp_set_len_short(udph, skb->len);
> udph->check = 0;
>
> *next_protocol = IPPROTO_UDP;
> diff --git a/net/netfilter/nf_conntrack_proto_udp.c b/net/netfilter/nf_conntrack_proto_udp.c
> index 0030fbe8885c..e9bd1632304f 100644
> --- a/net/netfilter/nf_conntrack_proto_udp.c
> +++ b/net/netfilter/nf_conntrack_proto_udp.c
> @@ -41,11 +41,22 @@ static void udp_error_log(const struct sk_buff *skb,
> nf_l4proto_log_invalid(skb, state, IPPROTO_UDP, "%s", msg);
> }
>
> +static bool udp_validate_len(struct sk_buff *skb,
> + const struct udphdr *hdr,
> + unsigned int dataoff)
> +{
> + unsigned int udplen = udp_get_len_short(hdr);
> + unsigned int skblen = skb->len - dataoff;
> +
> + if (udplen > skblen || udplen < sizeof(*hdr))
> + return false;
> + return true;
> +}
> +
> static bool udp_error(struct sk_buff *skb,
> unsigned int dataoff,
> const struct nf_hook_state *state)
> {
> - unsigned int udplen = skb->len - dataoff;
> const struct udphdr *hdr;
> struct udphdr _hdr;
>
> @@ -57,7 +68,7 @@ static bool udp_error(struct sk_buff *skb,
> }
>
> /* Truncated/malformed packets */
> - if (ntohs(hdr->len) > udplen || ntohs(hdr->len) < sizeof(*hdr)) {
> + if (!udp_validate_len(skb, hdr, dataoff)) {
> udp_error_log(skb, state, "truncated/malformed packet");
> return true;
> }
> @@ -153,7 +164,7 @@ static bool udplite_error(struct sk_buff *skb,
> return true;
> }
>
> - cscov = ntohs(hdr->len);
> + cscov = udp_get_len_short(hdr);
> if (cscov == 0) {
> cscov = udplen;
> } else if (cscov < sizeof(*hdr) || cscov > udplen) {
> diff --git a/net/netfilter/nf_log_syslog.c b/net/netfilter/nf_log_syslog.c
> index 41503847d9d7..0254db8b97ce 100644
> --- a/net/netfilter/nf_log_syslog.c
> +++ b/net/netfilter/nf_log_syslog.c
> @@ -290,7 +290,7 @@ nf_log_dump_udp_header(struct nf_log_buf *m,
>
> /* Max length: 20 "SPT=65535 DPT=65535 " */
> nf_log_buf_add(m, "SPT=%u DPT=%u LEN=%u ",
> - ntohs(uh->source), ntohs(uh->dest), ntohs(uh->len));
> + ntohs(uh->source), ntohs(uh->dest), udp_get_len_short(uh));
>
> out:
> return 0;
> diff --git a/net/netfilter/nf_nat_helper.c b/net/netfilter/nf_nat_helper.c
> index bf591e6af005..3853f41db499 100644
> --- a/net/netfilter/nf_nat_helper.c
> +++ b/net/netfilter/nf_nat_helper.c
> @@ -161,7 +161,7 @@ nf_nat_mangle_udp_packet(struct sk_buff *skb,
>
> /* update the length of the UDP packet */
> datalen = skb->len - protoff;
> - udph->len = htons(datalen);
> + udp_set_len_short(udph, datalen);
>
> /* fix udp checksum if udp checksum was previously calculated */
> if (!udph->check && skb->ip_summed != CHECKSUM_PARTIAL)
> diff --git a/net/psp/psp_main.c b/net/psp/psp_main.c
> index d4c04c923c5a..2415b75a2a12 100644
> --- a/net/psp/psp_main.c
> +++ b/net/psp/psp_main.c
> @@ -207,7 +207,7 @@ static void psp_write_headers(struct net *net, struct sk_buff *skb, __be32 spi,
> uh->source = udp_flow_src_port(net, skb, 0, 0, false);
> }
> uh->check = 0;
> - uh->len = htons(udp_len);
> + udp_set_len_short(uh, udp_len);
>
> psph->nexthdr = IPPROTO_TCP;
> psph->hdrlen = PSP_HDRLEN_NOOPT;
> diff --git a/net/sched/act_csum.c b/net/sched/act_csum.c
> index a5cc76613f32..5315f851b7a4 100644
> --- a/net/sched/act_csum.c
> +++ b/net/sched/act_csum.c
> @@ -276,7 +276,7 @@ static int tcf_csum_ipv4_udp(struct sk_buff *skb, unsigned int ihl,
> return 0;
>
> iph = ip_hdr(skb);
> - ul = ntohs(udph->len);
> + ul = udp_get_len_short(udph);
>
> if (udplite || udph->check) {
>
> @@ -334,7 +334,7 @@ static int tcf_csum_ipv6_udp(struct sk_buff *skb, unsigned int ihl,
> return 0;
>
> ip6h = ipv6_hdr(skb);
> - ul = ntohs(udph->len);
> + ul = udp_get_len_short(udph);
>
> udph->check = 0;
>
> diff --git a/net/xfrm/xfrm_nat_keepalive.c b/net/xfrm/xfrm_nat_keepalive.c
> index ebf95d48e86c..678626ae3229 100644
> --- a/net/xfrm/xfrm_nat_keepalive.c
> +++ b/net/xfrm/xfrm_nat_keepalive.c
> @@ -133,7 +133,7 @@ static void nat_keepalive_send(struct nat_keepalive *ka)
> uh = skb_push(skb, sizeof(*uh));
> uh->source = ka->encap_sport;
> uh->dest = ka->encap_dport;
> - uh->len = htons(skb->len);
> + udp_set_len_short(uh, skb->len);
> uh->check = 0;
>
> skb->mark = ka->smark;
> --
> 2.52.0
>
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH net-next v2 03/12] geneve: Fix off-by-one comparing with GRO_LEGACY_MAX_SIZE
2026-02-26 20:15 ` [PATCH net-next v2 03/12] geneve: Fix off-by-one comparing with GRO_LEGACY_MAX_SIZE Alice Mikityanska
@ 2026-02-26 20:20 ` Alice Mikityanska
0 siblings, 0 replies; 25+ messages in thread
From: Alice Mikityanska @ 2026-02-26 20:20 UTC (permalink / raw)
To: Paolo Abeni
Cc: Alice Mikityanska, Daniel Borkmann, David S. Miller, Eric Dumazet,
Jakub Kicinski, Xin Long, Willem de Bruijn, David Ahern,
Nikolay Aleksandrov, Shuah Khan, Stanislav Fomichev, Andrew Lunn,
Simon Horman, Florian Westphal, netdev
On Thu, 26 Feb 2026 at 22:16, Alice Mikityanska
<alice.kernel@fastmail.im> wrote:
>
> From: Alice Mikityanska <alice@isovalent.com>
>
> GRO_LEGACY_MAX_SIZE = 65536; total_len being 65536 is too big to fit
> into a u16. As can be seen in skb_gro_receive, packets bigger or equal
> to gro_max_size (or GRO_LEGACY_MAX_SIZE) are dropped with -E2BIG. Apply
> the same boundary to geneve_post_decap_hint to avoid writing 65536 to a
> 16-bit iph->tot_len field with an overflow.
>
> Fixes: fd0dd796576e ("geneve: use GRO hint option in the RX path")
> Signed-off-by: Alice Mikityanska <alice@isovalent.com>
> ---
> drivers/net/geneve.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/net/geneve.c b/drivers/net/geneve.c
> index 01cdd06102e0..7a26e2439d48 100644
> --- a/drivers/net/geneve.c
> +++ b/drivers/net/geneve.c
> @@ -604,7 +604,7 @@ static int geneve_post_decap_hint(const struct sock *sk, struct sk_buff *skb,
> ipv6h = (void *)skb->data + gro_hint->nested_nh_offset;
> iph = (struct iphdr *)ipv6h;
> total_len = skb->len - gro_hint->nested_nh_offset;
> - if (total_len > GRO_LEGACY_MAX_SIZE)
> + if (total_len >= GRO_LEGACY_MAX_SIZE)
> return -E2BIG;
>
> /*
> --
> 2.52.0
>
Paolo, when I was looking at the surrounding code, I got a question
about your patch [1].
len = skb->len - gro_hint->nested_nh_offset;
This len is calculated as pseudo header len for checksum purposes. The
pseudo header len includes the UDP header and payload. Shouldn't it then
subtract nested_tp_offset (the beginning of the UDP header)? I.e. have
the same value of uh->len. I may be missing some context, but it caught
my eye, and I wanted to double-check.
[1]: https://lore.kernel.org/all/4a9a390588a429191e0ffe48ccdd288bb69e567e.1769011015.git.pabeni@redhat.com/
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH net-next v2 12/12] selftests: net: Add a test for BIG TCP in UDP tunnels
2026-02-26 20:16 ` [PATCH net-next v2 12/12] selftests: net: Add a test for BIG TCP in UDP tunnels Alice Mikityanska
@ 2026-02-27 1:30 ` Jakub Kicinski
2026-02-27 9:35 ` Alice Mikityanska
0 siblings, 1 reply; 25+ messages in thread
From: Jakub Kicinski @ 2026-02-27 1:30 UTC (permalink / raw)
To: Alice Mikityanska
Cc: Daniel Borkmann, David S. Miller, Eric Dumazet, Paolo Abeni,
Xin Long, Willem de Bruijn, David Ahern, Nikolay Aleksandrov,
Shuah Khan, Stanislav Fomichev, Andrew Lunn, Simon Horman,
Florian Westphal, netdev, Alice Mikityanska
On Thu, 26 Feb 2026 22:16:00 +0200 Alice Mikityanska wrote:
> + ip netns exec "$SERVER_NS" netserver 2>&1 >/dev/null
I'm assuming you want to suppress both stdout and stderr?
shellcheck points out this won't work, then. 2>&1 has to be last
--
pw-bot: cr
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH net-next v2 12/12] selftests: net: Add a test for BIG TCP in UDP tunnels
2026-02-27 1:30 ` Jakub Kicinski
@ 2026-02-27 9:35 ` Alice Mikityanska
0 siblings, 0 replies; 25+ messages in thread
From: Alice Mikityanska @ 2026-02-27 9:35 UTC (permalink / raw)
To: Jakub Kicinski
Cc: Alice Mikityanska, Daniel Borkmann, David S. Miller, Eric Dumazet,
Paolo Abeni, Xin Long, Willem de Bruijn, David Ahern,
Nikolay Aleksandrov, Shuah Khan, Stanislav Fomichev, Andrew Lunn,
Simon Horman, Florian Westphal, netdev
On Fri, 27 Feb 2026 at 03:30, Jakub Kicinski <kuba@kernel.org> wrote:
>
> On Thu, 26 Feb 2026 22:16:00 +0200 Alice Mikityanska wrote:
> > + ip netns exec "$SERVER_NS" netserver 2>&1 >/dev/null
>
> I'm assuming you want to suppress both stdout and stderr?
I think I only wanted to suppress stdout (speed stats, etc.), but
still see any possible errors from stderr. To be honest, I don't
remember why I had the stderr>stdout redirect then (I see I don't have
it with the netperf client). Thanks for pointing it out, let's
probably keep `> /dev/null` and drop `2>&1`.
> shellcheck points out this won't work, then. 2>&1 has to be last
> --
> pw-bot: cr
^ permalink raw reply [flat|nested] 25+ messages in thread
* [syzbot ci] Re: BIG TCP for UDP tunnels
2026-02-26 20:15 [PATCH net-next v2 00/12] BIG TCP for UDP tunnels Alice Mikityanska
` (11 preceding siblings ...)
2026-02-26 20:16 ` [PATCH net-next v2 12/12] selftests: net: Add a test for BIG TCP in UDP tunnels Alice Mikityanska
@ 2026-02-27 18:17 ` syzbot ci
12 siblings, 0 replies; 25+ messages in thread
From: syzbot ci @ 2026-02-27 18:17 UTC (permalink / raw)
To: alice.kernel, alice, andrew, daniel, davem, dsahern, edumazet, fw,
gal, horms, kuba, lucien.xin, netdev, pabeni, razor, shuah,
stfomichev, willemdebruijn.kernel
Cc: syzbot, syzkaller-bugs
syzbot ci has tested the following series
[v2] BIG TCP for UDP tunnels
https://lore.kernel.org/all/20260226201600.222044-1-alice.kernel@fastmail.im
* [PATCH net-next v2 01/12] net/sched: act_csum: don't mangle UDP tunnel GSO packets
* [PATCH net-next v2 02/12] udp: gso: Simplify handling length in GSO_PARTIAL
* [PATCH net-next v2 03/12] geneve: Fix off-by-one comparing with GRO_LEGACY_MAX_SIZE
* [PATCH net-next v2 04/12] net: Use helpers to get/set UDP len tree-wide
* [PATCH net-next v2 05/12] net: Enable BIG TCP with partial GSO
* [PATCH net-next v2 06/12] udp: Support gro_ipv4_max_size > 65536
* [PATCH net-next v2 07/12] udp: Support BIG TCP GSO packets where they can occur
* [PATCH net-next v2 08/12] udp: Validate UDP length in udp_gro_receive
* [PATCH net-next v2 09/12] udp: Set length in UDP header to 0 for big GSO packets
* [PATCH net-next v2 10/12] vxlan: Enable BIG TCP packets
* [PATCH net-next v2 11/12] geneve: Enable BIG TCP packets
* [PATCH net-next v2 12/12] selftests: net: Add a test for BIG TCP in UDP tunnels
and found the following issue:
WARNING in l2tp_xmit_skb
Full report is available here:
https://ci.syzbot.org/series/8de1e666-84bd-4f74-b980-539c4b651492
***
WARNING in l2tp_xmit_skb
tree: net-next
URL: https://kernel.googlesource.com/pub/scm/linux/kernel/git/netdev/net-next.git
base: fd6dad4e1ae296b67b87291256878a58dad36c93
arch: amd64
compiler: Debian clang version 21.1.8 (++20251221033036+2078da43e25a-1~exp1~20251221153213.50), Debian LLD 21.1.8
config: https://ci.syzbot.org/builds/f58ab8e4-fdf3-4b35-8063-129b877f1ed8/config
C repro: https://ci.syzbot.org/findings/a09a9cb9-7ad3-49ee-8662-3581bd42cd46/c_repro
syz repro: https://ci.syzbot.org/findings/a09a9cb9-7ad3-49ee-8662-3581bd42cd46/syz_repro
------------[ cut here ]------------
len >= 65536u
WARNING: ./include/linux/udp.h:38 at udp_set_len_short include/linux/udp.h:38 [inline], CPU#1: syz.0.17/5957
WARNING: ./include/linux/udp.h:38 at l2tp_xmit_core net/l2tp/l2tp_core.c:1293 [inline], CPU#1: syz.0.17/5957
WARNING: ./include/linux/udp.h:38 at l2tp_xmit_skb+0x1204/0x18d0 net/l2tp/l2tp_core.c:1327, CPU#1: syz.0.17/5957
Modules linked in:
CPU: 1 UID: 0 PID: 5957 Comm: syz.0.17 Not tainted syzkaller #0 PREEMPT(full)
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
RIP: 0010:udp_set_len_short include/linux/udp.h:38 [inline]
RIP: 0010:l2tp_xmit_core net/l2tp/l2tp_core.c:1293 [inline]
RIP: 0010:l2tp_xmit_skb+0x1204/0x18d0 net/l2tp/l2tp_core.c:1327
Code: 0f 0b 90 e9 21 f9 ff ff e8 e9 05 ec f6 90 0f 0b 90 e9 8d f9 ff ff e8 db 05 ec f6 90 0f 0b 90 e9 cc f9 ff ff e8 cd 05 ec f6 90 <0f> 0b 90 e9 de fa ff ff 44 89 f1 80 e1 07 80 c1 03 38 c1 0f 8c 4f
RSP: 0018:ffffc90003d67878 EFLAGS: 00010293
RAX: ffffffff8ad985e3 RBX: ffff8881a6400090 RCX: ffff8881697f0000
RDX: 0000000000000000 RSI: 0000000000034010 RDI: 000000000000ffff
RBP: dffffc0000000000 R08: 0000000000000003 R09: 0000000000000004
R10: dffffc0000000000 R11: fffff520007acf00 R12: ffff8881baf20900
R13: 0000000000034010 R14: ffff8881a640008e R15: ffff8881760f7000
FS: 000055557e81f500(0000) GS:ffff8882a9467000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000200000033000 CR3: 00000001612f4000 CR4: 00000000000006f0
Call Trace:
<TASK>
pppol2tp_sendmsg+0x40a/0x5f0 net/l2tp/l2tp_ppp.c:302
sock_sendmsg_nosec net/socket.c:727 [inline]
__sock_sendmsg net/socket.c:742 [inline]
sock_write_iter+0x503/0x550 net/socket.c:1195
do_iter_readv_writev+0x619/0x8c0 fs/read_write.c:-1
vfs_writev+0x33c/0x990 fs/read_write.c:1059
do_writev+0x154/0x2e0 fs/read_write.c:1105
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0x14d/0xf80 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f636479c629
Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 e8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007ffffd4241c8 EFLAGS: 00000246 ORIG_RAX: 0000000000000014
RAX: ffffffffffffffda RBX: 00007f6364a15fa0 RCX: 00007f636479c629
RDX: 0000000000000001 RSI: 0000200000000080 RDI: 0000000000000003
RBP: 00007f6364832b39 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 00007f6364a15fac R14: 00007f6364a15fa0 R15: 00007f6364a15fa0
</TASK>
***
If these findings have caused you to resend the series or submit a
separate fix, please add the following tag to your commit message:
Tested-by: syzbot@syzkaller.appspotmail.com
---
This report is generated by a bot. It may contain errors.
syzbot ci engineers can be reached at syzkaller@googlegroups.com.
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH net-next v2 02/12] udp: gso: Simplify handling length in GSO_PARTIAL
2026-02-26 20:15 ` [PATCH net-next v2 02/12] udp: gso: Simplify handling length in GSO_PARTIAL Alice Mikityanska
@ 2026-03-06 20:55 ` Willem de Bruijn
2026-03-06 22:19 ` Alice Mikityanska
0 siblings, 1 reply; 25+ messages in thread
From: Willem de Bruijn @ 2026-03-06 20:55 UTC (permalink / raw)
To: Alice Mikityanska, Daniel Borkmann, David S. Miller, Eric Dumazet,
Jakub Kicinski, Paolo Abeni, Xin Long, Willem de Bruijn,
David Ahern, Nikolay Aleksandrov
Cc: Shuah Khan, Stanislav Fomichev, Andrew Lunn, Simon Horman,
Florian Westphal, netdev, Alice Mikityanska, Gal Pressman
Alice Mikityanska wrote:
> From: Alice Mikityanska <alice@isovalent.com>
>
> Taking further the idea of commit b10b446ce7ad ("udp: gso: Use single
> MSS length in UDP header for GSO_PARTIAL"), simplify the implementation
> and fix the checksum (apparently ignored by hardware anyway).
>
> The mentioned commit started using msslen for uh->len, but still uses
> newlen to adjust uh->check. If the formula for check is fixed, newlen is
> assigned but never used before the loop, and newlen is overwritten after
> the loop. This makes msslen not really necessary, as we can reuse
> newlen, if we don't adjust mss before.
That's a big if. Why is the udp length now set to the segment length,
and not to the GSO packet length, as before?
> The adjustment of mss can be
> simply dropped, because mss is not used anywhere else below.
>
> This brings us back to one variable, drops an unneeded arithmetic for
> mss, and fixes the UDP checksum.
>
> Signed-off-by: Alice Mikityanska <alice@isovalent.com>
> Cc: Gal Pressman <gal@nvidia.com>
> ---
> net/ipv4/udp_offload.c | 13 ++-----------
> 1 file changed, 2 insertions(+), 11 deletions(-)
>
> diff --git a/net/ipv4/udp_offload.c b/net/ipv4/udp_offload.c
> index 6b1654c1ad4a..e831234326c4 100644
> --- a/net/ipv4/udp_offload.c
> +++ b/net/ipv4/udp_offload.c
> @@ -483,11 +483,11 @@ struct sk_buff *__udp_gso_segment(struct sk_buff *gso_skb,
> struct sock *sk = gso_skb->sk;
> unsigned int sum_truesize = 0;
> struct sk_buff *segs, *seg;
> - __be16 newlen, msslen;
> struct udphdr *uh;
> unsigned int mss;
> bool copy_dtor;
> __sum16 check;
> + __be16 newlen;
> int ret = 0;
>
> mss = skb_shinfo(gso_skb)->gso_size;
> @@ -556,15 +556,6 @@ struct sk_buff *__udp_gso_segment(struct sk_buff *gso_skb,
> return segs;
> }
>
> - msslen = htons(sizeof(*uh) + mss);
> -
> - /* GSO partial and frag_list segmentation only requires splitting
> - * the frame into an MSS multiple and possibly a remainder, both
> - * cases return a GSO skb. So update the mss now.
> - */
> - if (skb_is_gso(segs))
> - mss *= skb_shinfo(segs)->gso_segs;
> -
> seg = segs;
> uh = udp_hdr(seg);
>
> @@ -587,7 +578,7 @@ struct sk_buff *__udp_gso_segment(struct sk_buff *gso_skb,
> if (!seg->next)
> break;
>
> - uh->len = msslen;
> + uh->len = newlen;
> uh->check = check;
>
> if (seg->ip_summed == CHECKSUM_PARTIAL)
> --
> 2.52.0
>
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH net-next v2 06/12] udp: Support gro_ipv4_max_size > 65536
2026-02-26 20:15 ` [PATCH net-next v2 06/12] udp: Support gro_ipv4_max_size > 65536 Alice Mikityanska
@ 2026-03-06 21:24 ` Willem de Bruijn
2026-03-06 21:31 ` Willem de Bruijn
0 siblings, 1 reply; 25+ messages in thread
From: Willem de Bruijn @ 2026-03-06 21:24 UTC (permalink / raw)
To: Alice Mikityanska, Daniel Borkmann, David S. Miller, Eric Dumazet,
Jakub Kicinski, Paolo Abeni, Xin Long, Willem de Bruijn,
David Ahern, Nikolay Aleksandrov
Cc: Shuah Khan, Stanislav Fomichev, Andrew Lunn, Simon Horman,
Florian Westphal, netdev, Alice Mikityanska
Alice Mikityanska wrote:
> From: Alice Mikityanska <alice@isovalent.com>
>
> Currently, gro_max_size and gro_ipv4_max_size can be set to values
> bigger than 65536, and GRO will happily aggregate UDP to the configured
> size (for example, with TCP traffic in VXLAN tunnels). However,
> udp_gro_complete uses the 16-bit length field in the UDP header to store
> the length of the aggregated packet. It leads to the packet truncation
> later in __udp4_lib_rcv.
>
> Fix this by storing 0 to the UDP length field and by restoring the real
> length from skb->len in __udp4_lib_rcv.
>
> Signed-off-by: Alice Mikityanska <alice@isovalent.com>
> ---
> net/ipv4/udp.c | 5 ++++-
> net/ipv4/udp_offload.c | 4 ++--
> net/ipv6/udp_offload.c | 2 +-
> 3 files changed, 7 insertions(+), 4 deletions(-)
>
> diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
> index 345ef93001fc..870b35107ede 100644
> --- a/net/ipv4/udp.c
> +++ b/net/ipv4/udp.c
> @@ -2690,7 +2690,7 @@ int __udp4_lib_rcv(struct sk_buff *skb, struct udp_table *udptable,
> {
> struct sock *sk = NULL;
> struct udphdr *uh;
> - unsigned short ulen;
> + unsigned int ulen;
> struct rtable *rt = skb_rtable(skb);
> __be32 saddr, daddr;
> struct net *net = dev_net(skb->dev);
> @@ -2714,6 +2714,9 @@ int __udp4_lib_rcv(struct sk_buff *skb, struct udp_table *udptable,
> goto short_packet;
>
> if (proto == IPPROTO_UDP) {
> + if (!ulen)
> + ulen = skb->len;
> +
For normal packets, ip_rcv_core truncates skbs to their IP length.
I don't immediate see the GRO layer taking care of this. Which is fine
if it can be done later, but not if these protocol fields are zeroed.
Should we validate that packets have no padding before we coalesce?
> /* UDP validates ulen. */
> if (ulen < sizeof(*uh) || pskb_trim_rcsum(skb, ulen))
> goto short_packet;
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH net-next v2 06/12] udp: Support gro_ipv4_max_size > 65536
2026-03-06 21:24 ` Willem de Bruijn
@ 2026-03-06 21:31 ` Willem de Bruijn
0 siblings, 0 replies; 25+ messages in thread
From: Willem de Bruijn @ 2026-03-06 21:31 UTC (permalink / raw)
To: Willem de Bruijn, Alice Mikityanska, Daniel Borkmann,
David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
Xin Long, Willem de Bruijn, David Ahern, Nikolay Aleksandrov
Cc: Shuah Khan, Stanislav Fomichev, Andrew Lunn, Simon Horman,
Florian Westphal, netdev, Alice Mikityanska
Willem de Bruijn wrote:
> Alice Mikityanska wrote:
> > From: Alice Mikityanska <alice@isovalent.com>
> >
> > Currently, gro_max_size and gro_ipv4_max_size can be set to values
> > bigger than 65536, and GRO will happily aggregate UDP to the configured
> > size (for example, with TCP traffic in VXLAN tunnels). However,
> > udp_gro_complete uses the 16-bit length field in the UDP header to store
> > the length of the aggregated packet. It leads to the packet truncation
> > later in __udp4_lib_rcv.
> >
> > Fix this by storing 0 to the UDP length field and by restoring the real
> > length from skb->len in __udp4_lib_rcv.
> >
> > Signed-off-by: Alice Mikityanska <alice@isovalent.com>
> > ---
> > net/ipv4/udp.c | 5 ++++-
> > net/ipv4/udp_offload.c | 4 ++--
> > net/ipv6/udp_offload.c | 2 +-
> > 3 files changed, 7 insertions(+), 4 deletions(-)
> >
> > diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
> > index 345ef93001fc..870b35107ede 100644
> > --- a/net/ipv4/udp.c
> > +++ b/net/ipv4/udp.c
> > @@ -2690,7 +2690,7 @@ int __udp4_lib_rcv(struct sk_buff *skb, struct udp_table *udptable,
> > {
> > struct sock *sk = NULL;
> > struct udphdr *uh;
> > - unsigned short ulen;
> > + unsigned int ulen;
> > struct rtable *rt = skb_rtable(skb);
> > __be32 saddr, daddr;
> > struct net *net = dev_net(skb->dev);
> > @@ -2714,6 +2714,9 @@ int __udp4_lib_rcv(struct sk_buff *skb, struct udp_table *udptable,
> > goto short_packet;
> >
> > if (proto == IPPROTO_UDP) {
> > + if (!ulen)
> > + ulen = skb->len;
> > +
>
> For normal packets, ip_rcv_core truncates skbs to their IP length.
>
> I don't immediate see the GRO layer taking care of this. Which is fine
> if it can be done later, but not if these protocol fields are zeroed.
>
> Should we validate that packets have no padding before we coalesce?
I should have read ahead. This is addressed in patch 8. Great! Never mind this comment.
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH net-next v2 02/12] udp: gso: Simplify handling length in GSO_PARTIAL
2026-03-06 20:55 ` Willem de Bruijn
@ 2026-03-06 22:19 ` Alice Mikityanska
2026-03-07 23:23 ` Willem de Bruijn
0 siblings, 1 reply; 25+ messages in thread
From: Alice Mikityanska @ 2026-03-06 22:19 UTC (permalink / raw)
To: Willem de Bruijn
Cc: Alice Mikityanska, Daniel Borkmann, David S. Miller, Eric Dumazet,
Jakub Kicinski, Paolo Abeni, Xin Long, David Ahern,
Nikolay Aleksandrov, Shuah Khan, Stanislav Fomichev, Andrew Lunn,
Simon Horman, Florian Westphal, netdev, Gal Pressman
Thanks for reviewing my series!
On Fri, 6 Mar 2026 at 22:55, Willem de Bruijn
<willemdebruijn.kernel@gmail.com> wrote:
>
> Alice Mikityanska wrote:
> > From: Alice Mikityanska <alice@isovalent.com>
> >
> > Taking further the idea of commit b10b446ce7ad ("udp: gso: Use single
> > MSS length in UDP header for GSO_PARTIAL"), simplify the implementation
> > and fix the checksum (apparently ignored by hardware anyway).
> >
> > The mentioned commit started using msslen for uh->len, but still uses
> > newlen to adjust uh->check. If the formula for check is fixed, newlen is
> > assigned but never used before the loop, and newlen is overwritten after
> > the loop. This makes msslen not really necessary, as we can reuse
> > newlen, if we don't adjust mss before.
>
> That's a big if. Why is the udp length now set to the segment length,
> and not to the GSO packet length, as before?
Just so that we are on the same page: the behavior of the UDP length
field changed in Gal's commit b10b446ce7ad, not in mine. You reviewed
that commit and approved it:
https://lore.kernel.org/netdev/willemdebruijn.kernel.218d53621fba7@gmail.com/
I'm just refactoring to simplify the code a little bit + fixing the
checksum field that went out of sync with length after Gal's change.
Hope that clarifies it.
> > The adjustment of mss can be
> > simply dropped, because mss is not used anywhere else below.
> >
> > This brings us back to one variable, drops an unneeded arithmetic for
> > mss, and fixes the UDP checksum.
> >
> > Signed-off-by: Alice Mikityanska <alice@isovalent.com>
> > Cc: Gal Pressman <gal@nvidia.com>
> > ---
> > net/ipv4/udp_offload.c | 13 ++-----------
> > 1 file changed, 2 insertions(+), 11 deletions(-)
> >
> > diff --git a/net/ipv4/udp_offload.c b/net/ipv4/udp_offload.c
> > index 6b1654c1ad4a..e831234326c4 100644
> > --- a/net/ipv4/udp_offload.c
> > +++ b/net/ipv4/udp_offload.c
> > @@ -483,11 +483,11 @@ struct sk_buff *__udp_gso_segment(struct sk_buff *gso_skb,
> > struct sock *sk = gso_skb->sk;
> > unsigned int sum_truesize = 0;
> > struct sk_buff *segs, *seg;
> > - __be16 newlen, msslen;
> > struct udphdr *uh;
> > unsigned int mss;
> > bool copy_dtor;
> > __sum16 check;
> > + __be16 newlen;
> > int ret = 0;
> >
> > mss = skb_shinfo(gso_skb)->gso_size;
> > @@ -556,15 +556,6 @@ struct sk_buff *__udp_gso_segment(struct sk_buff *gso_skb,
> > return segs;
> > }
> >
> > - msslen = htons(sizeof(*uh) + mss);
> > -
> > - /* GSO partial and frag_list segmentation only requires splitting
> > - * the frame into an MSS multiple and possibly a remainder, both
> > - * cases return a GSO skb. So update the mss now.
> > - */
> > - if (skb_is_gso(segs))
> > - mss *= skb_shinfo(segs)->gso_segs;
> > -
> > seg = segs;
> > uh = udp_hdr(seg);
> >
> > @@ -587,7 +578,7 @@ struct sk_buff *__udp_gso_segment(struct sk_buff *gso_skb,
> > if (!seg->next)
> > break;
> >
> > - uh->len = msslen;
> > + uh->len = newlen;
> > uh->check = check;
> >
> > if (seg->ip_summed == CHECKSUM_PARTIAL)
> > --
> > 2.52.0
> >
>
>
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH net-next v2 02/12] udp: gso: Simplify handling length in GSO_PARTIAL
2026-03-06 22:19 ` Alice Mikityanska
@ 2026-03-07 23:23 ` Willem de Bruijn
2026-03-07 23:34 ` Alice Mikityanska
0 siblings, 1 reply; 25+ messages in thread
From: Willem de Bruijn @ 2026-03-07 23:23 UTC (permalink / raw)
To: Alice Mikityanska, Willem de Bruijn
Cc: Alice Mikityanska, Daniel Borkmann, David S. Miller, Eric Dumazet,
Jakub Kicinski, Paolo Abeni, Xin Long, David Ahern,
Nikolay Aleksandrov, Shuah Khan, Stanislav Fomichev, Andrew Lunn,
Simon Horman, Florian Westphal, netdev, Gal Pressman
Alice Mikityanska wrote:
> Thanks for reviewing my series!
>
> On Fri, 6 Mar 2026 at 22:55, Willem de Bruijn
> <willemdebruijn.kernel@gmail.com> wrote:
> >
> > Alice Mikityanska wrote:
> > > From: Alice Mikityanska <alice@isovalent.com>
> > >
> > > Taking further the idea of commit b10b446ce7ad ("udp: gso: Use single
> > > MSS length in UDP header for GSO_PARTIAL"), simplify the implementation
> > > and fix the checksum (apparently ignored by hardware anyway).
> > >
> > > The mentioned commit started using msslen for uh->len, but still uses
> > > newlen to adjust uh->check. If the formula for check is fixed, newlen is
> > > assigned but never used before the loop, and newlen is overwritten after
> > > the loop. This makes msslen not really necessary, as we can reuse
> > > newlen, if we don't adjust mss before.
> >
> > That's a big if. Why is the udp length now set to the segment length,
> > and not to the GSO packet length, as before?
>
> Just so that we are on the same page: the behavior of the UDP length
> field changed in Gal's commit b10b446ce7ad, not in mine. You reviewed
> that commit and approved it:
>
> https://lore.kernel.org/netdev/willemdebruijn.kernel.218d53621fba7@gmail.com/
>
> I'm just refactoring to simplify the code a little bit + fixing the
> checksum field that went out of sync with length after Gal's change.
> Hope that clarifies it.
Oh right. When respinning can you add this brief summary?
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH net-next v2 02/12] udp: gso: Simplify handling length in GSO_PARTIAL
2026-03-07 23:23 ` Willem de Bruijn
@ 2026-03-07 23:34 ` Alice Mikityanska
2026-03-07 23:53 ` Willem de Bruijn
0 siblings, 1 reply; 25+ messages in thread
From: Alice Mikityanska @ 2026-03-07 23:34 UTC (permalink / raw)
To: Willem de Bruijn
Cc: Alice Mikityanska, Daniel Borkmann, David S. Miller, Eric Dumazet,
Jakub Kicinski, Paolo Abeni, Xin Long, David Ahern,
Nikolay Aleksandrov, Shuah Khan, Stanislav Fomichev, Andrew Lunn,
Simon Horman, Florian Westphal, netdev, Gal Pressman
On Sun, 8 Mar 2026 at 01:23, Willem de Bruijn
<willemdebruijn.kernel@gmail.com> wrote:
>
> Alice Mikityanska wrote:
> > Thanks for reviewing my series!
> >
> > On Fri, 6 Mar 2026 at 22:55, Willem de Bruijn
> > <willemdebruijn.kernel@gmail.com> wrote:
> > >
> > > Alice Mikityanska wrote:
> > > > From: Alice Mikityanska <alice@isovalent.com>
> > > >
> > > > Taking further the idea of commit b10b446ce7ad ("udp: gso: Use single
> > > > MSS length in UDP header for GSO_PARTIAL"), simplify the implementation
> > > > and fix the checksum (apparently ignored by hardware anyway).
> > > >
> > > > The mentioned commit started using msslen for uh->len, but still uses
> > > > newlen to adjust uh->check. If the formula for check is fixed, newlen is
> > > > assigned but never used before the loop, and newlen is overwritten after
> > > > the loop. This makes msslen not really necessary, as we can reuse
> > > > newlen, if we don't adjust mss before.
> > >
> > > That's a big if. Why is the udp length now set to the segment length,
> > > and not to the GSO packet length, as before?
> >
> > Just so that we are on the same page: the behavior of the UDP length
> > field changed in Gal's commit b10b446ce7ad, not in mine. You reviewed
> > that commit and approved it:
> >
> > https://lore.kernel.org/netdev/willemdebruijn.kernel.218d53621fba7@gmail.com/
> >
> > I'm just refactoring to simplify the code a little bit + fixing the
> > checksum field that went out of sync with length after Gal's change.
> > Hope that clarifies it.
>
> Oh right. When respinning can you add this brief summary?
It's already in the first sentence of the commit message... I can
rephrase it to add something about "refactoring without changing
behavior" if "simplify the implementation" reads ambiguous.
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH net-next v2 02/12] udp: gso: Simplify handling length in GSO_PARTIAL
2026-03-07 23:34 ` Alice Mikityanska
@ 2026-03-07 23:53 ` Willem de Bruijn
0 siblings, 0 replies; 25+ messages in thread
From: Willem de Bruijn @ 2026-03-07 23:53 UTC (permalink / raw)
To: Alice Mikityanska, Willem de Bruijn
Cc: Alice Mikityanska, Daniel Borkmann, David S. Miller, Eric Dumazet,
Jakub Kicinski, Paolo Abeni, Xin Long, David Ahern,
Nikolay Aleksandrov, Shuah Khan, Stanislav Fomichev, Andrew Lunn,
Simon Horman, Florian Westphal, netdev, Gal Pressman
Alice Mikityanska wrote:
> On Sun, 8 Mar 2026 at 01:23, Willem de Bruijn
> <willemdebruijn.kernel@gmail.com> wrote:
> >
> > Alice Mikityanska wrote:
> > > Thanks for reviewing my series!
> > >
> > > On Fri, 6 Mar 2026 at 22:55, Willem de Bruijn
> > > <willemdebruijn.kernel@gmail.com> wrote:
> > > >
> > > > Alice Mikityanska wrote:
> > > > > From: Alice Mikityanska <alice@isovalent.com>
> > > > >
> > > > > Taking further the idea of commit b10b446ce7ad ("udp: gso: Use single
> > > > > MSS length in UDP header for GSO_PARTIAL"), simplify the implementation
> > > > > and fix the checksum (apparently ignored by hardware anyway).
> > > > >
> > > > > The mentioned commit started using msslen for uh->len, but still uses
> > > > > newlen to adjust uh->check. If the formula for check is fixed, newlen is
> > > > > assigned but never used before the loop, and newlen is overwritten after
> > > > > the loop. This makes msslen not really necessary, as we can reuse
> > > > > newlen, if we don't adjust mss before.
> > > >
> > > > That's a big if. Why is the udp length now set to the segment length,
> > > > and not to the GSO packet length, as before?
> > >
> > > Just so that we are on the same page: the behavior of the UDP length
> > > field changed in Gal's commit b10b446ce7ad, not in mine. You reviewed
> > > that commit and approved it:
> > >
> > > https://lore.kernel.org/netdev/willemdebruijn.kernel.218d53621fba7@gmail.com/
> > >
> > > I'm just refactoring to simplify the code a little bit + fixing the
> > > checksum field that went out of sync with length after Gal's change.
> > > Hope that clarifies it.
> >
> > Oh right. When respinning can you add this brief summary?
>
> It's already in the first sentence of the commit message... I can
> rephrase it to add something about "refactoring without changing
> behavior" if "simplify the implementation" reads ambiguous.
I totally read over that. As is is great. Thanks.
^ permalink raw reply [flat|nested] 25+ messages in thread
end of thread, other threads:[~2026-03-07 23:53 UTC | newest]
Thread overview: 25+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-02-26 20:15 [PATCH net-next v2 00/12] BIG TCP for UDP tunnels Alice Mikityanska
2026-02-26 20:15 ` [PATCH net-next v2 01/12] net/sched: act_csum: don't mangle UDP tunnel GSO packets Alice Mikityanska
2026-02-26 20:15 ` [PATCH net-next v2 02/12] udp: gso: Simplify handling length in GSO_PARTIAL Alice Mikityanska
2026-03-06 20:55 ` Willem de Bruijn
2026-03-06 22:19 ` Alice Mikityanska
2026-03-07 23:23 ` Willem de Bruijn
2026-03-07 23:34 ` Alice Mikityanska
2026-03-07 23:53 ` Willem de Bruijn
2026-02-26 20:15 ` [PATCH net-next v2 03/12] geneve: Fix off-by-one comparing with GRO_LEGACY_MAX_SIZE Alice Mikityanska
2026-02-26 20:20 ` Alice Mikityanska
2026-02-26 20:15 ` [PATCH net-next v2 04/12] net: Use helpers to get/set UDP len tree-wide Alice Mikityanska
2026-02-26 20:19 ` Alice Mikityanska
2026-02-26 20:15 ` [PATCH net-next v2 05/12] net: Enable BIG TCP with partial GSO Alice Mikityanska
2026-02-26 20:15 ` [PATCH net-next v2 06/12] udp: Support gro_ipv4_max_size > 65536 Alice Mikityanska
2026-03-06 21:24 ` Willem de Bruijn
2026-03-06 21:31 ` Willem de Bruijn
2026-02-26 20:15 ` [PATCH net-next v2 07/12] udp: Support BIG TCP GSO packets where they can occur Alice Mikityanska
2026-02-26 20:15 ` [PATCH net-next v2 08/12] udp: Validate UDP length in udp_gro_receive Alice Mikityanska
2026-02-26 20:15 ` [PATCH net-next v2 09/12] udp: Set length in UDP header to 0 for big GSO packets Alice Mikityanska
2026-02-26 20:15 ` [PATCH net-next v2 10/12] vxlan: Enable BIG TCP packets Alice Mikityanska
2026-02-26 20:15 ` [PATCH net-next v2 11/12] geneve: " Alice Mikityanska
2026-02-26 20:16 ` [PATCH net-next v2 12/12] selftests: net: Add a test for BIG TCP in UDP tunnels Alice Mikityanska
2026-02-27 1:30 ` Jakub Kicinski
2026-02-27 9:35 ` Alice Mikityanska
2026-02-27 18:17 ` [syzbot ci] Re: BIG TCP for " syzbot ci
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox