* [PATCH net-next v4 0/2] udp: fix FOU/GUE over multicast
@ 2026-07-01 23:10 Anton Danilov
2026-07-01 23:10 ` [PATCH net-next v4 1/2] udp: fix encapsulation packet resubmit in multicast deliver Anton Danilov
2026-07-01 23:10 ` [PATCH net-next v4 2/2] selftests: net: add FOU multicast encapsulation resubmit test Anton Danilov
0 siblings, 2 replies; 3+ messages in thread
From: Anton Danilov @ 2026-07-01 23:10 UTC (permalink / raw)
To: netdev
Cc: Willem de Bruijn, David S . Miller, David Ahern, Eric Dumazet,
Kuniyuki Iwashima, Jakub Kicinski, Paolo Abeni, Simon Horman,
Shuah Khan, linux-kselftest
UDP encapsulation (FOU, GUE) has never worked correctly with multicast
destination addresses. When a FOU-encapsulated packet arrives at a
multicast address, it enters __udp4_lib_mcast_deliver() which calls
consume_skb() on packets that need resubmission to the inner protocol
handler, silently dropping them instead.
The unicast delivery path handles this correctly by returning -ret,
but the multicast path was never updated to support UDP encapsulation
resubmit.
This causes silent packet loss for FOU/GRETAP tunnels configured with
multicast remote addresses. The loss ratio depends on the early demux
cache hit rate - packets that hit early demux bypass the multicast path
and work correctly, masking the issue.
Reproducing the issue:
ip netns add ns_a && ip netns add ns_b
ip link add veth0 type veth peer name veth1
ip link set veth0 netns ns_a && ip link set veth1 netns ns_b
ip -n ns_a addr add 10.0.0.1/24 dev veth0 && ip -n ns_a link set veth0 up
ip -n ns_b addr add 10.0.0.2/24 dev veth1 && ip -n ns_b link set veth1 up
# Multicast routes
ip -n ns_a route add 239.0.0.0/8 dev veth0
ip -n ns_b route add 239.0.0.0/8 dev veth1
# Disable early demux to expose the issue (otherwise it's partially masked)
ip netns exec ns_b sysctl -w net.ipv4.ip_early_demux=0
# Join multicast group on receiver
ip -n ns_b addr add 239.0.0.1/32 dev veth1 autojoin
# Sender: GRETAP with FOU encap
ip -n ns_a link add eoudp0 type gretap \
remote 239.0.0.1 local 10.0.0.1 \
encap fou encap-sport 4797 encap-dport 4797 key 239.0.0.1
ip -n ns_a link set eoudp0 up
ip -n ns_a addr add 192.168.99.1/24 dev eoudp0
# Receiver: FOU listener + GRETAP
ip netns exec ns_b ip fou add port 4797 ipproto 47
ip -n ns_b link add eoudp0 type gretap \
remote 239.0.0.1 local 10.0.0.2 \
encap fou encap-sport 4797 encap-dport 4797 key 239.0.0.1
ip -n ns_b link set eoudp0 up
ip -n ns_b addr add 192.168.99.2/24 dev eoudp0
# Static neigh: ARP replies can't traverse unidirectional mcast tunnel
recv_mac=$(ip -n ns_b link show eoudp0 | awk '/ether/{print $2}')
ip -n ns_a neigh add 192.168.99.2 lladdr $recv_mac dev eoudp0
# Test: ping through the FOU/GRETAP tunnel
ip netns exec ns_a ping -c 100 192.168.99.2
# -> without this patch: 0 packets received on eoudp0
# -> with this patch: all packets received on eoudp0
AI assistance (Claude, claude-opus-4-6) was used during root cause
analysis of the kernel source code (tracing the call chain from
udp_queue_rcv_skb through encap_rcv to ip_protocol_deliver_rcu,
comparing unicast/GSO/multicast paths) and during patch and selftest
authoring. The fix approach was identified by observing that the
unicast path (udp_unicast_rcv_skb) already handles encap resubmit
correctly via return -ret, while the multicast path did not.
v4:
- Promoted from RFC to PATCH; no functional changes since v3.
v3 was posted as RFC and consequently dropped from patchwork,
which explains the lack of review feedback.
v3: https://lore.kernel.org/netdev/cover.1777934869.git.littlesmilingcloud@gmail.com/
- Use return -ret instead of calling ip_protocol_deliver_rcu()
directly, matching the unicast path and avoiding call stack
growth with nested encapsulations (Kuniyuki Iwashima)
- Only change the first-socket path; the clone loop is not
reachable for tunnel sockets (no SO_REUSEADDR/SO_REUSEPORT)
- Replace Python packet generator with ping through a properly
configured FOU/GRETAP tunnel in the selftest
- Add static neighbor entry (ARP replies cannot traverse the
unidirectional multicast tunnel)
v2: https://lore.kernel.org/netdev/ad_dal164gVmImWl@dau-home-pc/
- Moved inline Python packet generator into a separate helper
- Fixed author email typo in Signed-off-by
v1 (RFC): https://lore.kernel.org/netdev/ad7MsSJOuUU6EGwS@dau-home-pc/
Anton Danilov (2):
udp: fix encapsulation packet resubmit in multicast deliver
selftests: net: add FOU multicast encapsulation resubmit test
net/ipv4/udp.c | 6 +-
net/ipv6/udp.c | 6 +-
tools/testing/selftests/net/Makefile | 1 +
.../testing/selftests/net/fou_mcast_encap.sh | 112 ++++++++++++++++++
4 files changed, 121 insertions(+), 4 deletions(-)
create mode 100755 tools/testing/selftests/net/fou_mcast_encap.sh
--
2.47.3
^ permalink raw reply [flat|nested] 3+ messages in thread
* [PATCH net-next v4 1/2] udp: fix encapsulation packet resubmit in multicast deliver
2026-07-01 23:10 [PATCH net-next v4 0/2] udp: fix FOU/GUE over multicast Anton Danilov
@ 2026-07-01 23:10 ` Anton Danilov
2026-07-01 23:10 ` [PATCH net-next v4 2/2] selftests: net: add FOU multicast encapsulation resubmit test Anton Danilov
1 sibling, 0 replies; 3+ messages in thread
From: Anton Danilov @ 2026-07-01 23:10 UTC (permalink / raw)
To: netdev
Cc: Willem de Bruijn, David S . Miller, David Ahern, Eric Dumazet,
Kuniyuki Iwashima, Jakub Kicinski, Paolo Abeni, Simon Horman,
Shuah Khan, linux-kselftest
When a UDP encapsulation socket (e.g., FOU) receives a multicast
packet, __udp4_lib_mcast_deliver() and __udp6_lib_mcast_deliver()
call consume_skb() when udp_queue_rcv_skb() returns a positive value.
A positive return value from udp_queue_rcv_skb() indicates that the
encap_rcv handler (e.g., fou_udp_recv) has consumed the UDP header
and wants the packet to be resubmitted to the IP protocol handler
for further processing (e.g., as a GRE packet).
The unicast path in udp_unicast_rcv_skb() handles this correctly by
returning -ret, which propagates up to ip_protocol_deliver_rcu() for
resubmission. However, the multicast path destroys the packet via
consume_skb() instead of resubmitting it, causing silent packet loss.
This affects any UDP encapsulation (FOU, GUE) combined with multicast
destination addresses.
Fix this by returning -ret instead of calling consume_skb() when the
return value is positive, matching the behavior of the unicast path.
This avoids growing the call stack compared to calling
ip_protocol_deliver_rcu() directly.
Signed-off-by: Anton Danilov <littlesmilingcloud@gmail.com>
Assisted-by: Claude:claude-opus-4-6
---
net/ipv4/udp.c | 6 ++++--
net/ipv6/udp.c | 6 ++++--
2 files changed, 8 insertions(+), 4 deletions(-)
diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index 70f6cbd4ef73..b0910659391e 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -2475,6 +2475,7 @@ static int __udp4_lib_mcast_deliver(struct net *net, struct sk_buff *skb,
struct udp_hslot *hslot;
struct sk_buff *nskb;
bool use_hash2;
+ int ret;
hash2_any = 0;
hash2 = 0;
@@ -2519,8 +2520,9 @@ static int __udp4_lib_mcast_deliver(struct net *net, struct sk_buff *skb,
}
if (first) {
- if (udp_queue_rcv_skb(first, skb) > 0)
- consume_skb(skb);
+ ret = udp_queue_rcv_skb(first, skb);
+ if (ret > 0)
+ return -ret;
} else {
kfree_skb(skb);
__UDP_INC_STATS(net, UDP_MIB_IGNOREDMULTI);
diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
index 15e032194ecc..ff2e389e286b 100644
--- a/net/ipv6/udp.c
+++ b/net/ipv6/udp.c
@@ -949,6 +949,7 @@ static int __udp6_lib_mcast_deliver(struct net *net, struct sk_buff *skb,
struct udp_hslot *hslot;
struct sk_buff *nskb;
bool use_hash2;
+ int ret;
hash2_any = 0;
hash2 = 0;
@@ -998,8 +999,9 @@ static int __udp6_lib_mcast_deliver(struct net *net, struct sk_buff *skb,
}
if (first) {
- if (udpv6_queue_rcv_skb(first, skb) > 0)
- consume_skb(skb);
+ ret = udpv6_queue_rcv_skb(first, skb);
+ if (ret > 0)
+ return -ret;
} else {
kfree_skb(skb);
__UDP6_INC_STATS(net, UDP_MIB_IGNOREDMULTI);
--
2.47.3
^ permalink raw reply related [flat|nested] 3+ messages in thread
* [PATCH net-next v4 2/2] selftests: net: add FOU multicast encapsulation resubmit test
2026-07-01 23:10 [PATCH net-next v4 0/2] udp: fix FOU/GUE over multicast Anton Danilov
2026-07-01 23:10 ` [PATCH net-next v4 1/2] udp: fix encapsulation packet resubmit in multicast deliver Anton Danilov
@ 2026-07-01 23:10 ` Anton Danilov
1 sibling, 0 replies; 3+ messages in thread
From: Anton Danilov @ 2026-07-01 23:10 UTC (permalink / raw)
To: netdev
Cc: Willem de Bruijn, David S . Miller, David Ahern, Eric Dumazet,
Kuniyuki Iwashima, Jakub Kicinski, Paolo Abeni, Simon Horman,
Shuah Khan, linux-kselftest
Add a selftest to verify that FOU-encapsulated packets addressed to a
multicast destination are correctly resubmitted to the inner protocol
handler (GRE) via the UDP multicast delivery path.
The test creates two network namespaces connected by a veth pair with
a FOU/GRETAP tunnel using a multicast remote address (239.0.0.1).
Ping is sent through the tunnel and received packets are counted on
the receiver's tunnel interface.
A static neighbor entry is configured on the sender because ARP
replies from the receiver cannot traverse the unidirectional multicast
tunnel back to the sender.
The early demux optimization (net.ipv4.ip_early_demux) is disabled on
the receiver to force packets through __udp4_lib_mcast_deliver(),
which is the code path being tested.
Signed-off-by: Anton Danilov <littlesmilingcloud@gmail.com>
Assisted-by: Claude:claude-opus-4-6
---
tools/testing/selftests/net/Makefile | 1 +
.../testing/selftests/net/fou_mcast_encap.sh | 112 ++++++++++++++++++
2 files changed, 113 insertions(+)
create mode 100755 tools/testing/selftests/net/fou_mcast_encap.sh
diff --git a/tools/testing/selftests/net/Makefile b/tools/testing/selftests/net/Makefile
index 708d960ae07d..7e9ae937cffa 100644
--- a/tools/testing/selftests/net/Makefile
+++ b/tools/testing/selftests/net/Makefile
@@ -39,6 +39,7 @@ TEST_PROGS := \
fib_rule_tests.sh \
fib_tests.sh \
fin_ack_lat.sh \
+ fou_mcast_encap.sh \
fq_band_pktlimit.sh \
gre_gso.sh \
gre_ipv6_lladdr.sh \
diff --git a/tools/testing/selftests/net/fou_mcast_encap.sh b/tools/testing/selftests/net/fou_mcast_encap.sh
new file mode 100755
index 000000000000..8db9633f4c28
--- /dev/null
+++ b/tools/testing/selftests/net/fou_mcast_encap.sh
@@ -0,0 +1,112 @@
+#!/bin/bash
+# SPDX-License-Identifier: GPL-2.0
+#
+# Test that UDP encapsulation (FOU) correctly handles packet resubmit
+# when packets are delivered via the multicast UDP delivery path.
+#
+# When a FOU-encapsulated packet arrives with a multicast destination IP,
+# __udp4_lib_mcast_deliver() must resubmit it to the inner protocol
+# handler (e.g., GRE) rather than consuming it. This test verifies that
+# by creating a FOU/GRETAP tunnel with a multicast remote address and
+# sending ping through it.
+#
+# The early demux optimization can mask this issue by routing packets via
+# the unicast path (udp_unicast_rcv_skb), so we disable it to force
+# packets through __udp4_lib_mcast_deliver().
+
+source lib.sh
+
+NSENDER=""
+NRECV=""
+
+cleanup() {
+ cleanup_all_ns
+}
+
+trap cleanup EXIT
+
+setup() {
+ setup_ns NSENDER NRECV
+
+ ip link add veth_s type veth peer name veth_r
+ ip link set veth_s netns "$NSENDER"
+ ip link set veth_r netns "$NRECV"
+
+ ip -n "$NSENDER" addr add 10.0.0.1/24 dev veth_s
+ ip -n "$NSENDER" link set veth_s up
+
+ ip -n "$NRECV" addr add 10.0.0.2/24 dev veth_r
+ ip -n "$NRECV" link set veth_r up
+
+ # Disable early demux to force multicast delivery path
+ ip netns exec "$NRECV" sysctl -wq net.ipv4.ip_early_demux=0
+
+ # Join multicast group on receiver
+ ip -n "$NRECV" addr add 239.0.0.1/32 dev veth_r autojoin
+
+ # Multicast routes
+ ip -n "$NRECV" route add 239.0.0.0/8 dev veth_r
+ ip -n "$NSENDER" route add 239.0.0.0/8 dev veth_s
+
+ # Sender: GRETAP with FOU encap (no FOU listener needed on TX side)
+ ip -n "$NSENDER" link add eoudp0 type gretap \
+ remote 239.0.0.1 local 10.0.0.1 \
+ encap fou encap-sport 4797 encap-dport 4797 \
+ key 239.0.0.1
+ ip -n "$NSENDER" link set eoudp0 up
+ ip -n "$NSENDER" addr add 192.168.99.1/24 dev eoudp0
+
+ # Receiver: FOU listener + GRETAP
+ ip netns exec "$NRECV" ip fou add port 4797 ipproto 47
+ ip -n "$NRECV" link add eoudp0 type gretap \
+ remote 239.0.0.1 local 10.0.0.2 \
+ encap fou encap-sport 4797 encap-dport 4797 \
+ key 239.0.0.1
+ ip -n "$NRECV" link set eoudp0 up
+ ip -n "$NRECV" addr add 192.168.99.2/24 dev eoudp0
+
+ # Static neigh entry on sender: ARP replies cannot traverse the
+ # multicast tunnel back, so pre-populate the neighbor cache.
+ local recv_mac
+ recv_mac=$(ip -n "$NRECV" link show eoudp0 | awk '/ether/{print $2}')
+ ip -n "$NSENDER" neigh add 192.168.99.2 lladdr "$recv_mac" dev eoudp0
+}
+
+get_rx_packets() {
+ ip -n "$NRECV" -s link show eoudp0 | awk '/RX:/{getline; print $2}'
+}
+
+test_fou_mcast_encap() {
+ local count=100
+ local rx_before
+ local rx_after
+ local rx_delta
+
+ # Warmup: let any initial broadcast/ARP traffic settle
+ ip netns exec "$NSENDER" ping -c 1 -W 1 192.168.99.2 >/dev/null 2>&1
+ sleep 1
+
+ rx_before=$(get_rx_packets)
+ ip netns exec "$NSENDER" ping -c $count -W 1 192.168.99.2 >/dev/null 2>&1
+ sleep 1
+ rx_after=$(get_rx_packets)
+
+ rx_delta=$((rx_after - rx_before))
+
+ if [ "$rx_delta" -ge "$count" ]; then
+ echo "PASS: received $rx_delta/$count packets via multicast FOU/GRETAP"
+ return "$ksft_pass"
+ elif [ "$rx_delta" -gt 0 ]; then
+ echo "FAIL: only $rx_delta/$count packets received (partial delivery)"
+ return "$ksft_fail"
+ else
+ echo "FAIL: 0/$count packets received (multicast encap resubmit broken)"
+ return "$ksft_fail"
+ fi
+}
+
+echo "TEST: FOU/GRETAP multicast encapsulation resubmit"
+
+setup
+test_fou_mcast_encap
+exit $?
--
2.47.3
^ permalink raw reply related [flat|nested] 3+ messages in thread
end of thread, other threads:[~2026-07-01 23:11 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-07-01 23:10 [PATCH net-next v4 0/2] udp: fix FOU/GUE over multicast Anton Danilov
2026-07-01 23:10 ` [PATCH net-next v4 1/2] udp: fix encapsulation packet resubmit in multicast deliver Anton Danilov
2026-07-01 23:10 ` [PATCH net-next v4 2/2] selftests: net: add FOU multicast encapsulation resubmit test Anton Danilov
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox