Netdev List
 help / color / mirror / Atom feed
* [PATCH net-next v5 0/2] udp: fix FOU/GUE over multicast
@ 2026-07-05  2:36 Anton Danilov
  2026-07-05  2:36 ` [PATCH net-next v5 1/2] udp: fix encapsulation packet resubmit in multicast deliver Anton Danilov
  2026-07-05  2:36 ` [PATCH net-next v5 2/2] selftests: net: add FOU multicast encapsulation resubmit test Anton Danilov
  0 siblings, 2 replies; 3+ messages in thread
From: Anton Danilov @ 2026-07-05  2:36 UTC (permalink / raw)
  To: netdev
  Cc: Willem de Bruijn, David S . Miller, David Ahern, Eric Dumazet,
	Kuniyuki Iwashima, Jakub Kicinski, Paolo Abeni, Simon Horman,
	Shuah Khan, linux-kselftest

UDP encapsulation (FOU, GUE) has never worked correctly with multicast
destination addresses. When a FOU-encapsulated packet arrives at a
multicast address, it enters __udp4_lib_mcast_deliver() /
__udp6_lib_mcast_deliver() which call consume_skb() on packets that
need resubmission to the inner protocol handler, silently dropping
them instead.

The unicast delivery paths handle this correctly by propagating the
return value up to ip[6]_protocol_deliver_rcu() for resubmission, but
the multicast paths were never updated to support UDP encapsulation
resubmit.

This causes silent packet loss for FOU/GRETAP tunnels configured with
multicast remote addresses (both IPv4 and IPv6).

Reproducing the issue (IPv4):

  ip netns add ns_a && ip netns add ns_b
  ip link add veth0 netns ns_a type veth peer name veth1 netns ns_b

  ip -n ns_a addr add 10.0.0.1/24 dev veth0 && ip -n ns_a link set veth0 up
  ip -n ns_b addr add 10.0.0.2/24 dev veth1 && ip -n ns_b link set veth1 up

  ip -n ns_a route add 239.0.0.0/8 dev veth0
  ip -n ns_b route add 239.0.0.0/8 dev veth1

  # Disable early demux to expose the issue (otherwise it's partially masked)
  ip netns exec ns_b sysctl -w net.ipv4.ip_early_demux=0

  # Join multicast group on receiver
  ip -n ns_b addr add 239.0.0.1/32 dev veth1 autojoin

  # Sender: GRETAP with FOU encap
  ip -n ns_a link add eoudp0 type gretap \
      remote 239.0.0.1 local 10.0.0.1 \
      encap fou encap-sport 4797 encap-dport 4797 key 239.0.0.1
  ip -n ns_a link set eoudp0 up
  ip -n ns_a addr add 192.168.99.1/24 dev eoudp0

  # Receiver: FOU listener + GRETAP
  ip netns exec ns_b ip fou add port 4797 ipproto 47
  ip -n ns_b link add eoudp0 type gretap \
      remote 239.0.0.1 local 10.0.0.2 \
      encap fou encap-sport 4797 encap-dport 4797 key 239.0.0.1
  ip -n ns_b link set eoudp0 up
  ip -n ns_b addr add 192.168.99.2/24 dev eoudp0

  # Static neigh: ARP replies can't traverse unidirectional mcast tunnel
  recv_mac=$(ip -n ns_b link show eoudp0 | awk '/ether/{print $2}')
  ip -n ns_a neigh add 192.168.99.2 lladdr $recv_mac dev eoudp0

  # Test: ping through the FOU/GRETAP tunnel
  ip netns exec ns_a ping -c 100 192.168.99.2
  # -> without this patch: 0 packets received on eoudp0
  # -> with this patch: all packets received on eoudp0

IPv6 (using fou6 + ip6gretap) exhibits the same silent drop with a
different fix (see 1/2 for the sign-of-ret difference between
ip_protocol_deliver_rcu() and ip6_protocol_deliver_rcu()).

AI assistance (Claude, claude-opus-4-6) was used during root cause
analysis of the kernel source code (tracing the call chain from
udp[6]_queue_rcv_skb through encap_rcv to ip[6]_protocol_deliver_rcu,
comparing unicast/GSO/multicast paths) and during patch and selftest
authoring.

v5:
  - Fix IPv6 patch: return ret, not -ret (Willem de Bruijn)
  - selftest: add IPv6 test case
  - selftest: create the veth pair inside the namespaces
    (Willem de Bruijn)
v4: https://lore.kernel.org/netdev/cover.1782945956.git.littlesmilingcloud@gmail.com/
  - Promoted from RFC to PATCH; no functional changes since v3.
    v3 was posted as RFC and consequently dropped from patchwork,
    which explains the lack of review feedback.
v3: https://lore.kernel.org/netdev/cover.1777934869.git.littlesmilingcloud@gmail.com/
  - Use return -ret instead of calling ip_protocol_deliver_rcu()
    directly, matching the unicast path and avoiding call stack
    growth with nested encapsulations (Kuniyuki Iwashima)
  - Only change the first-socket path; the clone loop is not
    reachable for tunnel sockets (no SO_REUSEADDR/SO_REUSEPORT)
  - Replace Python packet generator with ping through a properly
    configured FOU/GRETAP tunnel in the selftest
  - Add static neighbor entry (ARP replies cannot traverse the
    unidirectional multicast tunnel)
v2: https://lore.kernel.org/netdev/ad_dal164gVmImWl@dau-home-pc/
  - Moved inline Python packet generator into a separate helper
  - Fixed author email typo in Signed-off-by
v1 (RFC): https://lore.kernel.org/netdev/ad7MsSJOuUU6EGwS@dau-home-pc/

Anton Danilov (2):
  udp: fix encapsulation packet resubmit in multicast deliver
  selftests: net: add FOU multicast encapsulation resubmit test

 net/ipv4/udp.c                                |   6 +-
 net/ipv6/udp.c                                |   6 +-
 tools/testing/selftests/net/Makefile          |   1 +
 .../testing/selftests/net/fou_mcast_encap.sh  | 177 ++++++++++++++++++
 4 files changed, 186 insertions(+), 4 deletions(-)
 create mode 100755 tools/testing/selftests/net/fou_mcast_encap.sh

-- 
2.47.3


^ permalink raw reply	[flat|nested] 3+ messages in thread

* [PATCH net-next v5 1/2] udp: fix encapsulation packet resubmit in multicast deliver
  2026-07-05  2:36 [PATCH net-next v5 0/2] udp: fix FOU/GUE over multicast Anton Danilov
@ 2026-07-05  2:36 ` Anton Danilov
  2026-07-05  2:36 ` [PATCH net-next v5 2/2] selftests: net: add FOU multicast encapsulation resubmit test Anton Danilov
  1 sibling, 0 replies; 3+ messages in thread
From: Anton Danilov @ 2026-07-05  2:36 UTC (permalink / raw)
  To: netdev
  Cc: Willem de Bruijn, David S . Miller, David Ahern, Eric Dumazet,
	Kuniyuki Iwashima, Jakub Kicinski, Paolo Abeni, Simon Horman,
	Shuah Khan, linux-kselftest

When a UDP encapsulation socket (e.g., FOU) receives a multicast
packet, __udp4_lib_mcast_deliver() and __udp6_lib_mcast_deliver()
call consume_skb() when udp_queue_rcv_skb() returns a positive value.
A positive return value from udp_queue_rcv_skb() indicates that the
encap_rcv handler (e.g., fou_udp_recv) has consumed the UDP header
and wants the packet to be resubmitted to the IP protocol handler
for further processing (e.g., as a GRE packet).

The unicast paths handle this correctly by propagating the return
value up to ip_protocol_deliver_rcu() / ip6_protocol_deliver_rcu()
for resubmission. However, the multicast paths destroy the packet
via consume_skb() instead of resubmitting it, causing silent packet
loss.

This affects any UDP encapsulation (FOU, GUE) combined with multicast
destination addresses.

Fix this by returning the value from udp_queue_rcv_skb() when it is
positive, matching the behavior of the corresponding unicast paths.
Note the sign difference between IPv4 and IPv6:

  - IPv4: udp_unicast_rcv_skb() returns -ret, and
    ip_protocol_deliver_rcu() resubmits when ret < 0
    (using -ret as the protocol number).
  - IPv6: udp6_unicast_rcv_skb() returns ret, and
    ip6_protocol_deliver_rcu() resubmits when ret > 0
    (using ret as the nexthdr).

Both mcast paths now follow the same convention as their respective
unicast paths.

Suggested-by: Willem de Bruijn <willemb@google.com>
Suggested-by: Kuniyuki Iwashima <kuniyu@google.com>
Signed-off-by: Anton Danilov <littlesmilingcloud@gmail.com>
Assisted-by: Claude:claude-opus-4-6
---
 net/ipv4/udp.c | 6 ++++--
 net/ipv6/udp.c | 6 ++++--
 2 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index 59248a59358c..d3ddcbfc8477 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -2476,6 +2476,7 @@ static int __udp4_lib_mcast_deliver(struct net *net, struct sk_buff *skb,
 	struct udp_hslot *hslot;
 	struct sk_buff *nskb;
 	bool use_hash2;
+	int ret;
 
 	hash2_any = 0;
 	hash2 = 0;
@@ -2520,8 +2521,9 @@ static int __udp4_lib_mcast_deliver(struct net *net, struct sk_buff *skb,
 	}
 
 	if (first) {
-		if (udp_queue_rcv_skb(first, skb) > 0)
-			consume_skb(skb);
+		ret = udp_queue_rcv_skb(first, skb);
+		if (ret > 0)
+			return -ret;
 	} else {
 		kfree_skb(skb);
 		__UDP_INC_STATS(net, UDP_MIB_IGNOREDMULTI);
diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
index 392e18b97045..0910cc171776 100644
--- a/net/ipv6/udp.c
+++ b/net/ipv6/udp.c
@@ -949,6 +949,7 @@ static int __udp6_lib_mcast_deliver(struct net *net, struct sk_buff *skb,
 	struct udp_hslot *hslot;
 	struct sk_buff *nskb;
 	bool use_hash2;
+	int ret;
 
 	hash2_any = 0;
 	hash2 = 0;
@@ -998,8 +999,9 @@ static int __udp6_lib_mcast_deliver(struct net *net, struct sk_buff *skb,
 	}
 
 	if (first) {
-		if (udpv6_queue_rcv_skb(first, skb) > 0)
-			consume_skb(skb);
+		ret = udpv6_queue_rcv_skb(first, skb);
+		if (ret > 0)
+			return ret;
 	} else {
 		kfree_skb(skb);
 		__UDP6_INC_STATS(net, UDP_MIB_IGNOREDMULTI);
-- 
2.47.3


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* [PATCH net-next v5 2/2] selftests: net: add FOU multicast encapsulation resubmit test
  2026-07-05  2:36 [PATCH net-next v5 0/2] udp: fix FOU/GUE over multicast Anton Danilov
  2026-07-05  2:36 ` [PATCH net-next v5 1/2] udp: fix encapsulation packet resubmit in multicast deliver Anton Danilov
@ 2026-07-05  2:36 ` Anton Danilov
  1 sibling, 0 replies; 3+ messages in thread
From: Anton Danilov @ 2026-07-05  2:36 UTC (permalink / raw)
  To: netdev
  Cc: Willem de Bruijn, David S . Miller, David Ahern, Eric Dumazet,
	Kuniyuki Iwashima, Jakub Kicinski, Paolo Abeni, Simon Horman,
	Shuah Khan, linux-kselftest

Add a selftest to verify that FOU-encapsulated packets addressed to a
multicast destination are correctly resubmitted to the inner protocol
handler (GRE) via the UDP multicast delivery path.  Both IPv4 and IPv6
paths are tested.

The test creates two network namespaces connected by a veth pair with
a FOU/GRETAP (IPv4) and FOU/ip6gretap (IPv6) tunnel using multicast
remote addresses (239.0.0.1 and ff0e::1).  Ping is sent through each
tunnel and received packets are counted on the receiver's tunnel
interface.

The veth pair is created directly inside the namespaces to avoid
possible name collisions with devices in the root namespace.

Static neighbor entries are configured on the sender because ARP/ND
replies from the receiver cannot traverse the unidirectional multicast
tunnel back to the sender.

The early demux optimization (net.ipv4.ip_early_demux, which controls
both IPv4 and IPv6) is disabled on the receiver to force packets
through __udp4_lib_mcast_deliver() / __udp6_lib_mcast_deliver(), which
is the code path being tested.

Signed-off-by: Anton Danilov <littlesmilingcloud@gmail.com>
Assisted-by: Claude:claude-opus-4-6
---
 tools/testing/selftests/net/Makefile          |   1 +
 .../testing/selftests/net/fou_mcast_encap.sh  | 177 ++++++++++++++++++
 2 files changed, 178 insertions(+)
 create mode 100755 tools/testing/selftests/net/fou_mcast_encap.sh

diff --git a/tools/testing/selftests/net/Makefile b/tools/testing/selftests/net/Makefile
index 708d960ae07d..7e9ae937cffa 100644
--- a/tools/testing/selftests/net/Makefile
+++ b/tools/testing/selftests/net/Makefile
@@ -39,6 +39,7 @@ TEST_PROGS := \
 	fib_rule_tests.sh \
 	fib_tests.sh \
 	fin_ack_lat.sh \
+	fou_mcast_encap.sh \
 	fq_band_pktlimit.sh \
 	gre_gso.sh \
 	gre_ipv6_lladdr.sh \
diff --git a/tools/testing/selftests/net/fou_mcast_encap.sh b/tools/testing/selftests/net/fou_mcast_encap.sh
new file mode 100755
index 000000000000..728513d55db4
--- /dev/null
+++ b/tools/testing/selftests/net/fou_mcast_encap.sh
@@ -0,0 +1,177 @@
+#!/bin/bash
+# SPDX-License-Identifier: GPL-2.0
+#
+# Test that UDP encapsulation (FOU) correctly handles packet resubmit
+# when packets are delivered via the multicast UDP delivery path.
+#
+# When a FOU-encapsulated packet arrives with a multicast destination IP,
+# __udp4_lib_mcast_deliver() / __udp6_lib_mcast_deliver() must resubmit
+# it to the inner protocol handler (e.g., GRE) rather than consuming it.
+# This test verifies both IPv4 and IPv6 paths by creating a FOU/GRETAP
+# tunnel with a multicast remote address and sending ping through it.
+#
+# The early demux optimization can mask this issue by routing packets via
+# the unicast path (udp[6]_unicast_rcv_skb), so we disable it to force
+# packets through the multicast delivery function.
+
+source lib.sh
+
+NSENDER=""
+NRECV=""
+
+FOU_PORT4=4797
+FOU_PORT6=4798
+MCAST4=239.0.0.1
+MCAST6=ff0e::1
+
+TUN4_S=192.168.99.1
+TUN4_R=192.168.99.2
+TUN6_S=2001:db8:99::1
+TUN6_R=2001:db8:99::2
+
+cleanup() {
+	cleanup_all_ns
+}
+
+trap cleanup EXIT
+
+setup_common() {
+	setup_ns NSENDER NRECV
+
+	# Create veth pair directly inside namespaces to avoid name
+	# collisions with devices in the root namespace.
+	ip link add veth_s netns "$NSENDER" type veth \
+		peer name veth_r netns "$NRECV"
+
+	ip -n "$NSENDER" link set veth_s up
+	ip -n "$NRECV" link set veth_r up
+
+	# Same sysctl controls early demux for both IPv4 and IPv6.
+	ip netns exec "$NRECV" sysctl -wq net.ipv4.ip_early_demux=0
+}
+
+setup_ipv4() {
+	ip -n "$NSENDER" addr add 10.0.0.1/24 dev veth_s
+	ip -n "$NRECV" addr add 10.0.0.2/24 dev veth_r
+
+	# Join multicast group on receiver
+	ip -n "$NRECV" addr add "$MCAST4/32" dev veth_r autojoin
+
+	ip -n "$NSENDER" route add 239.0.0.0/8 dev veth_s
+	ip -n "$NRECV" route add 239.0.0.0/8 dev veth_r
+
+	# Sender: GRETAP with FOU encap (no FOU listener needed on TX side)
+	ip -n "$NSENDER" link add eoudp4 type gretap \
+		remote "$MCAST4" local 10.0.0.1 \
+		encap fou encap-sport "$FOU_PORT4" encap-dport "$FOU_PORT4" \
+		key "$MCAST4"
+	ip -n "$NSENDER" link set eoudp4 up
+	ip -n "$NSENDER" addr add "$TUN4_S/24" dev eoudp4
+
+	# Receiver: FOU listener + GRETAP
+	ip netns exec "$NRECV" ip fou add port "$FOU_PORT4" ipproto 47
+	ip -n "$NRECV" link add eoudp4 type gretap \
+		remote "$MCAST4" local 10.0.0.2 \
+		encap fou encap-sport "$FOU_PORT4" encap-dport "$FOU_PORT4" \
+		key "$MCAST4"
+	ip -n "$NRECV" link set eoudp4 up
+	ip -n "$NRECV" addr add "$TUN4_R/24" dev eoudp4
+
+	# Static neigh on sender: ARP replies cannot traverse the
+	# unidirectional multicast tunnel.
+	local recv_mac
+	recv_mac=$(ip -n "$NRECV" link show eoudp4 | awk '/ether/{print $2}')
+	ip -n "$NSENDER" neigh add "$TUN4_R" lladdr "$recv_mac" dev eoudp4
+}
+
+setup_ipv6() {
+	# Skip cleanly if IPv6 is not available in the running kernel.
+	[ -e /proc/sys/net/ipv6 ] || return "$ksft_skip"
+	modprobe -q fou6 || return "$ksft_skip"
+
+	ip -n "$NSENDER" addr add 2001:db8::1/64 dev veth_s nodad
+	ip -n "$NRECV" addr add 2001:db8::2/64 dev veth_r nodad
+
+	# Join multicast group on receiver
+	ip -n "$NRECV" addr add "$MCAST6/128" dev veth_r autojoin
+
+	ip -n "$NSENDER" -6 route add ff00::/8 dev veth_s
+	ip -n "$NRECV" -6 route add ff00::/8 dev veth_r
+
+	# Sender: ip6gretap with FOU encap
+	ip -n "$NSENDER" link add eoudp6 type ip6gretap \
+		remote "$MCAST6" local 2001:db8::1 \
+		encap fou encap-sport "$FOU_PORT6" encap-dport "$FOU_PORT6" \
+		key 42
+	ip -n "$NSENDER" link set eoudp6 up
+	ip -n "$NSENDER" addr add "$TUN6_S/64" dev eoudp6 nodad
+
+	# Receiver: FOU listener (IPv6) + ip6gretap
+	ip netns exec "$NRECV" ip fou add port "$FOU_PORT6" ipproto 47 -6
+	ip -n "$NRECV" link add eoudp6 type ip6gretap \
+		remote "$MCAST6" local 2001:db8::2 \
+		encap fou encap-sport "$FOU_PORT6" encap-dport "$FOU_PORT6" \
+		key 42
+	ip -n "$NRECV" link set eoudp6 up
+	ip -n "$NRECV" addr add "$TUN6_R/64" dev eoudp6 nodad
+
+	# Static neigh on sender: neighbor discovery cannot traverse the
+	# unidirectional multicast tunnel.
+	local recv_mac
+	recv_mac=$(ip -n "$NRECV" link show eoudp6 | awk '/ether/{print $2}')
+	ip -n "$NSENDER" neigh add "$TUN6_R" lladdr "$recv_mac" dev eoudp6
+}
+
+get_rx_packets() {
+	local dev="$1"
+
+	ip -n "$NRECV" -s link show "$dev" | awk '/RX:/{getline; print $2}'
+}
+
+run_ping_test() {
+	local family="$1"
+	local dev="$2"
+	local dst="$3"
+	local count=100
+	local rx_before rx_after rx_delta
+
+	# Warmup: let any initial broadcast/ND traffic settle
+	ip netns exec "$NSENDER" ping "$family" -c 1 -W 1 "$dst" \
+		>/dev/null 2>&1
+	sleep 1
+
+	rx_before=$(get_rx_packets "$dev")
+	ip netns exec "$NSENDER" ping "$family" -c $count -W 1 "$dst" \
+		>/dev/null 2>&1
+	sleep 1
+	rx_after=$(get_rx_packets "$dev")
+
+	rx_delta=$((rx_after - rx_before))
+
+	if [ "$rx_delta" -ge "$count" ]; then
+		echo "PASS: received $rx_delta/$count packets"
+		return "$ksft_pass"
+	elif [ "$rx_delta" -gt 0 ]; then
+		echo "FAIL: only $rx_delta/$count packets received"
+		return "$ksft_fail"
+	else
+		echo "FAIL: 0/$count packets received"
+		return "$ksft_fail"
+	fi
+}
+
+ret=0
+
+echo "TEST: FOU/GRETAP IPv4 multicast encapsulation resubmit"
+setup_common
+setup_ipv4
+run_ping_test -4 eoudp4 "$TUN4_R" || ret=$?
+
+echo "TEST: FOU/GRETAP IPv6 multicast encapsulation resubmit"
+if setup_ipv6; then
+	run_ping_test -6 eoudp6 "$TUN6_R" || ret=$?
+else
+	echo "SKIP: IPv6 unavailable"
+fi
+
+exit $ret
-- 
2.47.3


^ permalink raw reply related	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2026-07-05  2:36 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-07-05  2:36 [PATCH net-next v5 0/2] udp: fix FOU/GUE over multicast Anton Danilov
2026-07-05  2:36 ` [PATCH net-next v5 1/2] udp: fix encapsulation packet resubmit in multicast deliver Anton Danilov
2026-07-05  2:36 ` [PATCH net-next v5 2/2] selftests: net: add FOU multicast encapsulation resubmit test Anton Danilov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox