* [PATCH net v4 0/2] gre: Fix regressions in IPv6 link-local address generation.
@ 2025-03-07 19:28 Guillaume Nault
2025-03-07 19:28 ` [PATCH net v4 1/2] gre: Fix " Guillaume Nault
` (2 more replies)
0 siblings, 3 replies; 15+ messages in thread
From: Guillaume Nault @ 2025-03-07 19:28 UTC (permalink / raw)
To: David Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet
Cc: netdev, Simon Horman, David Ahern, Antonio Quartulli,
Ido Schimmel, Petr Machata
IPv6 link-local address generation has some special cases for GRE
devices. This has led to several regressions in the past, and some of
them are still not fixed. This series fixes the remaining problems,
like the ipv6.conf.<dev>.addr_gen_mode sysctl being ignored and the
router discovery process not being started (see details in patch 1).
To avoid any further regressions, patch 2 adds selftests covering
IPv4 and IPv6 gre/gretap devices with all combinations of currently
supported addr_gen_mode values.
v4: Use lib.sh in the selftest (patch 2).
v3: Rework patch 1's commit message (typos + GRE device types
clarifications).
v2: Add Makefile entry for the new selftest (patch 2).
Guillaume Nault (2):
gre: Fix IPv6 link-local address generation.
selftests: Add IPv6 link-local address generation tests for GRE
devices.
net/ipv6/addrconf.c | 15 +-
tools/testing/selftests/net/Makefile | 1 +
.../testing/selftests/net/gre_ipv6_lladdr.sh | 177 ++++++++++++++++++
3 files changed, 187 insertions(+), 6 deletions(-)
create mode 100755 tools/testing/selftests/net/gre_ipv6_lladdr.sh
--
2.39.2
^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH net v4 1/2] gre: Fix IPv6 link-local address generation.
2025-03-07 19:28 [PATCH net v4 0/2] gre: Fix regressions in IPv6 link-local address generation Guillaume Nault
@ 2025-03-07 19:28 ` Guillaume Nault
2025-03-09 18:03 ` Ido Schimmel
2025-03-14 15:18 ` Stanislav Fomichev
2025-03-07 19:28 ` [PATCH net v4 2/2] selftests: Add IPv6 link-local address generation tests for GRE devices Guillaume Nault
2025-03-13 9:30 ` [PATCH net v4 0/2] gre: Fix regressions in IPv6 link-local address generation patchwork-bot+netdevbpf
2 siblings, 2 replies; 15+ messages in thread
From: Guillaume Nault @ 2025-03-07 19:28 UTC (permalink / raw)
To: David Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet
Cc: netdev, Simon Horman, David Ahern, Antonio Quartulli,
Ido Schimmel, Petr Machata
Use addrconf_addr_gen() to generate IPv6 link-local addresses on GRE
devices in most cases and fall back to using add_v4_addrs() only in
case the GRE configuration is incompatible with addrconf_addr_gen().
GRE used to use addrconf_addr_gen() until commit e5dd729460ca
("ip/ip6_gre: use the same logic as SIT interfaces when computing v6LL
address") restricted this use to gretap and ip6gretap devices, and
created add_v4_addrs() (borrowed from SIT) for non-Ethernet GRE ones.
The original problem came when commit 9af28511be10 ("addrconf: refuse
isatap eui64 for INADDR_ANY") made __ipv6_isatap_ifid() fail when its
addr parameter was 0. The commit says that this would create an invalid
address, however, I couldn't find any RFC saying that the generated
interface identifier would be wrong. Anyway, since gre over IPv4
devices pass their local tunnel address to __ipv6_isatap_ifid(), that
commit broke their IPv6 link-local address generation when the local
address was unspecified.
Then commit e5dd729460ca ("ip/ip6_gre: use the same logic as SIT
interfaces when computing v6LL address") tried to fix that case by
defining add_v4_addrs() and calling it to generate the IPv6 link-local
address instead of using addrconf_addr_gen() (apart for gretap and
ip6gretap devices, which would still use the regular
addrconf_addr_gen(), since they have a MAC address).
That broke several use cases because add_v4_addrs() isn't properly
integrated into the rest of IPv6 Neighbor Discovery code. Several of
these shortcomings have been fixed over time, but add_v4_addrs()
remains broken on several aspects. In particular, it doesn't send any
Router Sollicitations, so the SLAAC process doesn't start until the
interface receives a Router Advertisement. Also, add_v4_addrs() mostly
ignores the address generation mode of the interface
(/proc/sys/net/ipv6/conf/*/addr_gen_mode), thus breaking the
IN6_ADDR_GEN_MODE_RANDOM and IN6_ADDR_GEN_MODE_STABLE_PRIVACY cases.
Fix the situation by using add_v4_addrs() only in the specific scenario
where the normal method would fail. That is, for interfaces that have
all of the following characteristics:
* run over IPv4,
* transport IP packets directly, not Ethernet (that is, not gretap
interfaces),
* tunnel endpoint is INADDR_ANY (that is, 0),
* device address generation mode is EUI64.
In all other cases, revert back to the regular addrconf_addr_gen().
Also, remove the special case for ip6gre interfaces in add_v4_addrs(),
since ip6gre devices now always use addrconf_addr_gen() instead.
Fixes: e5dd729460ca ("ip/ip6_gre: use the same logic as SIT interfaces when computing v6LL address")
Signed-off-by: Guillaume Nault <gnault@redhat.com>
---
v4: No changes.
v3: Rework commit message to make it clearer which types of GRE devices
we're talking about (Ido).
v2: No changes.
net/ipv6/addrconf.c | 15 +++++++++------
1 file changed, 9 insertions(+), 6 deletions(-)
diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index ac8cc1076536..8b6258819dad 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -3209,16 +3209,13 @@ static void add_v4_addrs(struct inet6_dev *idev)
struct in6_addr addr;
struct net_device *dev;
struct net *net = dev_net(idev->dev);
- int scope, plen, offset = 0;
+ int scope, plen;
u32 pflags = 0;
ASSERT_RTNL();
memset(&addr, 0, sizeof(struct in6_addr));
- /* in case of IP6GRE the dev_addr is an IPv6 and therefore we use only the last 4 bytes */
- if (idev->dev->addr_len == sizeof(struct in6_addr))
- offset = sizeof(struct in6_addr) - 4;
- memcpy(&addr.s6_addr32[3], idev->dev->dev_addr + offset, 4);
+ memcpy(&addr.s6_addr32[3], idev->dev->dev_addr, 4);
if (!(idev->dev->flags & IFF_POINTOPOINT) && idev->dev->type == ARPHRD_SIT) {
scope = IPV6_ADDR_COMPATv4;
@@ -3529,7 +3526,13 @@ static void addrconf_gre_config(struct net_device *dev)
return;
}
- if (dev->type == ARPHRD_ETHER) {
+ /* Generate the IPv6 link-local address using addrconf_addr_gen(),
+ * unless we have an IPv4 GRE device not bound to an IP address and
+ * which is in EUI64 mode (as __ipv6_isatap_ifid() would fail in this
+ * case). Such devices fall back to add_v4_addrs() instead.
+ */
+ if (!(dev->type == ARPHRD_IPGRE && *(__be32 *)dev->dev_addr == 0 &&
+ idev->cnf.addr_gen_mode == IN6_ADDR_GEN_MODE_EUI64)) {
addrconf_addr_gen(idev, true);
return;
}
--
2.39.2
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH net v4 2/2] selftests: Add IPv6 link-local address generation tests for GRE devices.
2025-03-07 19:28 [PATCH net v4 0/2] gre: Fix regressions in IPv6 link-local address generation Guillaume Nault
2025-03-07 19:28 ` [PATCH net v4 1/2] gre: Fix " Guillaume Nault
@ 2025-03-07 19:28 ` Guillaume Nault
2025-03-09 18:04 ` Ido Schimmel
2025-03-10 9:44 ` Petr Machata
2025-03-13 9:30 ` [PATCH net v4 0/2] gre: Fix regressions in IPv6 link-local address generation patchwork-bot+netdevbpf
2 siblings, 2 replies; 15+ messages in thread
From: Guillaume Nault @ 2025-03-07 19:28 UTC (permalink / raw)
To: David Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet
Cc: netdev, Simon Horman, David Ahern, Antonio Quartulli,
Ido Schimmel, Petr Machata
GRE devices have their special code for IPv6 link-local address
generation that has been the source of several regressions in the past.
Add selftest to check that all gre, ip6gre, gretap and ip6gretap get an
IPv6 link-link local address in accordance with the
net.ipv6.conf.<dev>.addr_gen_mode sysctl.
Signed-off-by: Guillaume Nault <gnault@redhat.com>
---
v4: Use lib.sh (Paolo, Petr).
v3: No changes.
v2: Add selftest to Makefile.
tools/testing/selftests/net/Makefile | 1 +
.../testing/selftests/net/gre_ipv6_lladdr.sh | 177 ++++++++++++++++++
2 files changed, 178 insertions(+)
create mode 100755 tools/testing/selftests/net/gre_ipv6_lladdr.sh
diff --git a/tools/testing/selftests/net/Makefile b/tools/testing/selftests/net/Makefile
index 73ee88d6b043..5916f3b81c39 100644
--- a/tools/testing/selftests/net/Makefile
+++ b/tools/testing/selftests/net/Makefile
@@ -31,6 +31,7 @@ TEST_PROGS += veth.sh
TEST_PROGS += ioam6.sh
TEST_PROGS += gro.sh
TEST_PROGS += gre_gso.sh
+TEST_PROGS += gre_ipv6_lladdr.sh
TEST_PROGS += cmsg_so_mark.sh
TEST_PROGS += cmsg_so_priority.sh
TEST_PROGS += cmsg_time.sh cmsg_ipv6.sh
diff --git a/tools/testing/selftests/net/gre_ipv6_lladdr.sh b/tools/testing/selftests/net/gre_ipv6_lladdr.sh
new file mode 100755
index 000000000000..5b34f6e1f831
--- /dev/null
+++ b/tools/testing/selftests/net/gre_ipv6_lladdr.sh
@@ -0,0 +1,177 @@
+#!/bin/bash
+# SPDX-License-Identifier: GPL-2.0
+
+source ./lib.sh
+
+PAUSE_ON_FAIL="no"
+
+# The trap function handler
+#
+exit_cleanup_all()
+{
+ cleanup_all_ns
+
+ exit "${EXIT_STATUS}"
+}
+
+# Add fake IPv4 and IPv6 networks on the loopback device, to be used as
+# underlay by future GRE devices.
+#
+setup_basenet()
+{
+ ip -netns "${NS0}" link set dev lo up
+ ip -netns "${NS0}" address add dev lo 192.0.2.10/24
+ ip -netns "${NS0}" address add dev lo 2001:db8::10/64 nodad
+}
+
+# Check if network device has an IPv6 link-local address assigned.
+#
+# Parameters:
+#
+# * $1: The network device to test
+# * $2: An extra regular expression that should be matched (to verify the
+# presence of extra attributes)
+# * $3: The expected return code from grep (to allow checking the absence of
+# a link-local address)
+# * $4: The user visible name for the scenario being tested
+#
+check_ipv6_ll_addr()
+{
+ local DEV="$1"
+ local EXTRA_MATCH="$2"
+ local XRET="$3"
+ local MSG="$4"
+
+ RET=0
+ set +e
+ ip -netns "${NS0}" -6 address show dev "${DEV}" scope link | grep "fe80::" | grep -q "${EXTRA_MATCH}"
+ check_err_fail "${XRET}" $? ""
+ log_test "${MSG}"
+ set -e
+}
+
+# Create a GRE device and verify that it gets an IPv6 link-local address as
+# expected.
+#
+# Parameters:
+#
+# * $1: The device type (gre, ip6gre, gretap or ip6gretap)
+# * $2: The local underlay IP address (can be an IPv4, an IPv6 or "any")
+# * $3: The remote underlay IP address (can be an IPv4, an IPv6 or "any")
+# * $4: The IPv6 interface identifier generation mode to use for the GRE
+# device (eui64, none, stable-privacy or random).
+#
+test_gre_device()
+{
+ local GRE_TYPE="$1"
+ local LOCAL_IP="$2"
+ local REMOTE_IP="$3"
+ local MODE="$4"
+ local ADDR_GEN_MODE
+ local MATCH_REGEXP
+ local MSG
+
+ ip link add netns "${NS0}" name gretest type "${GRE_TYPE}" local "${LOCAL_IP}" remote "${REMOTE_IP}"
+
+ case "${MODE}" in
+ "eui64")
+ ADDR_GEN_MODE=0
+ MATCH_REGEXP=""
+ MSG="${GRE_TYPE}, mode: 0 (EUI64), ${LOCAL_IP} -> ${REMOTE_IP}"
+ XRET=0
+ ;;
+ "none")
+ ADDR_GEN_MODE=1
+ MATCH_REGEXP=""
+ MSG="${GRE_TYPE}, mode: 1 (none), ${LOCAL_IP} -> ${REMOTE_IP}"
+ XRET=1 # No link-local address should be generated
+ ;;
+ "stable-privacy")
+ ADDR_GEN_MODE=2
+ MATCH_REGEXP="stable-privacy"
+ MSG="${GRE_TYPE}, mode: 2 (stable privacy), ${LOCAL_IP} -> ${REMOTE_IP}"
+ XRET=0
+ # Initialise stable_secret (required for stable-privacy mode)
+ ip netns exec "${NS0}" sysctl -qw net.ipv6.conf.gretest.stable_secret="2001:db8::abcd"
+ ;;
+ "random")
+ ADDR_GEN_MODE=3
+ MATCH_REGEXP="stable-privacy"
+ MSG="${GRE_TYPE}, mode: 3 (random), ${LOCAL_IP} -> ${REMOTE_IP}"
+ XRET=0
+ ;;
+ esac
+
+ # Check that IPv6 link-local address is generated when device goes up
+ ip netns exec "${NS0}" sysctl -qw net.ipv6.conf.gretest.addr_gen_mode="${ADDR_GEN_MODE}"
+ ip -netns "${NS0}" link set dev gretest up
+ check_ipv6_ll_addr gretest "${MATCH_REGEXP}" "${XRET}" "config: ${MSG}"
+
+ # Now disable link-local address generation
+ ip -netns "${NS0}" link set dev gretest down
+ ip netns exec "${NS0}" sysctl -qw net.ipv6.conf.gretest.addr_gen_mode=1
+ ip -netns "${NS0}" link set dev gretest up
+
+ # Check that link-local address generation works when re-enabled while
+ # the device is already up
+ ip netns exec "${NS0}" sysctl -qw net.ipv6.conf.gretest.addr_gen_mode="${ADDR_GEN_MODE}"
+ check_ipv6_ll_addr gretest "${MATCH_REGEXP}" "${XRET}" "update: ${MSG}"
+
+ ip -netns "${NS0}" link del dev gretest
+}
+
+test_gre4()
+{
+ local GRE_TYPE
+ local MODE
+
+ for GRE_TYPE in "gre" "gretap"; do
+ printf "\n####\nTesting IPv6 link-local address generation on ${GRE_TYPE} devices\n####\n\n"
+
+ for MODE in "eui64" "none" "stable-privacy" "random"; do
+ test_gre_device "${GRE_TYPE}" 192.0.2.10 192.0.2.11 "${MODE}"
+ test_gre_device "${GRE_TYPE}" any 192.0.2.11 "${MODE}"
+ test_gre_device "${GRE_TYPE}" 192.0.2.10 any "${MODE}"
+ done
+ done
+}
+
+test_gre6()
+{
+ local GRE_TYPE
+ local MODE
+
+ for GRE_TYPE in "ip6gre" "ip6gretap"; do
+ printf "\n####\nTesting IPv6 link-local address generation on ${GRE_TYPE} devices\n####\n\n"
+
+ for MODE in "eui64" "none" "stable-privacy" "random"; do
+ test_gre_device "${GRE_TYPE}" 2001:db8::10 2001:db8::11 "${MODE}"
+ test_gre_device "${GRE_TYPE}" any 2001:db8::11 "${MODE}"
+ test_gre_device "${GRE_TYPE}" 2001:db8::10 any "${MODE}"
+ done
+ done
+}
+
+usage()
+{
+ echo "Usage: $0 [-p]"
+ exit 1
+}
+
+while getopts :p o
+do
+ case $o in
+ p) PAUSE_ON_FAIL="yes";;
+ *) usage;;
+ esac
+done
+
+setup_ns NS0
+
+set -e
+trap exit_cleanup_all EXIT
+
+setup_basenet
+
+test_gre4
+test_gre6
--
2.39.2
^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH net v4 1/2] gre: Fix IPv6 link-local address generation.
2025-03-07 19:28 ` [PATCH net v4 1/2] gre: Fix " Guillaume Nault
@ 2025-03-09 18:03 ` Ido Schimmel
2025-03-14 15:18 ` Stanislav Fomichev
1 sibling, 0 replies; 15+ messages in thread
From: Ido Schimmel @ 2025-03-09 18:03 UTC (permalink / raw)
To: Guillaume Nault
Cc: David Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet, netdev,
Simon Horman, David Ahern, Antonio Quartulli, Petr Machata
On Fri, Mar 07, 2025 at 08:28:53PM +0100, Guillaume Nault wrote:
> Use addrconf_addr_gen() to generate IPv6 link-local addresses on GRE
> devices in most cases and fall back to using add_v4_addrs() only in
> case the GRE configuration is incompatible with addrconf_addr_gen().
>
> GRE used to use addrconf_addr_gen() until commit e5dd729460ca
> ("ip/ip6_gre: use the same logic as SIT interfaces when computing v6LL
> address") restricted this use to gretap and ip6gretap devices, and
> created add_v4_addrs() (borrowed from SIT) for non-Ethernet GRE ones.
>
> The original problem came when commit 9af28511be10 ("addrconf: refuse
> isatap eui64 for INADDR_ANY") made __ipv6_isatap_ifid() fail when its
> addr parameter was 0. The commit says that this would create an invalid
> address, however, I couldn't find any RFC saying that the generated
> interface identifier would be wrong. Anyway, since gre over IPv4
> devices pass their local tunnel address to __ipv6_isatap_ifid(), that
> commit broke their IPv6 link-local address generation when the local
> address was unspecified.
>
> Then commit e5dd729460ca ("ip/ip6_gre: use the same logic as SIT
> interfaces when computing v6LL address") tried to fix that case by
> defining add_v4_addrs() and calling it to generate the IPv6 link-local
> address instead of using addrconf_addr_gen() (apart for gretap and
> ip6gretap devices, which would still use the regular
> addrconf_addr_gen(), since they have a MAC address).
>
> That broke several use cases because add_v4_addrs() isn't properly
> integrated into the rest of IPv6 Neighbor Discovery code. Several of
> these shortcomings have been fixed over time, but add_v4_addrs()
> remains broken on several aspects. In particular, it doesn't send any
> Router Sollicitations, so the SLAAC process doesn't start until the
> interface receives a Router Advertisement. Also, add_v4_addrs() mostly
> ignores the address generation mode of the interface
> (/proc/sys/net/ipv6/conf/*/addr_gen_mode), thus breaking the
> IN6_ADDR_GEN_MODE_RANDOM and IN6_ADDR_GEN_MODE_STABLE_PRIVACY cases.
>
> Fix the situation by using add_v4_addrs() only in the specific scenario
> where the normal method would fail. That is, for interfaces that have
> all of the following characteristics:
>
> * run over IPv4,
> * transport IP packets directly, not Ethernet (that is, not gretap
> interfaces),
> * tunnel endpoint is INADDR_ANY (that is, 0),
> * device address generation mode is EUI64.
>
> In all other cases, revert back to the regular addrconf_addr_gen().
>
> Also, remove the special case for ip6gre interfaces in add_v4_addrs(),
> since ip6gre devices now always use addrconf_addr_gen() instead.
>
> Fixes: e5dd729460ca ("ip/ip6_gre: use the same logic as SIT interfaces when computing v6LL address")
> Signed-off-by: Guillaume Nault <gnault@redhat.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH net v4 2/2] selftests: Add IPv6 link-local address generation tests for GRE devices.
2025-03-07 19:28 ` [PATCH net v4 2/2] selftests: Add IPv6 link-local address generation tests for GRE devices Guillaume Nault
@ 2025-03-09 18:04 ` Ido Schimmel
2025-03-10 9:44 ` Petr Machata
1 sibling, 0 replies; 15+ messages in thread
From: Ido Schimmel @ 2025-03-09 18:04 UTC (permalink / raw)
To: Guillaume Nault
Cc: David Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet, netdev,
Simon Horman, David Ahern, Antonio Quartulli, Petr Machata
On Fri, Mar 07, 2025 at 08:28:58PM +0100, Guillaume Nault wrote:
> GRE devices have their special code for IPv6 link-local address
> generation that has been the source of several regressions in the past.
>
> Add selftest to check that all gre, ip6gre, gretap and ip6gretap get an
> IPv6 link-link local address in accordance with the
> net.ipv6.conf.<dev>.addr_gen_mode sysctl.
>
> Signed-off-by: Guillaume Nault <gnault@redhat.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Tested-by: Ido Schimmel <idosch@nvidia.com>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH net v4 2/2] selftests: Add IPv6 link-local address generation tests for GRE devices.
2025-03-07 19:28 ` [PATCH net v4 2/2] selftests: Add IPv6 link-local address generation tests for GRE devices Guillaume Nault
2025-03-09 18:04 ` Ido Schimmel
@ 2025-03-10 9:44 ` Petr Machata
1 sibling, 0 replies; 15+ messages in thread
From: Petr Machata @ 2025-03-10 9:44 UTC (permalink / raw)
To: Guillaume Nault
Cc: David Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet, netdev,
Simon Horman, David Ahern, Antonio Quartulli, Ido Schimmel,
Petr Machata
Guillaume Nault <gnault@redhat.com> writes:
> GRE devices have their special code for IPv6 link-local address
> generation that has been the source of several regressions in the past.
>
> Add selftest to check that all gre, ip6gre, gretap and ip6gretap get an
> IPv6 link-link local address in accordance with the
> net.ipv6.conf.<dev>.addr_gen_mode sysctl.
>
> Signed-off-by: Guillaume Nault <gnault@redhat.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH net v4 0/2] gre: Fix regressions in IPv6 link-local address generation.
2025-03-07 19:28 [PATCH net v4 0/2] gre: Fix regressions in IPv6 link-local address generation Guillaume Nault
2025-03-07 19:28 ` [PATCH net v4 1/2] gre: Fix " Guillaume Nault
2025-03-07 19:28 ` [PATCH net v4 2/2] selftests: Add IPv6 link-local address generation tests for GRE devices Guillaume Nault
@ 2025-03-13 9:30 ` patchwork-bot+netdevbpf
2 siblings, 0 replies; 15+ messages in thread
From: patchwork-bot+netdevbpf @ 2025-03-13 9:30 UTC (permalink / raw)
To: Guillaume Nault
Cc: davem, kuba, pabeni, edumazet, netdev, horms, dsahern, antonio,
idosch, petrm
Hello:
This series was applied to netdev/net.git (main)
by Paolo Abeni <pabeni@redhat.com>:
On Fri, 7 Mar 2025 20:28:48 +0100 you wrote:
> IPv6 link-local address generation has some special cases for GRE
> devices. This has led to several regressions in the past, and some of
> them are still not fixed. This series fixes the remaining problems,
> like the ipv6.conf.<dev>.addr_gen_mode sysctl being ignored and the
> router discovery process not being started (see details in patch 1).
>
> To avoid any further regressions, patch 2 adds selftests covering
> IPv4 and IPv6 gre/gretap devices with all combinations of currently
> supported addr_gen_mode values.
>
> [...]
Here is the summary with links:
- [net,v4,1/2] gre: Fix IPv6 link-local address generation.
https://git.kernel.org/netdev/net/c/183185a18ff9
- [net,v4,2/2] selftests: Add IPv6 link-local address generation tests for GRE devices.
https://git.kernel.org/netdev/net/c/6f50175ccad4
You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH net v4 1/2] gre: Fix IPv6 link-local address generation.
2025-03-07 19:28 ` [PATCH net v4 1/2] gre: Fix " Guillaume Nault
2025-03-09 18:03 ` Ido Schimmel
@ 2025-03-14 15:18 ` Stanislav Fomichev
2025-03-14 19:22 ` Guillaume Nault
1 sibling, 1 reply; 15+ messages in thread
From: Stanislav Fomichev @ 2025-03-14 15:18 UTC (permalink / raw)
To: Guillaume Nault
Cc: David Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet, netdev,
Simon Horman, David Ahern, Antonio Quartulli, Ido Schimmel,
Petr Machata
On 03/07, Guillaume Nault wrote:
> Use addrconf_addr_gen() to generate IPv6 link-local addresses on GRE
> devices in most cases and fall back to using add_v4_addrs() only in
> case the GRE configuration is incompatible with addrconf_addr_gen().
>
> GRE used to use addrconf_addr_gen() until commit e5dd729460ca
> ("ip/ip6_gre: use the same logic as SIT interfaces when computing v6LL
> address") restricted this use to gretap and ip6gretap devices, and
> created add_v4_addrs() (borrowed from SIT) for non-Ethernet GRE ones.
>
> The original problem came when commit 9af28511be10 ("addrconf: refuse
> isatap eui64 for INADDR_ANY") made __ipv6_isatap_ifid() fail when its
> addr parameter was 0. The commit says that this would create an invalid
> address, however, I couldn't find any RFC saying that the generated
> interface identifier would be wrong. Anyway, since gre over IPv4
> devices pass their local tunnel address to __ipv6_isatap_ifid(), that
> commit broke their IPv6 link-local address generation when the local
> address was unspecified.
>
> Then commit e5dd729460ca ("ip/ip6_gre: use the same logic as SIT
> interfaces when computing v6LL address") tried to fix that case by
> defining add_v4_addrs() and calling it to generate the IPv6 link-local
> address instead of using addrconf_addr_gen() (apart for gretap and
> ip6gretap devices, which would still use the regular
> addrconf_addr_gen(), since they have a MAC address).
>
> That broke several use cases because add_v4_addrs() isn't properly
> integrated into the rest of IPv6 Neighbor Discovery code. Several of
> these shortcomings have been fixed over time, but add_v4_addrs()
> remains broken on several aspects. In particular, it doesn't send any
> Router Sollicitations, so the SLAAC process doesn't start until the
> interface receives a Router Advertisement. Also, add_v4_addrs() mostly
> ignores the address generation mode of the interface
> (/proc/sys/net/ipv6/conf/*/addr_gen_mode), thus breaking the
> IN6_ADDR_GEN_MODE_RANDOM and IN6_ADDR_GEN_MODE_STABLE_PRIVACY cases.
>
> Fix the situation by using add_v4_addrs() only in the specific scenario
> where the normal method would fail. That is, for interfaces that have
> all of the following characteristics:
>
> * run over IPv4,
> * transport IP packets directly, not Ethernet (that is, not gretap
> interfaces),
> * tunnel endpoint is INADDR_ANY (that is, 0),
> * device address generation mode is EUI64.
Could you please double check net/forwarding/ip6gre_custom_multipath_hash.sh ?
It seems like it started falling after this series has been pulled:
https://netdev-3.bots.linux.dev/vmksft-forwarding-dbg/results/31301/2-ip6gre-custom-multipath-hash-sh/stdout
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH net v4 1/2] gre: Fix IPv6 link-local address generation.
2025-03-14 15:18 ` Stanislav Fomichev
@ 2025-03-14 19:22 ` Guillaume Nault
2025-03-14 20:18 ` Stanislav Fomichev
0 siblings, 1 reply; 15+ messages in thread
From: Guillaume Nault @ 2025-03-14 19:22 UTC (permalink / raw)
To: Stanislav Fomichev
Cc: David Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet, netdev,
Simon Horman, David Ahern, Antonio Quartulli, Ido Schimmel,
Petr Machata
On Fri, Mar 14, 2025 at 08:18:32AM -0700, Stanislav Fomichev wrote:
> On 03/07, Guillaume Nault wrote:
> > Use addrconf_addr_gen() to generate IPv6 link-local addresses on GRE
> > devices in most cases and fall back to using add_v4_addrs() only in
> > case the GRE configuration is incompatible with addrconf_addr_gen().
> >
> > GRE used to use addrconf_addr_gen() until commit e5dd729460ca
> > ("ip/ip6_gre: use the same logic as SIT interfaces when computing v6LL
> > address") restricted this use to gretap and ip6gretap devices, and
> > created add_v4_addrs() (borrowed from SIT) for non-Ethernet GRE ones.
> >
> > The original problem came when commit 9af28511be10 ("addrconf: refuse
> > isatap eui64 for INADDR_ANY") made __ipv6_isatap_ifid() fail when its
> > addr parameter was 0. The commit says that this would create an invalid
> > address, however, I couldn't find any RFC saying that the generated
> > interface identifier would be wrong. Anyway, since gre over IPv4
> > devices pass their local tunnel address to __ipv6_isatap_ifid(), that
> > commit broke their IPv6 link-local address generation when the local
> > address was unspecified.
> >
> > Then commit e5dd729460ca ("ip/ip6_gre: use the same logic as SIT
> > interfaces when computing v6LL address") tried to fix that case by
> > defining add_v4_addrs() and calling it to generate the IPv6 link-local
> > address instead of using addrconf_addr_gen() (apart for gretap and
> > ip6gretap devices, which would still use the regular
> > addrconf_addr_gen(), since they have a MAC address).
> >
> > That broke several use cases because add_v4_addrs() isn't properly
> > integrated into the rest of IPv6 Neighbor Discovery code. Several of
> > these shortcomings have been fixed over time, but add_v4_addrs()
> > remains broken on several aspects. In particular, it doesn't send any
> > Router Sollicitations, so the SLAAC process doesn't start until the
> > interface receives a Router Advertisement. Also, add_v4_addrs() mostly
> > ignores the address generation mode of the interface
> > (/proc/sys/net/ipv6/conf/*/addr_gen_mode), thus breaking the
> > IN6_ADDR_GEN_MODE_RANDOM and IN6_ADDR_GEN_MODE_STABLE_PRIVACY cases.
> >
> > Fix the situation by using add_v4_addrs() only in the specific scenario
> > where the normal method would fail. That is, for interfaces that have
> > all of the following characteristics:
> >
> > * run over IPv4,
> > * transport IP packets directly, not Ethernet (that is, not gretap
> > interfaces),
> > * tunnel endpoint is INADDR_ANY (that is, 0),
> > * device address generation mode is EUI64.
>
> Could you please double check net/forwarding/ip6gre_custom_multipath_hash.sh ?
> It seems like it started falling after this series has been pulled:
> https://netdev-3.bots.linux.dev/vmksft-forwarding-dbg/results/31301/2-ip6gre-custom-multipath-hash-sh/stdout
Hum, net/forwarding/ip6gre_custom_multipath_hash.sh works for me on the
current net tree (I'm at commit 4003c9e78778). I have only one failure,
but it already happened before 183185a18ff9 ("gre: Fix IPv6 link-local
address generation.") was applied.
# ./ip6gre_custom_multipath_hash.sh
TEST: ping [ OK ]
TEST: ping6 [ OK ]
INFO: Running IPv4 overlay custom multipath hash tests
TEST: Multipath hash field: Inner source IP (balanced) [ OK ]
INFO: Packets sent on path1 / path2: 6350 / 6251
TEST: Multipath hash field: Inner source IP (unbalanced) [ OK ]
INFO: Packets sent on path1 / path2: 12602 / 0
TEST: Multipath hash field: Inner destination IP (balanced) [ OK ]
INFO: Packets sent on path1 / path2: 5400 / 7201
TEST: Multipath hash field: Inner destination IP (unbalanced) [ OK ]
INFO: Packets sent on path1 / path2: 0 / 12600
TEST: Multipath hash field: Inner source port (balanced) [ OK ]
INFO: Packets sent on path1 / path2: 16458 / 16311
TEST: Multipath hash field: Inner source port (unbalanced) [ OK ]
INFO: Packets sent on path1 / path2: 32769 / 0
TEST: Multipath hash field: Inner destination port (balanced) [ OK ]
INFO: Packets sent on path1 / path2: 16458 / 16311
TEST: Multipath hash field: Inner destination port (unbalanced) [ OK ]
INFO: Packets sent on path1 / path2: 0 / 32769
INFO: Running IPv6 overlay custom multipath hash tests
TEST: Multipath hash field: Inner source IP (balanced) [ OK ]
INFO: Packets sent on path1 / path2: 5900 / 6700
TEST: Multipath hash field: Inner source IP (unbalanced) [ OK ]
INFO: Packets sent on path1 / path2: 0 / 12600
TEST: Multipath hash field: Inner destination IP (balanced) [ OK ]
INFO: Packets sent on path1 / path2: 5900 / 6700
TEST: Multipath hash field: Inner destination IP (unbalanced) [ OK ]
INFO: Packets sent on path1 / path2: 12600 / 0
TEST: Multipath hash field: Inner flowlabel (balanced) [FAIL]
Expected traffic to be balanced, but it is not
INFO: Packets sent on path1 / path2: 0 / 1
TEST: Multipath hash field: Inner flowlabel (unbalanced) [ OK ]
INFO: Packets sent on path1 / path2: 0 / 12600
TEST: Multipath hash field: Inner source port (balanced) [ OK ]
INFO: Packets sent on path1 / path2: 16387 / 16385
TEST: Multipath hash field: Inner source port (unbalanced) [ OK ]
INFO: Packets sent on path1 / path2: 32770 / 0
TEST: Multipath hash field: Inner destination port (balanced) [ OK ]
INFO: Packets sent on path1 / path2: 16386 / 16384
TEST: Multipath hash field: Inner destination port (unbalanced) [ OK ]
INFO: Packets sent on path1 / path2: 32769 / 0
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH net v4 1/2] gre: Fix IPv6 link-local address generation.
2025-03-14 19:22 ` Guillaume Nault
@ 2025-03-14 20:18 ` Stanislav Fomichev
2025-03-16 13:08 ` Ido Schimmel
0 siblings, 1 reply; 15+ messages in thread
From: Stanislav Fomichev @ 2025-03-14 20:18 UTC (permalink / raw)
To: Guillaume Nault
Cc: David Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet, netdev,
Simon Horman, David Ahern, Antonio Quartulli, Ido Schimmel,
Petr Machata
On 03/14, Guillaume Nault wrote:
> On Fri, Mar 14, 2025 at 08:18:32AM -0700, Stanislav Fomichev wrote:
> > On 03/07, Guillaume Nault wrote:
> > > Use addrconf_addr_gen() to generate IPv6 link-local addresses on GRE
> > > devices in most cases and fall back to using add_v4_addrs() only in
> > > case the GRE configuration is incompatible with addrconf_addr_gen().
> > >
> > > GRE used to use addrconf_addr_gen() until commit e5dd729460ca
> > > ("ip/ip6_gre: use the same logic as SIT interfaces when computing v6LL
> > > address") restricted this use to gretap and ip6gretap devices, and
> > > created add_v4_addrs() (borrowed from SIT) for non-Ethernet GRE ones.
> > >
> > > The original problem came when commit 9af28511be10 ("addrconf: refuse
> > > isatap eui64 for INADDR_ANY") made __ipv6_isatap_ifid() fail when its
> > > addr parameter was 0. The commit says that this would create an invalid
> > > address, however, I couldn't find any RFC saying that the generated
> > > interface identifier would be wrong. Anyway, since gre over IPv4
> > > devices pass their local tunnel address to __ipv6_isatap_ifid(), that
> > > commit broke their IPv6 link-local address generation when the local
> > > address was unspecified.
> > >
> > > Then commit e5dd729460ca ("ip/ip6_gre: use the same logic as SIT
> > > interfaces when computing v6LL address") tried to fix that case by
> > > defining add_v4_addrs() and calling it to generate the IPv6 link-local
> > > address instead of using addrconf_addr_gen() (apart for gretap and
> > > ip6gretap devices, which would still use the regular
> > > addrconf_addr_gen(), since they have a MAC address).
> > >
> > > That broke several use cases because add_v4_addrs() isn't properly
> > > integrated into the rest of IPv6 Neighbor Discovery code. Several of
> > > these shortcomings have been fixed over time, but add_v4_addrs()
> > > remains broken on several aspects. In particular, it doesn't send any
> > > Router Sollicitations, so the SLAAC process doesn't start until the
> > > interface receives a Router Advertisement. Also, add_v4_addrs() mostly
> > > ignores the address generation mode of the interface
> > > (/proc/sys/net/ipv6/conf/*/addr_gen_mode), thus breaking the
> > > IN6_ADDR_GEN_MODE_RANDOM and IN6_ADDR_GEN_MODE_STABLE_PRIVACY cases.
> > >
> > > Fix the situation by using add_v4_addrs() only in the specific scenario
> > > where the normal method would fail. That is, for interfaces that have
> > > all of the following characteristics:
> > >
> > > * run over IPv4,
> > > * transport IP packets directly, not Ethernet (that is, not gretap
> > > interfaces),
> > > * tunnel endpoint is INADDR_ANY (that is, 0),
> > > * device address generation mode is EUI64.
> >
> > Could you please double check net/forwarding/ip6gre_custom_multipath_hash.sh ?
> > It seems like it started falling after this series has been pulled:
> > https://netdev-3.bots.linux.dev/vmksft-forwarding-dbg/results/31301/2-ip6gre-custom-multipath-hash-sh/stdout
>
> Hum, net/forwarding/ip6gre_custom_multipath_hash.sh works for me on the
> current net tree (I'm at commit 4003c9e78778). I have only one failure,
> but it already happened before 183185a18ff9 ("gre: Fix IPv6 link-local
> address generation.") was applied.
On my side I see the following (ignore ping6 FAILs):
bfc6c67ec2d6 - (net-next/main, net-next/HEAD) net/smc: use the correct ndev to find pnetid by pnetid table (7 hours ago) <Guangguan Wang>
TAP version 13
1..1
# timeout set to 0
# selftests: net/forwarding: ip6gre_custom_multipath_hash.sh
[ 9.275735][ T167] ip (167) used greatest stack depth: 23536 bytes left
[ 13.769300][ T255] gre: GRE over IPv4 demultiplexor driver
[ 13.838185][ T255] ip6_gre: GRE over IPv6 tunneling driver
[ 13.951780][ T12] ip6_tunnel: g1 xmit: Local address not yet configured!
[ 14.038101][ T12] ip6_tunnel: g1 xmit: Local address not yet configured!
[ 15.148469][ T281] 8021q: 802.1Q VLAN Support v1.8
[ 17.559477][ T321] GACT probability NOT on
[ 18.551876][ T12] ip6_tunnel: g2 xmit: Local address not yet configured!
[ 18.633656][ T12] ip6_tunnel: g2 xmit: Local address not yet configured!
# TEST: ping [ OK ]
# TEST: ping6 [FAIL]
# INFO: Running IPv4 overlay custom multipath hash tests
# TEST: Multipath hash field: Inner source IP (balanced) [FAIL]
# Expected traffic to be balanced, but it is not
# INFO: Packets sent on path1 / path2: 1 / 12602
# TEST: Multipath hash field: Inner source IP (unbalanced) [ OK ]
# INFO: Packets sent on path1 / path2: 0 / 12601
# TEST: Multipath hash field: Inner destination IP (balanced) [FAIL]
# Expected traffic to be balanced, but it is not
# INFO: Packets sent on path1 / path2: 1 / 12600
# TEST: Multipath hash field: Inner destination IP (unbalanced) [ OK ]
# INFO: Packets sent on path1 / path2: 0 / 12600
...
8ecea691e844 - (HEAD -> upstream/net-next/main) Revert "gre: Fix IPv6 link-local address generation." (2 minutes ago) <Stanislav Fomichev>
TAP version 13
1..1
# timeout set to 0
# selftests: net/forwarding: ip6gre_custom_multipath_hash.sh
[ 13.863060][ T252] gre: GRE over IPv4 demultiplexor driver
[ 13.911551][ T252] ip6_gre: GRE over IPv6 tunneling driver
[ 15.226124][ T277] 8021q: 802.1Q VLAN Support v1.8
[ 17.629460][ T317] GACT probability NOT on
[ 17.645781][ T315] tc (315) used greatest stack depth: 23040 bytes left
# TEST: ping [ OK ]
# TEST: ping6 [FAIL]
# INFO: Running IPv4 overlay custom multipath hash tests
# TEST: Multipath hash field: Inner source IP (balanced) [ OK ]
# INFO: Packets sent on path1 / path2: 5552 / 7052
# TEST: Multipath hash field: Inner source IP (unbalanced) [ OK ]
# INFO: Packets sent on path1 / path2: 12600 / 2
[ 36.278056][ C2] clocksource: Long readout interval, skipping watchdog check: cs_nsec: 1078005296 wd_nsec: 1078004682
# TEST: Multipath hash field: Inner destination IP (balanced) [ OK ]
# INFO: Packets sent on path1 / path2: 6650 / 5950
# TEST: Multipath hash field: Inner destination IP (unbalanced) [ OK ]
# INFO: Packets sent on path1 / path2: 0 / 12600
...
And I also see the failures on 4003c9e78778. Not sure why we see
different results. And the NIPAs fails as well:
https://netdev-3.bots.linux.dev/vmksft-forwarding-dbg/results/32922/1-ip6gre-custom-multipath-hash-sh/stdout
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH net v4 1/2] gre: Fix IPv6 link-local address generation.
2025-03-14 20:18 ` Stanislav Fomichev
@ 2025-03-16 13:08 ` Ido Schimmel
2025-03-17 21:10 ` Guillaume Nault
0 siblings, 1 reply; 15+ messages in thread
From: Ido Schimmel @ 2025-03-16 13:08 UTC (permalink / raw)
To: Stanislav Fomichev
Cc: Guillaume Nault, David Miller, Jakub Kicinski, Paolo Abeni,
Eric Dumazet, netdev, Simon Horman, David Ahern,
Antonio Quartulli, Petr Machata
On Fri, Mar 14, 2025 at 01:18:21PM -0700, Stanislav Fomichev wrote:
> On 03/14, Guillaume Nault wrote:
> > On Fri, Mar 14, 2025 at 08:18:32AM -0700, Stanislav Fomichev wrote:
> > > On 03/07, Guillaume Nault wrote:
> > > > Use addrconf_addr_gen() to generate IPv6 link-local addresses on GRE
> > > > devices in most cases and fall back to using add_v4_addrs() only in
> > > > case the GRE configuration is incompatible with addrconf_addr_gen().
> > > >
> > > > GRE used to use addrconf_addr_gen() until commit e5dd729460ca
> > > > ("ip/ip6_gre: use the same logic as SIT interfaces when computing v6LL
> > > > address") restricted this use to gretap and ip6gretap devices, and
> > > > created add_v4_addrs() (borrowed from SIT) for non-Ethernet GRE ones.
> > > >
> > > > The original problem came when commit 9af28511be10 ("addrconf: refuse
> > > > isatap eui64 for INADDR_ANY") made __ipv6_isatap_ifid() fail when its
> > > > addr parameter was 0. The commit says that this would create an invalid
> > > > address, however, I couldn't find any RFC saying that the generated
> > > > interface identifier would be wrong. Anyway, since gre over IPv4
> > > > devices pass their local tunnel address to __ipv6_isatap_ifid(), that
> > > > commit broke their IPv6 link-local address generation when the local
> > > > address was unspecified.
> > > >
> > > > Then commit e5dd729460ca ("ip/ip6_gre: use the same logic as SIT
> > > > interfaces when computing v6LL address") tried to fix that case by
> > > > defining add_v4_addrs() and calling it to generate the IPv6 link-local
> > > > address instead of using addrconf_addr_gen() (apart for gretap and
> > > > ip6gretap devices, which would still use the regular
> > > > addrconf_addr_gen(), since they have a MAC address).
> > > >
> > > > That broke several use cases because add_v4_addrs() isn't properly
> > > > integrated into the rest of IPv6 Neighbor Discovery code. Several of
> > > > these shortcomings have been fixed over time, but add_v4_addrs()
> > > > remains broken on several aspects. In particular, it doesn't send any
> > > > Router Sollicitations, so the SLAAC process doesn't start until the
> > > > interface receives a Router Advertisement. Also, add_v4_addrs() mostly
> > > > ignores the address generation mode of the interface
> > > > (/proc/sys/net/ipv6/conf/*/addr_gen_mode), thus breaking the
> > > > IN6_ADDR_GEN_MODE_RANDOM and IN6_ADDR_GEN_MODE_STABLE_PRIVACY cases.
> > > >
> > > > Fix the situation by using add_v4_addrs() only in the specific scenario
> > > > where the normal method would fail. That is, for interfaces that have
> > > > all of the following characteristics:
> > > >
> > > > * run over IPv4,
> > > > * transport IP packets directly, not Ethernet (that is, not gretap
> > > > interfaces),
> > > > * tunnel endpoint is INADDR_ANY (that is, 0),
> > > > * device address generation mode is EUI64.
> > >
> > > Could you please double check net/forwarding/ip6gre_custom_multipath_hash.sh ?
> > > It seems like it started falling after this series has been pulled:
> > > https://netdev-3.bots.linux.dev/vmksft-forwarding-dbg/results/31301/2-ip6gre-custom-multipath-hash-sh/stdout
> >
> > Hum, net/forwarding/ip6gre_custom_multipath_hash.sh works for me on the
> > current net tree (I'm at commit 4003c9e78778). I have only one failure,
> > but it already happened before 183185a18ff9 ("gre: Fix IPv6 link-local
> > address generation.") was applied.
>
> On my side I see the following (ignore ping6 FAILs):
>
> bfc6c67ec2d6 - (net-next/main, net-next/HEAD) net/smc: use the correct ndev to find pnetid by pnetid table (7 hours ago) <Guangguan Wang>
>
> TAP version 13
> 1..1
> # timeout set to 0
> # selftests: net/forwarding: ip6gre_custom_multipath_hash.sh
> [ 9.275735][ T167] ip (167) used greatest stack depth: 23536 bytes left
> [ 13.769300][ T255] gre: GRE over IPv4 demultiplexor driver
> [ 13.838185][ T255] ip6_gre: GRE over IPv6 tunneling driver
> [ 13.951780][ T12] ip6_tunnel: g1 xmit: Local address not yet configured!
> [ 14.038101][ T12] ip6_tunnel: g1 xmit: Local address not yet configured!
> [ 15.148469][ T281] 8021q: 802.1Q VLAN Support v1.8
> [ 17.559477][ T321] GACT probability NOT on
> [ 18.551876][ T12] ip6_tunnel: g2 xmit: Local address not yet configured!
> [ 18.633656][ T12] ip6_tunnel: g2 xmit: Local address not yet configured!
> # TEST: ping [ OK ]
> # TEST: ping6 [FAIL]
> # INFO: Running IPv4 overlay custom multipath hash tests
> # TEST: Multipath hash field: Inner source IP (balanced) [FAIL]
> # Expected traffic to be balanced, but it is not
> # INFO: Packets sent on path1 / path2: 1 / 12602
> # TEST: Multipath hash field: Inner source IP (unbalanced) [ OK ]
> # INFO: Packets sent on path1 / path2: 0 / 12601
> # TEST: Multipath hash field: Inner destination IP (balanced) [FAIL]
> # Expected traffic to be balanced, but it is not
> # INFO: Packets sent on path1 / path2: 1 / 12600
> # TEST: Multipath hash field: Inner destination IP (unbalanced) [ OK ]
> # INFO: Packets sent on path1 / path2: 0 / 12600
> ...
>
> 8ecea691e844 - (HEAD -> upstream/net-next/main) Revert "gre: Fix IPv6 link-local address generation." (2 minutes ago) <Stanislav Fomichev>
>
> TAP version 13
> 1..1
> # timeout set to 0
> # selftests: net/forwarding: ip6gre_custom_multipath_hash.sh
> [ 13.863060][ T252] gre: GRE over IPv4 demultiplexor driver
> [ 13.911551][ T252] ip6_gre: GRE over IPv6 tunneling driver
> [ 15.226124][ T277] 8021q: 802.1Q VLAN Support v1.8
> [ 17.629460][ T317] GACT probability NOT on
> [ 17.645781][ T315] tc (315) used greatest stack depth: 23040 bytes left
> # TEST: ping [ OK ]
> # TEST: ping6 [FAIL]
> # INFO: Running IPv4 overlay custom multipath hash tests
> # TEST: Multipath hash field: Inner source IP (balanced) [ OK ]
> # INFO: Packets sent on path1 / path2: 5552 / 7052
> # TEST: Multipath hash field: Inner source IP (unbalanced) [ OK ]
> # INFO: Packets sent on path1 / path2: 12600 / 2
> [ 36.278056][ C2] clocksource: Long readout interval, skipping watchdog check: cs_nsec: 1078005296 wd_nsec: 1078004682
> # TEST: Multipath hash field: Inner destination IP (balanced) [ OK ]
> # INFO: Packets sent on path1 / path2: 6650 / 5950
> # TEST: Multipath hash field: Inner destination IP (unbalanced) [ OK ]
> # INFO: Packets sent on path1 / path2: 0 / 12600
> ...
>
> And I also see the failures on 4003c9e78778. Not sure why we see
> different results. And the NIPAs fails as well:
>
> https://netdev-3.bots.linux.dev/vmksft-forwarding-dbg/results/32922/1-ip6gre-custom-multipath-hash-sh/stdout
I can reproduce this locally and I'm getting the exact same result as
the CI. All the balanced tests fail because the traffic is forwarded via
a single nexthop. No failures after reverting 183185a18ff9.
I'm still not sure what happens, but for some reason a neighbour is not
created on one of the nexthop devices which causes rt6_check_neigh() to
skip over this path (returning RT6_NUD_FAIL_DO_RR). Enabling
CONFIG_IPV6_ROUTER_PREF fixes the issue because then RT6_NUD_SUCCEED is
returned.
I can continue looking into this on Tuesday (mostly AFK tomorrow).
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH net v4 1/2] gre: Fix IPv6 link-local address generation.
2025-03-16 13:08 ` Ido Schimmel
@ 2025-03-17 21:10 ` Guillaume Nault
2025-03-20 16:26 ` Simon Horman
0 siblings, 1 reply; 15+ messages in thread
From: Guillaume Nault @ 2025-03-17 21:10 UTC (permalink / raw)
To: Ido Schimmel
Cc: Stanislav Fomichev, David Miller, Jakub Kicinski, Paolo Abeni,
Eric Dumazet, netdev, Simon Horman, David Ahern,
Antonio Quartulli, Petr Machata
On Sun, Mar 16, 2025 at 03:08:48PM +0200, Ido Schimmel wrote:
> On Fri, Mar 14, 2025 at 01:18:21PM -0700, Stanislav Fomichev wrote:
> > On 03/14, Guillaume Nault wrote:
> > > On Fri, Mar 14, 2025 at 08:18:32AM -0700, Stanislav Fomichev wrote:
> > > >
> > > > Could you please double check net/forwarding/ip6gre_custom_multipath_hash.sh ?
> > > > It seems like it started falling after this series has been pulled:
> > > > https://netdev-3.bots.linux.dev/vmksft-forwarding-dbg/results/31301/2-ip6gre-custom-multipath-hash-sh/stdout
> > >
> > > Hum, net/forwarding/ip6gre_custom_multipath_hash.sh works for me on the
> > > current net tree (I'm at commit 4003c9e78778). I have only one failure,
> > > but it already happened before 183185a18ff9 ("gre: Fix IPv6 link-local
> > > address generation.") was applied.
> >
> > On my side I see the following (ignore ping6 FAILs):
> >
> > bfc6c67ec2d6 - (net-next/main, net-next/HEAD) net/smc: use the correct ndev to find pnetid by pnetid table (7 hours ago) <Guangguan Wang>
> >
> > TAP version 13
> > 1..1
> > # timeout set to 0
> > # selftests: net/forwarding: ip6gre_custom_multipath_hash.sh
> > [ 9.275735][ T167] ip (167) used greatest stack depth: 23536 bytes left
> > [ 13.769300][ T255] gre: GRE over IPv4 demultiplexor driver
> > [ 13.838185][ T255] ip6_gre: GRE over IPv6 tunneling driver
> > [ 13.951780][ T12] ip6_tunnel: g1 xmit: Local address not yet configured!
> > [ 14.038101][ T12] ip6_tunnel: g1 xmit: Local address not yet configured!
> > [ 15.148469][ T281] 8021q: 802.1Q VLAN Support v1.8
> > [ 17.559477][ T321] GACT probability NOT on
> > [ 18.551876][ T12] ip6_tunnel: g2 xmit: Local address not yet configured!
> > [ 18.633656][ T12] ip6_tunnel: g2 xmit: Local address not yet configured!
> > # TEST: ping [ OK ]
> > # TEST: ping6 [FAIL]
> > # INFO: Running IPv4 overlay custom multipath hash tests
> > # TEST: Multipath hash field: Inner source IP (balanced) [FAIL]
> > # Expected traffic to be balanced, but it is not
> > # INFO: Packets sent on path1 / path2: 1 / 12602
> > # TEST: Multipath hash field: Inner source IP (unbalanced) [ OK ]
> > # INFO: Packets sent on path1 / path2: 0 / 12601
> > # TEST: Multipath hash field: Inner destination IP (balanced) [FAIL]
> > # Expected traffic to be balanced, but it is not
> > # INFO: Packets sent on path1 / path2: 1 / 12600
> > # TEST: Multipath hash field: Inner destination IP (unbalanced) [ OK ]
> > # INFO: Packets sent on path1 / path2: 0 / 12600
> > ...
> >
> > 8ecea691e844 - (HEAD -> upstream/net-next/main) Revert "gre: Fix IPv6 link-local address generation." (2 minutes ago) <Stanislav Fomichev>
> >
> > TAP version 13
> > 1..1
> > # timeout set to 0
> > # selftests: net/forwarding: ip6gre_custom_multipath_hash.sh
> > [ 13.863060][ T252] gre: GRE over IPv4 demultiplexor driver
> > [ 13.911551][ T252] ip6_gre: GRE over IPv6 tunneling driver
> > [ 15.226124][ T277] 8021q: 802.1Q VLAN Support v1.8
> > [ 17.629460][ T317] GACT probability NOT on
> > [ 17.645781][ T315] tc (315) used greatest stack depth: 23040 bytes left
> > # TEST: ping [ OK ]
> > # TEST: ping6 [FAIL]
> > # INFO: Running IPv4 overlay custom multipath hash tests
> > # TEST: Multipath hash field: Inner source IP (balanced) [ OK ]
> > # INFO: Packets sent on path1 / path2: 5552 / 7052
> > # TEST: Multipath hash field: Inner source IP (unbalanced) [ OK ]
> > # INFO: Packets sent on path1 / path2: 12600 / 2
> > [ 36.278056][ C2] clocksource: Long readout interval, skipping watchdog check: cs_nsec: 1078005296 wd_nsec: 1078004682
> > # TEST: Multipath hash field: Inner destination IP (balanced) [ OK ]
> > # INFO: Packets sent on path1 / path2: 6650 / 5950
> > # TEST: Multipath hash field: Inner destination IP (unbalanced) [ OK ]
> > # INFO: Packets sent on path1 / path2: 0 / 12600
> > ...
> >
> > And I also see the failures on 4003c9e78778. Not sure why we see
> > different results. And the NIPAs fails as well:
> >
> > https://netdev-3.bots.linux.dev/vmksft-forwarding-dbg/results/32922/1-ip6gre-custom-multipath-hash-sh/stdout
>
> I can reproduce this locally and I'm getting the exact same result as
> the CI. All the balanced tests fail because the traffic is forwarded via
> a single nexthop. No failures after reverting 183185a18ff9.
>
> I'm still not sure what happens, but for some reason a neighbour is not
> created on one of the nexthop devices which causes rt6_check_neigh() to
> skip over this path (returning RT6_NUD_FAIL_DO_RR). Enabling
> CONFIG_IPV6_ROUTER_PREF fixes the issue because then RT6_NUD_SUCCEED is
> returned.
>
> I can continue looking into this on Tuesday (mostly AFK tomorrow).
I finally managed to reproduce the problem using vng. Still no problem
on my regular VM, no matter if I enable CONFIG_IPV6_ROUTER_PREF or not.
I'll continue investigating this problem...
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH net v4 1/2] gre: Fix IPv6 link-local address generation.
2025-03-17 21:10 ` Guillaume Nault
@ 2025-03-20 16:26 ` Simon Horman
2025-03-20 19:04 ` Ido Schimmel
0 siblings, 1 reply; 15+ messages in thread
From: Simon Horman @ 2025-03-20 16:26 UTC (permalink / raw)
To: Guillaume Nault
Cc: Ido Schimmel, Stanislav Fomichev, David Miller, Jakub Kicinski,
Paolo Abeni, Eric Dumazet, netdev, David Ahern, Antonio Quartulli,
Petr Machata
On Mon, Mar 17, 2025 at 10:10:45PM +0100, Guillaume Nault wrote:
> On Sun, Mar 16, 2025 at 03:08:48PM +0200, Ido Schimmel wrote:
> > On Fri, Mar 14, 2025 at 01:18:21PM -0700, Stanislav Fomichev wrote:
> > > On 03/14, Guillaume Nault wrote:
> > > > On Fri, Mar 14, 2025 at 08:18:32AM -0700, Stanislav Fomichev wrote:
> > > > >
> > > > > Could you please double check net/forwarding/ip6gre_custom_multipath_hash.sh ?
> > > > > It seems like it started falling after this series has been pulled:
> > > > > https://netdev-3.bots.linux.dev/vmksft-forwarding-dbg/results/31301/2-ip6gre-custom-multipath-hash-sh/stdout
> > > >
> > > > Hum, net/forwarding/ip6gre_custom_multipath_hash.sh works for me on the
> > > > current net tree (I'm at commit 4003c9e78778). I have only one failure,
> > > > but it already happened before 183185a18ff9 ("gre: Fix IPv6 link-local
> > > > address generation.") was applied.
> > >
> > > On my side I see the following (ignore ping6 FAILs):
> > >
> > > bfc6c67ec2d6 - (net-next/main, net-next/HEAD) net/smc: use the correct ndev to find pnetid by pnetid table (7 hours ago) <Guangguan Wang>
> > >
> > > TAP version 13
> > > 1..1
> > > # timeout set to 0
> > > # selftests: net/forwarding: ip6gre_custom_multipath_hash.sh
> > > [ 9.275735][ T167] ip (167) used greatest stack depth: 23536 bytes left
> > > [ 13.769300][ T255] gre: GRE over IPv4 demultiplexor driver
> > > [ 13.838185][ T255] ip6_gre: GRE over IPv6 tunneling driver
> > > [ 13.951780][ T12] ip6_tunnel: g1 xmit: Local address not yet configured!
> > > [ 14.038101][ T12] ip6_tunnel: g1 xmit: Local address not yet configured!
> > > [ 15.148469][ T281] 8021q: 802.1Q VLAN Support v1.8
> > > [ 17.559477][ T321] GACT probability NOT on
> > > [ 18.551876][ T12] ip6_tunnel: g2 xmit: Local address not yet configured!
> > > [ 18.633656][ T12] ip6_tunnel: g2 xmit: Local address not yet configured!
> > > # TEST: ping [ OK ]
> > > # TEST: ping6 [FAIL]
> > > # INFO: Running IPv4 overlay custom multipath hash tests
> > > # TEST: Multipath hash field: Inner source IP (balanced) [FAIL]
> > > # Expected traffic to be balanced, but it is not
> > > # INFO: Packets sent on path1 / path2: 1 / 12602
> > > # TEST: Multipath hash field: Inner source IP (unbalanced) [ OK ]
> > > # INFO: Packets sent on path1 / path2: 0 / 12601
> > > # TEST: Multipath hash field: Inner destination IP (balanced) [FAIL]
> > > # Expected traffic to be balanced, but it is not
> > > # INFO: Packets sent on path1 / path2: 1 / 12600
> > > # TEST: Multipath hash field: Inner destination IP (unbalanced) [ OK ]
> > > # INFO: Packets sent on path1 / path2: 0 / 12600
> > > ...
> > >
> > > 8ecea691e844 - (HEAD -> upstream/net-next/main) Revert "gre: Fix IPv6 link-local address generation." (2 minutes ago) <Stanislav Fomichev>
> > >
> > > TAP version 13
> > > 1..1
> > > # timeout set to 0
> > > # selftests: net/forwarding: ip6gre_custom_multipath_hash.sh
> > > [ 13.863060][ T252] gre: GRE over IPv4 demultiplexor driver
> > > [ 13.911551][ T252] ip6_gre: GRE over IPv6 tunneling driver
> > > [ 15.226124][ T277] 8021q: 802.1Q VLAN Support v1.8
> > > [ 17.629460][ T317] GACT probability NOT on
> > > [ 17.645781][ T315] tc (315) used greatest stack depth: 23040 bytes left
> > > # TEST: ping [ OK ]
> > > # TEST: ping6 [FAIL]
> > > # INFO: Running IPv4 overlay custom multipath hash tests
> > > # TEST: Multipath hash field: Inner source IP (balanced) [ OK ]
> > > # INFO: Packets sent on path1 / path2: 5552 / 7052
> > > # TEST: Multipath hash field: Inner source IP (unbalanced) [ OK ]
> > > # INFO: Packets sent on path1 / path2: 12600 / 2
> > > [ 36.278056][ C2] clocksource: Long readout interval, skipping watchdog check: cs_nsec: 1078005296 wd_nsec: 1078004682
> > > # TEST: Multipath hash field: Inner destination IP (balanced) [ OK ]
> > > # INFO: Packets sent on path1 / path2: 6650 / 5950
> > > # TEST: Multipath hash field: Inner destination IP (unbalanced) [ OK ]
> > > # INFO: Packets sent on path1 / path2: 0 / 12600
> > > ...
> > >
> > > And I also see the failures on 4003c9e78778. Not sure why we see
> > > different results. And the NIPAs fails as well:
> > >
> > > https://netdev-3.bots.linux.dev/vmksft-forwarding-dbg/results/32922/1-ip6gre-custom-multipath-hash-sh/stdout
> >
> > I can reproduce this locally and I'm getting the exact same result as
> > the CI. All the balanced tests fail because the traffic is forwarded via
> > a single nexthop. No failures after reverting 183185a18ff9.
> >
> > I'm still not sure what happens, but for some reason a neighbour is not
> > created on one of the nexthop devices which causes rt6_check_neigh() to
> > skip over this path (returning RT6_NUD_FAIL_DO_RR). Enabling
> > CONFIG_IPV6_ROUTER_PREF fixes the issue because then RT6_NUD_SUCCEED is
> > returned.
> >
> > I can continue looking into this on Tuesday (mostly AFK tomorrow).
>
> I finally managed to reproduce the problem using vng. Still no problem
> on my regular VM, no matter if I enable CONFIG_IPV6_ROUTER_PREF or not.
> I'll continue investigating this problem...
FWIIW, I have tried much, but am unable to _reliably_ reproduce this problem.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH net v4 1/2] gre: Fix IPv6 link-local address generation.
2025-03-20 16:26 ` Simon Horman
@ 2025-03-20 19:04 ` Ido Schimmel
2025-03-24 14:36 ` Guillaume Nault
0 siblings, 1 reply; 15+ messages in thread
From: Ido Schimmel @ 2025-03-20 19:04 UTC (permalink / raw)
To: Simon Horman, gnault
Cc: Stanislav Fomichev, David Miller, Jakub Kicinski, Paolo Abeni,
Eric Dumazet, netdev, David Ahern, Antonio Quartulli,
Petr Machata
On Thu, Mar 20, 2025 at 04:26:46PM +0000, Simon Horman wrote:
> On Mon, Mar 17, 2025 at 10:10:45PM +0100, Guillaume Nault wrote:
> > On Sun, Mar 16, 2025 at 03:08:48PM +0200, Ido Schimmel wrote:
> > > On Fri, Mar 14, 2025 at 01:18:21PM -0700, Stanislav Fomichev wrote:
> > > > On 03/14, Guillaume Nault wrote:
> > > > > On Fri, Mar 14, 2025 at 08:18:32AM -0700, Stanislav Fomichev wrote:
> > > > > >
> > > > > > Could you please double check net/forwarding/ip6gre_custom_multipath_hash.sh ?
> > > > > > It seems like it started falling after this series has been pulled:
> > > > > > https://netdev-3.bots.linux.dev/vmksft-forwarding-dbg/results/31301/2-ip6gre-custom-multipath-hash-sh/stdout
> > > > >
> > > > > Hum, net/forwarding/ip6gre_custom_multipath_hash.sh works for me on the
> > > > > current net tree (I'm at commit 4003c9e78778). I have only one failure,
> > > > > but it already happened before 183185a18ff9 ("gre: Fix IPv6 link-local
> > > > > address generation.") was applied.
> > > >
> > > > On my side I see the following (ignore ping6 FAILs):
> > > >
> > > > bfc6c67ec2d6 - (net-next/main, net-next/HEAD) net/smc: use the correct ndev to find pnetid by pnetid table (7 hours ago) <Guangguan Wang>
> > > >
> > > > TAP version 13
> > > > 1..1
> > > > # timeout set to 0
> > > > # selftests: net/forwarding: ip6gre_custom_multipath_hash.sh
> > > > [ 9.275735][ T167] ip (167) used greatest stack depth: 23536 bytes left
> > > > [ 13.769300][ T255] gre: GRE over IPv4 demultiplexor driver
> > > > [ 13.838185][ T255] ip6_gre: GRE over IPv6 tunneling driver
> > > > [ 13.951780][ T12] ip6_tunnel: g1 xmit: Local address not yet configured!
> > > > [ 14.038101][ T12] ip6_tunnel: g1 xmit: Local address not yet configured!
> > > > [ 15.148469][ T281] 8021q: 802.1Q VLAN Support v1.8
> > > > [ 17.559477][ T321] GACT probability NOT on
> > > > [ 18.551876][ T12] ip6_tunnel: g2 xmit: Local address not yet configured!
> > > > [ 18.633656][ T12] ip6_tunnel: g2 xmit: Local address not yet configured!
> > > > # TEST: ping [ OK ]
> > > > # TEST: ping6 [FAIL]
> > > > # INFO: Running IPv4 overlay custom multipath hash tests
> > > > # TEST: Multipath hash field: Inner source IP (balanced) [FAIL]
> > > > # Expected traffic to be balanced, but it is not
> > > > # INFO: Packets sent on path1 / path2: 1 / 12602
> > > > # TEST: Multipath hash field: Inner source IP (unbalanced) [ OK ]
> > > > # INFO: Packets sent on path1 / path2: 0 / 12601
> > > > # TEST: Multipath hash field: Inner destination IP (balanced) [FAIL]
> > > > # Expected traffic to be balanced, but it is not
> > > > # INFO: Packets sent on path1 / path2: 1 / 12600
> > > > # TEST: Multipath hash field: Inner destination IP (unbalanced) [ OK ]
> > > > # INFO: Packets sent on path1 / path2: 0 / 12600
> > > > ...
> > > >
> > > > 8ecea691e844 - (HEAD -> upstream/net-next/main) Revert "gre: Fix IPv6 link-local address generation." (2 minutes ago) <Stanislav Fomichev>
> > > >
> > > > TAP version 13
> > > > 1..1
> > > > # timeout set to 0
> > > > # selftests: net/forwarding: ip6gre_custom_multipath_hash.sh
> > > > [ 13.863060][ T252] gre: GRE over IPv4 demultiplexor driver
> > > > [ 13.911551][ T252] ip6_gre: GRE over IPv6 tunneling driver
> > > > [ 15.226124][ T277] 8021q: 802.1Q VLAN Support v1.8
> > > > [ 17.629460][ T317] GACT probability NOT on
> > > > [ 17.645781][ T315] tc (315) used greatest stack depth: 23040 bytes left
> > > > # TEST: ping [ OK ]
> > > > # TEST: ping6 [FAIL]
> > > > # INFO: Running IPv4 overlay custom multipath hash tests
> > > > # TEST: Multipath hash field: Inner source IP (balanced) [ OK ]
> > > > # INFO: Packets sent on path1 / path2: 5552 / 7052
> > > > # TEST: Multipath hash field: Inner source IP (unbalanced) [ OK ]
> > > > # INFO: Packets sent on path1 / path2: 12600 / 2
> > > > [ 36.278056][ C2] clocksource: Long readout interval, skipping watchdog check: cs_nsec: 1078005296 wd_nsec: 1078004682
> > > > # TEST: Multipath hash field: Inner destination IP (balanced) [ OK ]
> > > > # INFO: Packets sent on path1 / path2: 6650 / 5950
> > > > # TEST: Multipath hash field: Inner destination IP (unbalanced) [ OK ]
> > > > # INFO: Packets sent on path1 / path2: 0 / 12600
> > > > ...
> > > >
> > > > And I also see the failures on 4003c9e78778. Not sure why we see
> > > > different results. And the NIPAs fails as well:
> > > >
> > > > https://netdev-3.bots.linux.dev/vmksft-forwarding-dbg/results/32922/1-ip6gre-custom-multipath-hash-sh/stdout
> > >
> > > I can reproduce this locally and I'm getting the exact same result as
> > > the CI. All the balanced tests fail because the traffic is forwarded via
> > > a single nexthop. No failures after reverting 183185a18ff9.
> > >
> > > I'm still not sure what happens, but for some reason a neighbour is not
> > > created on one of the nexthop devices which causes rt6_check_neigh() to
> > > skip over this path (returning RT6_NUD_FAIL_DO_RR). Enabling
> > > CONFIG_IPV6_ROUTER_PREF fixes the issue because then RT6_NUD_SUCCEED is
> > > returned.
> > >
> > > I can continue looking into this on Tuesday (mostly AFK tomorrow).
> >
> > I finally managed to reproduce the problem using vng. Still no problem
> > on my regular VM, no matter if I enable CONFIG_IPV6_ROUTER_PREF or not.
> > I'll continue investigating this problem...
>
> FWIIW, I have tried much, but am unable to _reliably_ reproduce this problem.
Sorry for the delay. Busy with other tasks at the moment, but I found
some time to look into this. I believe I understand the issue and have a
fix. Guillaume's patch is fine. It simply exposed a bug elsewhere.
The test is failing because all the packets are forwarded via a single
path instead of being load balanced between both paths.
fib6_select_path() chooses the path according to the hash-threshold
algorithm. If the function is called with the last nexthop in a
multipath route, it will always choose this nexthop because the
calculated hash will always be smaller than the upper bound of this
nexthop.
Fix is to find the first nexthop (sibling route) and choose the first
matching nexthop according to hash-threshold. Given Guillaume and you
can reproduce the issue, can you please test the fix [1]?
I think Guillaume's patch exposed the issue because it caused the ip6gre
device to transmit a packet (Router Solicitation as part of the DAD
process for the IPv6 link-local address) as soon as the device is
brought up. With debug kernels this might happen while forwarding is
still disabled as the test enables forwarding at the end of the setup.
When forwarding is disabled the nexthop's neighbour state is taken into
account when choosing a route in rt6_select() and round-robin will be
performed between the two sibling routes. It is possible to end up in a
situation where rt6_select() always returns the second sibling route
which fib6_select_path() will then always select due to its upper bound.
[1]
diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index fb2e99a56529..afcd66b73a92 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -412,11 +412,35 @@ static bool rt6_check_expired(const struct rt6_info *rt)
return false;
}
+static struct fib6_info *
+rt6_multipath_first_sibling_rcu(const struct fib6_info *rt)
+{
+ struct fib6_info *iter;
+ struct fib6_node *fn;
+
+ fn = rcu_dereference(rt->fib6_node);
+ if (!fn)
+ goto out;
+ iter = rcu_dereference(fn->leaf);
+ if (!iter)
+ goto out;
+
+ while (iter) {
+ if (iter->fib6_metric == rt->fib6_metric &&
+ rt6_qualify_for_ecmp(iter))
+ return iter;
+ iter = rcu_dereference(iter->fib6_next);
+ }
+
+out:
+ return NULL;
+}
+
void fib6_select_path(const struct net *net, struct fib6_result *res,
struct flowi6 *fl6, int oif, bool have_oif_match,
const struct sk_buff *skb, int strict)
{
- struct fib6_info *match = res->f6i;
+ struct fib6_info *first, *match = res->f6i;
struct fib6_info *sibling;
if (!match->nh && (!match->fib6_nsiblings || have_oif_match))
@@ -440,10 +464,18 @@ void fib6_select_path(const struct net *net, struct fib6_result *res,
return;
}
- if (fl6->mp_hash <= atomic_read(&match->fib6_nh->fib_nh_upper_bound))
+ first = rt6_multipath_first_sibling_rcu(match);
+ if (!first)
goto out;
- list_for_each_entry_rcu(sibling, &match->fib6_siblings,
+ if (fl6->mp_hash <= atomic_read(&first->fib6_nh->fib_nh_upper_bound) &&
+ rt6_score_route(first->fib6_nh, first->fib6_flags, oif,
+ strict) >= 0) {
+ match = first;
+ goto out;
+ }
+
+ list_for_each_entry_rcu(sibling, &first->fib6_siblings,
fib6_siblings) {
const struct fib6_nh *nh = sibling->fib6_nh;
int nh_upper_bound;
^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH net v4 1/2] gre: Fix IPv6 link-local address generation.
2025-03-20 19:04 ` Ido Schimmel
@ 2025-03-24 14:36 ` Guillaume Nault
0 siblings, 0 replies; 15+ messages in thread
From: Guillaume Nault @ 2025-03-24 14:36 UTC (permalink / raw)
To: Ido Schimmel
Cc: Simon Horman, Stanislav Fomichev, David Miller, Jakub Kicinski,
Paolo Abeni, Eric Dumazet, netdev, David Ahern, Antonio Quartulli,
Petr Machata
On Thu, Mar 20, 2025 at 09:04:29PM +0200, Ido Schimmel wrote:
> On Thu, Mar 20, 2025 at 04:26:46PM +0000, Simon Horman wrote:
> > On Mon, Mar 17, 2025 at 10:10:45PM +0100, Guillaume Nault wrote:
> > > On Sun, Mar 16, 2025 at 03:08:48PM +0200, Ido Schimmel wrote:
> > > > On Fri, Mar 14, 2025 at 01:18:21PM -0700, Stanislav Fomichev wrote:
> > > > > On 03/14, Guillaume Nault wrote:
> > > > > > On Fri, Mar 14, 2025 at 08:18:32AM -0700, Stanislav Fomichev wrote:
> > > > > > >
> > > > > > > Could you please double check net/forwarding/ip6gre_custom_multipath_hash.sh ?
> > > > > > > It seems like it started falling after this series has been pulled:
> > > > > > > https://netdev-3.bots.linux.dev/vmksft-forwarding-dbg/results/31301/2-ip6gre-custom-multipath-hash-sh/stdout
> > > > > >
> > > > > > Hum, net/forwarding/ip6gre_custom_multipath_hash.sh works for me on the
> > > > > > current net tree (I'm at commit 4003c9e78778). I have only one failure,
> > > > > > but it already happened before 183185a18ff9 ("gre: Fix IPv6 link-local
> > > > > > address generation.") was applied.
> > > > >
> > > > > On my side I see the following (ignore ping6 FAILs):
> > > > >
> > > > > bfc6c67ec2d6 - (net-next/main, net-next/HEAD) net/smc: use the correct ndev to find pnetid by pnetid table (7 hours ago) <Guangguan Wang>
> > > > >
> > > > > TAP version 13
> > > > > 1..1
> > > > > # timeout set to 0
> > > > > # selftests: net/forwarding: ip6gre_custom_multipath_hash.sh
> > > > > [ 9.275735][ T167] ip (167) used greatest stack depth: 23536 bytes left
> > > > > [ 13.769300][ T255] gre: GRE over IPv4 demultiplexor driver
> > > > > [ 13.838185][ T255] ip6_gre: GRE over IPv6 tunneling driver
> > > > > [ 13.951780][ T12] ip6_tunnel: g1 xmit: Local address not yet configured!
> > > > > [ 14.038101][ T12] ip6_tunnel: g1 xmit: Local address not yet configured!
> > > > > [ 15.148469][ T281] 8021q: 802.1Q VLAN Support v1.8
> > > > > [ 17.559477][ T321] GACT probability NOT on
> > > > > [ 18.551876][ T12] ip6_tunnel: g2 xmit: Local address not yet configured!
> > > > > [ 18.633656][ T12] ip6_tunnel: g2 xmit: Local address not yet configured!
> > > > > # TEST: ping [ OK ]
> > > > > # TEST: ping6 [FAIL]
> > > > > # INFO: Running IPv4 overlay custom multipath hash tests
> > > > > # TEST: Multipath hash field: Inner source IP (balanced) [FAIL]
> > > > > # Expected traffic to be balanced, but it is not
> > > > > # INFO: Packets sent on path1 / path2: 1 / 12602
> > > > > # TEST: Multipath hash field: Inner source IP (unbalanced) [ OK ]
> > > > > # INFO: Packets sent on path1 / path2: 0 / 12601
> > > > > # TEST: Multipath hash field: Inner destination IP (balanced) [FAIL]
> > > > > # Expected traffic to be balanced, but it is not
> > > > > # INFO: Packets sent on path1 / path2: 1 / 12600
> > > > > # TEST: Multipath hash field: Inner destination IP (unbalanced) [ OK ]
> > > > > # INFO: Packets sent on path1 / path2: 0 / 12600
> > > > > ...
> > > > >
> > > > > 8ecea691e844 - (HEAD -> upstream/net-next/main) Revert "gre: Fix IPv6 link-local address generation." (2 minutes ago) <Stanislav Fomichev>
> > > > >
> > > > > TAP version 13
> > > > > 1..1
> > > > > # timeout set to 0
> > > > > # selftests: net/forwarding: ip6gre_custom_multipath_hash.sh
> > > > > [ 13.863060][ T252] gre: GRE over IPv4 demultiplexor driver
> > > > > [ 13.911551][ T252] ip6_gre: GRE over IPv6 tunneling driver
> > > > > [ 15.226124][ T277] 8021q: 802.1Q VLAN Support v1.8
> > > > > [ 17.629460][ T317] GACT probability NOT on
> > > > > [ 17.645781][ T315] tc (315) used greatest stack depth: 23040 bytes left
> > > > > # TEST: ping [ OK ]
> > > > > # TEST: ping6 [FAIL]
> > > > > # INFO: Running IPv4 overlay custom multipath hash tests
> > > > > # TEST: Multipath hash field: Inner source IP (balanced) [ OK ]
> > > > > # INFO: Packets sent on path1 / path2: 5552 / 7052
> > > > > # TEST: Multipath hash field: Inner source IP (unbalanced) [ OK ]
> > > > > # INFO: Packets sent on path1 / path2: 12600 / 2
> > > > > [ 36.278056][ C2] clocksource: Long readout interval, skipping watchdog check: cs_nsec: 1078005296 wd_nsec: 1078004682
> > > > > # TEST: Multipath hash field: Inner destination IP (balanced) [ OK ]
> > > > > # INFO: Packets sent on path1 / path2: 6650 / 5950
> > > > > # TEST: Multipath hash field: Inner destination IP (unbalanced) [ OK ]
> > > > > # INFO: Packets sent on path1 / path2: 0 / 12600
> > > > > ...
> > > > >
> > > > > And I also see the failures on 4003c9e78778. Not sure why we see
> > > > > different results. And the NIPAs fails as well:
> > > > >
> > > > > https://netdev-3.bots.linux.dev/vmksft-forwarding-dbg/results/32922/1-ip6gre-custom-multipath-hash-sh/stdout
> > > >
> > > > I can reproduce this locally and I'm getting the exact same result as
> > > > the CI. All the balanced tests fail because the traffic is forwarded via
> > > > a single nexthop. No failures after reverting 183185a18ff9.
> > > >
> > > > I'm still not sure what happens, but for some reason a neighbour is not
> > > > created on one of the nexthop devices which causes rt6_check_neigh() to
> > > > skip over this path (returning RT6_NUD_FAIL_DO_RR). Enabling
> > > > CONFIG_IPV6_ROUTER_PREF fixes the issue because then RT6_NUD_SUCCEED is
> > > > returned.
> > > >
> > > > I can continue looking into this on Tuesday (mostly AFK tomorrow).
> > >
> > > I finally managed to reproduce the problem using vng. Still no problem
> > > on my regular VM, no matter if I enable CONFIG_IPV6_ROUTER_PREF or not.
> > > I'll continue investigating this problem...
> >
> > FWIIW, I have tried much, but am unable to _reliably_ reproduce this problem.
>
> Sorry for the delay. Busy with other tasks at the moment, but I found
> some time to look into this. I believe I understand the issue and have a
> fix. Guillaume's patch is fine. It simply exposed a bug elsewhere.
>
> The test is failing because all the packets are forwarded via a single
> path instead of being load balanced between both paths.
> fib6_select_path() chooses the path according to the hash-threshold
> algorithm. If the function is called with the last nexthop in a
> multipath route, it will always choose this nexthop because the
> calculated hash will always be smaller than the upper bound of this
> nexthop.
>
> Fix is to find the first nexthop (sibling route) and choose the first
> matching nexthop according to hash-threshold. Given Guillaume and you
> can reproduce the issue, can you please test the fix [1]?
Good catch!
I can confirm that your patch fixes the selftest for me.
> I think Guillaume's patch exposed the issue because it caused the ip6gre
> device to transmit a packet (Router Solicitation as part of the DAD
> process for the IPv6 link-local address) as soon as the device is
> brought up. With debug kernels this might happen while forwarding is
> still disabled as the test enables forwarding at the end of the setup.
>
> When forwarding is disabled the nexthop's neighbour state is taken into
> account when choosing a route in rt6_select() and round-robin will be
> performed between the two sibling routes. It is possible to end up in a
> situation where rt6_select() always returns the second sibling route
> which fib6_select_path() will then always select due to its upper bound.
Thanks a lot for your help Ido!
> [1]
> diff --git a/net/ipv6/route.c b/net/ipv6/route.c
> index fb2e99a56529..afcd66b73a92 100644
> --- a/net/ipv6/route.c
> +++ b/net/ipv6/route.c
> @@ -412,11 +412,35 @@ static bool rt6_check_expired(const struct rt6_info *rt)
> return false;
> }
>
> +static struct fib6_info *
> +rt6_multipath_first_sibling_rcu(const struct fib6_info *rt)
> +{
> + struct fib6_info *iter;
> + struct fib6_node *fn;
> +
> + fn = rcu_dereference(rt->fib6_node);
> + if (!fn)
> + goto out;
> + iter = rcu_dereference(fn->leaf);
> + if (!iter)
> + goto out;
> +
> + while (iter) {
> + if (iter->fib6_metric == rt->fib6_metric &&
> + rt6_qualify_for_ecmp(iter))
> + return iter;
> + iter = rcu_dereference(iter->fib6_next);
> + }
> +
> +out:
> + return NULL;
> +}
> +
> void fib6_select_path(const struct net *net, struct fib6_result *res,
> struct flowi6 *fl6, int oif, bool have_oif_match,
> const struct sk_buff *skb, int strict)
> {
> - struct fib6_info *match = res->f6i;
> + struct fib6_info *first, *match = res->f6i;
> struct fib6_info *sibling;
>
> if (!match->nh && (!match->fib6_nsiblings || have_oif_match))
> @@ -440,10 +464,18 @@ void fib6_select_path(const struct net *net, struct fib6_result *res,
> return;
> }
>
> - if (fl6->mp_hash <= atomic_read(&match->fib6_nh->fib_nh_upper_bound))
> + first = rt6_multipath_first_sibling_rcu(match);
> + if (!first)
> goto out;
>
> - list_for_each_entry_rcu(sibling, &match->fib6_siblings,
> + if (fl6->mp_hash <= atomic_read(&first->fib6_nh->fib_nh_upper_bound) &&
> + rt6_score_route(first->fib6_nh, first->fib6_flags, oif,
> + strict) >= 0) {
> + match = first;
> + goto out;
> + }
> +
> + list_for_each_entry_rcu(sibling, &first->fib6_siblings,
> fib6_siblings) {
> const struct fib6_nh *nh = sibling->fib6_nh;
> int nh_upper_bound;
>
^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2025-03-24 14:36 UTC | newest]
Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-03-07 19:28 [PATCH net v4 0/2] gre: Fix regressions in IPv6 link-local address generation Guillaume Nault
2025-03-07 19:28 ` [PATCH net v4 1/2] gre: Fix " Guillaume Nault
2025-03-09 18:03 ` Ido Schimmel
2025-03-14 15:18 ` Stanislav Fomichev
2025-03-14 19:22 ` Guillaume Nault
2025-03-14 20:18 ` Stanislav Fomichev
2025-03-16 13:08 ` Ido Schimmel
2025-03-17 21:10 ` Guillaume Nault
2025-03-20 16:26 ` Simon Horman
2025-03-20 19:04 ` Ido Schimmel
2025-03-24 14:36 ` Guillaume Nault
2025-03-07 19:28 ` [PATCH net v4 2/2] selftests: Add IPv6 link-local address generation tests for GRE devices Guillaume Nault
2025-03-09 18:04 ` Ido Schimmel
2025-03-10 9:44 ` Petr Machata
2025-03-13 9:30 ` [PATCH net v4 0/2] gre: Fix regressions in IPv6 link-local address generation patchwork-bot+netdevbpf
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).