* [RFC PATCH net-next 1/2] ipv6: Honor oif when choosing nexthop for locally generated traffic
@ 2025-12-24 16:18 Ido Schimmel
2025-12-24 16:18 ` [RFC PATCH net-next 2/2] selftests: fib_tests: Add test cases for route lookup with oif Ido Schimmel
0 siblings, 1 reply; 4+ messages in thread
From: Ido Schimmel @ 2025-12-24 16:18 UTC (permalink / raw)
To: netdev; +Cc: davem, kuba, pabeni, edumazet, dsahern, horms, Ido Schimmel
Commit 741a11d9e410 ("net: ipv6: Add RT6_LOOKUP_F_IFACE flag if oif is
set") made the kernel honor the oif parameter when specified as part of
output route lookup:
# ip route add 2001:db8:1::/64 dev dummy1
# ip route add ::/0 dev dummy2
# ip route get 2001:db8:1::1 oif dummy2 fibmatch
default dev dummy2 metric 1024 pref medium
Due to regression reports, the behavior was partially reverted in commit
d46a9d678e4c ("net: ipv6: Dont add RT6_LOOKUP_F_IFACE flag if saddr
set") to only honor the oif if source address is not specified:
# ip route get 2001:db8:1::1 from 2001:db8:2::1 oif dummy2 fibmatch
2001:db8:1::/64 dev dummy1 metric 1024 pref medium
That is, when source address is specified, the kernel will choose the
most specific route even if its nexthop device does not match the
specified oif.
This creates a problem for multipath routes. After looking up a route,
when source address is not specified, the kernel will choose a nexthop
whose nexthop device matches the specified oif:
# sysctl -wq net.ipv6.conf.all.forwarding=1
# ip route add 2001:db8:10::/64 nexthop via fe80::1 dev dummy1 nexthop via fe80::2 dev dummy2
# for i in {1..100}; do ip route get 2001:db8:10::${i} oif dummy2; done | grep -o dummy[0-9] | sort | uniq -c
100 dummy2
But will disregard the oif when source address is specified despite the
fact that a matching nexthop exists:
# for i in {1..100}; do ip route get 2001:db8:10::${i} from 2001:db8:2::1 oif dummy2; done | grep -o dummy[0-9] | sort | uniq -c
53 dummy1
47 dummy2
This behavior differs from IPv4:
# ip address add 192.0.2.1/32 dev lo
# ip route add 198.51.100.0/24 nexthop via inet6 fe80::1 dev dummy1 nexthop via inet6 fe80::2 dev dummy2
# for i in {1..100}; do ip route get 198.51.100.${i} from 192.0.2.1 oif dummy2; done | grep -o dummy[0-9] | sort | uniq -c
100 dummy2
What happens is that fib6_table_lookup() returns a route with a matching
nexthop device (assuming it exists):
# perf record -e fib6:fib6_table_lookup -- bash -c "for i in {1..100}; do ip route get 2001:db8:10::${i} from 2001:db8:2::1 oif dummy2; done > /dev/null"
# perf script | grep -o dummy[0-9] | sort | uniq -c
100 dummy2
But it is later overwritten during path selection in fib6_select_path()
which instead chooses a nexthop according to the calculated hash.
Solve this by telling fib6_select_path() to skip path selection if we
have an oif match during output route lookup (iif being
LOOPBACK_IFINDEX).
Behavior after the change:
# sysctl -wq net.ipv6.conf.all.forwarding=1
# ip route add 2001:db8:10::/64 nexthop via fe80::1 dev dummy1 nexthop via fe80::2 dev dummy2
# for i in {1..100}; do ip route get 2001:db8:10::${i} from 2001:db8:2::1 oif dummy2; done | grep -o dummy[0-9] | sort | uniq -c
100 dummy2
Note that enabling forwarding is only needed because we did not add
neighbor entries for the gateway addresses. When forwarding is disabled
and CONFIG_IPV6_ROUTER_PREF is not enabled in kernel config, the kernel
will treat non-existing neighbor entries as errors and perform
round-robin between the nexthops:
# sysctl -wq net.ipv6.conf.all.forwarding=0
# for i in {1..100}; do ip route get 2001:db8:10::${i} from 2001:db8:2::1 oif dummy2; done | grep -o dummy[0-9] | sort | uniq -c
50 dummy1
50 dummy2
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
---
net/ipv6/route.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index aee6a10b112a..0795473ecd9b 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -2254,6 +2254,7 @@ struct rt6_info *ip6_pol_route(struct net *net, struct fib6_table *table,
{
struct fib6_result res = {};
struct rt6_info *rt = NULL;
+ bool have_oif_match;
int strict = 0;
WARN_ON_ONCE((flags & RT6_LOOKUP_F_DST_NOREF) &&
@@ -2270,7 +2271,9 @@ struct rt6_info *ip6_pol_route(struct net *net, struct fib6_table *table,
if (res.f6i == net->ipv6.fib6_null_entry)
goto out;
- fib6_select_path(net, &res, fl6, oif, false, skb, strict);
+ have_oif_match = fl6->flowi6_iif == LOOPBACK_IFINDEX &&
+ oif == res.nh->fib_nh_dev->ifindex;
+ fib6_select_path(net, &res, fl6, oif, have_oif_match, skb, strict);
/*Search through exception table */
rt = rt6_find_cached_rt(&res, &fl6->daddr, &fl6->saddr);
--
2.52.0
^ permalink raw reply related [flat|nested] 4+ messages in thread* [RFC PATCH net-next 2/2] selftests: fib_tests: Add test cases for route lookup with oif
2025-12-24 16:18 [RFC PATCH net-next 1/2] ipv6: Honor oif when choosing nexthop for locally generated traffic Ido Schimmel
@ 2025-12-24 16:18 ` Ido Schimmel
2026-01-06 17:59 ` David Ahern
0 siblings, 1 reply; 4+ messages in thread
From: Ido Schimmel @ 2025-12-24 16:18 UTC (permalink / raw)
To: netdev; +Cc: davem, kuba, pabeni, edumazet, dsahern, horms, Ido Schimmel
Test that both address families respect the oif parameter when a
matching multipath route is found, regardless of the presence of a
source address.
Output without "ipv6: Honor oif when choosing nexthop for locally
generated traffic":
# ./fib_tests.sh -t "ipv4_mpath_oif ipv6_mpath_oif"
IPv4 multipath oif test
TEST: IPv4 multipath via first nexthop [ OK ]
TEST: IPv4 multipath via second nexthop [ OK ]
TEST: IPv4 multipath via first nexthop with source address [ OK ]
TEST: IPv4 multipath via second nexthop with source address [ OK ]
IPv6 multipath oif test
TEST: IPv6 multipath via first nexthop [ OK ]
TEST: IPv6 multipath via second nexthop [ OK ]
TEST: IPv6 multipath via first nexthop with source address [FAIL]
TEST: IPv6 multipath via second nexthop with source address [FAIL]
Tests passed: 6
Tests failed: 2
Output with "ipv6: Honor oif when choosing nexthop for locally generated
traffic":
# ./fib_tests.sh -t "ipv4_mpath_oif ipv6_mpath_oif"
IPv4 multipath oif test
TEST: IPv4 multipath via first nexthop [ OK ]
TEST: IPv4 multipath via second nexthop [ OK ]
TEST: IPv4 multipath via first nexthop with source address [ OK ]
TEST: IPv4 multipath via second nexthop with source address [ OK ]
IPv6 multipath oif test
TEST: IPv6 multipath via first nexthop [ OK ]
TEST: IPv6 multipath via second nexthop [ OK ]
TEST: IPv6 multipath via first nexthop with source address [ OK ]
TEST: IPv6 multipath via second nexthop with source address [ OK ]
Tests passed: 8
Tests failed: 0
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
---
tools/testing/selftests/net/fib_tests.sh | 108 ++++++++++++++++++++++-
1 file changed, 107 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/net/fib_tests.sh b/tools/testing/selftests/net/fib_tests.sh
index a88f797c549a..8ae0adbcafe9 100755
--- a/tools/testing/selftests/net/fib_tests.sh
+++ b/tools/testing/selftests/net/fib_tests.sh
@@ -12,7 +12,7 @@ TESTS="unregister down carrier nexthop suppress ipv6_notify ipv4_notify \
ipv4_route_metrics ipv4_route_v6_gw rp_filter ipv4_del_addr \
ipv6_del_addr ipv4_mangle ipv6_mangle ipv4_bcast_neigh fib6_gc_test \
ipv4_mpath_list ipv6_mpath_list ipv4_mpath_balance ipv6_mpath_balance \
- fib6_ra_to_static"
+ ipv4_mpath_oif ipv6_mpath_oif fib6_ra_to_static"
VERBOSE=0
PAUSE_ON_FAIL=no
@@ -2776,6 +2776,110 @@ ipv6_mpath_balance_test()
forwarding_cleanup
}
+ipv4_mpath_oif_test_common()
+{
+ local get_param=$1; shift
+ local expected_oif=$1; shift
+ local test_name=$1; shift
+ local tmp_file
+
+ tmp_file=$(mktemp)
+
+ for i in {1..100}; do
+ $IP route get 203.0.113.${i} $get_param >> "$tmp_file"
+ done
+
+ [[ $(grep "$expected_oif" "$tmp_file" | wc -l) -eq 100 ]]
+ log_test $? 0 "$test_name"
+
+ rm "$tmp_file"
+}
+
+ipv4_mpath_oif_test()
+{
+ echo
+ echo "IPv4 multipath oif test"
+
+ setup
+
+ set -e
+ $IP link add dummy1 type dummy
+ $IP link set dev dummy1 up
+ $IP address add 192.0.2.1/28 dev dummy1
+ $IP address add 192.0.2.17/32 dev lo
+
+ $IP route add 203.0.113.0/24 \
+ nexthop via 198.51.100.2 dev dummy0 \
+ nexthop via 192.0.2.2 dev dummy1
+ set +e
+
+ ipv4_mpath_oif_test_common "oif dummy0" "dummy0" \
+ "IPv4 multipath via first nexthop"
+
+ ipv4_mpath_oif_test_common "oif dummy1" "dummy1" \
+ "IPv4 multipath via second nexthop"
+
+ ipv4_mpath_oif_test_common "oif dummy0 from 192.0.2.17" "dummy0" \
+ "IPv4 multipath via first nexthop with source address"
+
+ ipv4_mpath_oif_test_common "oif dummy1 from 192.0.2.17" "dummy1" \
+ "IPv4 multipath via second nexthop with source address"
+
+ cleanup
+}
+
+ipv6_mpath_oif_test_common()
+{
+ local get_param=$1; shift
+ local expected_oif=$1; shift
+ local test_name=$1; shift
+ local tmp_file
+
+ tmp_file=$(mktemp)
+
+ for i in {1..100}; do
+ $IP route get 2001:db8:10::${i} $get_param >> "$tmp_file"
+ done
+
+ [[ $(grep "$expected_oif" "$tmp_file" | wc -l) -eq 100 ]]
+ log_test $? 0 "$test_name"
+
+ rm "$tmp_file"
+}
+
+ipv6_mpath_oif_test()
+{
+ echo
+ echo "IPv6 multipath oif test"
+
+ setup
+
+ set -e
+ $IP link add dummy1 type dummy
+ $IP link set dev dummy1 up
+ $IP address add 2001:db8:2::1/64 dev dummy1
+ $IP address add 2001:db8:100::1/128 dev lo
+
+ $IP route add 2001:db8:10::/64 \
+ nexthop via 2001:db8:1::2 dev dummy0 \
+ nexthop via 2001:db8:2::2 dev dummy1
+ set +e
+
+ ipv6_mpath_oif_test_common "oif dummy0" "dummy0" \
+ "IPv6 multipath via first nexthop"
+
+ ipv6_mpath_oif_test_common "oif dummy1" "dummy1" \
+ "IPv6 multipath via second nexthop"
+
+ ipv6_mpath_oif_test_common "oif dummy0 from 2001:db8:100::1" "dummy0" \
+ "IPv6 multipath via first nexthop with source address"
+
+ ipv6_mpath_oif_test_common "oif dummy1 from 2001:db8:100::1" "dummy1" \
+ "IPv6 multipath via second nexthop with source address"
+
+ cleanup
+}
+
################################################################################
# usage
@@ -2861,6 +2965,8 @@ do
ipv6_mpath_list) ipv6_mpath_list_test;;
ipv4_mpath_balance) ipv4_mpath_balance_test;;
ipv6_mpath_balance) ipv6_mpath_balance_test;;
+ ipv4_mpath_oif) ipv4_mpath_oif_test;;
+ ipv6_mpath_oif) ipv6_mpath_oif_test;;
fib6_ra_to_static) fib6_ra_to_static;;
help) echo "Test names: $TESTS"; exit 0;;
--
2.52.0
^ permalink raw reply related [flat|nested] 4+ messages in thread* Re: [RFC PATCH net-next 2/2] selftests: fib_tests: Add test cases for route lookup with oif
2025-12-24 16:18 ` [RFC PATCH net-next 2/2] selftests: fib_tests: Add test cases for route lookup with oif Ido Schimmel
@ 2026-01-06 17:59 ` David Ahern
2026-01-07 12:51 ` Ido Schimmel
0 siblings, 1 reply; 4+ messages in thread
From: David Ahern @ 2026-01-06 17:59 UTC (permalink / raw)
To: Ido Schimmel, netdev; +Cc: davem, kuba, pabeni, edumazet, horms
On 12/24/25 9:18 AM, Ido Schimmel wrote:
> Test that both address families respect the oif parameter when a
> matching multipath route is found, regardless of the presence of a
> source address.
>
> Output without "ipv6: Honor oif when choosing nexthop for locally
> generated traffic":
>
> # ./fib_tests.sh -t "ipv4_mpath_oif ipv6_mpath_oif"
>
> IPv4 multipath oif test
> TEST: IPv4 multipath via first nexthop [ OK ]
> TEST: IPv4 multipath via second nexthop [ OK ]
> TEST: IPv4 multipath via first nexthop with source address [ OK ]
> TEST: IPv4 multipath via second nexthop with source address [ OK ]
>
> IPv6 multipath oif test
> TEST: IPv6 multipath via first nexthop [ OK ]
> TEST: IPv6 multipath via second nexthop [ OK ]
> TEST: IPv6 multipath via first nexthop with source address [FAIL]
> TEST: IPv6 multipath via second nexthop with source address [FAIL]
>
> Tests passed: 6
> Tests failed: 2
>
> Output with "ipv6: Honor oif when choosing nexthop for locally generated
> traffic":
>
> # ./fib_tests.sh -t "ipv4_mpath_oif ipv6_mpath_oif"
>
> IPv4 multipath oif test
> TEST: IPv4 multipath via first nexthop [ OK ]
> TEST: IPv4 multipath via second nexthop [ OK ]
> TEST: IPv4 multipath via first nexthop with source address [ OK ]
> TEST: IPv4 multipath via second nexthop with source address [ OK ]
>
> IPv6 multipath oif test
> TEST: IPv6 multipath via first nexthop [ OK ]
> TEST: IPv6 multipath via second nexthop [ OK ]
> TEST: IPv6 multipath via first nexthop with source address [ OK ]
> TEST: IPv6 multipath via second nexthop with source address [ OK ]
>
> Tests passed: 8
> Tests failed: 0
>
> Signed-off-by: Ido Schimmel <idosch@nvidia.com>
> ---
> tools/testing/selftests/net/fib_tests.sh | 108 ++++++++++++++++++++++-
> 1 file changed, 107 insertions(+), 1 deletion(-)
>
> diff --git a/tools/testing/selftests/net/fib_tests.sh b/tools/testing/selftests/net/fib_tests.sh
> index a88f797c549a..8ae0adbcafe9 100755
> --- a/tools/testing/selftests/net/fib_tests.sh
> +++ b/tools/testing/selftests/net/fib_tests.sh
> @@ -12,7 +12,7 @@ TESTS="unregister down carrier nexthop suppress ipv6_notify ipv4_notify \
> ipv4_route_metrics ipv4_route_v6_gw rp_filter ipv4_del_addr \
> ipv6_del_addr ipv4_mangle ipv6_mangle ipv4_bcast_neigh fib6_gc_test \
> ipv4_mpath_list ipv6_mpath_list ipv4_mpath_balance ipv6_mpath_balance \
> - fib6_ra_to_static"
> + ipv4_mpath_oif ipv6_mpath_oif fib6_ra_to_static"
>
> VERBOSE=0
> PAUSE_ON_FAIL=no
> @@ -2776,6 +2776,110 @@ ipv6_mpath_balance_test()
> forwarding_cleanup
> }
>
> +ipv4_mpath_oif_test_common()
> +{
> + local get_param=$1; shift
> + local expected_oif=$1; shift
> + local test_name=$1; shift
> + local tmp_file
> +
> + tmp_file=$(mktemp)
> +
> + for i in {1..100}; do
> + $IP route get 203.0.113.${i} $get_param >> "$tmp_file"
> + done
> +
> + [[ $(grep "$expected_oif" "$tmp_file" | wc -l) -eq 100 ]]
> + log_test $? 0 "$test_name"
> +
> + rm "$tmp_file"
> +}
> +
> +ipv4_mpath_oif_test()
> +{
> + echo
> + echo "IPv4 multipath oif test"
> +
> + setup
> +
> + set -e
> + $IP link add dummy1 type dummy
> + $IP link set dev dummy1 up
> + $IP address add 192.0.2.1/28 dev dummy1
> + $IP address add 192.0.2.17/32 dev lo
> +
> + $IP route add 203.0.113.0/24 \
> + nexthop via 198.51.100.2 dev dummy0 \
> + nexthop via 192.0.2.2 dev dummy1
> + set +e
> +
> + ipv4_mpath_oif_test_common "oif dummy0" "dummy0" \
> + "IPv4 multipath via first nexthop"
> +
> + ipv4_mpath_oif_test_common "oif dummy1" "dummy1" \
> + "IPv4 multipath via second nexthop"
> +
> + ipv4_mpath_oif_test_common "oif dummy0 from 192.0.2.17" "dummy0" \
> + "IPv4 multipath via first nexthop with source address"
> +
> + ipv4_mpath_oif_test_common "oif dummy1 from 192.0.2.17" "dummy1" \
> + "IPv4 multipath via second nexthop with source address"
> +
> + cleanup
> +}
> +
> +ipv6_mpath_oif_test_common()
> +{
> + local get_param=$1; shift
> + local expected_oif=$1; shift
> + local test_name=$1; shift
> + local tmp_file
> +
> + tmp_file=$(mktemp)
> +
> + for i in {1..100}; do
> + $IP route get 2001:db8:10::${i} $get_param >> "$tmp_file"
> + done
> +
> + [[ $(grep "$expected_oif" "$tmp_file" | wc -l) -eq 100 ]]
> + log_test $? 0 "$test_name"
> +
> + rm "$tmp_file"
> +}
> +
> +ipv6_mpath_oif_test()
> +{
> + echo
> + echo "IPv6 multipath oif test"
> +
> + setup
> +
> + set -e
> + $IP link add dummy1 type dummy
> + $IP link set dev dummy1 up
> + $IP address add 2001:db8:2::1/64 dev dummy1
> + $IP address add 2001:db8:100::1/128 dev lo
> +
> + $IP route add 2001:db8:10::/64 \
> + nexthop via 2001:db8:1::2 dev dummy0 \
> + nexthop via 2001:db8:2::2 dev dummy1
> + set +e
> +
> + ipv6_mpath_oif_test_common "oif dummy0" "dummy0" \
> + "IPv6 multipath via first nexthop"
> +
> + ipv6_mpath_oif_test_common "oif dummy1" "dummy1" \
> + "IPv6 multipath via second nexthop"
> +
> + ipv6_mpath_oif_test_common "oif dummy0 from 2001:db8:100::1" "dummy0" \
> + "IPv6 multipath via first nexthop with source address"
> +
> + ipv6_mpath_oif_test_common "oif dummy1 from 2001:db8:100::1" "dummy1" \
> + "IPv6 multipath via second nexthop with source address"
> +
> + cleanup
> +}
> +
> ################################################################################
> # usage
>
> @@ -2861,6 +2965,8 @@ do
> ipv6_mpath_list) ipv6_mpath_list_test;;
> ipv4_mpath_balance) ipv4_mpath_balance_test;;
> ipv6_mpath_balance) ipv6_mpath_balance_test;;
> + ipv4_mpath_oif) ipv4_mpath_oif_test;;
> + ipv6_mpath_oif) ipv6_mpath_oif_test;;
> fib6_ra_to_static) fib6_ra_to_static;;
>
> help) echo "Test names: $TESTS"; exit 0;;
if VRF versions of the test also pass, I am good with the proposed change.
^ permalink raw reply [flat|nested] 4+ messages in thread* Re: [RFC PATCH net-next 2/2] selftests: fib_tests: Add test cases for route lookup with oif
2026-01-06 17:59 ` David Ahern
@ 2026-01-07 12:51 ` Ido Schimmel
0 siblings, 0 replies; 4+ messages in thread
From: Ido Schimmel @ 2026-01-07 12:51 UTC (permalink / raw)
To: David Ahern; +Cc: netdev, davem, kuba, pabeni, edumazet, horms
On Tue, Jan 06, 2026 at 10:59:53AM -0700, David Ahern wrote:
> if VRF versions of the test also pass, I am good with the proposed change.
OK, thanks. I added VRF tests and they also pass. Will include them in
v2.
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2026-01-07 12:51 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-12-24 16:18 [RFC PATCH net-next 1/2] ipv6: Honor oif when choosing nexthop for locally generated traffic Ido Schimmel
2025-12-24 16:18 ` [RFC PATCH net-next 2/2] selftests: fib_tests: Add test cases for route lookup with oif Ido Schimmel
2026-01-06 17:59 ` David Ahern
2026-01-07 12:51 ` Ido Schimmel
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox