From: Benjamin Poirier <bpoirier@nvidia.com>
To: David Ahern <dsahern@kernel.org>,
"David S. Miller" <davem@davemloft.net>,
Eric Dumazet <edumazet@google.com>,
Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
Shuah Khan <shuah@kernel.org>
Cc: "netdev@vger.kernel.org" <netdev@vger.kernel.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"linux-kselftest@vger.kernel.org"
<linux-kselftest@vger.kernel.org>,
Ido Schimmel <idosch@nvidia.com>
Subject: [PATCH net-next v2 3/4] nexthop: Do not return invalid nexthop object during multipath selection
Date: Wed, 19 Jul 2023 13:57:22 +0000 [thread overview]
Message-ID: <20230719-nh_select-v2-3-04383e89f868@nvidia.com> (raw)
In-Reply-To: <20230719-nh_select-v2-0-04383e89f868@nvidia.com>
With legacy nexthops, when net.ipv4.fib_multipath_use_neigh is set,
fib_select_multipath() will never set res->nhc to a nexthop that is not
good (as per fib_good_nh()). OTOH, with nexthop objects,
nexthop_select_path_hthr() may return a nexthop that failed the
nexthop_is_good_nh() test even if there was one that passed. Refactor
nexthop_select_path_hthr() to follow a selection logic more similar to
fib_select_multipath().
The issue can be demonstrated with the following sequence of commands. The
first block shows that things work as expected with legacy nexthops. The
last sequence of `ip rou get` in the second block shows the problem case -
some routes still use the .2 nexthop.
sysctl net.ipv4.fib_multipath_use_neigh=1
ip link add dummy1 up type dummy
ip rou add 198.51.100.0/24 nexthop via 192.0.2.1 dev dummy1 onlink nexthop via 192.0.2.2 dev dummy1 onlink
for i in {10..19}; do ip -o rou get 198.51.100.$i; done
ip neigh add 192.0.2.1 dev dummy1 nud failed
echo ".1 failed:" # results should not use .1
for i in {10..19}; do ip -o rou get 198.51.100.$i; done
ip neigh del 192.0.2.1 dev dummy1
ip neigh add 192.0.2.2 dev dummy1 nud failed
echo ".2 failed:" # results should not use .2
for i in {10..19}; do ip -o rou get 198.51.100.$i; done
ip link del dummy1
ip link add dummy1 up type dummy
ip nexthop add id 1 via 192.0.2.1 dev dummy1 onlink
ip nexthop add id 2 via 192.0.2.2 dev dummy1 onlink
ip nexthop add id 1001 group 1/2
ip rou add 198.51.100.0/24 nhid 1001
for i in {10..19}; do ip -o rou get 198.51.100.$i; done
ip neigh add 192.0.2.1 dev dummy1 nud failed
echo ".1 failed:" # results should not use .1
for i in {10..19}; do ip -o rou get 198.51.100.$i; done
ip neigh del 192.0.2.1 dev dummy1
ip neigh add 192.0.2.2 dev dummy1 nud failed
echo ".2 failed:" # results should not use .2
for i in {10..19}; do ip -o rou get 198.51.100.$i; done
ip link del dummy1
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Benjamin Poirier <bpoirier@nvidia.com>
---
net/ipv4/nexthop.c | 14 ++++++++------
1 file changed, 8 insertions(+), 6 deletions(-)
diff --git a/net/ipv4/nexthop.c b/net/ipv4/nexthop.c
index c12acbf39659..93f14d39fef6 100644
--- a/net/ipv4/nexthop.c
+++ b/net/ipv4/nexthop.c
@@ -1194,20 +1194,22 @@ static struct nexthop *nexthop_select_path_hthr(struct nh_group *nhg, int hash)
for (i = 0; i < nhg->num_nh; ++i) {
struct nh_grp_entry *nhge = &nhg->nh_entries[i];
- if (hash > atomic_read(&nhge->hthr.upper_bound))
- continue;
-
/* nexthops always check if it is good and does
* not rely on a sysctl for this behavior
*/
- if (nexthop_is_good_nh(nhge->nh))
- return nhge->nh;
+ if (!nexthop_is_good_nh(nhge->nh))
+ continue;
if (!rc)
rc = nhge->nh;
+
+ if (hash > atomic_read(&nhge->hthr.upper_bound))
+ continue;
+
+ return nhge->nh;
}
- return rc;
+ return rc ? : nhg->nh_entries[0].nh;
}
static struct nexthop *nexthop_select_path_res(struct nh_group *nhg, int hash)
--
2.40.1
next prev parent reply other threads:[~2023-07-19 13:57 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-07-19 13:57 [PATCH net-next v2 0/4] nexthop: Refactor and fix nexthop selection for multipath routes Benjamin Poirier
2023-07-19 13:57 ` [PATCH net-next v2 1/4] nexthop: Factor out hash threshold fdb nexthop selection Benjamin Poirier
2023-07-19 13:57 ` [PATCH net-next v2 2/4] nexthop: Factor out neighbor validity check Benjamin Poirier
2023-07-19 13:57 ` Benjamin Poirier [this message]
2023-07-20 15:18 ` [PATCH net-next v2 3/4] nexthop: Do not return invalid nexthop object during multipath selection David Ahern
2023-07-19 13:57 ` [PATCH net-next v2 4/4] selftests: net: Add test cases for nexthop groups with invalid neighbors Benjamin Poirier
2023-07-21 3:30 ` [PATCH net-next v2 0/4] nexthop: Refactor and fix nexthop selection for multipath routes patchwork-bot+netdevbpf
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230719-nh_select-v2-3-04383e89f868@nvidia.com \
--to=bpoirier@nvidia.com \
--cc=davem@davemloft.net \
--cc=dsahern@kernel.org \
--cc=edumazet@google.com \
--cc=idosch@nvidia.com \
--cc=kuba@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-kselftest@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=shuah@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox