From: Ido Schimmel <idosch@nvidia.com>
To: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
Cc: netdev@vger.kernel.org, davem@davemloft.net, kuba@kernel.org,
edumazet@google.com, pabeni@redhat.com, dsahern@kernel.org,
horms@kernel.org, kuniyu@amazon.com,
Willem de Bruijn <willemb@google.com>
Subject: Re: [PATCH net-next v2 2/3] ip: load balance tcp connections to single dst addr and port
Date: Fri, 25 Apr 2025 18:14:53 +0300 [thread overview]
Message-ID: <aAum7Wj5rRuwzGUp@shredder> (raw)
In-Reply-To: <20250424143549.669426-3-willemdebruijn.kernel@gmail.com>
On Thu, Apr 24, 2025 at 10:35:19AM -0400, Willem de Bruijn wrote:
> From: Willem de Bruijn <willemb@google.com>
>
> Load balance new TCP connections across nexthops also when they
> connect to the same service at a single remote address and port.
>
> This affects only port-based multipath hashing:
> fib_multipath_hash_policy 1 or 3.
>
> Local connections must choose both a source address and port when
> connecting to a remote service, in ip_route_connect. This
> "chicken-and-egg problem" (commit 2d7192d6cbab ("ipv4: Sanitize and
> simplify ip_route_{connect,newports}()")) is resolved by first
> selecting a source address, by looking up a route using the zero
> wildcard source port and address.
>
> As a result multiple connections to the same destination address and
> port have no entropy in fib_multipath_hash.
>
> This is not a problem when forwarding, as skb-based hashing has a
> 4-tuple. Nor when establishing UDP connections, as autobind there
> selects a port before reaching ip_route_connect.
>
> Load balance also TCP, by using a random port in fib_multipath_hash.
> Port assignment in inet_hash_connect is not atomic with
> ip_route_connect. Thus ports are unpredictable, effectively random.
>
> Implementation details:
>
> Do not actually pass a random fl4_sport, as that affects not only
> hashing, but routing more broadly, and can match a source port based
> policy route, which existing wildcard port 0 will not. Instead,
> define a new wildcard flowi flag that is used only for hashing.
>
> Selecting a random source is equivalent to just selecting a random
> hash entirely. But for code clarity, follow the normal 4-tuple hash
> process and only update this field.
>
> fib_multipath_hash can be reached with zero sport from other code
> paths, so explicitly pass this flowi flag, rather than trying to infer
> this case in the function itself.
>
> Signed-off-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
next prev parent reply other threads:[~2025-04-25 15:15 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-04-24 14:35 [PATCH net-next v2 0/3] ip: improve tcp sock multipath routing Willem de Bruijn
2025-04-24 14:35 ` [PATCH net-next v2 1/3] ipv4: prefer multipath nexthop that matches source address Willem de Bruijn
2025-04-24 16:47 ` Eric Dumazet
2025-04-25 14:59 ` Ido Schimmel
2025-04-26 15:01 ` Willem de Bruijn
2025-04-27 17:30 ` Ido Schimmel
2025-04-28 16:26 ` Willem de Bruijn
2025-04-29 7:54 ` Ido Schimmel
2025-04-24 14:35 ` [PATCH net-next v2 2/3] ip: load balance tcp connections to single dst addr and port Willem de Bruijn
2025-04-24 16:05 ` David Ahern
2025-04-24 16:51 ` Eric Dumazet
2025-04-25 15:14 ` Ido Schimmel [this message]
2025-04-24 14:35 ` [PATCH net-next v2 3/3] selftests/net: test tcp connection load balancing Willem de Bruijn
2025-04-25 15:47 ` Ido Schimmel
2025-04-29 14:40 ` [PATCH net-next v2 0/3] ip: improve tcp sock multipath routing patchwork-bot+netdevbpf
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aAum7Wj5rRuwzGUp@shredder \
--to=idosch@nvidia.com \
--cc=davem@davemloft.net \
--cc=dsahern@kernel.org \
--cc=edumazet@google.com \
--cc=horms@kernel.org \
--cc=kuba@kernel.org \
--cc=kuniyu@amazon.com \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=willemb@google.com \
--cc=willemdebruijn.kernel@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.