netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ido Schimmel <idosch@idosch.org>
To: Dmitry <demetriousz@proton.me>, willemdebruijn.kernel@gmail.com
Cc: "David S. Miller" <davem@davemloft.net>,
	David Ahern <dsahern@kernel.org>,
	Eric Dumazet <edumazet@google.com>,
	Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
	Simon Horman <horms@kernel.org>,
	netdev@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH net-next] net: ipv6: respect route prfsrc and fill empty saddr before ECMP hash
Date: Tue, 7 Oct 2025 20:04:01 +0300	[thread overview]
Message-ID: <aOVIAWAxpWto8ETd@shredder> (raw)
In-Reply-To: <MZruGuax8jyrCcZTXAVhH0AaAMOZ-2Gcj5VeZO8xy8wS9FqwA3EMhPFpHLZs67FAKCu6z3GpEVeArSX2qGdSUqsysI-0o13dKK1ZmUhK_l0=@proton.me>

On Mon, Oct 06, 2025 at 06:31:10PM +0000, Dmitry wrote:
> If the 5-tuple is not changed, then both the hash and the outgoing interface
> (OIF) should remain consistent, which is not the case. Only with the fix does it
> respect the configured SRC and produce a consistent, correct 5-tuple with the
> proper hash.
> 
> Therefore, in my opinion, this should be fixed.

Note that even if the hash is consistent throughout the lifetime of the
socket, it is still possible for packets to be routed out of different
interfaces. This can happen, for example, if one of the nexthop devices
loses its carrier. This will change the hash thresholds in the ECMP
group and can cause packets to egress a different interface even if the
current one is not the one that went down. Obviously packets can also
change paths due to changes in other routers between you and the
destination. A network design that results in connections being severed
every time a flow is routed differently seems fragile to me.

If you still want to address the issue, then I believe that the correct
way to do it would be to align tcp_v6_connect() with tcp_v4_connect().
I'm not sure why they differ, but the IPv4 version will first do a route
lookup to determine the source address, then allocate a source port and
only when all the parameters are known it will do a final route lookup
and cache the result in the socket. IPv6 on the other hand, does a
single route lookup with an unknown source address and an unknown source
port.

This is explained in the comment above ip_route_connect_init() and
Willem also explained it here:

https://lore.kernel.org/all/20250424143549.669426-2-willemdebruijn.kernel@gmail.com/

Willem, do you happen to know why tcp_v6_connect() only performs a
single route lookup?

Link to the original patch:

https://lore.kernel.org/netdev/20251005-ipv6-set-saddr-to-prefsrc-before-hash-to-stabilize-ecmp-v1-1-d43b6ef00035@proton.me/

Thanks

  reply	other threads:[~2025-10-07 17:04 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-05 20:49 [PATCH net-next] net: ipv6: respect route prfsrc and fill empty saddr before ECMP hash Dmitry Z via B4 Relay
2025-10-06 13:30 ` Ido Schimmel
2025-10-06 18:31   ` Dmitry
2025-10-07 17:04     ` Ido Schimmel [this message]
2025-10-07 22:25       ` Willem de Bruijn
2025-10-08  6:57         ` Ido Schimmel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aOVIAWAxpWto8ETd@shredder \
    --to=idosch@idosch.org \
    --cc=davem@davemloft.net \
    --cc=demetriousz@proton.me \
    --cc=dsahern@kernel.org \
    --cc=edumazet@google.com \
    --cc=horms@kernel.org \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=willemdebruijn.kernel@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).