From: Florian Westphal <fw@strlen.de>
To: Sriram Yagnaraman <sriram.yagnaraman@est.tech>
Cc: Florian Westphal <fw@strlen.de>,
"netfilter-devel@vger.kernel.org"
<netfilter-devel@vger.kernel.org>,
kernel test robot <lkp@intel.com>,
"claudio.porfiri@ericsson.com" <claudio.porfiri@ericsson.com>
Subject: Re: [PATCH v2 1/2] netfilter: conntrack: introduce no_random_port proc entry
Date: Wed, 2 Nov 2022 15:00:25 +0100 [thread overview]
Message-ID: <20221102140025.GF5040@breakpoint.cc> (raw)
In-Reply-To: <7c24bfe4-94be-6eab-d30a-6dc0500652da@est.tech>
Sriram Yagnaraman <sriram.yagnaraman@est.tech> wrote:
> On 2022-10-31 09:38, Florian Westphal wrote:
>
> > sriram.yagnaraman@est.tech <sriram.yagnaraman@est.tech> wrote:
> >> From: Sriram Yagnaraman <sriram.yagnaraman@est.tech>
> >>
> >> This patch introduces a new proc entry to disable source port
> >> randomization for SCTP connections.
> > Hmm. Can you elaborate? The sport is never randomized, unless either
> > 1. User explicitly requested it via "random" flag passed to snat rule, or
> > 2. the is an existing connection, using the *same* sport:saddr -> daddr:dport
> > quadruple as the new request.
> >
> > In 2), this new toggle prevents communication. So I wonder why ...
>
> Thank you so much for the detailed review comments.
>
> My use case for this flag originates from a deployment of SCTP client
> endpoints on docker/kubernetes environments, where typically there exists
> SNAT rules for the endpoints on egress. The *user* in this case are the
> CNI plugins that configure the SNAT rules, and some of the most common
> plugins use --random-fully regardless of the protocol.
>
> Consider an SCTP association A -> B, which has two paths via NAT A and B
> A: 1.2.3.4:12345
> B: 5.6.7.8/9:42
> NAT A: 1.2.31.4 (used for path towards 5.6.7.8)
> NAT B: 1.2.32.4 (used for path towards 5.6.7.9)
>
> ┌───────┐ ┌───┐
> ┌──► NAT A ├───► │
> ┌─────┐ │ └───────┘ │ │
> │ A ├───┤ │ B │
> └─────┘ │ ┌───────┐ │ │
> └──► NAT B ├───► │
> └───────┘ └───┘
>
> Let us assume in NAT A (1.2.31.4), the connections is setup as
> ORIGINAL TUPLE REPLY TUPLE
> 1.2.3.4:12345 -> 5.6.7.8:42, 5.6.7.8.42 -> 1.2.31.4:33333
>
> Let us assume in NAT B (1.2.32.4), the connections is setup as
> ORIGINAL TUPLE REPLY TUPLE
> 1.2.3.4:12345 -> 5.6.7.9:42, 5.6.7.8.42 -> 1.2.32.4:44444
>
> Since the port numbers are different when viewed from B, the association
> will not become multihomed, with only the primary path being active.
> Moreover, on a NAT/middlebox restart, we will end up getting new ports.
>
> I understand this is a problem in the way SNAT rules are configured, my
> proposal was to have this flag as a means of preventing such a problem
> even if the user wanted to.
Ugh, sorry, but that sounds just wrong.
> >> As specified in RFC9260 all transport addresses used by an SCTP endpoint
> >> MUST use the same port number but can use multiple IP addresses. That
> >> means that all paths taken within an SCTP association should have the
> >> same port even if they pass through different NAT/middleboxes in the
> >> network.
Hmm, I don't understand WHY this requirement exists, since endpoints
cannot control source port (or source address) seen by the peer;
NAT won't go away.
I read that snippet several times and its not clear to me if
"port number" refers to sport or dport. Dport would make sense to me,
but sport...? No, not really.
Won't the endpoints notice that the path is down and re-create the flow?
AFAIU the root cause of your problem is that:
1. NAT middleboxes remap source port AND
2. NAT middleboxes restart frequently
... so fixing either 1 or 2 would avoid the problem.
I don't think adding sysctls to override 1) is a sane option.
> Since the flag is optional, the idea is to enable it only on hosts that
> are part of docker/kubernetes environments and use NAT in their datapath.
We can't fix the ruleset but we can somehow cure it via sysctl in each netns?
I don't like this.
NAT middlebox restart with --random is a problem in any case, not just
for SCTP, because the chosen "random port" is lost.
I don't see a way to fix this, unless NOT using --random mode.
If connection is subject to sequence number rewrite (for tcp)
the connection won't survive either as the sejadj state is lost.
next prev parent reply other threads:[~2022-11-02 14:00 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-10-30 12:25 [PATCH v2 0/2] netfilter: conntrack: improve SCTP multihoming sriram.yagnaraman
2022-10-30 12:25 ` [PATCH v2 1/2] netfilter: conntrack: introduce no_random_port proc entry sriram.yagnaraman
2022-10-31 8:38 ` Florian Westphal
2022-10-31 18:41 ` Sriram Yagnaraman
2022-11-02 14:00 ` Florian Westphal [this message]
2022-11-03 20:02 ` Sriram Yagnaraman
2022-11-21 11:24 ` Marcelo Ricardo Leitner
2022-10-30 12:25 ` [PATCH v2 2/2] netfilter: conntrack: add sctp DATA_SENT state sriram.yagnaraman
2022-11-02 14:02 ` Florian Westphal
2022-11-21 11:20 ` Marcelo Ricardo Leitner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20221102140025.GF5040@breakpoint.cc \
--to=fw@strlen.de \
--cc=claudio.porfiri@ericsson.com \
--cc=lkp@intel.com \
--cc=netfilter-devel@vger.kernel.org \
--cc=sriram.yagnaraman@est.tech \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).