From: Mat Martineau <mathew.j.martineau@linux.intel.com>
To: Paolo Abeni <pabeni@redhat.com>
Cc: Florian Westphal <fw@strlen.de>,
Geliang Tang <geliangtang@gmail.com>,
mptcp@lists.linux.dev
Subject: Re: [MPTCP][PATCH v6 mptcp-next 0/6] add MP_CAPABLE 'C' flag
Date: Wed, 19 May 2021 11:06:01 -0700 (PDT) [thread overview]
Message-ID: <56ee8dc0-e1b1-84c1-acbd-154cc24ee098@linux.intel.com> (raw)
In-Reply-To: <ef31d1b4b0456a02f0e8515c449ea4fd4e7c1611.camel@redhat.com>
On Wed, 19 May 2021, Paolo Abeni wrote:
> On Tue, 2021-05-18 at 12:07 -0700, Mat Martineau wrote:
>> On Tue, 18 May 2021, Paolo Abeni wrote:
>>> The in-kernel path manager is want to be as simple as possible. As soon
>>> as we start allocating memory on peer req it's very easy to be prone to
>>> DoS. Memory accounting and limits could save us or at least mitigate
>>> the problem, but they will add more complexity.
>>
>> My assumption was that high-connection-count servers would probably have
>> their limits set in a way that ADD_ADDRs would not be accepted anyway.
>> This is the use case where clients may be behind a NAT and the server is
>> accepting MP_JOINs from the client (but the server is never sending
>> MP_JOINs), and the server is possibly sending ADD_ADDRs to the client. The
>> configurable limit is already there from the existing in-kernel PM code.
>>
>> I do agree with "as simple as possible", but the RFC requirements are
>> constraining that. Even though the C bit is ignored in Linux
>> implementations so far, I don't think it should stay that way. The current
>> code always tries to connect to the initial subflow, so we need to do
>> *something* different to remember an alternate peer address. There are
>> other options (like remembering only the most recent ADD_ADDR), but given
>> the configurable limits we have it doesn't seem that different
>> memory-wise. What's important is implementing this RFC functionality well,
>> so it's really helpful to explore the alternatives here - and your
>> understanding of the resource issues and DoS tradeoffs is very useful.
>
> Uhmmm... I start feeling I misunderstand the RFC WRT to 'C' bit
> handling.
>
> """
> The third bit, labeled "C", is set to 1 to indicate
> that the sender of this option will not accept
> additional MPTCP subflows to the source address and
> port, and therefore the receiver MUST NOT try to open
> any additional subflows toward this address and port.
> """
>
> To avoid creating additional subflows towards the MPC server
> address/port, we don't need to store any additional information beyond
> the 'C' flag: net-next uses that pair only for new subflow with
> 'subflow' type (yep, the type name is not very self-explaining :(
>
> We just need to forbit them:
>
> ---
> diff --git a/net/mptcp/pm_netlink.c b/net/mptcp/pm_netlink.c
> index d094588afad8..2ae9fac00623 100644
> --- a/net/mptcp/pm_netlink.c
> +++ b/net/mptcp/pm_netlink.c
> @@ -451,7 +451,7 @@ static void mptcp_pm_create_subflow_or_signal_addr(struct mptcp_sock *msk)
>
> /* check if should create a new subflow */
> if (msk->pm.local_addr_used < local_addr_max &&
> - msk->pm.subflows < subflows_max) {
> + msk->pm.subflows < subflows_max && !msk->pm.remote_deny_join_id0) {
> local = select_local_address(pernet, msk);
> if (local) {
> struct mptcp_addr_info remote = { 0 };
>
> ---
>
>>> Just does not seam worthy to me, I really think we should start with
>>> the bare minumum to respect the RFC itself, avoiding as much complexity
>>> as possible:
>>> - setting C always to 0 on server
>>> - when the peer send C==1, cache the MPC addr/port, and do not attempt
>>> other connection to such endpoint.
>>>
>>> I'm not even sure the latter step is strictly necessary.
>>
>> For the former, it is certainly simpler to always send C==0, but sending
>> C==1 from a high-connection-count server behind a load balancer is the
>> primary use case.
>
> I *think* it would be far better (and possibly not that complex!!!
> @Florian: WDYT? ) implementing an MPTCP aware load balancer. The MPJ
> syn pkt carries the remote token, so it's possible for the load
> balancer to make the "correct" decision.
>
>> I think the "MUST NOT" wording of the RFC does require the latter step
>> (server or client), and that implies a minimally functional in-kernel PM
>> would need to track some other address to connect to.
>
> This part is unclear to me. Why should the kernel cache more
> addresses?
>
> It's already capabale of using as many ADD_ADDRs as the peer provides
> with the current options handling. Sure, there are some constraints.
> The stricter one posed by the current implementation is that the kernel
> fully processes (up to subflow creation) a given ADD_ADDR option before
> using another one.
>
> That is useful to avoid transforming an MPTCP client in a DDOS tool
> controlled by an (evil) MPTCP server, and sounds like a reasonable
> tread-off between complexity and flexibility.
>
> Looks like we have some topic for tomorrow's mtg :)
>
Yeah, I think it will be far easier to discuss in that format!
--
Mat Martineau
Intel
prev parent reply other threads:[~2021-05-19 18:06 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-05-14 14:32 [MPTCP][PATCH v6 mptcp-next 0/6] add MP_CAPABLE 'C' flag Geliang Tang
2021-05-14 14:32 ` [MPTCP][PATCH v6 mptcp-next 1/6] mptcp: add sysctl allow_join_initial_addr_port Geliang Tang
2021-05-14 14:32 ` [MPTCP][PATCH v6 mptcp-next 2/6] mptcp: add allow_join_id0 in mptcp_out_options Geliang Tang
2021-05-14 14:32 ` [MPTCP][PATCH v6 mptcp-next 3/6] mptcp: rename mptcp_pm_add_entry to mptcp_pm_anno_entry Geliang Tang
2021-05-14 14:32 ` [MPTCP][PATCH v6 mptcp-next 4/6] mptcp: add add_list in mptcp_pm_data Geliang Tang
2021-05-14 14:32 ` [MPTCP][PATCH v6 mptcp-next 5/6] mptcp: add deny_join_id0 in mptcp_options_received Geliang Tang
2021-05-14 14:32 ` [MPTCP][PATCH v6 mptcp-next 6/6] selftests: mptcp: add deny_join_id0 testcases Geliang Tang
2021-05-20 22:50 ` [MPTCP][PATCH v6 mptcp-next 4/6] mptcp: add add_list in mptcp_pm_data Mat Martineau
2021-05-14 23:36 ` [MPTCP][PATCH v6 mptcp-next 0/6] add MP_CAPABLE 'C' flag Mat Martineau
2021-05-17 14:17 ` Paolo Abeni
2021-05-17 19:59 ` Mat Martineau
2021-05-18 10:11 ` Geliang Tang
2021-05-18 13:56 ` Paolo Abeni
2021-05-18 19:07 ` Mat Martineau
2021-05-19 9:51 ` Paolo Abeni
2021-05-19 18:06 ` Mat Martineau [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=56ee8dc0-e1b1-84c1-acbd-154cc24ee098@linux.intel.com \
--to=mathew.j.martineau@linux.intel.com \
--cc=fw@strlen.de \
--cc=geliangtang@gmail.com \
--cc=mptcp@lists.linux.dev \
--cc=pabeni@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox