From: Mat Martineau <mathew.j.martineau@linux.intel.com>
To: Paolo Abeni <pabeni@redhat.com>
Cc: Florian Westphal <fw@strlen.de>,
Geliang Tang <geliangtang@gmail.com>,
mptcp@lists.linux.dev
Subject: Re: [MPTCP][PATCH v6 mptcp-next 0/6] add MP_CAPABLE 'C' flag
Date: Wed, 19 May 2021 11:06:01 -0700 (PDT) [thread overview]
Message-ID: <56ee8dc0-e1b1-84c1-acbd-154cc24ee098@linux.intel.com> (raw)
In-Reply-To: <ef31d1b4b0456a02f0e8515c449ea4fd4e7c1611.camel@redhat.com>
On Wed, 19 May 2021, Paolo Abeni wrote:
> On Tue, 2021-05-18 at 12:07 -0700, Mat Martineau wrote:
>> On Tue, 18 May 2021, Paolo Abeni wrote:
>>> The in-kernel path manager is want to be as simple as possible. As soon
>>> as we start allocating memory on peer req it's very easy to be prone to
>>> DoS. Memory accounting and limits could save us or at least mitigate
>>> the problem, but they will add more complexity.
>>
>> My assumption was that high-connection-count servers would probably have
>> their limits set in a way that ADD_ADDRs would not be accepted anyway.
>> This is the use case where clients may be behind a NAT and the server is
>> accepting MP_JOINs from the client (but the server is never sending
>> MP_JOINs), and the server is possibly sending ADD_ADDRs to the client. The
>> configurable limit is already there from the existing in-kernel PM code.
>>
>> I do agree with "as simple as possible", but the RFC requirements are
>> constraining that. Even though the C bit is ignored in Linux
>> implementations so far, I don't think it should stay that way. The current
>> code always tries to connect to the initial subflow, so we need to do
>> *something* different to remember an alternate peer address. There are
>> other options (like remembering only the most recent ADD_ADDR), but given
>> the configurable limits we have it doesn't seem that different
>> memory-wise. What's important is implementing this RFC functionality well,
>> so it's really helpful to explore the alternatives here - and your
>> understanding of the resource issues and DoS tradeoffs is very useful.
>
> Uhmmm... I start feeling I misunderstand the RFC WRT to 'C' bit
> handling.
>
> """
> The third bit, labeled "C", is set to 1 to indicate
> that the sender of this option will not accept
> additional MPTCP subflows to the source address and
> port, and therefore the receiver MUST NOT try to open
> any additional subflows toward this address and port.
> """
>
> To avoid creating additional subflows towards the MPC server
> address/port, we don't need to store any additional information beyond
> the 'C' flag: net-next uses that pair only for new subflow with
> 'subflow' type (yep, the type name is not very self-explaining :(
>
> We just need to forbit them:
>
> ---
> diff --git a/net/mptcp/pm_netlink.c b/net/mptcp/pm_netlink.c
> index d094588afad8..2ae9fac00623 100644
> --- a/net/mptcp/pm_netlink.c
> +++ b/net/mptcp/pm_netlink.c
> @@ -451,7 +451,7 @@ static void mptcp_pm_create_subflow_or_signal_addr(struct mptcp_sock *msk)
>
> /* check if should create a new subflow */
> if (msk->pm.local_addr_used < local_addr_max &&
> - msk->pm.subflows < subflows_max) {
> + msk->pm.subflows < subflows_max && !msk->pm.remote_deny_join_id0) {
> local = select_local_address(pernet, msk);
> if (local) {
> struct mptcp_addr_info remote = { 0 };
>
> ---
>
>>> Just does not seam worthy to me, I really think we should start with
>>> the bare minumum to respect the RFC itself, avoiding as much complexity
>>> as possible:
>>> - setting C always to 0 on server
>>> - when the peer send C==1, cache the MPC addr/port, and do not attempt
>>> other connection to such endpoint.
>>>
>>> I'm not even sure the latter step is strictly necessary.
>>
>> For the former, it is certainly simpler to always send C==0, but sending
>> C==1 from a high-connection-count server behind a load balancer is the
>> primary use case.
>
> I *think* it would be far better (and possibly not that complex!!!
> @Florian: WDYT? ) implementing an MPTCP aware load balancer. The MPJ
> syn pkt carries the remote token, so it's possible for the load
> balancer to make the "correct" decision.
>
>> I think the "MUST NOT" wording of the RFC does require the latter step
>> (server or client), and that implies a minimally functional in-kernel PM
>> would need to track some other address to connect to.
>
> This part is unclear to me. Why should the kernel cache more
> addresses?
>
> It's already capabale of using as many ADD_ADDRs as the peer provides
> with the current options handling. Sure, there are some constraints.
> The stricter one posed by the current implementation is that the kernel
> fully processes (up to subflow creation) a given ADD_ADDR option before
> using another one.
>
> That is useful to avoid transforming an MPTCP client in a DDOS tool
> controlled by an (evil) MPTCP server, and sounds like a reasonable
> tread-off between complexity and flexibility.
>
> Looks like we have some topic for tomorrow's mtg :)
>
Yeah, I think it will be far easier to discuss in that format!
--
Mat Martineau
Intel
prev parent reply other threads:[~2021-05-19 18:06 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-05-14 14:32 [MPTCP][PATCH v6 mptcp-next 0/6] add MP_CAPABLE 'C' flag Geliang Tang
2021-05-14 14:32 ` [MPTCP][PATCH v6 mptcp-next 1/6] mptcp: add sysctl allow_join_initial_addr_port Geliang Tang
2021-05-14 14:32 ` [MPTCP][PATCH v6 mptcp-next 2/6] mptcp: add allow_join_id0 in mptcp_out_options Geliang Tang
2021-05-14 14:32 ` [MPTCP][PATCH v6 mptcp-next 3/6] mptcp: rename mptcp_pm_add_entry to mptcp_pm_anno_entry Geliang Tang
2021-05-14 14:32 ` [MPTCP][PATCH v6 mptcp-next 4/6] mptcp: add add_list in mptcp_pm_data Geliang Tang
2021-05-14 14:32 ` [MPTCP][PATCH v6 mptcp-next 5/6] mptcp: add deny_join_id0 in mptcp_options_received Geliang Tang
2021-05-14 14:32 ` [MPTCP][PATCH v6 mptcp-next 6/6] selftests: mptcp: add deny_join_id0 testcases Geliang Tang
2021-05-20 22:50 ` [MPTCP][PATCH v6 mptcp-next 4/6] mptcp: add add_list in mptcp_pm_data Mat Martineau
2021-05-14 23:36 ` [MPTCP][PATCH v6 mptcp-next 0/6] add MP_CAPABLE 'C' flag Mat Martineau
2021-05-17 14:17 ` Paolo Abeni
2021-05-17 19:59 ` Mat Martineau
2021-05-18 10:11 ` Geliang Tang
2021-05-18 13:56 ` Paolo Abeni
2021-05-18 19:07 ` Mat Martineau
2021-05-19 9:51 ` Paolo Abeni
2021-05-19 18:06 ` Mat Martineau [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=56ee8dc0-e1b1-84c1-acbd-154cc24ee098@linux.intel.com \
--to=mathew.j.martineau@linux.intel.com \
--cc=fw@strlen.de \
--cc=geliangtang@gmail.com \
--cc=mptcp@lists.linux.dev \
--cc=pabeni@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.