From: Paolo Abeni <pabeni@redhat.com>
To: Eric Dumazet <edumazet@google.com>
Cc: netdev@vger.kernel.org, mptcp@lists.linux.dev,
linux-kernel@vger.kernel.org,
Neal Cardwell <ncardwell@google.com>,
Kuniyuki Iwashima <kuniyu@google.com>,
"Matthieu Baerts (NGI0)" <matttbe@kernel.org>,
Mat Martineau <martineau@kernel.org>,
Geliang Tang <geliang@kernel.org>,
"David S. Miller" <davem@davemloft.net>,
Jakub Kicinski <kuba@kernel.org>, Simon Horman <horms@kernel.org>
Subject: Re: [PATCH net-next v2 05/15] tcp: allow mptcp to drop TS for some packets
Date: Thu, 11 Jun 2026 09:39:21 +0200 [thread overview]
Message-ID: <fb96d949-dcf5-494d-8076-eb4a972fa5e4@redhat.com> (raw)
In-Reply-To: <20260605-net-next-mptcp-add-addr6-port-ts-v2-5-758e7ca73f4d@kernel.org>
Hi Eric,
On 6/5/26 11:21 AM, Matthieu Baerts (NGI0) wrote:
> With TCP-timestamps (padded) taking 12 bytes and ADD_ADDR IPv6 + port
> taking 30 bytes, the 40-byte limit for the TCP options is reached. In
> this case, it is then not possible to send the address signal.
>
> The idea is to let MPTCP dropping the TCP-timestamps option for some
> specific packets, to be able to send some specific pure ACK carrying >28
> bytes of MPTCP options, like with this specific ADD_ADDR. A new
> parameter is passed from tcp_established_options to the MPTCP side to
> indicate if the TCP TS option is used, and if it should be dropped. The
> next commit implements the part on MPTCP side, but split into two
> patches to help TCP maintainers to identify the modifications on TCP
> side. This feature will be controlled by a new add_addr_v6_port_drop_ts
> MPTCP sysctl knob.
>
> It is important to keep in mind that dropping the TCP timestamps option
> for one packet of the connection could eventually disrupt some
> middleboxes: even if it should be unlikely, they could drop the packet
> or even block the connection. That's why this new feature will be
> controlled by a sysctl knob.
>
> Note that it would be technically possible to squeeze both options into
> the header if the ADD_ADDR is first written, and then the TCP timestamps
> without the NOPs preceding it. But this means more modifications on TCP
> side, plus some middleboxes could still be disrupted by that.
>
> In this implementation, an unused bit is used in mptcp_out_options
> structure to avoid passing an address to a local variable. Reading and
> setting it needs CONFIG_MPTCP, so the whole block now has this #if
> condition: mptcp_established_options() is then no longer used without
> CONFIG_MPTCP.
>
> About alternatives, instead of passing a new boolean (has_ts), another
> option would be to pass the whole option structure (opts), but
> 'struct tcp_out_options' is currently defined in tcp_output.c, and it
> would need to be exported. Plus that means the removal of the TCP TS
> option would be done on the MPTCP side, and not here on the TCP side.
> It feels clearer to remove other TCP options from the TCP side, than
> hiding that from the MPTCP side.
>
> Yet an other alternative would be to pass the size already taken by the
> other TCP options, and have a way to drop them all when needed. But this
> feels better to target only the timestamps option where dropping it
> should be safe, even if it is currently the only option that would be
> set before MPTCP, when MPTCP is used.
>
> Reviewed-by: Mat Martineau <martineau@kernel.org>
> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
> ---
> - v2: Avoid passing local variables' addresses to
> mptcp_established_options not to force the compiler to use a stack
> canary in this hot function, even for non-MPTCP flows. (Eric Dumazet)
> To: Neal Cardwell <ncardwell@google.com>
> To: Kuniyuki Iwashima <kuniyu@google.com>
> ---
> include/net/mptcp.h | 13 +++----------
> net/ipv4/tcp_output.c | 10 +++++++++-
> net/mptcp/options.c | 2 +-
> 3 files changed, 13 insertions(+), 12 deletions(-)
>
> diff --git a/include/net/mptcp.h b/include/net/mptcp.h
> index 24d1016a4664..71b9fc5a5796 100644
> --- a/include/net/mptcp.h
> +++ b/include/net/mptcp.h
> @@ -72,7 +72,8 @@ struct mptcp_out_options {
> u8 reset_reason:4,
> reset_transient:1,
> csum_reqd:1,
> - allow_join_id0:1;
> + allow_join_id0:1,
> + drop_ts:1;
> union {
> struct {
> u64 sndr_key;
> @@ -153,7 +154,7 @@ bool mptcp_syn_options(struct sock *sk, const struct sk_buff *skb,
> bool mptcp_synack_options(const struct request_sock *req, unsigned int *size,
> struct mptcp_out_options *opts);
> int mptcp_established_options(struct sock *sk, struct sk_buff *skb,
> - unsigned int remaining,
> + unsigned int remaining, bool has_ts,
> struct mptcp_out_options *opts);
> bool mptcp_incoming_options(struct sock *sk, struct sk_buff *skb);
>
> @@ -269,14 +270,6 @@ static inline bool mptcp_synack_options(const struct request_sock *req,
> return false;
> }
>
> -static inline int mptcp_established_options(struct sock *sk,
> - struct sk_buff *skb,
> - unsigned int remaining,
> - struct mptcp_out_options *opts)
> -{
> - return -1;
> -}
> -
> static inline bool mptcp_incoming_options(struct sock *sk,
> struct sk_buff *skb)
> {
> diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
> index d3b8e61d3c5e..26dd751ec72a 100644
> --- a/net/ipv4/tcp_output.c
> +++ b/net/ipv4/tcp_output.c
> @@ -1175,6 +1175,7 @@ static unsigned int tcp_established_options(struct sock *sk, struct sk_buff *skb
> size += TCPOLEN_TSTAMP_ALIGNED;
> }
>
> +#if IS_ENABLED(CONFIG_MPTCP)
> /* MPTCP options have precedence over SACK for the limited TCP
> * option space because a MPTCP connection would be forced to
> * fall back to regular TCP if a required multipath option is
> @@ -1183,15 +1184,22 @@ static unsigned int tcp_established_options(struct sock *sk, struct sk_buff *skb
> */
> if (sk_is_mptcp(sk)) {
> unsigned int remaining = MAX_TCP_OPTION_SPACE - size;
> + bool has_ts = opts->options & OPTION_TS;
> int opt_size;
>
> - opt_size = mptcp_established_options(sk, skb, remaining,
> + opts->mptcp.drop_ts = 0;
> +
> + opt_size = mptcp_established_options(sk, skb, remaining, has_ts,
> &opts->mptcp);
> if (opt_size >= 0) {
> opts->options |= OPTION_MPTCP;
> size += opt_size;
> +
> + if (opts->mptcp.drop_ts)
> + opts->options &= ~OPTION_TS;
I'm wondering if you are ok with this patch in the current form?
One thing that was discussed on the mptcp ML was exposing the
tcp_out_options layout so that mptcp_established_options() could receive
such argument and likely clean-up a bit this code.
Not done here because placing tcp_out_options under `include` felt a bit
"too much". Perhaps adding a `net/ipv4/tcp_option.h` header (and
including it from the mptcp code) would be more palatable?
Thanks,
Paolo
next prev parent reply other threads:[~2026-06-11 7:39 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-05 9:21 [PATCH net-next v2 00/15] mptcp: pm: drop TCP TS with ADD_ADDRv6 + port Matthieu Baerts (NGI0)
2026-06-05 9:21 ` [PATCH net-next v2 01/15] mptcp: options: suboptions sizes can be negative Matthieu Baerts (NGI0)
2026-06-05 9:21 ` [PATCH net-next v2 02/15] mptcp: pm: avoid computing rm_addr size twice Matthieu Baerts (NGI0)
2026-06-05 9:21 ` [PATCH net-next v2 03/15] mptcp: pm: avoid computing add_addr " Matthieu Baerts (NGI0)
2026-06-05 9:21 ` [PATCH net-next v2 04/15] mptcp: introduce add_addr_v6_port_drop_ts sysctl knob Matthieu Baerts (NGI0)
2026-06-05 9:21 ` [PATCH net-next v2 05/15] tcp: allow mptcp to drop TS for some packets Matthieu Baerts (NGI0)
2026-06-11 7:39 ` Paolo Abeni [this message]
2026-06-11 8:49 ` Matthieu Baerts
2026-06-11 12:38 ` Eric Dumazet
2026-06-05 9:21 ` [PATCH net-next v2 06/15] mptcp: pm: drop TCP TS with ADD_ADDRv6 + port Matthieu Baerts (NGI0)
2026-06-10 15:13 ` Matthieu Baerts
2026-06-10 15:22 ` Jakub Kicinski
2026-06-05 9:21 ` [PATCH net-next v2 07/15] selftests: mptcp: validate ADD_ADDRv6 + TS " Matthieu Baerts (NGI0)
2026-06-05 9:21 ` [PATCH net-next v2 08/15] selftests: mptcp: always check sent/dropped ADD_ADDRs Matthieu Baerts (NGI0)
2026-06-05 9:21 ` [PATCH net-next v2 09/15] mptcp: pm: use for_each_subflow helper Matthieu Baerts (NGI0)
2026-06-05 9:21 ` [PATCH net-next v2 10/15] mptcp: pm: rename add_entry structure to add_addr Matthieu Baerts (NGI0)
2026-06-05 9:21 ` [PATCH net-next v2 11/15] mptcp: pm: uniform announced addresses helpers Matthieu Baerts (NGI0)
2026-06-05 9:21 ` [PATCH net-next v2 12/15] mptcp: pm: remove add_ prefix from timer Matthieu Baerts (NGI0)
2026-06-05 9:21 ` [PATCH net-next v2 13/15] mptcp: pm: make mptcp_pm_add_addr_send_ack static Matthieu Baerts (NGI0)
2026-06-05 9:21 ` [PATCH net-next v2 14/15] mptcp: pm: avoid using del_timer directly Matthieu Baerts (NGI0)
2026-06-05 9:21 ` [PATCH net-next v2 15/15] mptcp: options: rst: drop unused skb parameter Matthieu Baerts (NGI0)
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=fb96d949-dcf5-494d-8076-eb4a972fa5e4@redhat.com \
--to=pabeni@redhat.com \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=geliang@kernel.org \
--cc=horms@kernel.org \
--cc=kuba@kernel.org \
--cc=kuniyu@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=martineau@kernel.org \
--cc=matttbe@kernel.org \
--cc=mptcp@lists.linux.dev \
--cc=ncardwell@google.com \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox