From: Matthieu Baerts <matttbe@kernel.org>
To: Paolo Abeni <pabeni@redhat.com>, Eric Dumazet <edumazet@google.com>
Cc: netdev@vger.kernel.org, mptcp@lists.linux.dev,
linux-kernel@vger.kernel.org,
Neal Cardwell <ncardwell@google.com>,
Kuniyuki Iwashima <kuniyu@google.com>,
Mat Martineau <martineau@kernel.org>,
Geliang Tang <geliang@kernel.org>,
"David S. Miller" <davem@davemloft.net>,
Jakub Kicinski <kuba@kernel.org>, Simon Horman <horms@kernel.org>
Subject: Re: [PATCH net-next v2 05/15] tcp: allow mptcp to drop TS for some packets
Date: Thu, 11 Jun 2026 10:49:39 +0200 [thread overview]
Message-ID: <1014329c-d60f-432a-b165-62866ca6a4b4@kernel.org> (raw)
In-Reply-To: <fb96d949-dcf5-494d-8076-eb4a972fa5e4@redhat.com>
Hi Paolo, Eric,
On 11/06/2026 09:39, Paolo Abeni wrote:
> Hi Eric,
>
> On 6/5/26 11:21 AM, Matthieu Baerts (NGI0) wrote:
>> With TCP-timestamps (padded) taking 12 bytes and ADD_ADDR IPv6 + port
>> taking 30 bytes, the 40-byte limit for the TCP options is reached. In
>> this case, it is then not possible to send the address signal.
>>
>> The idea is to let MPTCP dropping the TCP-timestamps option for some
>> specific packets, to be able to send some specific pure ACK carrying >28
>> bytes of MPTCP options, like with this specific ADD_ADDR. A new
>> parameter is passed from tcp_established_options to the MPTCP side to
>> indicate if the TCP TS option is used, and if it should be dropped. The
>> next commit implements the part on MPTCP side, but split into two
>> patches to help TCP maintainers to identify the modifications on TCP
>> side. This feature will be controlled by a new add_addr_v6_port_drop_ts
>> MPTCP sysctl knob.
>>
>> It is important to keep in mind that dropping the TCP timestamps option
>> for one packet of the connection could eventually disrupt some
>> middleboxes: even if it should be unlikely, they could drop the packet
>> or even block the connection. That's why this new feature will be
>> controlled by a sysctl knob.
>>
>> Note that it would be technically possible to squeeze both options into
>> the header if the ADD_ADDR is first written, and then the TCP timestamps
>> without the NOPs preceding it. But this means more modifications on TCP
>> side, plus some middleboxes could still be disrupted by that.
>>
>> In this implementation, an unused bit is used in mptcp_out_options
>> structure to avoid passing an address to a local variable. Reading and
>> setting it needs CONFIG_MPTCP, so the whole block now has this #if
>> condition: mptcp_established_options() is then no longer used without
>> CONFIG_MPTCP.
>>
>> About alternatives, instead of passing a new boolean (has_ts), another
>> option would be to pass the whole option structure (opts), but
>> 'struct tcp_out_options' is currently defined in tcp_output.c, and it
>> would need to be exported. Plus that means the removal of the TCP TS
>> option would be done on the MPTCP side, and not here on the TCP side.
>> It feels clearer to remove other TCP options from the TCP side, than
>> hiding that from the MPTCP side.
>>
>> Yet an other alternative would be to pass the size already taken by the
>> other TCP options, and have a way to drop them all when needed. But this
>> feels better to target only the timestamps option where dropping it
>> should be safe, even if it is currently the only option that would be
>> set before MPTCP, when MPTCP is used.
>>
>> Reviewed-by: Mat Martineau <martineau@kernel.org>
>> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
>> ---
>> - v2: Avoid passing local variables' addresses to
>> mptcp_established_options not to force the compiler to use a stack
>> canary in this hot function, even for non-MPTCP flows. (Eric Dumazet)
>> To: Neal Cardwell <ncardwell@google.com>
>> To: Kuniyuki Iwashima <kuniyu@google.com>
>> ---
>> include/net/mptcp.h | 13 +++----------
>> net/ipv4/tcp_output.c | 10 +++++++++-
>> net/mptcp/options.c | 2 +-
>> 3 files changed, 13 insertions(+), 12 deletions(-)
>>
>> diff --git a/include/net/mptcp.h b/include/net/mptcp.h
>> index 24d1016a4664..71b9fc5a5796 100644
>> --- a/include/net/mptcp.h
>> +++ b/include/net/mptcp.h
>> @@ -72,7 +72,8 @@ struct mptcp_out_options {
>> u8 reset_reason:4,
>> reset_transient:1,
>> csum_reqd:1,
>> - allow_join_id0:1;
>> + allow_join_id0:1,
>> + drop_ts:1;
>> union {
>> struct {
>> u64 sndr_key;
>> @@ -153,7 +154,7 @@ bool mptcp_syn_options(struct sock *sk, const struct sk_buff *skb,
>> bool mptcp_synack_options(const struct request_sock *req, unsigned int *size,
>> struct mptcp_out_options *opts);
>> int mptcp_established_options(struct sock *sk, struct sk_buff *skb,
>> - unsigned int remaining,
>> + unsigned int remaining, bool has_ts,
>> struct mptcp_out_options *opts);
>> bool mptcp_incoming_options(struct sock *sk, struct sk_buff *skb);
>>
>> @@ -269,14 +270,6 @@ static inline bool mptcp_synack_options(const struct request_sock *req,
>> return false;
>> }
>>
>> -static inline int mptcp_established_options(struct sock *sk,
>> - struct sk_buff *skb,
>> - unsigned int remaining,
>> - struct mptcp_out_options *opts)
>> -{
>> - return -1;
>> -}
>> -
>> static inline bool mptcp_incoming_options(struct sock *sk,
>> struct sk_buff *skb)
>> {
>> diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
>> index d3b8e61d3c5e..26dd751ec72a 100644
>> --- a/net/ipv4/tcp_output.c
>> +++ b/net/ipv4/tcp_output.c
>> @@ -1175,6 +1175,7 @@ static unsigned int tcp_established_options(struct sock *sk, struct sk_buff *skb
>> size += TCPOLEN_TSTAMP_ALIGNED;
>> }
>>
>> +#if IS_ENABLED(CONFIG_MPTCP)
>> /* MPTCP options have precedence over SACK for the limited TCP
>> * option space because a MPTCP connection would be forced to
>> * fall back to regular TCP if a required multipath option is
>> @@ -1183,15 +1184,22 @@ static unsigned int tcp_established_options(struct sock *sk, struct sk_buff *skb
>> */
>> if (sk_is_mptcp(sk)) {
>> unsigned int remaining = MAX_TCP_OPTION_SPACE - size;
>> + bool has_ts = opts->options & OPTION_TS;
>> int opt_size;
>>
>> - opt_size = mptcp_established_options(sk, skb, remaining,
>> + opts->mptcp.drop_ts = 0;
>> +
>> + opt_size = mptcp_established_options(sk, skb, remaining, has_ts,
>> &opts->mptcp);
>> if (opt_size >= 0) {
>> opts->options |= OPTION_MPTCP;
>> size += opt_size;
>> +
>> + if (opts->mptcp.drop_ts)
>> + opts->options &= ~OPTION_TS;
>
> I'm wondering if you are ok with this patch in the current form?
>
> One thing that was discussed on the mptcp ML was exposing the
> tcp_out_options layout so that mptcp_established_options() could receive
> such argument and likely clean-up a bit this code.
>
> Not done here because placing tcp_out_options under `include` felt a bit
> "too much". Perhaps adding a `net/ipv4/tcp_option.h` header (and
> including it from the mptcp code) would be more palatable?
Note that I also didn't do this because that would mean the removal of
the TCP TS option would be done from the MPTCP side, and not here from
the TCP side. That didn't feel right to hide this from the MPTCP side.
But I'm fine to change if preferred.
Cheers,
Matt
--
Sponsored by the NGI0 Core fund.
next prev parent reply other threads:[~2026-06-11 8:49 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-05 9:21 [PATCH net-next v2 00/15] mptcp: pm: drop TCP TS with ADD_ADDRv6 + port Matthieu Baerts (NGI0)
2026-06-05 9:21 ` [PATCH net-next v2 01/15] mptcp: options: suboptions sizes can be negative Matthieu Baerts (NGI0)
2026-06-05 9:21 ` [PATCH net-next v2 02/15] mptcp: pm: avoid computing rm_addr size twice Matthieu Baerts (NGI0)
2026-06-05 9:21 ` [PATCH net-next v2 03/15] mptcp: pm: avoid computing add_addr " Matthieu Baerts (NGI0)
2026-06-05 9:21 ` [PATCH net-next v2 04/15] mptcp: introduce add_addr_v6_port_drop_ts sysctl knob Matthieu Baerts (NGI0)
2026-06-05 9:21 ` [PATCH net-next v2 05/15] tcp: allow mptcp to drop TS for some packets Matthieu Baerts (NGI0)
2026-06-11 7:39 ` Paolo Abeni
2026-06-11 8:49 ` Matthieu Baerts [this message]
2026-06-11 12:38 ` Eric Dumazet
2026-06-11 17:18 ` Simon Baatz
2026-06-11 19:40 ` Matthieu Baerts
2026-06-05 9:21 ` [PATCH net-next v2 06/15] mptcp: pm: drop TCP TS with ADD_ADDRv6 + port Matthieu Baerts (NGI0)
2026-06-10 15:13 ` Matthieu Baerts
2026-06-10 15:22 ` Jakub Kicinski
2026-06-05 9:21 ` [PATCH net-next v2 07/15] selftests: mptcp: validate ADD_ADDRv6 + TS " Matthieu Baerts (NGI0)
2026-06-05 9:21 ` [PATCH net-next v2 08/15] selftests: mptcp: always check sent/dropped ADD_ADDRs Matthieu Baerts (NGI0)
2026-06-05 9:21 ` [PATCH net-next v2 09/15] mptcp: pm: use for_each_subflow helper Matthieu Baerts (NGI0)
2026-06-05 9:21 ` [PATCH net-next v2 10/15] mptcp: pm: rename add_entry structure to add_addr Matthieu Baerts (NGI0)
2026-06-05 9:21 ` [PATCH net-next v2 11/15] mptcp: pm: uniform announced addresses helpers Matthieu Baerts (NGI0)
2026-06-05 9:21 ` [PATCH net-next v2 12/15] mptcp: pm: remove add_ prefix from timer Matthieu Baerts (NGI0)
2026-06-05 9:21 ` [PATCH net-next v2 13/15] mptcp: pm: make mptcp_pm_add_addr_send_ack static Matthieu Baerts (NGI0)
2026-06-05 9:21 ` [PATCH net-next v2 14/15] mptcp: pm: avoid using del_timer directly Matthieu Baerts (NGI0)
2026-06-05 9:21 ` [PATCH net-next v2 15/15] mptcp: options: rst: drop unused skb parameter Matthieu Baerts (NGI0)
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1014329c-d60f-432a-b165-62866ca6a4b4@kernel.org \
--to=matttbe@kernel.org \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=geliang@kernel.org \
--cc=horms@kernel.org \
--cc=kuba@kernel.org \
--cc=kuniyu@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=martineau@kernel.org \
--cc=mptcp@lists.linux.dev \
--cc=ncardwell@google.com \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox