* [PATCH mptcp-net 0/2] mptcp: restore zero window probe
@ 2025-10-20 16:10 Paolo Abeni
2025-10-20 16:10 ` [PATCH mptcp-net 1/2] mptcp: restore " Paolo Abeni
` (2 more replies)
0 siblings, 3 replies; 7+ messages in thread
From: Paolo Abeni @ 2025-10-20 16:10 UTC (permalink / raw)
To: mptcp; +Cc: geliang
It turns out that the issue reported by Geliang on the pending backlog
patches is actually on older one that the new code made more easy
reproducible.
The first patch in the series addresses it[1], the 2nd one introduce new
mibs to hopefully catch this problem sooned.
I have a pktdrill to validate this scenario. I'll share that soon.
@Geliang could you please validate the above in your testbed, on top of
the backlog and splice patches?
[1] at least here, at least with the packet drill reproducer ;)
Paolo Abeni (2):
mptcp: restore window probe
mptcp: zero window probe mib
net/mptcp/mib.c | 1 +
net/mptcp/mib.h | 1 +
net/mptcp/protocol.c | 15 +++++++++++----
3 files changed, 13 insertions(+), 4 deletions(-)
--
2.51.0
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH mptcp-net 1/2] mptcp: restore window probe
2025-10-20 16:10 [PATCH mptcp-net 0/2] mptcp: restore zero window probe Paolo Abeni
@ 2025-10-20 16:10 ` Paolo Abeni
2025-10-21 0:12 ` Mat Martineau
2025-10-20 16:10 ` [PATCH mptcp-net 2/2] mptcp: zero window probe mib Paolo Abeni
2025-10-20 17:38 ` [PATCH mptcp-net 0/2] mptcp: restore zero window probe MPTCP CI
2 siblings, 1 reply; 7+ messages in thread
From: Paolo Abeni @ 2025-10-20 16:10 UTC (permalink / raw)
To: mptcp; +Cc: geliang
Since commit 72377ab2d671 ("mptcp: more conservative check for zero
probes") the MPTCP-level zero window probe check is always disabled, as
the TCP-level write queue always contains at least the newly allocated
skb.
Refine the relevant check tacking in account that the above condition
and that such skb can have zero length.
Fixes: 72377ab2d671 ("mptcp: more conservative check for zero probes")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
---
net/mptcp/protocol.c | 14 ++++++++++----
1 file changed, 10 insertions(+), 4 deletions(-)
diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index bde76b7311f986..fe029359b7d7a2 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -1276,6 +1276,7 @@ static int mptcp_sendmsg_frag(struct sock *sk, struct sock *ssk,
u64 data_seq = dfrag->data_seq + info->sent;
int offset = dfrag->offset + info->sent;
struct mptcp_sock *msk = mptcp_sk(sk);
+ struct tcp_sock *tp = tcp_sk(ssk);
bool zero_window_probe = false;
struct mptcp_ext *mpext = NULL;
bool can_coalesce = false;
@@ -1311,14 +1312,14 @@ static int mptcp_sendmsg_frag(struct sock *sk, struct sock *ssk,
mpext = mptcp_get_ext(skb);
if (!mptcp_skb_can_collapse_to(data_seq, skb, mpext)) {
TCP_SKB_CB(skb)->eor = 1;
- tcp_mark_push(tcp_sk(ssk), skb);
+ tcp_mark_push(tp, skb);
goto alloc_skb;
}
i = skb_shinfo(skb)->nr_frags;
can_coalesce = skb_can_coalesce(skb, i, dfrag->page, offset);
if (!can_coalesce && i >= READ_ONCE(net_hotdata.sysctl_max_skb_frags)) {
- tcp_mark_push(tcp_sk(ssk), skb);
+ tcp_mark_push(tp, skb);
goto alloc_skb;
}
@@ -1339,7 +1340,12 @@ static int mptcp_sendmsg_frag(struct sock *sk, struct sock *ssk,
if (copy == 0) {
u64 snd_una = READ_ONCE(msk->snd_una);
- if (snd_una != msk->snd_nxt || tcp_write_queue_tail(ssk)) {
+ /* No need for zero probe if there are any data pending
+ * either at the msk or ssk level; skb is the current write
+ * queue tail and can be empty at this point.
+ */
+ if (snd_una != msk->snd_nxt || skb->len ||
+ skb != tcp_send_head(ssk)) {
tcp_remove_empty_skb(ssk);
return 0;
}
@@ -1367,7 +1373,7 @@ static int mptcp_sendmsg_frag(struct sock *sk, struct sock *ssk,
skb->truesize += copy;
sk_wmem_queued_add(ssk, copy);
sk_mem_charge(ssk, copy);
- WRITE_ONCE(tcp_sk(ssk)->write_seq, tcp_sk(ssk)->write_seq + copy);
+ WRITE_ONCE(tp->write_seq, tp->write_seq + copy);
TCP_SKB_CB(skb)->end_seq += copy;
tcp_skb_pcount_set(skb, 0);
--
2.51.0
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH mptcp-net 2/2] mptcp: zero window probe mib
2025-10-20 16:10 [PATCH mptcp-net 0/2] mptcp: restore zero window probe Paolo Abeni
2025-10-20 16:10 ` [PATCH mptcp-net 1/2] mptcp: restore " Paolo Abeni
@ 2025-10-20 16:10 ` Paolo Abeni
2025-10-21 0:14 ` Mat Martineau
2025-10-20 17:38 ` [PATCH mptcp-net 0/2] mptcp: restore zero window probe MPTCP CI
2 siblings, 1 reply; 7+ messages in thread
From: Paolo Abeni @ 2025-10-20 16:10 UTC (permalink / raw)
To: mptcp; +Cc: geliang
Explicitly account for MPTCP-level zero windows probe, to catch
hopefully earlier issues alike the one addressed by the previous
patch.
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
---
net/mptcp/mib.c | 1 +
net/mptcp/mib.h | 1 +
net/mptcp/protocol.c | 1 +
3 files changed, 3 insertions(+)
diff --git a/net/mptcp/mib.c b/net/mptcp/mib.c
index 6003e47c770a7c..eb4645a9c5ac07 100644
--- a/net/mptcp/mib.c
+++ b/net/mptcp/mib.c
@@ -85,6 +85,7 @@ static const struct snmp_mib mptcp_snmp_list[] = {
SNMP_MIB_ITEM("DssFallback", MPTCP_MIB_DSSFALLBACK),
SNMP_MIB_ITEM("SimultConnectFallback", MPTCP_MIB_SIMULTCONNFALLBACK),
SNMP_MIB_ITEM("FallbackFailed", MPTCP_MIB_FALLBACKFAILED),
+ SNMP_MIB_ITEM("MPTcpWinProbe", MPTCP_MIB_MPTCPWINPROBE),
};
/* mptcp_mib_alloc - allocate percpu mib counters
diff --git a/net/mptcp/mib.h b/net/mptcp/mib.h
index 309bac6fea3252..f83a113700522e 100644
--- a/net/mptcp/mib.h
+++ b/net/mptcp/mib.h
@@ -88,6 +88,7 @@ enum linux_mptcp_mib_field {
MPTCP_MIB_DSSFALLBACK, /* Bad or missing DSS */
MPTCP_MIB_SIMULTCONNFALLBACK, /* Simultaneous connect */
MPTCP_MIB_FALLBACKFAILED, /* Can't fallback due to msk status */
+ MPTCP_MIB_MPTCPWINPROBE, /* MPTCP-level zero window probe */
__MPTCP_MIB_MAX
};
diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index fe029359b7d7a2..67919ae774cc40 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -1396,6 +1396,7 @@ static int mptcp_sendmsg_frag(struct sock *sk, struct sock *ssk,
mpext->dsn64);
if (zero_window_probe) {
+ MPTCP_INC_STATS(sock_net(ssk), MPTCP_MIB_MPTCPWINPROBE);
mptcp_subflow_ctx(ssk)->rel_write_seq += copy;
mpext->frozen = 1;
if (READ_ONCE(msk->csum_enabled))
--
2.51.0
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH mptcp-net 0/2] mptcp: restore zero window probe
2025-10-20 16:10 [PATCH mptcp-net 0/2] mptcp: restore zero window probe Paolo Abeni
2025-10-20 16:10 ` [PATCH mptcp-net 1/2] mptcp: restore " Paolo Abeni
2025-10-20 16:10 ` [PATCH mptcp-net 2/2] mptcp: zero window probe mib Paolo Abeni
@ 2025-10-20 17:38 ` MPTCP CI
2 siblings, 0 replies; 7+ messages in thread
From: MPTCP CI @ 2025-10-20 17:38 UTC (permalink / raw)
To: Paolo Abeni; +Cc: mptcp
Hi Paolo,
Thank you for your modifications, that's great!
Our CI did some validations and here is its report:
- KVM Validation: normal (except selftest_mptcp_join): Success! ✅
- KVM Validation: normal (only selftest_mptcp_join): Success! ✅
- KVM Validation: debug (except selftest_mptcp_join): Success! ✅
- KVM Validation: debug (only selftest_mptcp_join): Unstable: 1 failed test(s): selftest_mptcp_join 🔴
- KVM Validation: btf-normal (only bpftest_all): Success! ✅
- KVM Validation: btf-debug (only bpftest_all): Success! ✅
- Task: https://github.com/multipath-tcp/mptcp_net-next/actions/runs/18658193252
Initiator: Patchew Applier
Commits: https://github.com/multipath-tcp/mptcp_net-next/commits/2f955eee50f0
Patchwork: https://patchwork.kernel.org/project/mptcp/list/?series=1013663
If there are some issues, you can reproduce them using the same environment as
the one used by the CI thanks to a docker image, e.g.:
$ cd [kernel source code]
$ docker run -v "${PWD}:${PWD}:rw" -w "${PWD}" --privileged --rm -it \
--pull always mptcp/mptcp-upstream-virtme-docker:latest \
auto-normal
For more details:
https://github.com/multipath-tcp/mptcp-upstream-virtme-docker
Please note that despite all the efforts that have been already done to have a
stable tests suite when executed on a public CI like here, it is possible some
reported issues are not due to your modifications. Still, do not hesitate to
help us improve that ;-)
Cheers,
MPTCP GH Action bot
Bot operated by Matthieu Baerts (NGI0 Core)
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH mptcp-net 1/2] mptcp: restore window probe
2025-10-20 16:10 ` [PATCH mptcp-net 1/2] mptcp: restore " Paolo Abeni
@ 2025-10-21 0:12 ` Mat Martineau
2025-10-21 17:27 ` Paolo Abeni
0 siblings, 1 reply; 7+ messages in thread
From: Mat Martineau @ 2025-10-21 0:12 UTC (permalink / raw)
To: Paolo Abeni; +Cc: mptcp, geliang
On Mon, 20 Oct 2025, Paolo Abeni wrote:
> Since commit 72377ab2d671 ("mptcp: more conservative check for zero
> probes") the MPTCP-level zero window probe check is always disabled, as
> the TCP-level write queue always contains at least the newly allocated
> skb.
>
> Refine the relevant check tacking in account that the above condition
> and that such skb can have zero length.
>
> Fixes: 72377ab2d671 ("mptcp: more conservative check for zero probes")
> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
> ---
> net/mptcp/protocol.c | 14 ++++++++++----
> 1 file changed, 10 insertions(+), 4 deletions(-)
>
> diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
> index bde76b7311f986..fe029359b7d7a2 100644
> --- a/net/mptcp/protocol.c
> +++ b/net/mptcp/protocol.c
> @@ -1276,6 +1276,7 @@ static int mptcp_sendmsg_frag(struct sock *sk, struct sock *ssk,
> u64 data_seq = dfrag->data_seq + info->sent;
> int offset = dfrag->offset + info->sent;
> struct mptcp_sock *msk = mptcp_sk(sk);
> + struct tcp_sock *tp = tcp_sk(ssk);
Hi Paolo -
I appreciate the urge to tidy up the tcp_sk(ssk)s in this function, but it
does distract from the meaningful code change - and the updated zero probe
logic doesn't use the new tp variable...
Then again, if a netdev maintainer wants this in a -net patch, that
opinion does carry some weight :)
> bool zero_window_probe = false;
> struct mptcp_ext *mpext = NULL;
> bool can_coalesce = false;
> @@ -1311,14 +1312,14 @@ static int mptcp_sendmsg_frag(struct sock *sk, struct sock *ssk,
> mpext = mptcp_get_ext(skb);
> if (!mptcp_skb_can_collapse_to(data_seq, skb, mpext)) {
> TCP_SKB_CB(skb)->eor = 1;
> - tcp_mark_push(tcp_sk(ssk), skb);
> + tcp_mark_push(tp, skb);
> goto alloc_skb;
> }
>
> i = skb_shinfo(skb)->nr_frags;
> can_coalesce = skb_can_coalesce(skb, i, dfrag->page, offset);
> if (!can_coalesce && i >= READ_ONCE(net_hotdata.sysctl_max_skb_frags)) {
> - tcp_mark_push(tcp_sk(ssk), skb);
> + tcp_mark_push(tp, skb);
> goto alloc_skb;
> }
>
> @@ -1339,7 +1340,12 @@ static int mptcp_sendmsg_frag(struct sock *sk, struct sock *ssk,
> if (copy == 0) {
> u64 snd_una = READ_ONCE(msk->snd_una);
>
> - if (snd_una != msk->snd_nxt || tcp_write_queue_tail(ssk)) {
> + /* No need for zero probe if there are any data pending
> + * either at the msk or ssk level; skb is the current write
> + * queue tail and can be empty at this point.
> + */
> + if (snd_una != msk->snd_nxt || skb->len ||
> + skb != tcp_send_head(ssk)) {
LGTM
Reviewed-by: Mat Martineau <martineau@kernel.org>
> tcp_remove_empty_skb(ssk);
> return 0;
> }
> @@ -1367,7 +1373,7 @@ static int mptcp_sendmsg_frag(struct sock *sk, struct sock *ssk,
> skb->truesize += copy;
> sk_wmem_queued_add(ssk, copy);
> sk_mem_charge(ssk, copy);
> - WRITE_ONCE(tcp_sk(ssk)->write_seq, tcp_sk(ssk)->write_seq + copy);
> + WRITE_ONCE(tp->write_seq, tp->write_seq + copy);
> TCP_SKB_CB(skb)->end_seq += copy;
> tcp_skb_pcount_set(skb, 0);
>
> --
> 2.51.0
>
>
>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH mptcp-net 2/2] mptcp: zero window probe mib
2025-10-20 16:10 ` [PATCH mptcp-net 2/2] mptcp: zero window probe mib Paolo Abeni
@ 2025-10-21 0:14 ` Mat Martineau
0 siblings, 0 replies; 7+ messages in thread
From: Mat Martineau @ 2025-10-21 0:14 UTC (permalink / raw)
To: Paolo Abeni; +Cc: mptcp, geliang
On Mon, 20 Oct 2025, Paolo Abeni wrote:
> Explicitly account for MPTCP-level zero windows probe, to catch
> hopefully earlier issues alike the one addressed by the previous
> patch.
>
> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Helpful metric to track, looks good.
Reviewed-by: Mat Martineau <martineau@kernel.org>
> ---
> net/mptcp/mib.c | 1 +
> net/mptcp/mib.h | 1 +
> net/mptcp/protocol.c | 1 +
> 3 files changed, 3 insertions(+)
>
> diff --git a/net/mptcp/mib.c b/net/mptcp/mib.c
> index 6003e47c770a7c..eb4645a9c5ac07 100644
> --- a/net/mptcp/mib.c
> +++ b/net/mptcp/mib.c
> @@ -85,6 +85,7 @@ static const struct snmp_mib mptcp_snmp_list[] = {
> SNMP_MIB_ITEM("DssFallback", MPTCP_MIB_DSSFALLBACK),
> SNMP_MIB_ITEM("SimultConnectFallback", MPTCP_MIB_SIMULTCONNFALLBACK),
> SNMP_MIB_ITEM("FallbackFailed", MPTCP_MIB_FALLBACKFAILED),
> + SNMP_MIB_ITEM("MPTcpWinProbe", MPTCP_MIB_MPTCPWINPROBE),
> };
>
> /* mptcp_mib_alloc - allocate percpu mib counters
> diff --git a/net/mptcp/mib.h b/net/mptcp/mib.h
> index 309bac6fea3252..f83a113700522e 100644
> --- a/net/mptcp/mib.h
> +++ b/net/mptcp/mib.h
> @@ -88,6 +88,7 @@ enum linux_mptcp_mib_field {
> MPTCP_MIB_DSSFALLBACK, /* Bad or missing DSS */
> MPTCP_MIB_SIMULTCONNFALLBACK, /* Simultaneous connect */
> MPTCP_MIB_FALLBACKFAILED, /* Can't fallback due to msk status */
> + MPTCP_MIB_MPTCPWINPROBE, /* MPTCP-level zero window probe */
> __MPTCP_MIB_MAX
> };
>
> diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
> index fe029359b7d7a2..67919ae774cc40 100644
> --- a/net/mptcp/protocol.c
> +++ b/net/mptcp/protocol.c
> @@ -1396,6 +1396,7 @@ static int mptcp_sendmsg_frag(struct sock *sk, struct sock *ssk,
> mpext->dsn64);
>
> if (zero_window_probe) {
> + MPTCP_INC_STATS(sock_net(ssk), MPTCP_MIB_MPTCPWINPROBE);
> mptcp_subflow_ctx(ssk)->rel_write_seq += copy;
> mpext->frozen = 1;
> if (READ_ONCE(msk->csum_enabled))
> --
> 2.51.0
>
>
>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH mptcp-net 1/2] mptcp: restore window probe
2025-10-21 0:12 ` Mat Martineau
@ 2025-10-21 17:27 ` Paolo Abeni
0 siblings, 0 replies; 7+ messages in thread
From: Paolo Abeni @ 2025-10-21 17:27 UTC (permalink / raw)
To: Mat Martineau; +Cc: mptcp, geliang
On 10/21/25 2:12 AM, Mat Martineau wrote:
> On Mon, 20 Oct 2025, Paolo Abeni wrote:
>> Since commit 72377ab2d671 ("mptcp: more conservative check for zero
>> probes") the MPTCP-level zero window probe check is always disabled, as
>> the TCP-level write queue always contains at least the newly allocated
>> skb.
>>
>> Refine the relevant check tacking in account that the above condition
>> and that such skb can have zero length.
>>
>> Fixes: 72377ab2d671 ("mptcp: more conservative check for zero probes")
>> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
>> ---
>> net/mptcp/protocol.c | 14 ++++++++++----
>> 1 file changed, 10 insertions(+), 4 deletions(-)
>>
>> diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
>> index bde76b7311f986..fe029359b7d7a2 100644
>> --- a/net/mptcp/protocol.c
>> +++ b/net/mptcp/protocol.c
>> @@ -1276,6 +1276,7 @@ static int mptcp_sendmsg_frag(struct sock *sk, struct sock *ssk,
>> u64 data_seq = dfrag->data_seq + info->sent;
>> int offset = dfrag->offset + info->sent;
>> struct mptcp_sock *msk = mptcp_sk(sk);
>> + struct tcp_sock *tp = tcp_sk(ssk);
>
> Hi Paolo -
>
> I appreciate the urge to tidy up the tcp_sk(ssk)s in this function, but it
> does distract from the meaningful code change - and the updated zero probe
> logic doesn't use the new tp variable...
Whoops, due to a bit of rush on my side after spending ~1w looping on
the splice selftests ;) A previous iteration of this patch introduced a
new tcp_sock() usage hence the larger diffstat.
I'll send a v2 with the minimal fix.
/P
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2025-10-21 17:27 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-10-20 16:10 [PATCH mptcp-net 0/2] mptcp: restore zero window probe Paolo Abeni
2025-10-20 16:10 ` [PATCH mptcp-net 1/2] mptcp: restore " Paolo Abeni
2025-10-21 0:12 ` Mat Martineau
2025-10-21 17:27 ` Paolo Abeni
2025-10-20 16:10 ` [PATCH mptcp-net 2/2] mptcp: zero window probe mib Paolo Abeni
2025-10-21 0:14 ` Mat Martineau
2025-10-20 17:38 ` [PATCH mptcp-net 0/2] mptcp: restore zero window probe MPTCP CI
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.