All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH mptcp-net 0/3] mptcp: a few random fixes
@ 2025-11-13 23:05 Paolo Abeni
  2025-11-13 23:05 ` [PATCH mptcp-net 1/3] mptcp: fix premature close in case of fallback Paolo Abeni
                   ` (5 more replies)
  0 siblings, 6 replies; 7+ messages in thread
From: Paolo Abeni @ 2025-11-13 23:05 UTC (permalink / raw)
  To: mptcp

The first patch fixes selftest failures in CONFIG_PREEMPT build.
The second patch addresses a recently filed issue, 2nd one is due to
code inspection while investigating the mentioned issue.

v2 -> v3:
  - new patch 1/3
  - check OoO queue in mptcp_try_fallback

v1 -> v2:
  - fix pktdrill failures due to missing rcv_wnd_sent initialization

Paolo Abeni (3):
  mptcp: fix premature close in case of fallback
  mptcp: do not fallback when OoO is present
  mptcp: do not drop partial packets.

 net/mptcp/protocol.c | 37 +++++++++++++++++++++++++++++--------
 net/mptcp/subflow.c  |  3 ++-
 2 files changed, 31 insertions(+), 9 deletions(-)

-- 
2.51.1


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH mptcp-net 1/3] mptcp: fix premature close in case of fallback
  2025-11-13 23:05 [PATCH mptcp-net 0/3] mptcp: a few random fixes Paolo Abeni
@ 2025-11-13 23:05 ` Paolo Abeni
  2025-11-13 23:05 ` [PATCH mptcp-net 2/3] mptcp: do not fallback when OoO is present Paolo Abeni
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: Paolo Abeni @ 2025-11-13 23:05 UTC (permalink / raw)
  To: mptcp

I'm observing very frequent self-tests failures in case of fallback when
running on a CONFIG_PREEMPT kernel.

The root cause is that subflow_sched_work_if_closed() closes any subflow
as soon as it is half-closed and has no incoming data pending.

That works well for regular subflows - MPTCP needs be-directional
connectivity to operate on a given suflow - but for fallback socket is
race prone.

When TCP peer closes the connection before the MPTCP one,
subflow_sched_work_if_closed() will schedule the MPTCP worker to gracefully
close the subflow, and shortly after will do another schedule to inject and
process a dummy incoming DATA_FIN.

On CONFIG_PREEMPT kernel, the MPTCP worker can kick-in and close the
fallback subflow before subflow_sched_work_if_closed() is able to create
the dummy DATA_FIN, unexpectedly interrupting the transfer.

Address the issue explicitly avoiding closing fallback subflows on when the
peer is only half-closed.

Note that, when the subflow is able to create the DATA_FIN before the
worker invocation, the worker will change the msk state before trying to
close the subflow and will skip the latter operation as the msk will not
match anymore the precondition in __mptcp_close_subflow().

Fixes: f09b0ad55a11 ("mptcp: close subflow when receiving TCP+FIN")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
---
 net/mptcp/protocol.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index 935a32217741..1b222cd3096a 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -2630,7 +2630,8 @@ static void __mptcp_close_subflow(struct sock *sk)
 
 		if (ssk_state != TCP_CLOSE &&
 		    (ssk_state != TCP_CLOSE_WAIT ||
-		     inet_sk_state_load(sk) != TCP_ESTABLISHED))
+		     inet_sk_state_load(sk) != TCP_ESTABLISHED ||
+		     __mptcp_check_fallback(msk)))
 			continue;
 
 		/* 'subflow_data_ready' will re-sched once rx queue is empty */
-- 
2.51.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH mptcp-net 2/3] mptcp: do not fallback when OoO is present
  2025-11-13 23:05 [PATCH mptcp-net 0/3] mptcp: a few random fixes Paolo Abeni
  2025-11-13 23:05 ` [PATCH mptcp-net 1/3] mptcp: fix premature close in case of fallback Paolo Abeni
@ 2025-11-13 23:05 ` Paolo Abeni
  2025-11-13 23:05 ` [PATCH mptcp-net 3/3] mptcp: do not drop partial packets Paolo Abeni
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: Paolo Abeni @ 2025-11-13 23:05 UTC (permalink / raw)
  To: mptcp

In case of DSS corruption, the MPTCP protocol tries to avoid the
subflow reset if fallback is possible. Such corruptions happen in
the receive path; to ensure fallback is possible the stack additionally
need to check for OoO data, otherwise the fallback will break the data
stream.

Fixes: e32d262c89e2 ("mptcp: handle consistently DSS corruption")
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/598
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
---
Note: this does not avoid the WARN(), but fixes the inconsistend
read() behavior; the ingress data is OoO, we should not ack it
---
 net/mptcp/protocol.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index 1b222cd3096a..f5761bff288c 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -78,6 +78,13 @@ bool __mptcp_try_fallback(struct mptcp_sock *msk, int fb_mib)
 	if (__mptcp_check_fallback(msk))
 		return true;
 
+	/* The caller possibly is not holding the msk socket lock, but
+	 * in the fallback case only the current subflow is touching
+	 * the OoO queue.
+	 */
+	if (!RB_EMPTY_ROOT(&msk->out_of_order_queue))
+		return false;
+
 	spin_lock_bh(&msk->fallback_lock);
 	if (!msk->allow_infinite_fallback) {
 		spin_unlock_bh(&msk->fallback_lock);
-- 
2.51.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH mptcp-net 3/3] mptcp: do not drop partial packets.
  2025-11-13 23:05 [PATCH mptcp-net 0/3] mptcp: a few random fixes Paolo Abeni
  2025-11-13 23:05 ` [PATCH mptcp-net 1/3] mptcp: fix premature close in case of fallback Paolo Abeni
  2025-11-13 23:05 ` [PATCH mptcp-net 2/3] mptcp: do not fallback when OoO is present Paolo Abeni
@ 2025-11-13 23:05 ` Paolo Abeni
  2025-11-14  1:43 ` [PATCH mptcp-net 0/3] mptcp: a few random fixes MPTCP CI
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: Paolo Abeni @ 2025-11-13 23:05 UTC (permalink / raw)
  To: mptcp

Currently MPTCP drops partial packets for no good reason at all.
Instead we should just skip the already acked bytes.

Also add a missing check for zero window, which in turn requires
properly initializing the rcv_wnd_sent at connection creation time.

Fixes: ab174ad8ef76 ("mptcp: move ooo skbs into msk out of order queue.")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
---
Notes:
- We should also add a MIB for zerowin drop, but that should be a follow-up
patch, I think.
- based on top of 'mptcp: autotune related improvement', but targeting net,
  will have a conflict in __mptcp_move_skb().
---
 net/mptcp/protocol.c | 27 ++++++++++++++++++++-------
 net/mptcp/subflow.c  |  3 ++-
 2 files changed, 22 insertions(+), 8 deletions(-)

diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index f5761bff288c..78ac8ba80e59 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -385,6 +385,10 @@ static bool __mptcp_move_skb(struct sock *sk, struct sk_buff *skb)
 
 	mptcp_borrow_fwdmem(sk, skb);
 
+	/* Check for zero window.*/
+	if (atomic64_read(&msk->rcv_wnd_sent) == msk->ack_seq)
+		goto drop;
+
 	if (MPTCP_SKB_CB(skb)->map_seq == msk->ack_seq) {
 		/* in sequence */
 		msk->bytes_received += copy_len;
@@ -393,18 +397,27 @@ static bool __mptcp_move_skb(struct sock *sk, struct sk_buff *skb)
 		if (tail && mptcp_try_coalesce(sk, tail, skb))
 			return true;
 
-		skb_set_owner_r(skb, sk);
-		__skb_queue_tail(&sk->sk_receive_queue, skb);
-		return true;
+		goto enqueue;
 	} else if (after64(MPTCP_SKB_CB(skb)->map_seq, msk->ack_seq)) {
 		mptcp_data_queue_ofo(msk, skb);
 		return false;
 	}
 
-	/* old data, keep it simple and drop the whole pkt, sender
-	 * will retransmit as needed, if needed.
-	 */
-	MPTCP_INC_STATS(sock_net(sk), MPTCP_MIB_DUPDATA);
+	/* Check for old data. */
+	if (!after64(MPTCP_SKB_CB(skb)->end_seq, msk->ack_seq)) {
+		MPTCP_INC_STATS(sock_net(sk), MPTCP_MIB_DUPDATA);
+		goto drop;
+	}
+
+	/* Partial packet, seq < rcv_next < end_seq. */
+	MPTCP_SKB_CB(skb)->offset += msk->ack_seq - MPTCP_SKB_CB(skb)->map_seq;
+
+enqueue:
+	skb_set_owner_r(skb, sk);
+	__skb_queue_tail(&sk->sk_receive_queue, skb);
+	return true;
+
+drop:
 	mptcp_drop(sk, skb);
 	return false;
 }
diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c
index 86ce58ae533d..43a2ff058ba5 100644
--- a/net/mptcp/subflow.c
+++ b/net/mptcp/subflow.c
@@ -497,7 +497,8 @@ static void subflow_set_remote_key(struct mptcp_sock *msk,
 	WRITE_ONCE(msk->remote_key, subflow->remote_key);
 	WRITE_ONCE(msk->ack_seq, subflow->iasn);
 	WRITE_ONCE(msk->can_ack, true);
-	atomic64_set(&msk->rcv_wnd_sent, subflow->iasn);
+	atomic64_set(&msk->rcv_wnd_sent, subflow->iasn +
+		     tcp_receive_window(tcp_sk(subflow->tcp_sock)));
 }
 
 static void mptcp_propagate_state(struct sock *sk, struct sock *ssk,
-- 
2.51.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH mptcp-net 0/3] mptcp: a few random fixes
  2025-11-13 23:05 [PATCH mptcp-net 0/3] mptcp: a few random fixes Paolo Abeni
                   ` (2 preceding siblings ...)
  2025-11-13 23:05 ` [PATCH mptcp-net 3/3] mptcp: do not drop partial packets Paolo Abeni
@ 2025-11-14  1:43 ` MPTCP CI
  2025-11-14 11:30 ` Matthieu Baerts
  2025-11-14 11:49 ` Matthieu Baerts
  5 siblings, 0 replies; 7+ messages in thread
From: MPTCP CI @ 2025-11-14  1:43 UTC (permalink / raw)
  To: Paolo Abeni; +Cc: mptcp

Hi Paolo,

Thank you for your modifications, that's great!

Our CI did some validations and here is its report:

- KVM Validation: normal (except selftest_mptcp_join): Critical: 85 Call Trace(s) - Critical: Global Timeout ❌
- KVM Validation: normal (only selftest_mptcp_join): Success! ✅
- KVM Validation: debug (except selftest_mptcp_join): Success! ✅
- KVM Validation: debug (only selftest_mptcp_join): Unstable: 1 failed test(s): selftest_mptcp_join 🔴
- KVM Validation: btf-normal (only bpftest_all): Success! ✅
- KVM Validation: btf-debug (only bpftest_all): Unstable: 1 failed test(s): bpftest_test_progs-cpuv4_mptcp 🔴
- Task: https://github.com/multipath-tcp/mptcp_net-next/actions/runs/19349079984

Initiator: Patchew Applier
Commits: https://github.com/multipath-tcp/mptcp_net-next/commits/88e88d30abed
Patchwork: https://patchwork.kernel.org/project/mptcp/list/?series=1023207


If there are some issues, you can reproduce them using the same environment as
the one used by the CI thanks to a docker image, e.g.:

    $ cd [kernel source code]
    $ docker run -v "${PWD}:${PWD}:rw" -w "${PWD}" --privileged --rm -it \
        --pull always mptcp/mptcp-upstream-virtme-docker:latest \
        auto-normal

For more details:

    https://github.com/multipath-tcp/mptcp-upstream-virtme-docker


Please note that despite all the efforts that have been already done to have a
stable tests suite when executed on a public CI like here, it is possible some
reported issues are not due to your modifications. Still, do not hesitate to
help us improve that ;-)

Cheers,
MPTCP GH Action bot
Bot operated by Matthieu Baerts (NGI0 Core)

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH mptcp-net 0/3] mptcp: a few random fixes
  2025-11-13 23:05 [PATCH mptcp-net 0/3] mptcp: a few random fixes Paolo Abeni
                   ` (3 preceding siblings ...)
  2025-11-14  1:43 ` [PATCH mptcp-net 0/3] mptcp: a few random fixes MPTCP CI
@ 2025-11-14 11:30 ` Matthieu Baerts
  2025-11-14 11:49 ` Matthieu Baerts
  5 siblings, 0 replies; 7+ messages in thread
From: Matthieu Baerts @ 2025-11-14 11:30 UTC (permalink / raw)
  To: Paolo Abeni, mptcp

Hi Paolo,

On 14/11/2025 00:05, Paolo Abeni wrote:
> The first patch fixes selftest failures in CONFIG_PREEMPT build.
> The second patch addresses a recently filed issue, 2nd one is due to
> code inspection while investigating the mentioned issue.

Thank you for the new version!

It looks good to me:

Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>

Cheers,
Matt
-- 
Sponsored by the NGI0 Core fund.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH mptcp-net 0/3] mptcp: a few random fixes
  2025-11-13 23:05 [PATCH mptcp-net 0/3] mptcp: a few random fixes Paolo Abeni
                   ` (4 preceding siblings ...)
  2025-11-14 11:30 ` Matthieu Baerts
@ 2025-11-14 11:49 ` Matthieu Baerts
  5 siblings, 0 replies; 7+ messages in thread
From: Matthieu Baerts @ 2025-11-14 11:49 UTC (permalink / raw)
  To: Paolo Abeni, mptcp

Hi Paolo,

On 14/11/2025 00:05, Paolo Abeni wrote:
> The first patch fixes selftest failures in CONFIG_PREEMPT build.
> The second patch addresses a recently filed issue, 2nd one is due to
> code inspection while investigating the mentioned issue.

Thank you again! Now in our tree (fixes for -net):

New patches for t/upstream-net and t/upstream:
- f00f2972e67b: mptcp: fix premature close in case of fallback
- 8e46f6ba1eea: mptcp: do not fallback when OoO is present
- 1d2ce718811a: mptcp: do not drop partial packets
- Results: 7b864be7a6e5..d3c1552df7fb (export-net)
- 42c9778f2eef: conflict in t/mptcp-borrow-forward-memory-from-subflow
- Results: 2636e284e552..13d7de45372c (export)

Tests are now in progress:

- export-net:
https://github.com/multipath-tcp/mptcp_net-next/commit/2e032d2fe4e08ca8caac051eb86c063a29cf6dca/checks
- export:
https://github.com/multipath-tcp/mptcp_net-next/commit/a7420f3fafe4ac3877b5bafdea23b2c13d3e2d61/checks

Cheers,
Matt
-- 
Sponsored by the NGI0 Core fund.


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2025-11-14 11:49 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-11-13 23:05 [PATCH mptcp-net 0/3] mptcp: a few random fixes Paolo Abeni
2025-11-13 23:05 ` [PATCH mptcp-net 1/3] mptcp: fix premature close in case of fallback Paolo Abeni
2025-11-13 23:05 ` [PATCH mptcp-net 2/3] mptcp: do not fallback when OoO is present Paolo Abeni
2025-11-13 23:05 ` [PATCH mptcp-net 3/3] mptcp: do not drop partial packets Paolo Abeni
2025-11-14  1:43 ` [PATCH mptcp-net 0/3] mptcp: a few random fixes MPTCP CI
2025-11-14 11:30 ` Matthieu Baerts
2025-11-14 11:49 ` Matthieu Baerts

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.