MPTCP Linux Development
 help / color / mirror / Atom feed
From: Paolo Abeni <pabeni@redhat.com>
To: Geliang Tang <geliang@kernel.org>,
	Matthieu Baerts <matttbe@kernel.org>,
	mptcp@lists.linux.dev
Subject: Re: [PATCH v5 mptcp-next 00/10] mptcp: introduce backlog processing
Date: Thu, 9 Oct 2025 09:52:19 +0200	[thread overview]
Message-ID: <8a8feb1d-ad10-4ba4-a448-db8a0e45c7c3@redhat.com> (raw)
In-Reply-To: <2389029f56a9fa496b59be7655987e6d9c6362f2.camel@kernel.org>

[-- Attachment #1: Type: text/plain, Size: 4188 bytes --]

On 10/9/25 8:54 AM, Geliang Tang wrote:
> On Wed, 2025-10-08 at 09:30 +0200, Paolo Abeni wrote:
>> On 10/8/25 5:07 AM, Geliang Tang wrote:
>>> On Mon, 2025-10-06 at 19:07 +0200, Matthieu Baerts wrote:
>>>> Hi Paolo,
>>>>
>>>> On 06/10/2025 10:11, Paolo Abeni wrote:
>>>>> This series includes RX path improvement built around backlog
>>>>> processing
>>>> Thank you for the new version! This is not a review, but just a
>>>> note
>>>> to
>>>> tell you patchew didn't manage to apply the patches due to the
>>>> same
>>>> conflict that was already there with the v4 (mptcp_init_skb()
>>>> parameters
>>>> have been moved to the previous line). I just applied the patches
>>>> manually. While at it, I also used this test branch for syzkaller
>>>> to
>>>> validate them.
>>>>
>>>> (Also, on patch "mptcp: drop the __mptcp_data_ready() helper",
>>>> git
>>>> complained that there is a trailing whitespace.)
>>>
>>> Sorry, patches 9-10 break my "implement mptcp read_sock" v12
>>> series. I
>>> rebased this series on patches 1-8, it works well. But after
>>> applying
>>> patches 9-10, I changed mptcp_recv_skb() in [1] from
>>
>> Thanks for the feedback, the applied delta looks good to me.
>>
>>> # INFO: with MPTFO start
>>> # 57 ns2 MPTCP -> ns1 (10.0.1.1:10054      ) MPTCP     (duration
>>> 60989ms) [FAIL] client exit code 0, server 124
>>> # 
>>> # netns ns1-RqXF2p (listener) socket stat for 10054:
>>> # Failed to find cgroup2 mount
>>> # Failed to find cgroup2 mount
>>> # Failed to find cgroup2 mount
>>> # Netid State    Recv-Q Send-Q Local Address:Port  Peer
>>> Address:Port  
>>> # tcp   ESTAB    0      0           10.0.1.1:10054    
>>> 10.0.1.2:55516
>>> ino:2064372 sk:1 cgroup:unreachable:1 <->
>>> # 	 skmem:(r0,rb131072,t0,tb340992,f0,w0,o0,bl0,d0) sack
>>> cubic
>>> wscale:8,8 rto:206 rtt:5.026/10.034 ato:40 mss:1460 pmtu:1500
>>> rcvmss:1436 advmss:1460 cwnd:10 bytes_sent:115312
>>> bytes_retrans:1560
>>> bytes_acked:113752 bytes_received:5136 segs_out:85 segs_in:16
>>> data_segs_out:83 data_segs_in:4 send 23239156bps lastsnd:60939
>>> lastrcv:61035 lastack:60912 pacing_rate 343879640bps delivery_rate
>>> 1994680bps delivered:84 busy:123ms sndbuf_limited:41ms(33.3%)
>>> retrans:0/2 dsack_dups:2 rcv_space:14600 rcv_ssthresh:75432
>>> minrtt:0.003 rcv_wnd:75520 tcp-ulp-mptcp flags:Mec
>>> token:0000(id:0)/32ed0950(id:0) seq:2946228641406205031 sfseq:1
>>> ssnoff:1349223625 maplen:5136
>>> # mptcp LAST-ACK 0      0           10.0.1.1:10054    
>>> 10.0.1.2:55516
>>> timer:(keepalive,59sec,0) ino:0 sk:2 cgroup:unreachable:1 ---
>>> # 	 skmem:(r0,rb131072,t0,tb345088,f4088,w352264,o0,bl0,d0)
>>> subflows_max:2 remote_key token:32ed0950
>>> write_seq:6317574787800720824
>>> snd_una:6317574787800376423 rcv_nxt:2946228641406210168
>>> bytes_sent:113752 bytes_received:5136 bytes_acked:113752
>>> subflows_total:1 last_data_sent:60954 last_data_recv:61036
>>> last_ack_recv:60913                                       
>>
>> bytes_sent == bytes_sent, possibly we are missing a window-open
>> event,
>> which in turn should be triggered by a mptcp_cleanp_rbuf(), which
>> AFAICS
>> are correctly invoked in the splice code. TL;DR: I can't find
>> anything
>> obviously wrong :-P
>>
>> Also the default rx buf size is suspect.
>>
>> Can you reproduce the issue while capturing the traffic with tcpdump?
>> if
>> so, could you please share the capture?
> 
> Thank you for your suggestion. I've attached several tcpdump logs from
> when the tests failed.

Oh wow! the receiver actually sends the window open notification
(packets 527 and 528 in the trace), but the sender does not react at all.

I have no idea/I haven't digged yet why the sender did not try a zero
window probe (it should!), but it looks like we have some old bug in
sender wakeup since MPTCP_DEQUEUE introduction (which is very
surprising, why we did not catch/observe this earlier ?!?). That could
explain also sporadic mptcp_join failures.

Could you please try the attached patch?

/P

p.s. AFAICS the backlog introduction should just increase the frequency
of an already possible event...

[-- Attachment #2: always_wakeup_snd_nxt_increase.patch --]
[-- Type: text/x-patch, Size: 2138 bytes --]

diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index a92ecec1beb3b3..268ec752ffc01b 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -1038,12 +1038,14 @@ static void dfrag_clear(struct sock *sk, struct mptcp_data_frag *dfrag)
 }
 
 /* called under both the msk socket lock and the data lock */
-static void __mptcp_clean_una(struct sock *sk)
+static void __mptcp_clean_una_wakeup(struct sock *sk)
 {
 	struct mptcp_sock *msk = mptcp_sk(sk);
 	struct mptcp_data_frag *dtmp, *dfrag;
 	u64 snd_una;
 
+	lockdep_assert_held_once(&sk->sk_lock.slock);
+
 	snd_una = msk->snd_una;
 	list_for_each_entry_safe(dfrag, dtmp, &msk->rtx_queue, list) {
 		if (after64(dfrag->data_seq + dfrag->data_len, snd_una))
@@ -1095,13 +1097,6 @@ static void __mptcp_clean_una(struct sock *sk)
 
 	if (mptcp_pending_data_fin_ack(sk))
 		mptcp_schedule_work(sk);
-}
-
-static void __mptcp_clean_una_wakeup(struct sock *sk)
-{
-	lockdep_assert_held_once(&sk->sk_lock.slock);
-
-	__mptcp_clean_una(sk);
 	mptcp_write_space(sk);
 }
 
@@ -3512,7 +3507,7 @@ static void mptcp_destroy(struct sock *sk)
 void __mptcp_data_acked(struct sock *sk)
 {
 	if (!sock_owned_by_user(sk))
-		__mptcp_clean_una(sk);
+		__mptcp_clean_una_wakeup(sk);
 	else
 		__set_bit(MPTCP_CLEAN_UNA, &mptcp_sk(sk)->cb_flags);
 }
diff --git a/tools/testing/selftests/net/mptcp/mptcp_connect.sh b/tools/testing/selftests/net/mptcp/mptcp_connect.sh
index 61ae6762f5b601..a185abe13b95c4 100755
--- a/tools/testing/selftests/net/mptcp/mptcp_connect.sh
+++ b/tools/testing/selftests/net/mptcp/mptcp_connect.sh
@@ -447,8 +447,8 @@ do_transfer()
 	local duration
 	duration=$((stop-start))
 	printf "(duration %05sms) " "${duration}"
-		mptcp_lib_pr_err_stats "${listener_ns}" "${connector_ns}" "${port}" \
-			"/tmp/${listener_ns}.out" "/tmp/${connector_ns}.out"
+	#	mptcp_lib_pr_err_stats "${listener_ns}" "${connector_ns}" "${port}" \
+	#		"/tmp/${listener_ns}.out" "/tmp/${connector_ns}.out"
 	if [ ${rets} -ne 0 ] || [ ${retc} -ne 0 ]; then
 		mptcp_lib_pr_fail "client exit code $retc, server $rets"
 		mptcp_lib_pr_err_stats "${listener_ns}" "${connector_ns}" "${port}" \

  reply	other threads:[~2025-10-09  7:52 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-06  8:11 [PATCH v5 mptcp-next 00/10] mptcp: introduce backlog processing Paolo Abeni
2025-10-06  8:12 ` [PATCH v5 mptcp-next 01/10] mptcp: borrow forward memory from subflow Paolo Abeni
2025-10-06  8:12 ` [PATCH v5 mptcp-next 02/10] mptcp: cleanup fallback data fin reception Paolo Abeni
2025-10-06  8:12 ` [PATCH v5 mptcp-next 03/10] mptcp: cleanup fallback dummy mapping generation Paolo Abeni
2025-10-06  8:12 ` [PATCH v5 mptcp-next 04/10] mptcp: fix MSG_PEEK stream corruption Paolo Abeni
2025-10-06  8:12 ` [PATCH v5 mptcp-next 05/10] mptcp: ensure the kernel PM does not take action too late Paolo Abeni
2025-10-06  8:12 ` [PATCH v5 mptcp-next 06/10] mptcp: do not miss early first subflow close event notification Paolo Abeni
2025-10-06  8:12 ` [PATCH v5 mptcp-next 07/10] mptcp: make mptcp_destroy_common() static Paolo Abeni
2025-10-06  8:12 ` [PATCH v5 mptcp-next 08/10] mptcp: drop the __mptcp_data_ready() helper Paolo Abeni
2025-10-06  8:12 ` [PATCH v5 mptcp-next 09/10] mptcp: introduce mptcp-level backlog Paolo Abeni
2025-10-08  3:09   ` Geliang Tang
2025-10-20 19:45   ` Mat Martineau
2025-10-06  8:12 ` [PATCH v5 mptcp-next 10/10] mptcp: leverage the backlog for RX packet processing Paolo Abeni
2025-10-20 23:32   ` Mat Martineau
2025-10-21 17:21     ` Paolo Abeni
2025-10-21 23:53       ` Mat Martineau
2025-10-06 17:07 ` [PATCH v5 mptcp-next 00/10] mptcp: introduce backlog processing Matthieu Baerts
2025-10-08  3:07   ` Geliang Tang
2025-10-08  7:30     ` Paolo Abeni
2025-10-09  6:54       ` Geliang Tang
2025-10-09  7:52         ` Paolo Abeni [this message]
2025-10-09  9:02           ` Geliang Tang
2025-10-09 10:23             ` Paolo Abeni
2025-10-09 13:58               ` Paolo Abeni
2025-10-10  8:21                 ` Paolo Abeni
2025-10-10 12:22                   ` Geliang Tang
2025-10-13  9:07                     ` Geliang Tang
2025-10-13 13:29                       ` Paolo Abeni
2025-10-13 17:07                         ` Paolo Abeni
2025-10-15  9:00                       ` Paolo Abeni
2025-10-17  6:38                         ` Geliang Tang
2025-10-18  0:16                           ` Mat Martineau
2025-10-06 17:43 ` MPTCP CI

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8a8feb1d-ad10-4ba4-a448-db8a0e45c7c3@redhat.com \
    --to=pabeni@redhat.com \
    --cc=geliang@kernel.org \
    --cc=matttbe@kernel.org \
    --cc=mptcp@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox