MPTCP Linux Development
 help / color / mirror / Atom feed
From: Paolo Abeni <pabeni@redhat.com>
To: Geliang Tang <geliang@kernel.org>,
	gang.yan@linux.dev, Matthieu Baerts <matttbe@kernel.org>,
	mptcp@lists.linux.dev
Subject: Re: [PATCH v3 mptcp-next 00/10] mptcp: address stall under memory pressure
Date: Wed, 13 May 2026 11:51:14 +0200	[thread overview]
Message-ID: <d254a85a-ec80-41ef-88f2-ba8eb97bcb30@redhat.com> (raw)
In-Reply-To: <0b9d2fbe-5d11-4c73-910a-fb99601ad54a@redhat.com>

On 5/11/26 6:35 PM, Paolo Abeni wrote:
> On 5/11/26 1:11 PM, Geliang Tang wrote:
>> On Mon, 2026-05-11 at 10:29 +0200, Paolo Abeni wrote:
>>> On 5/9/26 9:07 AM, gang.yan@linux.dev wrote:
>>>> May 8, 2026 at 6:49 PM, "Matthieu Baerts"
>>>> <matttbe@kernel.org mailto:
>>>> matttbe@kernel.org?to=%22Matthieu%20Baerts%22%20%3Cmatttbe%40kernel
>>>> .org%3E > wrote:
>>>>>
>>>>> Hi Geliang, Gang,
>>>>>
>>>>> On 04/05/2026 17:39, Paolo Abeni wrote:
>>>>>
>>>>>>
>>>>>> This an attempt to fix the data transfer stall reported by
>>>>>> Geliang and
>>>>>>  Gang more carefully enforcing memory constraints at the MPTCP
>>>>>> level.
>>>>>>  
>>>>>>  Patch 1/10 moves the bound check before entering the TCP
>>>>>> socket.
>>>>>>  Patch 2, 3, 4 and 5 are cleanups/refactors finalized to safely
>>>>>> re-using
>>>>>>  TCP helpers on MPTCP skbs.
>>>>>>  Patch 6 makes TCP pruning related helpers available to MPTCP
>>>>>> and patch 7
>>>>>>  makes use of them. Patch 8 addresses an edge scenario that
>>>>>> could still
>>>>>>  lead to transfer stall under memory pressure.
>>>>>>  Finally patch 9 and 10 improve the MPTCP-level retransmission
>>>>>> schema to
>>>>>>  make recovery from memory pressure significanly faster.
>>>>>>  
>>>>>>  Note that the diffstat is biases by the quite large patch 4/9,
>>>>>> which
>>>>>>  contains mechanical transformation of existing code; "real"
>>>>>> changes are
>>>>>>  noticiable smaller.
>>>>>>  
>>>>>>  Tested successfully vs the test cases proposed by Geliang and
>>>>>> Gang and
>>>>>>  vs the selftests.
>>>>>>
>>>>> At the last meeting on Wednesday, Geliang mentioned he validated
>>>>> this
>>>>> series. Just to be sure, was it the v2 -- from last week -- or
>>>>> the v3 --
>>>>> from this week, while you were in EU -- that you validated?
>>>>> Because
>>>>> Paolo couldn't reproduce the issue you mentioned on the v3 on his
>>>>> side.
>>>>>
>>>> Hi, Matt
>>>>
>>>> The issue can also be reproduced on v3. I reproduced it using
>>>> Docker's
>>>> auto-debug mode with the mptcp_data.sh selftest:
>>>>
>>>> ‘’‘
>>>> 	Not running all tests but:
>>>>
>>>> -------- 8< --------
>>>> run_loop run_selftest_one mptcp_data.sh
>>>> -------- 8< --------
>>>>
>>>>
>>>>
>>>>
>>>> 	=== Attempt: 1 (Sat, 09 May 2026 06:55:44 +0000) ===
>>>>
>>>>
>>>> Selftest Test: ./mptcp_data.sh
>>>> TAP version 13
>>>> 1..1
>>>> # add_addr_accepted 4 subflows 4 
>>>> # id 1 flags signal 127.0.0.1 10001
>>>> # id 2 flags signal 127.0.0.1 10002
>>>> # id 3 flags signal 127.0.0.1 10003
>>>> # id 4 flags signal 127.0.0.1 10004
>>>> # TAP version 13
>>>> # 1..48
>>>> # # Starting 48 tests from 2 test cases.
>>>> # #  RUN           global.mptcp_v6 ...
>>>> # #            OK  global.mptcp_v6
>>>> # ok 1 global.mptcp_v6
>>>> # #  RUN           mptcp.shutdown_reuse ...
>>>> # #            OK  mptcp.shutdown_reuse
>>>> # ok 2 mptcp.shutdown_reuse
>>>> ...
>>>> # ok 48 mptcp.sendfile
>>>> # # FAILED: 41 / 48 tests passed.
>>>> # # Totals: pass:41 fail:7 xfail:0 xpass:0 skip:0 error:0
>>>> not ok 1 test: selftest_mptcp_data # FAIL
>>>> # time=33
>>>> ’‘’
>>>>
>>>> As Geliang mentioned in the weekly meeting, we will continue
>>>> to debug and locate the problem based on Paolo's v3 patches.
>>>
>>> Thanks for testing. I went over a couple more revisions. I run more
>>> than
>>> 100 iterations successfully on a local build on top of v5 (no
>>> failures,
>>> I stop due to time constraints):
>>>
>>> https://lore.kernel.org/mptcp/f00bdac0-b544-87b8-2ef4-ca4de0f045de@gmail.com/T/#t
>>>
>>> please have a spin as such later version.
>>
>> On v5, mptcp_data.sh works fine in normal mode with the virtme docker
>> image, but in debug mode it's still unstable - failing after several
>> loop iterations.
> 
> How many interactions? Can you please provide a sample of such failure?
> does the failure always happen in the same way/on the same test case?

I could reproduce some failures with several iteration in debug build.
Interesting they pointed out to:
- a pre-existing issue (missing wake-up) that deserves a separate patch [1].
- an intrinsic problem with queue collapsing: it's (very) slow and can
slow down the received a lot. I had to raise the timeout (replacing
TEST_F with TEST_F_TIMEOUT) above 300 seconds to complete 600 mptcp_data
runs without errors.

A working alternative to the latter change is avoid entirely the
xtcp_*collapse() stuff. The trade off here is that such option will make
./mptcp_join.sh -R slower (as more mptcp-level retransmissions will be
needed)

I'll try to share somewhat soonish the fix [1].

/P


      reply	other threads:[~2026-05-13  9:51 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-04 15:39 [PATCH v3 mptcp-next 00/10] mptcp: address stall under memory pressure Paolo Abeni
2026-05-04 15:39 ` [PATCH v3 mptcp-next 01/10] mptcp: move checks vs rcvbuf size earlier in the RX path Paolo Abeni
2026-05-04 15:39 ` [PATCH v3 mptcp-next 02/10] mptcp: drop the mptcp_ooo_try_coalesce() helper Paolo Abeni
2026-05-04 15:39 ` [PATCH v3 mptcp-next 03/10] mptcp: drop the cant_coalesce CB field Paolo Abeni
2026-05-04 15:39 ` [PATCH v3 mptcp-next 04/10] mptcp: remove CB offset field Paolo Abeni
2026-05-04 15:40 ` [PATCH v3 mptcp-next 05/10] mptcp: sync mptcp skb cb layout with tcp one Paolo Abeni
2026-05-04 15:40 ` [PATCH v3 mptcp-next 06/10] tcp: expose the tcp_collapse_ofo_queue() helper to mptcp usage, too Paolo Abeni
2026-05-04 16:39   ` Paolo Abeni
2026-05-04 19:07     ` Eric Dumazet
2026-05-05 15:57       ` Paolo Abeni
2026-05-05 16:35         ` Eric Dumazet
2026-05-06  9:33           ` Paolo Abeni
2026-05-04 15:40 ` [PATCH v3 mptcp-next 07/10] mptcp: implemented OoO queue pruning Paolo Abeni
2026-05-04 15:40 ` [PATCH v3 mptcp-next 08/10] mptcp: track prune recovery status Paolo Abeni
2026-05-04 15:40 ` [PATCH v3 mptcp-next 09/10] mptcp: move the retrans loop to a separate helper Paolo Abeni
2026-05-04 15:40 ` [PATCH v3 mptcp-next 10/10] mptcp: let the retrans scheduler do its job Paolo Abeni
2026-05-04 16:55 ` [PATCH v3 mptcp-next 00/10] mptcp: address stall under memory pressure MPTCP CI
2026-05-08 10:49 ` Matthieu Baerts
2026-05-09  7:07   ` gang.yan
2026-05-11  8:29     ` Paolo Abeni
2026-05-11 11:11       ` Geliang Tang
2026-05-11 16:35         ` Paolo Abeni
2026-05-13  9:51           ` Paolo Abeni [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d254a85a-ec80-41ef-88f2-ba8eb97bcb30@redhat.com \
    --to=pabeni@redhat.com \
    --cc=gang.yan@linux.dev \
    --cc=geliang@kernel.org \
    --cc=matttbe@kernel.org \
    --cc=mptcp@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox