From: Paolo Abeni <pabeni@redhat.com>
To: Geliang Tang <geliang@kernel.org>,
gang.yan@linux.dev, Matthieu Baerts <matttbe@kernel.org>,
mptcp@lists.linux.dev
Subject: Re: [PATCH v3 mptcp-next 00/10] mptcp: address stall under memory pressure
Date: Wed, 13 May 2026 11:51:14 +0200 [thread overview]
Message-ID: <d254a85a-ec80-41ef-88f2-ba8eb97bcb30@redhat.com> (raw)
In-Reply-To: <0b9d2fbe-5d11-4c73-910a-fb99601ad54a@redhat.com>
On 5/11/26 6:35 PM, Paolo Abeni wrote:
> On 5/11/26 1:11 PM, Geliang Tang wrote:
>> On Mon, 2026-05-11 at 10:29 +0200, Paolo Abeni wrote:
>>> On 5/9/26 9:07 AM, gang.yan@linux.dev wrote:
>>>> May 8, 2026 at 6:49 PM, "Matthieu Baerts"
>>>> <matttbe@kernel.org mailto:
>>>> matttbe@kernel.org?to=%22Matthieu%20Baerts%22%20%3Cmatttbe%40kernel
>>>> .org%3E > wrote:
>>>>>
>>>>> Hi Geliang, Gang,
>>>>>
>>>>> On 04/05/2026 17:39, Paolo Abeni wrote:
>>>>>
>>>>>>
>>>>>> This an attempt to fix the data transfer stall reported by
>>>>>> Geliang and
>>>>>> Gang more carefully enforcing memory constraints at the MPTCP
>>>>>> level.
>>>>>>
>>>>>> Patch 1/10 moves the bound check before entering the TCP
>>>>>> socket.
>>>>>> Patch 2, 3, 4 and 5 are cleanups/refactors finalized to safely
>>>>>> re-using
>>>>>> TCP helpers on MPTCP skbs.
>>>>>> Patch 6 makes TCP pruning related helpers available to MPTCP
>>>>>> and patch 7
>>>>>> makes use of them. Patch 8 addresses an edge scenario that
>>>>>> could still
>>>>>> lead to transfer stall under memory pressure.
>>>>>> Finally patch 9 and 10 improve the MPTCP-level retransmission
>>>>>> schema to
>>>>>> make recovery from memory pressure significanly faster.
>>>>>>
>>>>>> Note that the diffstat is biases by the quite large patch 4/9,
>>>>>> which
>>>>>> contains mechanical transformation of existing code; "real"
>>>>>> changes are
>>>>>> noticiable smaller.
>>>>>>
>>>>>> Tested successfully vs the test cases proposed by Geliang and
>>>>>> Gang and
>>>>>> vs the selftests.
>>>>>>
>>>>> At the last meeting on Wednesday, Geliang mentioned he validated
>>>>> this
>>>>> series. Just to be sure, was it the v2 -- from last week -- or
>>>>> the v3 --
>>>>> from this week, while you were in EU -- that you validated?
>>>>> Because
>>>>> Paolo couldn't reproduce the issue you mentioned on the v3 on his
>>>>> side.
>>>>>
>>>> Hi, Matt
>>>>
>>>> The issue can also be reproduced on v3. I reproduced it using
>>>> Docker's
>>>> auto-debug mode with the mptcp_data.sh selftest:
>>>>
>>>> ‘’‘
>>>> Not running all tests but:
>>>>
>>>> -------- 8< --------
>>>> run_loop run_selftest_one mptcp_data.sh
>>>> -------- 8< --------
>>>>
>>>>
>>>>
>>>>
>>>> === Attempt: 1 (Sat, 09 May 2026 06:55:44 +0000) ===
>>>>
>>>>
>>>> Selftest Test: ./mptcp_data.sh
>>>> TAP version 13
>>>> 1..1
>>>> # add_addr_accepted 4 subflows 4
>>>> # id 1 flags signal 127.0.0.1 10001
>>>> # id 2 flags signal 127.0.0.1 10002
>>>> # id 3 flags signal 127.0.0.1 10003
>>>> # id 4 flags signal 127.0.0.1 10004
>>>> # TAP version 13
>>>> # 1..48
>>>> # # Starting 48 tests from 2 test cases.
>>>> # # RUN global.mptcp_v6 ...
>>>> # # OK global.mptcp_v6
>>>> # ok 1 global.mptcp_v6
>>>> # # RUN mptcp.shutdown_reuse ...
>>>> # # OK mptcp.shutdown_reuse
>>>> # ok 2 mptcp.shutdown_reuse
>>>> ...
>>>> # ok 48 mptcp.sendfile
>>>> # # FAILED: 41 / 48 tests passed.
>>>> # # Totals: pass:41 fail:7 xfail:0 xpass:0 skip:0 error:0
>>>> not ok 1 test: selftest_mptcp_data # FAIL
>>>> # time=33
>>>> ’‘’
>>>>
>>>> As Geliang mentioned in the weekly meeting, we will continue
>>>> to debug and locate the problem based on Paolo's v3 patches.
>>>
>>> Thanks for testing. I went over a couple more revisions. I run more
>>> than
>>> 100 iterations successfully on a local build on top of v5 (no
>>> failures,
>>> I stop due to time constraints):
>>>
>>> https://lore.kernel.org/mptcp/f00bdac0-b544-87b8-2ef4-ca4de0f045de@gmail.com/T/#t
>>>
>>> please have a spin as such later version.
>>
>> On v5, mptcp_data.sh works fine in normal mode with the virtme docker
>> image, but in debug mode it's still unstable - failing after several
>> loop iterations.
>
> How many interactions? Can you please provide a sample of such failure?
> does the failure always happen in the same way/on the same test case?
I could reproduce some failures with several iteration in debug build.
Interesting they pointed out to:
- a pre-existing issue (missing wake-up) that deserves a separate patch [1].
- an intrinsic problem with queue collapsing: it's (very) slow and can
slow down the received a lot. I had to raise the timeout (replacing
TEST_F with TEST_F_TIMEOUT) above 300 seconds to complete 600 mptcp_data
runs without errors.
A working alternative to the latter change is avoid entirely the
xtcp_*collapse() stuff. The trade off here is that such option will make
./mptcp_join.sh -R slower (as more mptcp-level retransmissions will be
needed)
I'll try to share somewhat soonish the fix [1].
/P
prev parent reply other threads:[~2026-05-13 9:51 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-04 15:39 [PATCH v3 mptcp-next 00/10] mptcp: address stall under memory pressure Paolo Abeni
2026-05-04 15:39 ` [PATCH v3 mptcp-next 01/10] mptcp: move checks vs rcvbuf size earlier in the RX path Paolo Abeni
2026-05-04 15:39 ` [PATCH v3 mptcp-next 02/10] mptcp: drop the mptcp_ooo_try_coalesce() helper Paolo Abeni
2026-05-04 15:39 ` [PATCH v3 mptcp-next 03/10] mptcp: drop the cant_coalesce CB field Paolo Abeni
2026-05-04 15:39 ` [PATCH v3 mptcp-next 04/10] mptcp: remove CB offset field Paolo Abeni
2026-05-04 15:40 ` [PATCH v3 mptcp-next 05/10] mptcp: sync mptcp skb cb layout with tcp one Paolo Abeni
2026-05-04 15:40 ` [PATCH v3 mptcp-next 06/10] tcp: expose the tcp_collapse_ofo_queue() helper to mptcp usage, too Paolo Abeni
2026-05-04 16:39 ` Paolo Abeni
2026-05-04 19:07 ` Eric Dumazet
2026-05-05 15:57 ` Paolo Abeni
2026-05-05 16:35 ` Eric Dumazet
2026-05-06 9:33 ` Paolo Abeni
2026-05-04 15:40 ` [PATCH v3 mptcp-next 07/10] mptcp: implemented OoO queue pruning Paolo Abeni
2026-05-04 15:40 ` [PATCH v3 mptcp-next 08/10] mptcp: track prune recovery status Paolo Abeni
2026-05-04 15:40 ` [PATCH v3 mptcp-next 09/10] mptcp: move the retrans loop to a separate helper Paolo Abeni
2026-05-04 15:40 ` [PATCH v3 mptcp-next 10/10] mptcp: let the retrans scheduler do its job Paolo Abeni
2026-05-04 16:55 ` [PATCH v3 mptcp-next 00/10] mptcp: address stall under memory pressure MPTCP CI
2026-05-08 10:49 ` Matthieu Baerts
2026-05-09 7:07 ` gang.yan
2026-05-11 8:29 ` Paolo Abeni
2026-05-11 11:11 ` Geliang Tang
2026-05-11 16:35 ` Paolo Abeni
2026-05-13 9:51 ` Paolo Abeni [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=d254a85a-ec80-41ef-88f2-ba8eb97bcb30@redhat.com \
--to=pabeni@redhat.com \
--cc=gang.yan@linux.dev \
--cc=geliang@kernel.org \
--cc=matttbe@kernel.org \
--cc=mptcp@lists.linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox