From: Paolo Abeni <pabeni@redhat.com>
To: Geliang Tang <geliang@kernel.org>,
gang.yan@linux.dev, Matthieu Baerts <matttbe@kernel.org>,
mptcp@lists.linux.dev
Subject: Re: [PATCH v3 mptcp-next 00/10] mptcp: address stall under memory pressure
Date: Wed, 13 May 2026 11:51:14 +0200 [thread overview]
Message-ID: <d254a85a-ec80-41ef-88f2-ba8eb97bcb30@redhat.com> (raw)
In-Reply-To: <0b9d2fbe-5d11-4c73-910a-fb99601ad54a@redhat.com>
On 5/11/26 6:35 PM, Paolo Abeni wrote:
> On 5/11/26 1:11 PM, Geliang Tang wrote:
>> On Mon, 2026-05-11 at 10:29 +0200, Paolo Abeni wrote:
>>> On 5/9/26 9:07 AM, gang.yan@linux.dev wrote:
>>>> May 8, 2026 at 6:49 PM, "Matthieu Baerts"
>>>> <matttbe@kernel.org mailto:
>>>> matttbe@kernel.org?to=%22Matthieu%20Baerts%22%20%3Cmatttbe%40kernel
>>>> .org%3E > wrote:
>>>>>
>>>>> Hi Geliang, Gang,
>>>>>
>>>>> On 04/05/2026 17:39, Paolo Abeni wrote:
>>>>>
>>>>>>
>>>>>> This an attempt to fix the data transfer stall reported by
>>>>>> Geliang and
>>>>>> Gang more carefully enforcing memory constraints at the MPTCP
>>>>>> level.
>>>>>>
>>>>>> Patch 1/10 moves the bound check before entering the TCP
>>>>>> socket.
>>>>>> Patch 2, 3, 4 and 5 are cleanups/refactors finalized to safely
>>>>>> re-using
>>>>>> TCP helpers on MPTCP skbs.
>>>>>> Patch 6 makes TCP pruning related helpers available to MPTCP
>>>>>> and patch 7
>>>>>> makes use of them. Patch 8 addresses an edge scenario that
>>>>>> could still
>>>>>> lead to transfer stall under memory pressure.
>>>>>> Finally patch 9 and 10 improve the MPTCP-level retransmission
>>>>>> schema to
>>>>>> make recovery from memory pressure significanly faster.
>>>>>>
>>>>>> Note that the diffstat is biases by the quite large patch 4/9,
>>>>>> which
>>>>>> contains mechanical transformation of existing code; "real"
>>>>>> changes are
>>>>>> noticiable smaller.
>>>>>>
>>>>>> Tested successfully vs the test cases proposed by Geliang and
>>>>>> Gang and
>>>>>> vs the selftests.
>>>>>>
>>>>> At the last meeting on Wednesday, Geliang mentioned he validated
>>>>> this
>>>>> series. Just to be sure, was it the v2 -- from last week -- or
>>>>> the v3 --
>>>>> from this week, while you were in EU -- that you validated?
>>>>> Because
>>>>> Paolo couldn't reproduce the issue you mentioned on the v3 on his
>>>>> side.
>>>>>
>>>> Hi, Matt
>>>>
>>>> The issue can also be reproduced on v3. I reproduced it using
>>>> Docker's
>>>> auto-debug mode with the mptcp_data.sh selftest:
>>>>
>>>> ‘’‘
>>>> Not running all tests but:
>>>>
>>>> -------- 8< --------
>>>> run_loop run_selftest_one mptcp_data.sh
>>>> -------- 8< --------
>>>>
>>>>
>>>>
>>>>
>>>> === Attempt: 1 (Sat, 09 May 2026 06:55:44 +0000) ===
>>>>
>>>>
>>>> Selftest Test: ./mptcp_data.sh
>>>> TAP version 13
>>>> 1..1
>>>> # add_addr_accepted 4 subflows 4
>>>> # id 1 flags signal 127.0.0.1 10001
>>>> # id 2 flags signal 127.0.0.1 10002
>>>> # id 3 flags signal 127.0.0.1 10003
>>>> # id 4 flags signal 127.0.0.1 10004
>>>> # TAP version 13
>>>> # 1..48
>>>> # # Starting 48 tests from 2 test cases.
>>>> # # RUN global.mptcp_v6 ...
>>>> # # OK global.mptcp_v6
>>>> # ok 1 global.mptcp_v6
>>>> # # RUN mptcp.shutdown_reuse ...
>>>> # # OK mptcp.shutdown_reuse
>>>> # ok 2 mptcp.shutdown_reuse
>>>> ...
>>>> # ok 48 mptcp.sendfile
>>>> # # FAILED: 41 / 48 tests passed.
>>>> # # Totals: pass:41 fail:7 xfail:0 xpass:0 skip:0 error:0
>>>> not ok 1 test: selftest_mptcp_data # FAIL
>>>> # time=33
>>>> ’‘’
>>>>
>>>> As Geliang mentioned in the weekly meeting, we will continue
>>>> to debug and locate the problem based on Paolo's v3 patches.
>>>
>>> Thanks for testing. I went over a couple more revisions. I run more
>>> than
>>> 100 iterations successfully on a local build on top of v5 (no
>>> failures,
>>> I stop due to time constraints):
>>>
>>> https://lore.kernel.org/mptcp/f00bdac0-b544-87b8-2ef4-ca4de0f045de@gmail.com/T/#t
>>>
>>> please have a spin as such later version.
>>
>> On v5, mptcp_data.sh works fine in normal mode with the virtme docker
>> image, but in debug mode it's still unstable - failing after several
>> loop iterations.
>
> How many interactions? Can you please provide a sample of such failure?
> does the failure always happen in the same way/on the same test case?
I could reproduce some failures with several iteration in debug build.
Interesting they pointed out to:
- a pre-existing issue (missing wake-up) that deserves a separate patch [1].
- an intrinsic problem with queue collapsing: it's (very) slow and can
slow down the received a lot. I had to raise the timeout (replacing
TEST_F with TEST_F_TIMEOUT) above 300 seconds to complete 600 mptcp_data
runs without errors.
A working alternative to the latter change is avoid entirely the
xtcp_*collapse() stuff. The trade off here is that such option will make
./mptcp_join.sh -R slower (as more mptcp-level retransmissions will be
needed)
I'll try to share somewhat soonish the fix [1].
/P
prev parent reply other threads:[~2026-05-13 9:51 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-04 15:39 [PATCH v3 mptcp-next 00/10] mptcp: address stall under memory pressure Paolo Abeni
2026-05-04 15:39 ` [PATCH v3 mptcp-next 01/10] mptcp: move checks vs rcvbuf size earlier in the RX path Paolo Abeni
2026-05-04 15:39 ` [PATCH v3 mptcp-next 02/10] mptcp: drop the mptcp_ooo_try_coalesce() helper Paolo Abeni
2026-05-04 15:39 ` [PATCH v3 mptcp-next 03/10] mptcp: drop the cant_coalesce CB field Paolo Abeni
2026-05-04 15:39 ` [PATCH v3 mptcp-next 04/10] mptcp: remove CB offset field Paolo Abeni
2026-05-04 15:40 ` [PATCH v3 mptcp-next 05/10] mptcp: sync mptcp skb cb layout with tcp one Paolo Abeni
2026-05-04 15:40 ` [PATCH v3 mptcp-next 06/10] tcp: expose the tcp_collapse_ofo_queue() helper to mptcp usage, too Paolo Abeni
2026-05-04 16:39 ` Paolo Abeni
2026-05-04 19:07 ` Eric Dumazet
2026-05-05 15:57 ` Paolo Abeni
2026-05-05 16:35 ` Eric Dumazet
2026-05-06 9:33 ` Paolo Abeni
2026-05-04 15:40 ` [PATCH v3 mptcp-next 07/10] mptcp: implemented OoO queue pruning Paolo Abeni
2026-05-04 15:40 ` [PATCH v3 mptcp-next 08/10] mptcp: track prune recovery status Paolo Abeni
2026-05-04 15:40 ` [PATCH v3 mptcp-next 09/10] mptcp: move the retrans loop to a separate helper Paolo Abeni
2026-05-04 15:40 ` [PATCH v3 mptcp-next 10/10] mptcp: let the retrans scheduler do its job Paolo Abeni
2026-05-04 16:55 ` [PATCH v3 mptcp-next 00/10] mptcp: address stall under memory pressure MPTCP CI
2026-05-08 10:49 ` Matthieu Baerts
2026-05-09 7:07 ` gang.yan
2026-05-11 8:29 ` Paolo Abeni
2026-05-11 11:11 ` Geliang Tang
2026-05-11 16:35 ` Paolo Abeni
2026-05-13 9:51 ` Paolo Abeni [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=d254a85a-ec80-41ef-88f2-ba8eb97bcb30@redhat.com \
--to=pabeni@redhat.com \
--cc=gang.yan@linux.dev \
--cc=geliang@kernel.org \
--cc=matttbe@kernel.org \
--cc=mptcp@lists.linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.