public inbox for mptcp@lists.linux.dev
 help / color / mirror / Atom feed
From: Paolo Abeni <pabeni@redhat.com>
To: gang.yan@linux.dev, mptcp@lists.linux.dev
Subject: Re: [PATCH mptcp-net v5 1/5] mptcp: replace backlog_list with backlog_queue
Date: Mon, 20 Apr 2026 11:34:22 +0200	[thread overview]
Message-ID: <e04bd2eb-0107-4011-97d5-a672d770767c@redhat.com> (raw)
In-Reply-To: <ef0ce8465f0b852d5e4dc5e54031a156f606b636@linux.dev>

On 4/15/26 10:21 AM, gang.yan@linux.dev wrote:
> April 15, 2026 at 3:17 PM, "Paolo Abeni" <pabeni@redhat.com mailto:pabeni@redhat.com?to=%22Paolo%20Abeni%22%20%3Cpabeni%40redhat.com%3E > wrote:
>> AFAICS the stall in the self-tests in patch 5/5 is caused by the sysctl
>> setting taking effect on the server side _after_ that the 3whs
>> negotiated the initial window; the rcvbuf suddenly shrinks from ~128K to
>> 4K and almost every incoming packet is dropped.
>>
>> The test itself is really an extreme condition; we should accept any
>> implementation able to complete the transfer - even at very low speed.
>>
>> The initial test-case, the one using sendfile(), operates in a
>> significantly different way: it generates 1-bytes len DSS preventing
>> coalescing (I've not understood yet why coalescing does not happen),
>> which cause an extremely bad skb->truesize/skb->len ratio, which in turn
>> causes the initial window being way too "optimistic", extreme rcvbuf
>> squeeze at runtime and a behavior similar to the previous one.
>>
>> In both cases simply dropping incoming packets early/in
>> mptcp_incoming_options() when the rcvbuf is full does not solve the
>> issue: if the rcvbuf is used (mostly) by the OoO queue, retransmissions
>> always hit the same rcvbuf condition and are also dropped.
>>
>> The root cause of both scenario is that some very unlikely condition
>> calls to retract the announced rcv wnd, but mptcp can't do that.
>>
>> Currently I start to think that we need a strategy similar to plain TCP
>> to deal with such scenario: when rcvbuf is full we need to condense and
>> eventually prune the OoO queue (see tcp_prune_queue(),
>> tcp_collapse_ofo_queue(), tcp_collapse()).
>>
>> The above has some serious downsides, i.e. it could lead to large slice
>> of almost duplicate complex code, as is diff to abstract the MPTCP vs
>> TCP differences (CB, seq numbers, drop reasons). Still under investigation.
>>
>> /P
>>
> Hi, Paolo
> Thanks a lot for your detailed and insightful analysis of this problem!
> 
> I fully agree with your points: MPTCP should allow the transfer to complete
> even under extremely slow or harsh conditions, just as you mentioned.

I was likely not clear in my previous message. IMHO the key point is
that in the mentioned scenario we should consider suitable any fix that
would allow completing the transfer, even at a extremely low average
bitrate - because the memory conditions are indeed extreme.

I.e. we can/should consider this case an extreme slow path.

> Regarding the TCP-style mechanisms like 'tcp_prune_queue' for handling full
> rcvbuf conditions — I have actually attempted similar implementations before.
> As you pointed out, this approach is indeed highly complex for MPTCP. There
> are far too many aspects that require careful modification and consideration,
> making it extremely challenging to implement correctly.

I agree duplicating the TCP pruning code inside MPTCP does not look a
viable solution.

I think we can instead share it (at least the most cumbersome helper),
with some caveat. I have a few very rough patches doing that. Let me add
some comments to at least somehow make the code readable and I'll share
them here.

/P


  reply	other threads:[~2026-04-20  9:34 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-01  8:54 [PATCH mptcp-net v5 0/5] mptcp: fix stall because of data_ready Gang Yan
2026-04-01  8:54 ` [PATCH mptcp-net v5 1/5] mptcp: replace backlog_list with backlog_queue Gang Yan
2026-04-01 20:12   ` Paolo Abeni
2026-04-15  7:17     ` Paolo Abeni
2026-04-15  8:21       ` gang.yan
2026-04-20  9:34         ` Paolo Abeni [this message]
2026-04-20  9:41           ` gang.yan
2026-04-20 10:33             ` Paolo Abeni
2026-04-21  1:26               ` gang.yan
2026-04-01  8:54 ` [PATCH mptcp-net v5 2/5] mptcp: fix the stall problems using backlog_queue Gang Yan
2026-04-01  8:54 ` [PATCH mptcp-net v5 3/5] mptcp: fix the stall problems with data_ready Gang Yan
2026-04-01  8:54 ` [PATCH mptcp-net v5 4/5] mptcp: fix the dead_lock in mptcp_data_ready Gang Yan
2026-04-01 20:02   ` Paolo Abeni
     [not found]     ` <e1f58d941ad141f9f96a24d1a1d9d6ca74cc2f5e@linux.dev>
2026-04-03  6:45       ` Fwd: " gang.yan
2026-04-13 15:49         ` Paolo Abeni
2026-04-01  8:54 ` [PATCH mptcp-net v5 5/5] selftests: mptcp: test transmission with small rcvbuf Gang Yan
2026-04-03  1:17   ` Geliang Tang
2026-04-03  8:52     ` Matthieu Baerts

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e04bd2eb-0107-4011-97d5-a672d770767c@redhat.com \
    --to=pabeni@redhat.com \
    --cc=gang.yan@linux.dev \
    --cc=mptcp@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox