From: gang.yan@linux.dev
To: "Paolo Abeni" <pabeni@redhat.com>, mptcp@lists.linux.dev
Subject: Re: [PATCH mptcp-net v5 1/5] mptcp: replace backlog_list with backlog_queue
Date: Mon, 20 Apr 2026 09:41:44 +0000 [thread overview]
Message-ID: <7dc554be0c3343fee71e09a4fda5179cfe0571f0@linux.dev> (raw)
In-Reply-To: <e04bd2eb-0107-4011-97d5-a672d770767c@redhat.com>
April 20, 2026 at 5:34 PM, "Paolo Abeni" <pabeni@redhat.com mailto:pabeni@redhat.com?to=%22Paolo%20Abeni%22%20%3Cpabeni%40redhat.com%3E > wrote:
>
> On 4/15/26 10:21 AM, gang.yan@linux.dev wrote:
>
> >
> > April 15, 2026 at 3:17 PM, "Paolo Abeni" <pabeni@redhat.com mailto:pabeni@redhat.com?to=%22Paolo%20Abeni%22%20%3Cpabeni%40redhat.com%3E > wrote:
> >
> > >
> > > AFAICS the stall in the self-tests in patch 5/5 is caused by the sysctl
> > > setting taking effect on the server side _after_ that the 3whs
> > > negotiated the initial window; the rcvbuf suddenly shrinks from ~128K to
> > > 4K and almost every incoming packet is dropped.
> > >
> > > The test itself is really an extreme condition; we should accept any
> > > implementation able to complete the transfer - even at very low speed.
> > >
> > > The initial test-case, the one using sendfile(), operates in a
> > > significantly different way: it generates 1-bytes len DSS preventing
> > > coalescing (I've not understood yet why coalescing does not happen),
> > > which cause an extremely bad skb->truesize/skb->len ratio, which in turn
> > > causes the initial window being way too "optimistic", extreme rcvbuf
> > > squeeze at runtime and a behavior similar to the previous one.
> > >
> > > In both cases simply dropping incoming packets early/in
> > > mptcp_incoming_options() when the rcvbuf is full does not solve the
> > > issue: if the rcvbuf is used (mostly) by the OoO queue, retransmissions
> > > always hit the same rcvbuf condition and are also dropped.
> > >
> > > The root cause of both scenario is that some very unlikely condition
> > > calls to retract the announced rcv wnd, but mptcp can't do that.
> > >
> > > Currently I start to think that we need a strategy similar to plain TCP
> > > to deal with such scenario: when rcvbuf is full we need to condense and
> > > eventually prune the OoO queue (see tcp_prune_queue(),
> > > tcp_collapse_ofo_queue(), tcp_collapse()).
> > >
> > > The above has some serious downsides, i.e. it could lead to large slice
> > > of almost duplicate complex code, as is diff to abstract the MPTCP vs
> > > TCP differences (CB, seq numbers, drop reasons). Still under investigation.
> > >
> > > /P
> > >
> > Hi, Paolo
> > Thanks a lot for your detailed and insightful analysis of this problem!
> >
> > I fully agree with your points: MPTCP should allow the transfer to complete
> > even under extremely slow or harsh conditions, just as you mentioned.
> >
> I was likely not clear in my previous message. IMHO the key point is
> that in the mentioned scenario we should consider suitable any fix that
> would allow completing the transfer, even at a extremely low average
> bitrate - because the memory conditions are indeed extreme.
>
> I.e. we can/should consider this case an extreme slow path.
>
> >
> > Regarding the TCP-style mechanisms like 'tcp_prune_queue' for handling full
> > rcvbuf conditions — I have actually attempted similar implementations before.
> > As you pointed out, this approach is indeed highly complex for MPTCP. There
> > are far too many aspects that require careful modification and consideration,
> > making it extremely challenging to implement correctly.
> >
> I agree duplicating the TCP pruning code inside MPTCP does not look a
> viable solution.
>
> I think we can instead share it (at least the most cumbersome helper),
> with some caveat. I have a few very rough patches doing that. Let me add
> some comments to at least somehow make the code readable and I'll share
> them here.
>
Thank you so much for your help and for working on this!
Looking forward to your updates.
Thanks
Gang
> /P
>
next prev parent reply other threads:[~2026-04-20 9:41 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-01 8:54 [PATCH mptcp-net v5 0/5] mptcp: fix stall because of data_ready Gang Yan
2026-04-01 8:54 ` [PATCH mptcp-net v5 1/5] mptcp: replace backlog_list with backlog_queue Gang Yan
2026-04-01 20:12 ` Paolo Abeni
2026-04-15 7:17 ` Paolo Abeni
2026-04-15 8:21 ` gang.yan
2026-04-20 9:34 ` Paolo Abeni
2026-04-20 9:41 ` gang.yan [this message]
2026-04-20 10:33 ` Paolo Abeni
2026-04-21 1:26 ` gang.yan
2026-04-01 8:54 ` [PATCH mptcp-net v5 2/5] mptcp: fix the stall problems using backlog_queue Gang Yan
2026-04-01 8:54 ` [PATCH mptcp-net v5 3/5] mptcp: fix the stall problems with data_ready Gang Yan
2026-04-01 8:54 ` [PATCH mptcp-net v5 4/5] mptcp: fix the dead_lock in mptcp_data_ready Gang Yan
2026-04-01 20:02 ` Paolo Abeni
[not found] ` <e1f58d941ad141f9f96a24d1a1d9d6ca74cc2f5e@linux.dev>
2026-04-03 6:45 ` Fwd: " gang.yan
2026-04-13 15:49 ` Paolo Abeni
2026-04-01 8:54 ` [PATCH mptcp-net v5 5/5] selftests: mptcp: test transmission with small rcvbuf Gang Yan
2026-04-03 1:17 ` Geliang Tang
2026-04-03 8:52 ` Matthieu Baerts
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=7dc554be0c3343fee71e09a4fda5179cfe0571f0@linux.dev \
--to=gang.yan@linux.dev \
--cc=mptcp@lists.linux.dev \
--cc=pabeni@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox