MPTCP Linux Development
 help / color / mirror / Atom feed
From: gang.yan@linux.dev
To: "Paolo Abeni" <pabeni@redhat.com>
Cc: "Matthieu Baerts (NGI0)" <matttbe@kernel.org>,
	"Geliang Tang" <geliang@kernel.org>,
	"MPTCP Linux" <mptcp@lists.linux.dev>
Subject: Re: [PATCH v8 mptcp-next 7/9] mptcp: implemented OoO queue pruning
Date: Thu, 28 May 2026 01:18:21 +0000	[thread overview]
Message-ID: <9af2ada39337fd17da0740c117307876cb01beeb@linux.dev> (raw)
In-Reply-To: <76116a98-e6c3-4991-b4cd-d52c354e01ca@redhat.com>

May 27, 2026 at 6:01 PM, "Paolo Abeni" <pabeni@redhat.com mailto:pabeni@redhat.com?to=%22Paolo%20Abeni%22%20%3Cpabeni%40redhat.com%3E > wrote:


> 
> On 5/27/26 7:30 AM, gang.yan@linux.dev wrote:
> 
> > 
> > May 26, 2026 at 2:50 PM, "Paolo Abeni" <pabeni@redhat.com mailto:pabeni@redhat.com?to=%22Paolo%20Abeni%22%20%3Cpabeni%40redhat.com%3E > wrote:
> > 
> > > 
> > > On 5/26/26 5:13 AM, gang.yan@linux.dev wrote:
> > > 
> >  May 23, 2026 at 5:43 AM, "Paolo Abeni" <pabeni@redhat.com mailto:pabeni@redhat.com?to=%22Paolo%20Abeni%22%20%3Cpabeni%40redhat.com%3E > wrote:
> >  diff --git a/net/mptcp/mib.c b/net/mptcp/mib.c
> >  index ef65e2df709f..d9bd4f4afcc0 100644
> >  --- a/net/mptcp/mib.c
> >  +++ b/net/mptcp/mib.c
> >  @@ -87,6 +87,7 @@ static const struct snmp_mib mptcp_snmp_list[] = {
> >  SNMP_MIB_ITEM("WinProbe", MPTCP_MIB_WINPROBE),
> >  SNMP_MIB_ITEM("BacklogDrop", MPTCP_MIB_BACKLOGDROP),
> >  SNMP_MIB_ITEM("RcvPruned", MPTCP_MIB_RCVPRUNED),
> >  + SNMP_MIB_ITEM("OfoPruned", MPTCP_MIB_OFO_PRUNED),
> >  };
> >  
> >  /* mptcp_mib_alloc - allocate percpu mib counters
> >  diff --git a/net/mptcp/mib.h b/net/mptcp/mib.h
> >  index c84eb853d499..18f35f7e0a2d 100644
> >  --- a/net/mptcp/mib.h
> >  +++ b/net/mptcp/mib.h
> >  @@ -90,6 +90,7 @@ enum linux_mptcp_mib_field {
> >  MPTCP_MIB_WINPROBE, /* MPTCP-level zero window probe */
> >  MPTCP_MIB_BACKLOGDROP, /* Backlog over memory limit */
> >  MPTCP_MIB_RCVPRUNED, /* Dropped due to memory constrains */
> >  + MPTCP_MIB_OFO_PRUNED, /* MPTCP-level OoO queue pruned */
> >  __MPTCP_MIB_MAX
> >  };
> >  
> >  diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
> >  index 03d6f8658467..f446e22148b9 100644
> >  --- a/net/mptcp/protocol.c
> >  +++ b/net/mptcp/protocol.c
> >  @@ -373,6 +373,43 @@ static void mptcp_init_skb(struct sock *ssk, struct sk_buff *skb, int offset,
> >  skb_dst_drop(skb);
> >  }
> >  
> >  +/* "Inspired" from the TCP version */
> >  +static void mptcp_prune_ofo_queue(struct sock *sk, u64 seq)
> >  +{
> >  + struct mptcp_sock *msk = mptcp_sk(sk);
> >  + struct rb_node *node, *prev;
> >  + bool pruned = false;
> >  + u64 mem;
> >  +
> >  + if (RB_EMPTY_ROOT(&msk->out_of_order_queue))
> >  + return;
> >  +
> >  + node = &msk->ooo_last_skb->rbnode;
> >  +
> >  + do {
> >  + struct sk_buff *skb = rb_to_skb(node);
> >  +
> >  + /* Stop pruning if the incoming skb would land in OoO tail. */
> >  + if (after64(seq, MPTCP_SKB_CB(skb)->map_seq))
> >  + break;
> >  +
> >  + pruned = true;
> >  + prev = rb_prev(node);
> >  + rb_erase(node, &msk->out_of_order_queue);
> >  + mptcp_drop(sk, skb);
> >  + msk->ooo_last_skb = rb_to_skb(prev);
> >  +
> >  + mem = (unsigned int)atomic_read(&sk->sk_rmem_alloc);
> >  + if (mem < sk->sk_rcvbuf)
> >  + break;
> > 
> >  Hi Paolo,
> >  
> >  Thanks for the v8. While going through the code, I ran into a
> >  point that I'm not entirely sure about.
> >  
> >  TCP‘s design uses sk->sk_rcvbuf >> 3 (one eighth of the buffer)
> >  as a goal. It we use sk->sk_rcvbuf here, the loop may break after
> >  deleting just one packet, right? This may fail to free enough space
> >  for the incoming out‑of‑order packet, leading to repeated pruning
> >  calls and potential packet drops.
> > 
> > > 
> > > Thank you for your review.
> > > 
> > >  Yes, here there is some difference vs plain TCP and I think it's for the
> > >  better.
> > > 
> > >  TCP tries to make a "significant" amount of space in the receiver buffer
> > >  while MPTCP tries to make room for a single packet.
> > > 
> > >  Both strategies make sense in their respective context: TCP invokes
> > >  tcp_prune_ofo_queue() only after collapsing, and the latter is very CPU
> > >  intensive; if tcp_prune_ofo_queue() would make room for a single packet,
> > >  the next incoming one could still trigger collapsing and burn a lot of
> > >  CPU cycles (note that this bad chain of events could still happen if
> > >  sk_rcvbuf / 8 is not bigger than 2 packets).
> > > 
> > >  MPTCP (intentionally, as per discussion in the last iterations here)
> > >  does not perform collapsing, and directly does pruning when over limit.
> > >  Pruning is relatively cheap - computational complexity wise we could do
> > >  it for each incoming packet with "no issues", at least compared to
> > >  collapsing.
> > > 
> > >  Dropping the minimal amount of packets to fit the incoming one, allows
> > >  MPTCP to minimize the need for reinjections (for packets already acked
> > >  at the TCP-level, which we really should avoid). I think overall this
> > >  compromise is a better fit for MPTCP.
> > > 
> > >  BTW did you have the chance to perform testing on top of this revision?
> > > 
> >  
> >  Hi Paolo,
> >  
> >  Yes, I've tested v8 many times — hundreds of runs. And the tls.c part works
> >  fine too.
> >  
> >  I also spotted some issues with MPTCP TLS adaptation, but those are unrelated
> >  to v8. I'll reply in Geliang's thread about them soon.
> > 
> Thank you for testing. I'll send a v9 to try to get sashiko comments on
> remaining patches, but no real changes in it. Feel free to add your
> formal Tested-by tag, if you want :)

Great, thank you.

Tested-by: Gang Yan <yangang@kylinos.cn>

> 
> Thanks,
> 
> Paolo
>

  reply	other threads:[~2026-05-28  1:18 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-22 21:43 [PATCH v8 mptcp-next 0/9] mptcp: address stall under memory pressure Paolo Abeni
2026-05-22 21:43 ` [PATCH v8 mptcp-next 1/9] mptcp: fix missing wakeups in edge scenarios Paolo Abeni
2026-05-26  7:47   ` Matthieu Baerts
2026-05-22 21:43 ` [PATCH v8 mptcp-next 2/9] mptcp: fix retransmission loop when csum is enabled Paolo Abeni
2026-05-26  7:48   ` Matthieu Baerts
2026-05-26 15:10     ` Paolo Abeni
2026-05-22 21:43 ` [PATCH v8 mptcp-next 3/9] mptcp: close TOCTOU race while computing rcv_wnd Paolo Abeni
2026-05-26  6:10   ` Geliang Tang
2026-05-26  6:34     ` Paolo Abeni
2026-05-26  7:48       ` Matthieu Baerts
2026-05-22 21:43 ` [PATCH v8 mptcp-next 4/9] mptcp: allow subflow rcv wnd to shrink Paolo Abeni
2026-05-24  8:34   ` Paolo Abeni
2026-05-26  7:02     ` Matthieu Baerts
2026-05-26  7:49       ` Matthieu Baerts
2026-05-26 15:17         ` Paolo Abeni
2026-05-27  0:51           ` Matthieu Baerts
2026-05-26  7:48   ` Matthieu Baerts
2026-05-26 15:16     ` Paolo Abeni
2026-05-22 21:43 ` [PATCH v8 mptcp-next 5/9] mptcp: explicitly drop over memory limits Paolo Abeni
2026-05-22 21:43 ` [PATCH v8 mptcp-next 6/9] mptcp: enforce hard limit on backlog flushing Paolo Abeni
2026-05-22 21:43 ` [PATCH v8 mptcp-next 7/9] mptcp: implemented OoO queue pruning Paolo Abeni
2026-05-26  3:13   ` gang.yan
2026-05-26  6:50     ` Paolo Abeni
2026-05-27  5:30       ` gang.yan
2026-05-27 10:01         ` Paolo Abeni
2026-05-28  1:18           ` gang.yan [this message]
2026-05-22 21:43 ` [PATCH v8 mptcp-next 8/9] mptcp: move the retrans loop to a separate helper Paolo Abeni
2026-05-22 21:43 ` [PATCH v8 mptcp-next 9/9] mptcp: let the retrans scheduler do its job Paolo Abeni
2026-05-27  5:46   ` Geliang Tang
2026-05-22 23:10 ` [PATCH v8 mptcp-next 0/9] mptcp: address stall under memory pressure MPTCP CI

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9af2ada39337fd17da0740c117307876cb01beeb@linux.dev \
    --to=gang.yan@linux.dev \
    --cc=geliang@kernel.org \
    --cc=matttbe@kernel.org \
    --cc=mptcp@lists.linux.dev \
    --cc=pabeni@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox