All of lore.kernel.org
 help / color / mirror / Atom feed
From: gang.yan@linux.dev
To: "Paolo Abeni" <pabeni@redhat.com>
Cc: "Matthieu Baerts (NGI0)" <matttbe@kernel.org>,
	"Geliang Tang" <geliang@kernel.org>,
	"MPTCP Linux" <mptcp@lists.linux.dev>
Subject: Re: [PATCH v8 mptcp-next 7/9] mptcp: implemented OoO queue pruning
Date: Thu, 28 May 2026 01:18:21 +0000	[thread overview]
Message-ID: <9af2ada39337fd17da0740c117307876cb01beeb@linux.dev> (raw)
In-Reply-To: <76116a98-e6c3-4991-b4cd-d52c354e01ca@redhat.com>

May 27, 2026 at 6:01 PM, "Paolo Abeni" <pabeni@redhat.com mailto:pabeni@redhat.com?to=%22Paolo%20Abeni%22%20%3Cpabeni%40redhat.com%3E > wrote:


> 
> On 5/27/26 7:30 AM, gang.yan@linux.dev wrote:
> 
> > 
> > May 26, 2026 at 2:50 PM, "Paolo Abeni" <pabeni@redhat.com mailto:pabeni@redhat.com?to=%22Paolo%20Abeni%22%20%3Cpabeni%40redhat.com%3E > wrote:
> > 
> > > 
> > > On 5/26/26 5:13 AM, gang.yan@linux.dev wrote:
> > > 
> >  May 23, 2026 at 5:43 AM, "Paolo Abeni" <pabeni@redhat.com mailto:pabeni@redhat.com?to=%22Paolo%20Abeni%22%20%3Cpabeni%40redhat.com%3E > wrote:
> >  diff --git a/net/mptcp/mib.c b/net/mptcp/mib.c
> >  index ef65e2df709f..d9bd4f4afcc0 100644
> >  --- a/net/mptcp/mib.c
> >  +++ b/net/mptcp/mib.c
> >  @@ -87,6 +87,7 @@ static const struct snmp_mib mptcp_snmp_list[] = {
> >  SNMP_MIB_ITEM("WinProbe", MPTCP_MIB_WINPROBE),
> >  SNMP_MIB_ITEM("BacklogDrop", MPTCP_MIB_BACKLOGDROP),
> >  SNMP_MIB_ITEM("RcvPruned", MPTCP_MIB_RCVPRUNED),
> >  + SNMP_MIB_ITEM("OfoPruned", MPTCP_MIB_OFO_PRUNED),
> >  };
> >  
> >  /* mptcp_mib_alloc - allocate percpu mib counters
> >  diff --git a/net/mptcp/mib.h b/net/mptcp/mib.h
> >  index c84eb853d499..18f35f7e0a2d 100644
> >  --- a/net/mptcp/mib.h
> >  +++ b/net/mptcp/mib.h
> >  @@ -90,6 +90,7 @@ enum linux_mptcp_mib_field {
> >  MPTCP_MIB_WINPROBE, /* MPTCP-level zero window probe */
> >  MPTCP_MIB_BACKLOGDROP, /* Backlog over memory limit */
> >  MPTCP_MIB_RCVPRUNED, /* Dropped due to memory constrains */
> >  + MPTCP_MIB_OFO_PRUNED, /* MPTCP-level OoO queue pruned */
> >  __MPTCP_MIB_MAX
> >  };
> >  
> >  diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
> >  index 03d6f8658467..f446e22148b9 100644
> >  --- a/net/mptcp/protocol.c
> >  +++ b/net/mptcp/protocol.c
> >  @@ -373,6 +373,43 @@ static void mptcp_init_skb(struct sock *ssk, struct sk_buff *skb, int offset,
> >  skb_dst_drop(skb);
> >  }
> >  
> >  +/* "Inspired" from the TCP version */
> >  +static void mptcp_prune_ofo_queue(struct sock *sk, u64 seq)
> >  +{
> >  + struct mptcp_sock *msk = mptcp_sk(sk);
> >  + struct rb_node *node, *prev;
> >  + bool pruned = false;
> >  + u64 mem;
> >  +
> >  + if (RB_EMPTY_ROOT(&msk->out_of_order_queue))
> >  + return;
> >  +
> >  + node = &msk->ooo_last_skb->rbnode;
> >  +
> >  + do {
> >  + struct sk_buff *skb = rb_to_skb(node);
> >  +
> >  + /* Stop pruning if the incoming skb would land in OoO tail. */
> >  + if (after64(seq, MPTCP_SKB_CB(skb)->map_seq))
> >  + break;
> >  +
> >  + pruned = true;
> >  + prev = rb_prev(node);
> >  + rb_erase(node, &msk->out_of_order_queue);
> >  + mptcp_drop(sk, skb);
> >  + msk->ooo_last_skb = rb_to_skb(prev);
> >  +
> >  + mem = (unsigned int)atomic_read(&sk->sk_rmem_alloc);
> >  + if (mem < sk->sk_rcvbuf)
> >  + break;
> > 
> >  Hi Paolo,
> >  
> >  Thanks for the v8. While going through the code, I ran into a
> >  point that I'm not entirely sure about.
> >  
> >  TCP‘s design uses sk->sk_rcvbuf >> 3 (one eighth of the buffer)
> >  as a goal. It we use sk->sk_rcvbuf here, the loop may break after
> >  deleting just one packet, right? This may fail to free enough space
> >  for the incoming out‑of‑order packet, leading to repeated pruning
> >  calls and potential packet drops.
> > 
> > > 
> > > Thank you for your review.
> > > 
> > >  Yes, here there is some difference vs plain TCP and I think it's for the
> > >  better.
> > > 
> > >  TCP tries to make a "significant" amount of space in the receiver buffer
> > >  while MPTCP tries to make room for a single packet.
> > > 
> > >  Both strategies make sense in their respective context: TCP invokes
> > >  tcp_prune_ofo_queue() only after collapsing, and the latter is very CPU
> > >  intensive; if tcp_prune_ofo_queue() would make room for a single packet,
> > >  the next incoming one could still trigger collapsing and burn a lot of
> > >  CPU cycles (note that this bad chain of events could still happen if
> > >  sk_rcvbuf / 8 is not bigger than 2 packets).
> > > 
> > >  MPTCP (intentionally, as per discussion in the last iterations here)
> > >  does not perform collapsing, and directly does pruning when over limit.
> > >  Pruning is relatively cheap - computational complexity wise we could do
> > >  it for each incoming packet with "no issues", at least compared to
> > >  collapsing.
> > > 
> > >  Dropping the minimal amount of packets to fit the incoming one, allows
> > >  MPTCP to minimize the need for reinjections (for packets already acked
> > >  at the TCP-level, which we really should avoid). I think overall this
> > >  compromise is a better fit for MPTCP.
> > > 
> > >  BTW did you have the chance to perform testing on top of this revision?
> > > 
> >  
> >  Hi Paolo,
> >  
> >  Yes, I've tested v8 many times — hundreds of runs. And the tls.c part works
> >  fine too.
> >  
> >  I also spotted some issues with MPTCP TLS adaptation, but those are unrelated
> >  to v8. I'll reply in Geliang's thread about them soon.
> > 
> Thank you for testing. I'll send a v9 to try to get sashiko comments on
> remaining patches, but no real changes in it. Feel free to add your
> formal Tested-by tag, if you want :)

Great, thank you.

Tested-by: Gang Yan <yangang@kylinos.cn>

> 
> Thanks,
> 
> Paolo
>

  reply	other threads:[~2026-05-28  1:18 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-22 21:43 [PATCH v8 mptcp-next 0/9] mptcp: address stall under memory pressure Paolo Abeni
2026-05-22 21:43 ` [PATCH v8 mptcp-next 1/9] mptcp: fix missing wakeups in edge scenarios Paolo Abeni
2026-05-26  7:47   ` Matthieu Baerts
2026-05-22 21:43 ` [PATCH v8 mptcp-next 2/9] mptcp: fix retransmission loop when csum is enabled Paolo Abeni
2026-05-26  7:48   ` Matthieu Baerts
2026-05-26 15:10     ` Paolo Abeni
2026-05-22 21:43 ` [PATCH v8 mptcp-next 3/9] mptcp: close TOCTOU race while computing rcv_wnd Paolo Abeni
2026-05-26  6:10   ` Geliang Tang
2026-05-26  6:34     ` Paolo Abeni
2026-05-26  7:48       ` Matthieu Baerts
2026-05-22 21:43 ` [PATCH v8 mptcp-next 4/9] mptcp: allow subflow rcv wnd to shrink Paolo Abeni
2026-05-24  8:34   ` Paolo Abeni
2026-05-26  7:02     ` Matthieu Baerts
2026-05-26  7:49       ` Matthieu Baerts
2026-05-26 15:17         ` Paolo Abeni
2026-05-27  0:51           ` Matthieu Baerts
2026-05-26  7:48   ` Matthieu Baerts
2026-05-26 15:16     ` Paolo Abeni
2026-05-22 21:43 ` [PATCH v8 mptcp-next 5/9] mptcp: explicitly drop over memory limits Paolo Abeni
2026-05-22 21:43 ` [PATCH v8 mptcp-next 6/9] mptcp: enforce hard limit on backlog flushing Paolo Abeni
2026-05-22 21:43 ` [PATCH v8 mptcp-next 7/9] mptcp: implemented OoO queue pruning Paolo Abeni
2026-05-26  3:13   ` gang.yan
2026-05-26  6:50     ` Paolo Abeni
2026-05-27  5:30       ` gang.yan
2026-05-27 10:01         ` Paolo Abeni
2026-05-28  1:18           ` gang.yan [this message]
2026-05-22 21:43 ` [PATCH v8 mptcp-next 8/9] mptcp: move the retrans loop to a separate helper Paolo Abeni
2026-05-22 21:43 ` [PATCH v8 mptcp-next 9/9] mptcp: let the retrans scheduler do its job Paolo Abeni
2026-05-27  5:46   ` Geliang Tang
2026-05-22 23:10 ` [PATCH v8 mptcp-next 0/9] mptcp: address stall under memory pressure MPTCP CI

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9af2ada39337fd17da0740c117307876cb01beeb@linux.dev \
    --to=gang.yan@linux.dev \
    --cc=geliang@kernel.org \
    --cc=matttbe@kernel.org \
    --cc=mptcp@lists.linux.dev \
    --cc=pabeni@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.