public inbox for dev@dpdk.org
 help / color / mirror / Atom feed
From: "Morten Brørup" <mb@smartsharesystems.com>
To: "Bruce Richardson" <bruce.richardson@intel.com>
Cc: "Konstantin Ananyev" <konstantin.ananyev@huawei.com>,
	<techboard@dpdk.org>, <dev@dpdk.org>
Subject: RE: mbuf fast-free requirements analysis
Date: Wed, 14 Jan 2026 19:05:44 +0100	[thread overview]
Message-ID: <98CBD80474FA8B44BF855DF32C47DC35F6565E@smartserver.smartshare.dk> (raw)
In-Reply-To: <aWfGCglFfkpOAxgh@bricha3-mobl1.ger.corp.intel.com>

> From: Bruce Richardson [mailto:bruce.richardson@intel.com]
> Sent: Wednesday, 14 January 2026 17.36
> 
> On Wed, Jan 14, 2026 at 04:31:31PM +0100, Morten Brørup wrote:
> > > > If I'm not mistaken, the mbuf library is not a barrier for fast-
> > > freeing
> > > > segmented packet mbufs, and thus fast-free of jumbo frames is
> > > possible.
> > > >
> > > > We need a driver developer to confirm that my suggested approach
> -
> > > > resetting the mbuf fields, incl. 'm->nb_segs' and 'm->next', when
> > > > preparing the Tx descriptor - is viable.
> > > >
> > > Excellent analysis, Morten. If I get a chance some time this
> release
> > > cycle,
> > > I will try implementing this change in our drivers, see if any
> > > difference
> > > is made.
> >
> > Bruce,
> >
> > Have you had a chance to look into the driver change requirements?
> > If not, could you please try scratching the surface, to build a gut
> feeling.
> 
> I'll try and take a look this week. Juggling a few things at the
> moment, so
> I had forgotten about this. Sorry.
> 
> More comments inline below.
> 
> /Bruce
> 
> >
> > I wonder if the vector implementations have strong requirements that
> packets are not segmented...
> >
> > The i40 driver only sets "tx_simple_allowed" and "tx_vec_allowed"
> flags when MBUF_FAST_FREE is set:
> >
> https://elixir.bootlin.com/dpdk/v25.11/source/drivers/net/intel/i40e/i4
> 0e_rxtx.c#L3502
> >
> 
> Actually, it allows but does not require FAST_FREE. The check is just
> verifying that the flags with everything *but* FAST_FREE masked out is
> the
> same as the original flags, i.e. FAST_FREE is just ignored.

That's not how I read the code:
ad->tx_simple_allowed =
	(txq->offloads ==
	 (txq->offloads & RTE_ETH_TX_OFFLOAD_MBUF_FAST_FREE) &&
	 txq->tx_rs_thresh >= I40E_TX_MAX_BURST);

Look at it with offloads=(MULTI_SEGS|FAST_FREE):
simple_allowed = (MULTI_SEGS|FAST_FREE) == (MULTI_SEGS|FAST_FREE) & FAST_FREE
i.e.:
simple_allowed = (MULTI_SEGS|FAST_FREE) == FAST_FREE
i.e.: false

> 
> > And only when these two flags are set, it uses a vector Tx function:
> >
> https://elixir.bootlin.com/dpdk/v25.11/source/drivers/net/intel/i40e/i4
> 0e_rxtx.c#L3550
> > And a special Tx Prep function:
> >
> https://elixir.bootlin.com/dpdk/v25.11/source/drivers/net/intel/i40e/i4
> 0e_rxtx.c#L3584
> > Which fails if nb_segs != 1:
> >
> https://elixir.bootlin.com/dpdk/v25.11/source/drivers/net/intel/i40e/i4
> 0e_rxtx.c#L1675
> >
> > So currently it does.
> > But does it need to?... That is the question.
> > Paraphrasing:
> > Can the Tx function only be vectorized when the code path doesn't
> have branches depending on the number of segments?
> > If so, then this may be the main reason for not supporting segmented
> packets with FAST_FREE.
> >
> > In that case, we cannot remove the single-segment requirement from
> FAST_FREE without sacrificing the performance boost from vectorizing.
> 
> No, based on what I state above, this should not be a blocker. The
> vector
> paths do require us to guarantee only one segment per packet - without
> additional context descriptors - so only one descriptor per packet
> (generally, or always one + ctx, in one code-path case). FAST_FREE can
> be
> used in conjunction with that but should not be a requirement. See [1]
> where in vector cleanup we explicitly check for FAST_FREE.
> 
> Similarly for scalar code path, in my latest rework, I am attempting to
> standardize the use of FAST_FREE optimizations even when we have a
> slightly
> slower Tx path [2].

Good point:
The Tx path has two steps:
1) Pre-transmission Tx descriptor setup.
2) Post-transmission mbuf free.

FAST_FREE requirements for optimizing each of these two steps might differ.

As suggested in my other email, hopefully the post-transmission step can be vectorized (also for multi-segment packets) by assisting it in the pre-transmission step - i.e. by preparing the FAST_FREE segments for direct release to the mempool.

Then we can consider single-segment requirements for the pre-transmission step.

> 
> [1]
> https://github.com/DPDK/dpdk/blob/main/drivers/net/intel/common/tx.h
> [2] https://patches.dpdk.org/project/dpdk/patch/20260113151505.1871271-
> 31-bruce.richardson@intel.com/
> 
> >
> > But then we can proceed pursuing alternative optimizations, as
> suggested by Konstantin.
> >
> > Here's another idea:
> > The Tx function could pre-scan each Tx burst for multi-segment
> packets, to decide if the burst should be processed by the vector code
> path or a fallback code path (which can also handle multi-segment
> packets).
> >
> >
> > -Morten
> >

  reply	other threads:[~2026-01-14 18:05 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-12-15 11:06 mbuf fast-free requirements analysis Morten Brørup
2025-12-15 11:46 ` Bruce Richardson
2026-01-14 15:31   ` Morten Brørup
2026-01-14 16:36     ` Bruce Richardson
2026-01-14 18:05       ` Morten Brørup [this message]
2026-01-15  8:46         ` Bruce Richardson
2026-01-15  9:04           ` Morten Brørup
2026-01-23 11:20     ` [PATCH] net/intel: optimize for fast-free hint Bruce Richardson
2026-01-23 12:05       ` Morten Brørup
2026-01-23 12:09         ` Bruce Richardson
2026-01-23 12:27           ` Morten Brørup
2026-01-23 12:53             ` Bruce Richardson
2026-01-23 13:06               ` Morten Brørup
2026-04-08 13:25       ` [PATCH v2] " Bruce Richardson
2026-04-08 19:27         ` Morten Brørup
2026-01-23 11:33     ` mbuf fast-free requirements analysis Bruce Richardson
2025-12-15 14:41 ` Konstantin Ananyev
2025-12-15 16:14   ` Morten Brørup
2025-12-19 17:08     ` Konstantin Ananyev
2025-12-20  7:33       ` Morten Brørup
2025-12-22 15:22         ` Konstantin Ananyev
2025-12-22 17:11           ` Morten Brørup
2025-12-22 17:43             ` Bruce Richardson
2026-01-13 14:48               ` Konstantin Ananyev
2026-01-13 16:07                 ` Stephen Hemminger
2026-01-14 17:01 ` Bruce Richardson
2026-01-14 17:31   ` Morten Brørup
2026-01-14 17:45     ` Bruce Richardson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=98CBD80474FA8B44BF855DF32C47DC35F6565E@smartserver.smartshare.dk \
    --to=mb@smartsharesystems.com \
    --cc=bruce.richardson@intel.com \
    --cc=dev@dpdk.org \
    --cc=konstantin.ananyev@huawei.com \
    --cc=techboard@dpdk.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox