From: "Morten Brørup" <mb@smartsharesystems.com>
To: "Bruce Richardson" <bruce.richardson@intel.com>
Cc: "Konstantin Ananyev" <konstantin.ananyev@huawei.com>,
<techboard@dpdk.org>, <dev@dpdk.org>
Subject: RE: mbuf fast-free requirements analysis
Date: Wed, 14 Jan 2026 19:05:44 +0100 [thread overview]
Message-ID: <98CBD80474FA8B44BF855DF32C47DC35F6565E@smartserver.smartshare.dk> (raw)
In-Reply-To: <aWfGCglFfkpOAxgh@bricha3-mobl1.ger.corp.intel.com>
> From: Bruce Richardson [mailto:bruce.richardson@intel.com]
> Sent: Wednesday, 14 January 2026 17.36
>
> On Wed, Jan 14, 2026 at 04:31:31PM +0100, Morten Brørup wrote:
> > > > If I'm not mistaken, the mbuf library is not a barrier for fast-
> > > freeing
> > > > segmented packet mbufs, and thus fast-free of jumbo frames is
> > > possible.
> > > >
> > > > We need a driver developer to confirm that my suggested approach
> -
> > > > resetting the mbuf fields, incl. 'm->nb_segs' and 'm->next', when
> > > > preparing the Tx descriptor - is viable.
> > > >
> > > Excellent analysis, Morten. If I get a chance some time this
> release
> > > cycle,
> > > I will try implementing this change in our drivers, see if any
> > > difference
> > > is made.
> >
> > Bruce,
> >
> > Have you had a chance to look into the driver change requirements?
> > If not, could you please try scratching the surface, to build a gut
> feeling.
>
> I'll try and take a look this week. Juggling a few things at the
> moment, so
> I had forgotten about this. Sorry.
>
> More comments inline below.
>
> /Bruce
>
> >
> > I wonder if the vector implementations have strong requirements that
> packets are not segmented...
> >
> > The i40 driver only sets "tx_simple_allowed" and "tx_vec_allowed"
> flags when MBUF_FAST_FREE is set:
> >
> https://elixir.bootlin.com/dpdk/v25.11/source/drivers/net/intel/i40e/i4
> 0e_rxtx.c#L3502
> >
>
> Actually, it allows but does not require FAST_FREE. The check is just
> verifying that the flags with everything *but* FAST_FREE masked out is
> the
> same as the original flags, i.e. FAST_FREE is just ignored.
That's not how I read the code:
ad->tx_simple_allowed =
(txq->offloads ==
(txq->offloads & RTE_ETH_TX_OFFLOAD_MBUF_FAST_FREE) &&
txq->tx_rs_thresh >= I40E_TX_MAX_BURST);
Look at it with offloads=(MULTI_SEGS|FAST_FREE):
simple_allowed = (MULTI_SEGS|FAST_FREE) == (MULTI_SEGS|FAST_FREE) & FAST_FREE
i.e.:
simple_allowed = (MULTI_SEGS|FAST_FREE) == FAST_FREE
i.e.: false
>
> > And only when these two flags are set, it uses a vector Tx function:
> >
> https://elixir.bootlin.com/dpdk/v25.11/source/drivers/net/intel/i40e/i4
> 0e_rxtx.c#L3550
> > And a special Tx Prep function:
> >
> https://elixir.bootlin.com/dpdk/v25.11/source/drivers/net/intel/i40e/i4
> 0e_rxtx.c#L3584
> > Which fails if nb_segs != 1:
> >
> https://elixir.bootlin.com/dpdk/v25.11/source/drivers/net/intel/i40e/i4
> 0e_rxtx.c#L1675
> >
> > So currently it does.
> > But does it need to?... That is the question.
> > Paraphrasing:
> > Can the Tx function only be vectorized when the code path doesn't
> have branches depending on the number of segments?
> > If so, then this may be the main reason for not supporting segmented
> packets with FAST_FREE.
> >
> > In that case, we cannot remove the single-segment requirement from
> FAST_FREE without sacrificing the performance boost from vectorizing.
>
> No, based on what I state above, this should not be a blocker. The
> vector
> paths do require us to guarantee only one segment per packet - without
> additional context descriptors - so only one descriptor per packet
> (generally, or always one + ctx, in one code-path case). FAST_FREE can
> be
> used in conjunction with that but should not be a requirement. See [1]
> where in vector cleanup we explicitly check for FAST_FREE.
>
> Similarly for scalar code path, in my latest rework, I am attempting to
> standardize the use of FAST_FREE optimizations even when we have a
> slightly
> slower Tx path [2].
Good point:
The Tx path has two steps:
1) Pre-transmission Tx descriptor setup.
2) Post-transmission mbuf free.
FAST_FREE requirements for optimizing each of these two steps might differ.
As suggested in my other email, hopefully the post-transmission step can be vectorized (also for multi-segment packets) by assisting it in the pre-transmission step - i.e. by preparing the FAST_FREE segments for direct release to the mempool.
Then we can consider single-segment requirements for the pre-transmission step.
>
> [1]
> https://github.com/DPDK/dpdk/blob/main/drivers/net/intel/common/tx.h
> [2] https://patches.dpdk.org/project/dpdk/patch/20260113151505.1871271-
> 31-bruce.richardson@intel.com/
>
> >
> > But then we can proceed pursuing alternative optimizations, as
> suggested by Konstantin.
> >
> > Here's another idea:
> > The Tx function could pre-scan each Tx burst for multi-segment
> packets, to decide if the burst should be processed by the vector code
> path or a fallback code path (which can also handle multi-segment
> packets).
> >
> >
> > -Morten
> >
next prev parent reply other threads:[~2026-01-14 18:05 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-12-15 11:06 mbuf fast-free requirements analysis Morten Brørup
2025-12-15 11:46 ` Bruce Richardson
2026-01-14 15:31 ` Morten Brørup
2026-01-14 16:36 ` Bruce Richardson
2026-01-14 18:05 ` Morten Brørup [this message]
2026-01-15 8:46 ` Bruce Richardson
2026-01-15 9:04 ` Morten Brørup
2026-01-23 11:20 ` [PATCH] net/intel: optimize for fast-free hint Bruce Richardson
2026-01-23 12:05 ` Morten Brørup
2026-01-23 12:09 ` Bruce Richardson
2026-01-23 12:27 ` Morten Brørup
2026-01-23 12:53 ` Bruce Richardson
2026-01-23 13:06 ` Morten Brørup
2026-04-08 13:25 ` [PATCH v2] " Bruce Richardson
2026-04-08 19:27 ` Morten Brørup
2026-01-23 11:33 ` mbuf fast-free requirements analysis Bruce Richardson
2025-12-15 14:41 ` Konstantin Ananyev
2025-12-15 16:14 ` Morten Brørup
2025-12-19 17:08 ` Konstantin Ananyev
2025-12-20 7:33 ` Morten Brørup
2025-12-22 15:22 ` Konstantin Ananyev
2025-12-22 17:11 ` Morten Brørup
2025-12-22 17:43 ` Bruce Richardson
2026-01-13 14:48 ` Konstantin Ananyev
2026-01-13 16:07 ` Stephen Hemminger
2026-01-14 17:01 ` Bruce Richardson
2026-01-14 17:31 ` Morten Brørup
2026-01-14 17:45 ` Bruce Richardson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=98CBD80474FA8B44BF855DF32C47DC35F6565E@smartserver.smartshare.dk \
--to=mb@smartsharesystems.com \
--cc=bruce.richardson@intel.com \
--cc=dev@dpdk.org \
--cc=konstantin.ananyev@huawei.com \
--cc=techboard@dpdk.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox