From: Bruce Richardson <bruce.richardson@intel.com>
To: "Morten Brørup" <mb@smartsharesystems.com>
Cc: Stephen Hemminger <stephen@networkplumber.org>, <dev@dpdk.org>,
"Thomas Monjalon" <thomas@monjalon.net>,
Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Subject: Re: [RFC] ethdev: clarify rte_eth_tx_burst() return value and ownership semantics
Date: Wed, 18 Feb 2026 08:52:12 +0000 [thread overview]
Message-ID: <aZV9vI-17jfQ2MQu@bricha3-mobl1.ger.corp.intel.com> (raw)
In-Reply-To: <98CBD80474FA8B44BF855DF32C47DC35F65728@smartserver.smartshare.dk>
On Wed, Feb 18, 2026 at 09:48:04AM +0100, Morten Brørup wrote:
> > From: Stephen Hemminger [mailto:stephen@networkplumber.org]
> > Sent: Monday, 16 February 2026 19.00
> >
> > The documentation for rte_eth_tx_burst() uses the word "sent" to
> > describe the return value, which is misleading. Packets returned as
> > consumed may not have been transmitted yet; they have been accepted
> > by the driver and are no longer the caller's responsibility.
> >
> > This matters because the common usage pattern is:
> >
> > n = rte_eth_tx_burst(port, txq, mbufs, nb_pkts);
> > for (i = n; i < nb_pkts; i++)
> > rte_pktmbuf_free(mbufs[i]);
> >
> > For this to work correctly, the contract must be:
> > - tx_pkts[0..n-1]: ownership transferred to the driver.
> > - tx_pkts[n..nb_pkts-1]: untouched, still owned by the caller.
> >
> > Several drivers (and AI-assisted reviews) misinterpret the current
> > wording and treat packets with errors as unconsumed, returning a
> > short count. This causes callers to retry those packets indefinitely.
> > The correct behavior is that the driver must consume (and free)
> > erroneous packets, counting them via tx_errors.
> >
> > Replace "sent" with "consumed" in the return value description,
> > spell out the mbuf ownership contract, clarify the error handling
> > expectation, and update the @return block to match.
> >
> > Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
> > ---
> > lib/ethdev/rte_ethdev.h | 21 ++++++++++++++++-----
> > 1 file changed, 16 insertions(+), 5 deletions(-)
> >
> > diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
> > index 0d8e2d0236..9e49c4a945 100644
> > --- a/lib/ethdev/rte_ethdev.h
> > +++ b/lib/ethdev/rte_ethdev.h
> > @@ -6639,13 +6639,24 @@ uint16_t rte_eth_call_tx_callbacks(uint16_t
> > port_id, uint16_t queue_id,
> > * of the ring.
> > *
> > * The rte_eth_tx_burst() function returns the number of packets it
> > - * actually sent. A return value equal to *nb_pkts* means that all
> > packets
> > - * have been sent, and this is likely to signify that other output
> > packets
> > + * has consumed from the *tx_pkts* array. The driver takes ownership
> > of
> > + * the mbufs for all consumed packets (tx_pkts[0] to tx_pkts[n-1]);
> > + * the caller must not access them afterward. The remaining packets
> > + * (tx_pkts[n] to tx_pkts[nb_pkts-1]) are not modified and remain the
> > + * caller's responsibility.
> > + *
> > + * A return value equal to *nb_pkts* means that all packets have been
> > + * consumed, and this is likely to signify that other output packets
> > * could be immediately transmitted again. Applications that implement
> > a
> > * "send as many packets to transmit as possible" policy can check
> > this
> > * specific case and keep invoking the rte_eth_tx_burst() function
> > until
> > * a value less than *nb_pkts* is returned.
> > *
> > + * If a packet cannot be transmitted due to an error (for example, an
> > + * invalid offload flag), the driver must still consume it and free
> > the
> > + * mbuf, rather than stopping at that point. Such packets should be
> > + * counted in the *tx_errors* port statistic.
>
> The above paragraph is driver centric, it should be application centric.
> Suggest rephrasing as:
>
> If a packet cannot be transmitted due to an error (for example, an invalid offload flag), the rte_eth_tx_burst() function will still consume it, rather than stopping at that point.
> Such packets are counted in the *oerrors* port statistic.
>
> NB: In struct rte_eth_stats [1], the error counter is named "oerrors", not "tx_errors".
>
> [1]: https://elixir.bootlin.com/dpdk/v25.11/source/lib/ethdev/rte_ethdev.h#L273
>
> While discussing details...
> Let's say a packet has 4 segments, and the driver only has 2 descriptors remaining available.
> In that case, I think the driver should not consume the packet, but leave it for the application to either drop it or retry transmitting it later.
> Do we want to mention this case too, or is it a semi-obvious case of the descriptor ring having no more room?
>
I would tend towards it being covered by the descriptor ring not having
room. If we try to cover all edge cases here the documentation will get too
long and therefore less likely to be read.
/Bruce
next prev parent reply other threads:[~2026-02-18 8:52 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-16 18:00 [RFC] ethdev: clarify rte_eth_tx_burst() return value and ownership semantics Stephen Hemminger
2026-02-17 6:41 ` Andrew Rybchenko
2026-02-17 14:54 ` Stephen Hemminger
2026-02-18 8:48 ` Morten Brørup
2026-02-18 8:52 ` Bruce Richardson [this message]
2026-02-18 17:13 ` Stephen Hemminger
2026-02-18 17:32 ` Morten Brørup
2026-02-19 0:44 ` [PATCH v2] " Stephen Hemminger
2026-02-19 7:20 ` Morten Brørup
2026-02-19 19:00 ` Stephen Hemminger
2026-02-19 19:00 ` Stephen Hemminger
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aZV9vI-17jfQ2MQu@bricha3-mobl1.ger.corp.intel.com \
--to=bruce.richardson@intel.com \
--cc=andrew.rybchenko@oktetlabs.ru \
--cc=dev@dpdk.org \
--cc=mb@smartsharesystems.com \
--cc=stephen@networkplumber.org \
--cc=thomas@monjalon.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox